selected publications blog posting The Monte Carlo Transformer: a stochastic self-attention model for sequence prediction. HAL (Le Centre pour la Communication Scientifique Directe). 2020