The Monte Carlo Transformer: a stochastic self-attention model for sequence prediction Blog Posting uri icon