Introduction to transformer neural networks

In my last two article of the series "what it takes to learn transformer model for NLP", we discussed about the shortcoming of simple encoder decoder based models and then how that is solved by attention mechanism was also explained in the part -11 of the article. In this article,we will discuss transformer at a high level and then in the subsequent articles we will dive deep into the internals of the transformer model.

Introduction to transformer -Part 1
