No one, even the experts, has a compelete grasp of the inner workings of LLMs, but researchers are working to gain a better understanding. This article tries to explain what is known about large language models (LLMs) like ChatGPT without using technical jargon or advanced math.
—
Some takeaways of the article:
- LLMs are trained using billions of words of ordinary language. The training process involves predicting the next word in a sequence based on the previous words.
- Word vectors are long lists of numbers that represent words in a way that allows LLMs to reason about language.
- Transformers are the basic building block for LLM, they uses a technique called attention to learn how words relate to each other in a sentence or a text.
A jargon-free explanation of how AI large language models work
Want to really understand large language models? Here’s a gentle primer.