Large Language Models History
What are Language Models?
A language model (LM) is a tool that guesses the next word in a given sequence of terms.
Evolution of Language Models
The development of LMs can be broadly classified into five stages.
- Rule-based LM
- Statistical LM
- Neural LM
- Pre-Trained LM
- Large Language Models (LLM)
Rule-based Language Models
- Grammatical rules of a specific language were used to predict the next word in a sentence.
- E.g., in English
I
will be followed byam
notare
, andThey
can be followed byhave
orare
like these grammatical rules. - However, there are many exceptions, and handling all the language rules is tricky.
Statistical Language Models
- In this method, a large set of texts was analyzed, and the word-level probability of a word after a bunch of words was determined statistically.
- How many times does
am
appear afterI
that probability is compared with other words likeare
oris
. - In an advanced SLM n-gram model, instead of finding probability from a previous single word, the last bi-gram (two words) and tri-grams (three terms) were used to find the possibility of the next word.
- However, In English, a single word can have multiple meanings based on the context of the sentence. SLM can not able to determine the context of the sentence.
Neural Language Models
- With Word2Vec (Word to Vector), these models calculate the probability of the following words by neural networks.
- Example: RNN (Recurrent Neural Network), LSTM (Long Short Term Memory)
Pre-Trained Language Models
- ELMo (Context-aware Word Embedding) and Self-Attention through Transformer architecture raised the performance bar of NLP tasks. Example: BERT and GPT-2
- Models were trained with a large amount of text, and the context awareness increased.
Large Language Models (LLM)
- There is a thin line between PLM and LLM.
- Scaling model size and training data size of PLMs new emergent abilities of model discovered. Example: ChatGPT, LLaMA, Claude
- LLM is different from PLM broadly in three ways:
- Emergent abilities
- Prompting/Conversational Interface
- To attend the scale, Engineering and Research problems must be solved.