<aside> 💡 Every LLM has an upper bound in its short-term memory, referred to as its ‘Context Window’. It’s defined as the “Maximum number of combined input & output tokens.”
⛔️: you can exceed the limit in the course of a long back and forth chat and then things get very weird!
</aside>
<aside> 💡 This upper bound limits the models’ ability to retain and hold lucid back/forth conversations. The window is exceeded and the system begins forgetting what was said way up at the top of the conversation. Context Windows are getting bigger all the time and this will eventually be a non-issue. But for now, be aware of these constraints.
Artificial Analysis keeps a running current list comparing the major providers and their respective attributes like Context Window size.
</aside>
<aside> 💡 NOTE: ‘token’ is roughly equal to ‘word’ but is not the same thing. If you want to know how many tokens your large PDF file will take up, OpenAI provides a handy/dandy tool called the Tokenizer. You can input your text, and it will show you exactly the way it looks at words and cuts them up into these tokens which are what LLMs consume and output.
</aside>
The context window limitation can result in AI models missing crucial contextual information, leading to misinterpretations or incomplete insights. This is particularly risky in industries where decisions based on AI analysis can have significant financial or operational impacts.
To mitigate these risks, businesses should: