Undoubtedly, endowing a chatbot with memory enhances the user experience. Reconfiguring the memory capabilities of LLMS, which is far from simple, is exactly what a project called MemGPT is trying to accomplish and is delineated here. Until now, LLMS have constrained the length of conversations. An illustration of that statement is provided below:

MemGPT’s objective is to mitigate these limitations by emulating the memory management systems inherent in computer operating systems. MemGPT equips fixed-context Large Language Models (LLMs) with a stratified memory system, enabling the LLM to self-regulate its memory.

! The initiators of the MemGPT project intend to integrate with AutoGen and open-source LLMs, indicating significant advancements in the field.


Demonstration of the profound memory retrieval task, embodying the unique value MemGPT brings:

A real life example illustrating the enhanced chat experience provided by MemGPT (MemGPT in action):


Key parts of MemGPT:

  • Main Context: The main context is the fixed-length input that the LLM receives. MemGPT can parse the LLM’s text outputs and either yield control or execute a function call.
  • Function Calls: These can be used to move data between the main and external context. When an LLM generates a function call, it can request immediate return of execution to chain together functions.
  • Yield: In the case of a yield, the LLM will not run again until triggered by an external event, like a user message.
  • Virtual Context: MemGPT manages a virtual context, inspired by virtual memory in operating systems, to create an unbounded context for LLMs.
  • Applications: It is particularly useful in tasks like document analysis and multi-session chat where limited context windows of modern LLMs are a handicap.
  • Inspiration: The system draws inspiration from hierarchical memory systems in traditional operating systems.
  • Citation: The work is published as an arXiv preprint and is authored by Charles Packer, Vivian Fang, Shishir G. Patil, Kevin Lin, Sarah Wooders, and Joseph E. Gonzalez.

A really comprehensive explanation (32 minute video):

The project: