Revolutionary AI Memory System Unveiled

2025-06-16T07:00:00.000+00:00 2025 June 16. 07:00 Attila Fodor

Large Language Models (LLMs) are central to the pursuit of Artificial General Intelligence (AGI), yet they currently face considerable limitations concerning memory management. Contemporary LLMs typically depend on knowledge embedded within their fixed weights and a limited context window during operation, which hinders their ability to retain or update information over extended periods. While approaches such as Retrieval-Augmented Generation (RAG) integrate external knowledge, they frequently lack a structured approach to memory. This often results in issues like the forgetting of past interactions, reduced adaptability, and isolated memory across different platforms. Essentially, current LLMs do not treat memory as a persistent, manageable, or shareable resource, which constrains their practical utility.

In response to these challenges, researchers from MemTensor (Shanghai) Technology Co., Ltd., Shanghai Jiao Tong University, Renmin University of China, and the Research Institute of China Telecom have collaboratively developed MemOS. This novel memory operating system positions memory as a fundamental resource within language models. A key component of MemOS is MemCube, a unified abstraction that oversees parametric, activation, and plaintext memory. MemOS facilitates structured, traceable, and cross-task memory handling, allowing models to continuously adapt, internalize user preferences, and maintain consistent behavior. This represents a significant shift, transforming LLMs from static generators into dynamic, evolving systems capable of long-term learning and coordination across various platforms.

As AI systems become increasingly complex, handling a diverse range of tasks, roles, and data types, language models must advance beyond mere text comprehension to encompass memory retention and continuous learning. Current LLMs’ deficiency in structured memory management restricts their capacity for adaptation and growth over time. MemOS addresses this by treating memory as a core, schedulable resource, enabling long-term learning through structured storage, version control, and unified memory access. Unlike conventional training methodologies, MemOS supports a continuous "memory training" paradigm, which blurs the distinction between learning and inference. Furthermore, it incorporates governance features, ensuring traceability, access control, and secure utilization within evolving AI systems.

MemOS is designed as a memory-centric operating system for language models, conceptualizing memory not merely as stored data but as an active, evolving element of the model’s cognitive processes. It categorizes memory into three distinct types: Parametric Memory, which encompasses knowledge encoded in model weights through pretraining or fine-tuning; Activation Memory, referring to temporary internal states such as KV caches and attention patterns utilized during inference; and Plaintext Memory, which consists of editable and retrievable external data, including documents or prompts. These memory types interact within a unified framework known as the MemoryCube (MemCube), which encapsulates both content and metadata. This enables dynamic scheduling, versioning, access control, and transformations across memory types. This structured system empowers LLMs to adapt, recall relevant information, and efficiently evolve their capabilities, moving beyond their role as static generators.

MemOS operates on a three-layer architecture: the Interface Layer processes user inputs and converts them into memory-related tasks; the Operation Layer manages the scheduling, organization, and evolution of different memory types; and the Infrastructure Layer ensures secure storage, access governance, and collaboration among agents. All interactions within MemOS are facilitated through MemCubes, ensuring traceable, policy-driven memory operations. Through integrated modules such as MemScheduler, MemLifecycle, and MemGovernance, MemOS sustains a continuous and adaptive memory loop—from the initial user prompt, through memory injection during reasoning, to the storage of useful data for future application. This architectural design not only enhances the model’s responsiveness and personalization but also ensures that memory remains structured, secure, and reusable.

In summary, MemOS is a memory operating system that positions memory as a central and manageable component within LLMs. In contrast to traditional models that predominantly rely on static model weights and short-term runtime states, MemOS introduces a unified framework for managing parametric, activation, and plaintext memory. Its core is MemCube, a standardized memory unit that supports structured storage, lifecycle management, and task-aware memory augmentation. This system facilitates more coherent reasoning, enhanced adaptability, and improved cross-agent collaboration. Future objectives for MemOS include enabling memory sharing across models, the development of self-evolving memory blocks, and the establishment of a decentralized memory marketplace to support continuous learning and intelligent evolution.

Share this post

2025. August 01.

A Brutal Quarter for Apple, but What Comes After the iPhone?

Amid global economic and trade challenges, Apple has once again proven its extraordinary market power, surpassing analyst expectations in the third quarter of its 2025 fiscal year. The Cupertino giant not only posted record revenue for the period ending in June but also reached a historic milestone: the shipment of its three billionth iPhone. This achievement comes at a time when the company is grappling with the cost of punitive tariffs, intensifying competition in artificial intelligence, and a series of setbacks in the same field.

2025. July 31.

The Micron 9650: The World's First Commercial PCIe 6.0 SSD

In the age of artificial intelligence and high-performance computing, data speed has become critically important. In this rapidly accelerating digital world, Micron has announced a technological breakthrough that redefines our concept of data center storage. Enter the Micron 9650, the world’s first SSD equipped with a PCIe 6.0 interface—not just another product on the market, but a herald of a new era in server-side storage, offering unprecedented speed and efficiency.

2025. July 31.

OpenAI’s Study Mode: Teaching Students How to Think

In recent years, artificial intelligence has sparked revolutionary changes in education, shifting the focus from passive information intake to active learning processes aimed at deeper understanding.

2025. July 29.

Linux Kernel 6.16 Released

Linux Kernel 6.16 has been released. While the release process was, in the developers’ words, “uneventful” in the best possible sense, significant improvements lie beneath the surface, bringing progress in areas such as security, performance, and system management. Meanwhile, development of the upcoming 6.17 version has started in a more chaotic manner than usual—highlighting the human side of one of the world’s most important open-source projects.

2025. July 28.

What is WhoFi?

Wireless internet, or WiFi, is now a ubiquitous and indispensable part of our lives. We use it to connect our devices to the internet, communicate, and exchange information. But imagine if this same technology, which invisibly weaves through our homes and cities, could also identify and track us without cameras—even through walls. This is not a distant science fiction scenario, but the reality of a newly developed technology called WhoFi, which harnesses a previously untapped property of WiFi signals. To complicate matters, the term “WhoFi” also refers to an entirely different service with community-focused goals, so it's important to clarify which meaning is being discussed.

2025. July 25.

Anticipation is high for the release of GPT-5 — but what should we really expect?

OpenAI’s upcoming language model, GPT-5, has become one of the most anticipated technological developments in recent months. Following the release of GPT-4o and the specialized o1 models, attention is now shifting to this next-generation model, which—according to rumors and hints from company leaders—may represent a significant leap forward in artificial intelligence capabilities. But what do we actually know so far, and what remains pure speculation?