This is how LLM distorts

With the development of artificial intelligence (AI), more and more attention is being paid to so-called large language models (LLMs), which are now present not only in scientific research but also in many areas of everyday life—for example, in legal work, health data analysis, and computer program coding. However, understanding how these models work remains a serious challenge, especially when they make seemingly inexplicable mistakes or give misleading answers.

A new study by MIT researchers draws attention to a little-known but important phenomenon: so-called positional bias. This bias means that models tend to overemphasize information at the beginning or end of texts, while the middle parts are often pushed into the background. This effect can result, for example, in an AI-based search tool being more likely to find the information you are looking for in a 30-page document if it is on the first or last pages, even if the relevant detail is in the middle.

To explore the root of this phenomenon, the researchers developed a mathematical theoretical framework that examines the functioning of the so-called transformer architecture, which forms the basis of language models. This architecture relies in particular on the so-called attention mechanism, which allows the model to interpret individual words in their textual context. In practice, however, this mechanism has limitations: for example, for the sake of computational efficiency, many models limit how many other words a word can “pay attention” to. One such limitation, causal masking, explicitly favors words at the beginning of a text, even if they are of little significance.

The study also points out that these biases may stem not only from the architecture itself, but also from the data used to train the models. If the training data sets overrepresent information found at the beginning of texts, such a pattern may automatically be incorporated into the model's operation. Position bias is therefore partly a technical issue and partly a data quality issue.

Experiments conducted by researchers have confirmed this phenomenon: when the position of the correct answer in the text was changed in a given task, such as information retrieval, the accuracy of the models decreased dramatically towards the middle of the text and then improved slightly as the answer approached the end of the text. This pattern is referred to in the literature as the “lost in the middle” phenomenon.

Although the problem is not new, the novelty of the study lies in the fact that the researchers identified specific mechanisms that contribute to this bias—and also made suggestions for mitigating it. These include rethinking masking techniques, reducing the number of attention layers, and consciously applying positional encodings, which can help models interpret the text as a whole in a more balanced way.

It is important to emphasize that this phenomenon is not equally problematic in all areas of application. In text composition, for example, it is natural that the beginning and end of a text are given prominence. However, in applications where accurate data extraction or fair decision-making is the goal—such as in legal or medical contexts—these biases can have serious consequences.

Overall, the work of MIT researchers is a step toward making artificial intelligence systems more transparent and reliable. It does not promise an immediate solution, and we should not believe that positional bias is always a serious problem, but it is certain that a better understanding can bring us closer to the responsible and conscious use of AI systems. 

Share this post
MiniMax-M1 AI model, targeting the handling of large texts
With the development of artificial intelligence systems, there is a growing demand for models that are not only capable of interpreting language, but also of carrying out complex, multi-step thought processes. Such models can be crucial not only in theoretical tasks, but also in software development or real-time decision-making, for example. However, these applications are particularly sensitive to computational costs, which are often difficult to control using traditional approaches.
 How is the relationship between OpenAI and Microsoft transforming the artificial intelligence ecosystem?
One of the most striking examples of the rapid technological and business transformations taking place in the artificial intelligence industry is the redefinition of the relationship between Microsoft and OpenAI. The two companies have worked closely together for years, but recent developments clearly show that industry logic now favors more flexible, multi-player collaboration models rather than exclusive partnerships.
Amazon and SK Group to build South Korea's largest AI center
A new era may be dawning for South Korea's artificial intelligence industry, with Amazon Web Services (AWS) announcing that it will build the country's largest AI computing center in partnership with SK Group. The investment is not only a technological milestone, but also has a spectacular impact on SK Hynix's stock market performance.
Change in Windows facial recognition: no longer works in the dark
Microsoft recently introduced an important security update to its Windows Hello facial recognition login system, which is part of the Windows 11 operating system. As a result of the change, facial recognition no longer works in the dark, and the company has confirmed that this is not a technical error, but the result of a conscious decision.
Kali Linux 2025.2 released: sustainable improvements in a mature system
The latest stable release of Kali Linux, the popular Linux distribution for ethical hacking and cybersecurity analysis, version 2025.2, was released in June 2025. This time, the developers have not only introduced maintenance updates, but also several new features that enhance both usability and functionality of the system. The updates may be of particular interest to those who use the operating system for penetration testing, network traffic analysis or other security purposes.
Revolutionary AI Memory System Unveiled
Large Language Models (LLMs) are central to the pursuit of Artificial General Intelligence (AGI), yet they currently face considerable limitations concerning memory management. Contemporary LLMs typically depend on knowledge embedded within their fixed weights and a limited context window during operation, which hinders their ability to retain or update information over extended periods. While approaches such as Retrieval-Augmented Generation (RAG) integrate external knowledge, they frequently lack a structured approach to memory. This often results in issues like the forgetting of past interactions, reduced adaptability, and isolated memory across different platforms. Essentially, current LLMs do not treat memory as a persistent, manageable, or shareable resource, which constrains their practical utility.