Several Important New Features in Llama 4

2025-04-07T07:00:00.000+00:00 2025 April 07. 07:00 Attila Fodor

Meta’s latest family of artificial intelligence models, Llama 4, brings significant innovations to multimodal model development. In addition to the two models currently available—Llama 4 Scout and Llama 4 Maverick—a very powerful model called Llama 4 Behemoth is in development. Behemoth is expected to play a significant role in STEM-related tasks (science, technology, engineering, and mathematics) in the future.

Recently, many multimodal models have been introduced. These models can process and integrate different types of data—such as text, images, sound, and video—at the same time. This allows them to understand questions in a richer context and solve much more complex problems than previous text-only models. However, this strength can also be a drawback, as such models usually need far more resources than traditional single-modal systems. Meta addresses this issue with the Mixture of Experts (MoE) architecture used in the Llama 4 family. The MoE approach activates only a portion of the model for a given input, which improves efficiency and greatly reduces computational costs. This method is not unique to Llama 4; many large companies are following a similar trend. Nevertheless, Llama 4’s open-source strategy clearly sets it apart from its competitors.

As mentioned earlier, only the two smaller models in the family—Scout and Maverick—are currently available. Both models have 17 billion active parameters, meaning that they process input using 17 billion parameters. However, each model actually contains many more total parameters. Scout has 109 billion parameters, while Maverick has 400 billion. This is due to the MoE architecture: the models activate specific submodules (called “experts” by Meta) when processing input. Accordingly, Scout uses 16 experts and Maverick uses 128 experts. Although Scout is smaller than Maverick, it has the unique feature of a context window that can handle up to 10 million tokens, making it ideal for analyzing long texts, documents, or large codebases. While Maverick does not support such an extensive context window, several benchmarks show that it outperforms competitors like GPT-4o or Gemini 2.0 Flash in inference and coding tasks, even while using only half as many parameters as DeepSeek V3.

Although still in development, Meta claims that Behemoth will perform better than GPT-4.5, Claude Sonnet 3.7, and Gemini 2.0 Pro in STEM-related tasks. Like its two smaller siblings, Behemoth will have 288 billion active parameters, but with 16 submodules it will have nearly 2,000 billion parameters in total. Behemoth is also notable because Meta plans to use this model to train smaller models, and it could be integrated into Meta services such as Messenger, Instagram Direct, and WhatsApp.

Share this post

2025 May 23. Attila Fodor

Thinkless: Fight against the Growing Resource Demands of AI

In recent months, major tech companies have announced a series of reasoning features in their models. However, the immense resource requirements of these systems quickly became apparent, causing the prices of such subscription services to soar. Researchers at the National University of Singapore (NUS) have developed a new framework called "Thinkless", which could significantly transform how large language models (LLMs) handle reasoning tasks. This innovative approach, created by Gongfan Fang, Xinyin Ma, and Xinchao Wang at the NUS xML Lab, enables AI systems to dynamically choose between simple and complex reasoning strategies—potentially reducing computational costs by up to 90%. The framework addresses a critical inefficiency in current AI reasoning methods and represents a major step toward more resource-efficient AI.

2025 May 19. Attila Fodor

The EU’s Open Web Index Project: Another Step Toward Digital Independence

The Open Web Index (OWI) is an open-source initiative under the European Union’s Horizon Programme, aimed at democratizing web-search technologies and strengthening Europe’s digital sovereignty. The project will launch in June 2025, providing a common web index accessible to all and decoupling the indexing infrastructure from the search services that use it. In doing so, the OWI offers not only technical innovations but also a paradigm shift in the global search market—today, a single player (Google) holds over ninety percent of the market share and determines access to online information.

2025 May 16. Attila Fodor

Android 16 launches with enhanced protection

The new Android 16 release offers the platform’s three billion users the most comprehensive device-level protection to date. It focuses on safeguarding high-risk individuals while also marking a significant advancement for all security-conscious users. The system’s cornerstone is the upgraded Advanced Protection Program, which now activates a full suite of device-level defense mechanisms rather than the previous account-level settings. As a result, journalists, public figures, and other users vulnerable to sophisticated cyber threats can enable the platform’s strongest security features with a single switch.

2025 May 15. Attila Fodor

Gemini Advanced Strengthens GitHub Integration

There is no shortage of innovation in the world of AI-based development tools. Google has now announced direct GitHub integration for its premium AI assistant, Gemini Advanced. This move is not only a response to similar developments by its competitor OpenAI, but also a significant step forward in improving developer workflows.