JetBrains Mellum Is Now Open Source

As of April 30, 2025, JetBrains has taken a major step forward in AI by open-sourcing Mellum, its custom language model for code completion. Previously available only in JetBrains’ commercial products, this 4-billion-parameter model is now freely accessible on Hugging Face, opening new doors for researchers, educators, and development teams.

JetBrains built Mellum from the ground up as a proprietary, large-scale model dedicated solely to helping software developers. Unlike general-purpose AI models that juggle many features, Mellum is a “focus model” tailored to excel at one task: code completion.

Launched last year as part of the JetBrains AI Assistant, Mellum is already integrated into popular JetBrains IDEs—such as IntelliJ IDEA and PyCharm—to deliver faster, more accurate, and smarter code suggestions. Because it’s specialized, Mellum can offer completions that fit your code better, boosting both speed and precision compared to earlier tools.

Deciding to open-source Mellum involved lengthy discussions at JetBrains. This isn’t just a fine-tuned copy of an existing open model; it’s a model they trained from scratch for their own products.

In the end, JetBrains chose open source to tap into community collaboration, which they believe will speed up development and lower costs. They point to how projects like Linux, Git, Node.js, and Docker thrived through open-source cooperation—and note that some open-source large language models now rival top industry offerings.

By releasing Mellum on Hugging Face, JetBrains invites researchers, teachers, and teams to explore a code-focused AI’s inner workings. This move aligns with the growing trend toward transparent and collaborative AI development.

Technically, Mellum is a multilingual, 4 billion-parameter model optimized for code completion. It uses a transformer architecture similar to LLaMA, and was trained on about 4.2 trillion tokens drawn from freely licensed code repositories (like GitHub) and English Wikipedia text, helping it understand comments and documentation.

Mellum offers an 8,192-token context window and supports completion in many languages, including Java, Kotlin, Python, Go, PHP, C, C++, C#, JavaScript, TypeScript, CSS, HTML, Rust, and Ruby.

Model HumanEval Infilling (single-line) HumanEval Infilling (multi-line) RepoBench 1.1 (2K context, py) SAFIM (avg)
Mellum-4B-base 66.2 38.5 28.2 38.1
InCoder-6B 69.0 38.6 - 33.8
CodeLlama-7B-base 83.0 50.8 34.1 45.0
CodeLlama-13B-base 85.6 56.1 36.2 52.8
DeepSeek-Coder-6.7B 80.7 - - 63.4

Benchmarks show Mellum trails some code-focused models like CodeLlama in raw performance, but its 8-bit quantization keeps memory needs low—just over 4 GB of RAM—so it can run on more modest machines, whereas CodeLlama’s 7 billion and 13 billion versions need at least twice that. 

Share this post
Trends in the Use of LLMs in Development Based on Anthropic's Survey
Anthropic is a leading company in artificial intelligence research and development, best known for its large language model, Claude. The Claude.ai and Claude Code product lines have become especially popular among software developers in recent years due to their strong code generation abilities and excellent performance in automation.
Shortage of Artificial Intelligence Specialists in India Could Soon Affect Other Countries
India has long been a major player in global IT services, so it is natural that it has also become a leader in artificial intelligence (AI) research in recent years. According to experts, the majority of companies (around 80% according to Deloitte) are already working on autonomous, "agent-based" AI systems. These new technologies are creating challenges that could soon spread across the world. Demand for specialists with the right skills is growing sharply, while the current supply is not enough to meet the expected needs.
The bipolar world has also split the AI ecosystem
In spring 2025, Huawei began mass shipments of its Ascend 910C artificial intelligence chip to customers in China. The company hopes to fill the gap left by U.S. export restrictions in the AI-chip market. The Ascend 910C is not a brand-new design: it combines two earlier Ascend 910B chips into one module, using Huawei’s own Da Vinci architecture and chiplet technology. This dual-chip design delivers about 780–800 TFLOPS of computing power and around 3.2 TB/s of memory bandwidth.
Google Introduces the Agent2Agent (A2A) Open Source Protocol
In a recent speech, Jensen Huang (CEO of NVIDIA) divided the evolution of artificial intelligence into several phases and called the current phase the era of Agentic AI. Although he mainly focused on the next phase of the physical AI era, we should not forget that the Agentic AI era also started only this year, so its fully developed form has not yet been seen. The recent announcement by Google of the open source Agent2Agent protocol gives us a hint of what this more advanced form might look like. The protocol is designed to bridge the gap between AI agents created on different platforms, frameworks, and by various vendors, enabling smooth communication and collaboration.