JetBrains Mellum Is Now Open Source

As of April 30, 2025, JetBrains has taken a major step forward in AI by open-sourcing Mellum, its custom language model for code completion. Previously available only in JetBrains’ commercial products, this 4-billion-parameter model is now freely accessible on Hugging Face, opening new doors for researchers, educators, and development teams.

JetBrains built Mellum from the ground up as a proprietary, large-scale model dedicated solely to helping software developers. Unlike general-purpose AI models that juggle many features, Mellum is a “focus model” tailored to excel at one task: code completion.

Launched last year as part of the JetBrains AI Assistant, Mellum is already integrated into popular JetBrains IDEs—such as IntelliJ IDEA and PyCharm—to deliver faster, more accurate, and smarter code suggestions. Because it’s specialized, Mellum can offer completions that fit your code better, boosting both speed and precision compared to earlier tools.

Deciding to open-source Mellum involved lengthy discussions at JetBrains. This isn’t just a fine-tuned copy of an existing open model; it’s a model they trained from scratch for their own products.

In the end, JetBrains chose open source to tap into community collaboration, which they believe will speed up development and lower costs. They point to how projects like Linux, Git, Node.js, and Docker thrived through open-source cooperation—and note that some open-source large language models now rival top industry offerings.

By releasing Mellum on Hugging Face, JetBrains invites researchers, teachers, and teams to explore a code-focused AI’s inner workings. This move aligns with the growing trend toward transparent and collaborative AI development.

Technically, Mellum is a multilingual, 4 billion-parameter model optimized for code completion. It uses a transformer architecture similar to LLaMA, and was trained on about 4.2 trillion tokens drawn from freely licensed code repositories (like GitHub) and English Wikipedia text, helping it understand comments and documentation.

Mellum offers an 8,192-token context window and supports completion in many languages, including Java, Kotlin, Python, Go, PHP, C, C++, C#, JavaScript, TypeScript, CSS, HTML, Rust, and Ruby.

Model HumanEval Infilling (single-line) HumanEval Infilling (multi-line) RepoBench 1.1 (2K context, py) SAFIM (avg)
Mellum-4B-base 66.2 38.5 28.2 38.1
InCoder-6B 69.0 38.6 - 33.8
CodeLlama-7B-base 83.0 50.8 34.1 45.0
CodeLlama-13B-base 85.6 56.1 36.2 52.8
DeepSeek-Coder-6.7B 80.7 - - 63.4

Benchmarks show Mellum trails some code-focused models like CodeLlama in raw performance, but its 8-bit quantization keeps memory needs low—just over 4 GB of RAM—so it can run on more modest machines, whereas CodeLlama’s 7 billion and 13 billion versions need at least twice that. 

Share this post
This is how LLM distorts
With the development of artificial intelligence (AI), more and more attention is being paid to so-called large language models (LLMs), which are now present not only in scientific research but also in many areas of everyday life—for example, in legal work, health data analysis, and computer program coding. However, understanding how these models work remains a serious challenge, especially when they make seemingly inexplicable mistakes or give misleading answers.
MiniMax-M1 AI model, targeting the handling of large texts
With the development of artificial intelligence systems, there is a growing demand for models that are not only capable of interpreting language, but also of carrying out complex, multi-step thought processes. Such models can be crucial not only in theoretical tasks, but also in software development or real-time decision-making, for example. However, these applications are particularly sensitive to computational costs, which are often difficult to control using traditional approaches.
 How is the relationship between OpenAI and Microsoft transforming the artificial intelligence ecosystem?
One of the most striking examples of the rapid technological and business transformations taking place in the artificial intelligence industry is the redefinition of the relationship between Microsoft and OpenAI. The two companies have worked closely together for years, but recent developments clearly show that industry logic now favors more flexible, multi-player collaboration models rather than exclusive partnerships.
Google Cloud Run Adds GPU Support for AI and Batch Workloads
Google Cloud has officially launched general availability of NVIDIA GPU support for Cloud Run, marking a major step forward in its serverless platform. This update aims to give developers a cost-effective, scalable solution for GPU-powered tasks, especially those involving AI inference and batch processing. It addresses the rising need for accessible, production-ready GPU resources in the cloud—while preserving the key features that have made Cloud Run popular with developers.
Gemini Advanced Strengthens GitHub Integration
There is no shortage of innovation in the world of AI-based development tools. Google has now announced direct GitHub integration for its premium AI assistant, Gemini Advanced. This move is not only a response to similar developments by its competitor OpenAI, but also a significant step forward in improving developer workflows.
Apple Plans Its Own “Vibe-Coding” Platform in Partnership with Anthropic
Apple has encountered several challenges in developing its own AI solutions recently, so it’s perhaps unsurprising that the company is turning to external expertise. According to the latest reports, Apple has decided to join forces with Anthropic to create a revolutionary “vibe-coding” software platform that uses generative AI to write, edit, and test programmers’ code.