As of April 30, 2025, JetBrains has taken a major step forward in AI by open-sourcing Mellum, its custom language model for code completion. Previously available only in JetBrains’ commercial products, this 4-billion-parameter model is now freely accessible on Hugging Face, opening new doors for researchers, educators, and development teams.
JetBrains built Mellum from the ground up as a proprietary, large-scale model dedicated solely to helping software developers. Unlike general-purpose AI models that juggle many features, Mellum is a “focus model” tailored to excel at one task: code completion.
Launched last year as part of the JetBrains AI Assistant, Mellum is already integrated into popular JetBrains IDEs—such as IntelliJ IDEA and PyCharm—to deliver faster, more accurate, and smarter code suggestions. Because it’s specialized, Mellum can offer completions that fit your code better, boosting both speed and precision compared to earlier tools.
Deciding to open-source Mellum involved lengthy discussions at JetBrains. This isn’t just a fine-tuned copy of an existing open model; it’s a model they trained from scratch for their own products.
In the end, JetBrains chose open source to tap into community collaboration, which they believe will speed up development and lower costs. They point to how projects like Linux, Git, Node.js, and Docker thrived through open-source cooperation—and note that some open-source large language models now rival top industry offerings.
By releasing Mellum on Hugging Face, JetBrains invites researchers, teachers, and teams to explore a code-focused AI’s inner workings. This move aligns with the growing trend toward transparent and collaborative AI development.
Technically, Mellum is a multilingual, 4 billion-parameter model optimized for code completion. It uses a transformer architecture similar to LLaMA, and was trained on about 4.2 trillion tokens drawn from freely licensed code repositories (like GitHub) and English Wikipedia text, helping it understand comments and documentation.
Mellum offers an 8,192-token context window and supports completion in many languages, including Java, Kotlin, Python, Go, PHP, C, C++, C#, JavaScript, TypeScript, CSS, HTML, Rust, and Ruby.
Model | HumanEval Infilling (single-line) | HumanEval Infilling (multi-line) | RepoBench 1.1 (2K context, py) | SAFIM (avg) |
---|---|---|---|---|
Mellum-4B-base | 66.2 | 38.5 | 28.2 | 38.1 |
InCoder-6B | 69.0 | 38.6 | - | 33.8 |
CodeLlama-7B-base | 83.0 | 50.8 | 34.1 | 45.0 |
CodeLlama-13B-base | 85.6 | 56.1 | 36.2 | 52.8 |
DeepSeek-Coder-6.7B | 80.7 | - | - | 63.4 |
Benchmarks show Mellum trails some code-focused models like CodeLlama in raw performance, but its 8-bit quantization keeps memory needs low—just over 4 GB of RAM—so it can run on more modest machines, whereas CodeLlama’s 7 billion and 13 billion versions need at least twice that.