DeepSeek aims to corner OpenAI's o3 model with its more advanced R1 model

DeepSeek R1-0528, the latest development from Chinese company DeepSeek, represents a significant advance in the reasoning capabilities of artificial intelligence models. The new model is based on the January DeepSeek R1 and is an improved version of it. According to the company, the performance of DeepSeek R1-0528 already rivals that of OpenAI's o3 model and approaches the capabilities of Google Gemini 2.5 Pro.

The model has significantly improved its reasoning and inference capabilities. This was achieved through the use of increased computing resources, algorithmic optimization, and an increase in token usage per question from an average of 12,000 to 23,000. As a result, the model showed a significant performance increase in various tests. For example, on the AIME 2025 test, its accuracy increased from 70% to 87.5%.

The DeepSeek R1-0528 architecture contains 685 billion parameters (up from 671 billion in the previous R1) and uses a Mixture-of-Experts (MoE) design, where only 37 billion parameters are active per token. The model's context window is 128K tokens, and it can generate a maximum of 64K tokens. It supports function calls and JSON output formats. In addition, the hallucination rate has been reduced, especially when rewriting and summarizing content. Its code generation capabilities have also been improved.

The model has achieved remarkable results on various benchmarks. In mathematical tasks, its performance reaches or exceeds that of leading models such as OpenAI o3 and Google Gemini 2.5 Pro. In programming and coding tasks on LiveCodeBench, it ranks behind the OpenAI o4-mini and o3 reasoning models. Its general reasoning abilities have also improved, as evidenced by a significant increase in its GPQA-Diamond test score (from 71.5% to 81.0%).

DeepSeek has released a smaller, distilled version of the main R1-0528 model, called DeepSeek-R1-0528-Qwen3-8B. This model is based on Qwen3-8B and contains reasoning knowledge distilled from DeepSeek-R1-0528. It delivers outstanding performance among open source models. It outperforms Qwen3-8B by +10.0% and matches the performance of Qwen3-235B-thinking. It can be run on a single GPU with at least 40 GB of VRAM. 

Share this post
What is WhoFi?
Wireless internet, or WiFi, is now a ubiquitous and indispensable part of our lives. We use it to connect our devices to the internet, communicate, and exchange information. But imagine if this same technology, which invisibly weaves through our homes and cities, could also identify and track us without cameras—even through walls. This is not a distant science fiction scenario, but the reality of a newly developed technology called WhoFi, which harnesses a previously untapped property of WiFi signals. To complicate matters, the term “WhoFi” also refers to an entirely different service with community-focused goals, so it's important to clarify which meaning is being discussed.
China’s Own GPU Industry Is Slowly Awakening
“7G” is an abbreviation that sounds almost identical to the word for “miracle” in Chinese. Whether this is a lucky piece of marketing or a true technological prophecy remains to be seen. What Lisuan Technology is presenting with the 7G106—internally codenamed G100—is nothing less than the first serious attempt to step out of Nvidia and AMD’s shadow. No licensing agreements, no crutches based on Western intellectual property—this is a GPU built from scratch, manufactured using 6 nm DUV technology in a country that is only beginning to break free from the spell of Western technology exports.
Anticipation is high for the release of GPT-5 — but what should we really expect?
OpenAI’s upcoming language model, GPT-5, has become one of the most anticipated technological developments in recent months. Following the release of GPT-4o and the specialized o1 models, attention is now shifting to this next-generation model, which—according to rumors and hints from company leaders—may represent a significant leap forward in artificial intelligence capabilities. But what do we actually know so far, and what remains pure speculation?
What Does the Rise of DiffuCoder and Diffusion Language Models Mean?
A new approach is now fundamentally challenging this linear paradigm: diffusion language models (dLLMs), which generate content not sequentially but globally, through iterative refinement. But are they truly better suited to code generation than the well-established AR models? And what insights can we gain from DiffuCoder, the first major open-source experiment in this field?
Apple's New AI Models Can Understand What’s on Your Screen
When we look at our phone's display, what we see feels obvious—icons, text, and buttons we’re used to. But how does artificial intelligence interpret that same interface? This question is at the heart of joint research between Apple and Finland’s Aalto University, resulting in a model called ILuvUI. This development isn’t just a technical milestone; it’s a major step toward enabling digital systems to truly understand how we use applications—and how they can assist us even more effectively.
Artificial Intelligence in the Service of Religion and the Occult
Imagine attending a religious service. The voice of the priest or rabbi is familiar, the message resonates deeply, and the sermon seems thoughtfully tailored to the lives of those present. Then it is revealed that neither the words nor the voice came from a human being—they were generated by artificial intelligence, trained on the speaker’s previous sermons. The surprise lies not only in the capabilities of the technology, but also in the realization that spirituality—so often viewed as timeless and intrinsically human—has found a new partner in the form of an algorithm. What does this shift mean for faith, religious communities, and our understanding of what it means to believe?