Thinkless: Fight against the Growing Resource Demands of AI

In recent months, major tech companies have announced a series of reasoning features in their models. However, the immense resource requirements of these systems quickly became apparent, causing the prices of such subscription services to soar. Researchers at the National University of Singapore (NUS) have developed a new framework called "Thinkless", which could significantly transform how large language models (LLMs) handle reasoning tasks. This innovative approach, created by Gongfan Fang, Xinyin Ma, and Xinchao Wang at the NUS xML Lab, enables AI systems to dynamically choose between simple and complex reasoning strategies—potentially reducing computational costs by up to 90%. The framework addresses a critical inefficiency in current AI reasoning methods and represents a major step toward more resource-efficient AI.

Large language models have demonstrated impressive reasoning abilities using techniques like chain-of-thought (CoT), which facilitates step-by-step logical reasoning. However, while CoT improves performance on complex tasks requiring multi-step reasoning, it also incurs significant computational costs. The extensive reasoning tokens generated during these processes increase latency, memory usage, and overall computational demand.

The core issue identified by the researchers is that not every problem requires complex reasoning. Many queries can be answered with simple responses, yet current models often apply the same elaborate reasoning process regardless of task complexity.

This one-size-fits-all approach leads to substantial computational waste, particularly when the tasks are simple. For example, asking an AI to solve a basic addition problem like "2+2" often triggers the same resource-intensive reasoning pipeline used for complex mathematical proofs.

This inefficiency becomes especially problematic at scale—when LLMs process millions of queries daily, every unnecessary inference step multiplies energy consumption, costs, and environmental impact. Recognizing this challenge, researchers at NUS proposed Thinkless as a solution to improve efficiency.

The Thinkless framework enables LLMs to adaptively select between short and long-form reasoning depending on task complexity and model capabilities. The system uses two control tokens: <short> for concise answers and <think> for detailed reasoning.

This binary control mechanism allows the model to dynamically determine the appropriate reasoning depth for each query.

At the heart of the Thinkless method is the novel Decoupled Group Relative Policy Optimization (DeGRPO) algorithm, which marks a significant improvement over traditional reinforcement learning methods. DeGRPO breaks down the learning objective into two key components:

  • Control token loss – Governs the selection of reasoning mode, helping the model learn when to use extensive reasoning and when to deliver direct answers.

  • Response loss – Improves the accuracy of generated responses, ensuring high performance regardless of the chosen reasoning strategy.

This decoupled structure enables fine-grained control over each objective and stabilizes model training.

The model is trained within a reinforcement learning framework, allowing it to learn from experience which problem types benefit from detailed reasoning and which can be efficiently addressed with minimal computation.

The research team developed several model variants, including a 1.5-billion-parameter reinforcement learning model called "Thinkless-1.5B-RL-DeepScaleR" and a warm-up variant named "Thinkless-1.5B-Warmup."

NUS researchers evaluated Thinkless on multiple benchmarks, including Minerva Algebra, MATH-500, and GSM8K. The results were impressive: the framework reduced reliance on long-chain reasoning by 50–90%, significantly enhancing the computational efficiency of inference in language models.

Crucially, this increase in efficiency was achieved without sacrificing accuracy—a key result, as previous attempts to reduce inference complexity have often led to performance degradation. The decoupled nature of the DeGRPO algorithm allows Thinkless to maintain high performance while substantially lowering the computational burden.

Thinkless is not the only approach targeting the computational demands of AI reasoning. Several other strategies have been developed to tackle similar challenges:

ThinkLess (a training-free method)

Confusingly, another research team led by Gengyang Li has developed a similarly named approach called "ThinkLess." However, this method differs fundamentally from NUS’s Thinkless framework as it does not require model training.

Instead of training the model to select reasoning strategies, this method terminates inference early while preserving output quality. Researchers found that response cues often rely more on termination signals than on prior reasoning steps. ThinkLess exploits this insight by triggering the terminator token earlier to skip redundant reasoning while maintaining knowledge transfer.

Early-Stopping Self-Consistency (ESC)

Proposed by Yiwei Li et al., ESC is a scalable sampling technique designed to reduce the cost of self-consistency (SC) in multi-step reasoning. Traditional SC requires multiple samples at a predefined scale, which increases computation. ESC dynamically adjusts this process and significantly reduces the average number of samples—by 33.8% to 84.2%, depending on the benchmark—while maintaining comparable performance.

Dynamic Chain-of-Thought (D-CoT)

D-CoT implements adaptive reasoning depth and steps to reduce redundancy and delays in reward assignment during extended CoT reasoning. It introduces a state compression mechanism, adaptive inference steps, and an importance-driven pruning strategy during autoregressive decoding. It also features a partial reward estimator to assess the efficiency of inference blocks in real time, constructing a multi-level reasoning structure via macro- and micro-constraint buffers.

As concerns grow about the carbon footprint of AI, these approaches mark a crucial step toward more environmentally sustainable AI systems. Moreover, more efficient reasoning is also key to deploying advanced AI capabilities on devices with limited computational resources.

Share this post
After a Historic Turn, SK Hynix Becomes the New Market Leader in the Memory Industry
For three decades, the name Samsung was almost synonymous with leadership in the DRAM market. Now, however, the tables have turned: in the first half of 2025, South Korea’s SK Hynix surpassed its rival in the global memory industry for the first time, ending a streak of more than thirty years. This change signifies not just a shift in corporate rankings but also points to a deeper transformation across the entire semiconductor industry.
The Number of Organized Scientific Fraud Cases is Growing at an Alarming Rate
The world of science is built on curiosity, collaboration, and collective progress—at least in principle. In reality, however, it has always been marked by competition, inequality, and the potential for error. The scientific community has long feared that these pressures could divert some researchers from the fundamental mission of science: creating credible knowledge. For a long time, fraud appeared to be mainly the work of lone perpetrators. In recent years, however, a troubling trend has emerged: growing evidence suggests that fraud is no longer a series of isolated missteps but an organized, industrial-scale activity, according to a recent study.
Beyond the Hype: What Does GPT-5 Really Offer?
The development of artificial intelligence has accelerated rapidly in recent years, reaching a point where news about increasingly advanced models is emerging at an almost overwhelming pace. In this noisy environment, it’s difficult for any new development to stand out, as it must be more and more impressive to cross the threshold of user interest. OpenAI carries a double burden in this regard: not only must it continue to innovate, but it also needs to maintain its lead over fast-advancing competitors. It is into this tense landscape that OpenAI’s newly unveiled GPT-5 model family has arrived—eagerly anticipated by critics who, based on early announcements, expect nothing less than a new milestone in AI development. The big question, then, is whether it lives up to these expectations. In this article, we will examine how GPT-5 fits into the trajectory of AI model evolution, what new features it introduces, and how it impacts the current technological ecosystem.
The Most Popular Theories About the Impact of AI on the Workplace
Since the release of ChatGPT at the end of 2022, the field of AI has seen impressive developments almost every month, sparking widespread speculation about how it will change our lives. One of the central questions concerns its impact on the workplace. As fears surrounding this issue persist, I believe it's worth revisiting the topic from time to time. Although the development of AI is dramatic, over time we may gain a clearer understanding of such questions, as empirical evidence continues to accumulate and more theories emerge attempting to answer them. In this article, I’ve tried to compile the most relevant theories—without claiming to be exhaustive—as the literature on this topic is expanding by the day. The question remains: can we already see the light at the end of the tunnel, or are we still heading into an unfamiliar world we know too little about?
A Brutal Quarter for Apple, but What Comes After the iPhone?
Amid global economic and trade challenges, Apple has once again proven its extraordinary market power, surpassing analyst expectations in the third quarter of its 2025 fiscal year. The Cupertino giant not only posted record revenue for the period ending in June but also reached a historic milestone: the shipment of its three billionth iPhone. This achievement comes at a time when the company is grappling with the cost of punitive tariffs, intensifying competition in artificial intelligence, and a series of setbacks in the same field.
The Micron 9650: The World's First Commercial PCIe 6.0 SSD
In the age of artificial intelligence and high-performance computing, data speed has become critically important. In this rapidly accelerating digital world, Micron has announced a technological breakthrough that redefines our concept of data center storage. Enter the Micron 9650, the world’s first SSD equipped with a PCIe 6.0 interface—not just another product on the market, but a herald of a new era in server-side storage, offering unprecedented speed and efficiency.