Beyond Language: The COCONUT AI System That Thinks Like Your Brain

8 min readJan 17, 2025

“The limits of my language mean the limits of my world.”
- Ludwig Wittgenstein

When you’re solving a complex puzzle or navigating through a crowded street, you’re not constantly narrating your thoughts in words — your brain is processing tons of information and making decisions in abstract patterns, spatial relationships, and intuitive leaps. This wordless thinking is a fundamental aspect of human cognition; however, until now, artificial intelligence like Large Language Models (LLMs) has been constrained to reasoning through explicit language chains, forced to “think out loud” with every step. In a really cool new development, researchers at Meta have introduced COCONUT (Chain of Continuous Thought) from their paper “Training Large Language Models to Reason in a Continuous Latent Space”, a new approach that allows AI models to reason in abstract neural representations rather than words, much like we can. This innovation isn’t just about making AI more efficient — it is a shift in how we approach artificial intelligence, moving away from the limitations of language-based reasoning toward a more natural and powerful form of machine cognition. As we develop AI systems that can tackle increasingly complex challenges, this ability to think beyond the constraints of language could be key to achieving more sophisticated and human-like artificial intelligence.

The Problem with Language-Based AI Thinking

Current LLMs are limited in how they reason: they must state every step of their thinking process through language, in what’s known as “chain-of-thought” (CoT) reasoning. Imagine having to verbalize every single step when solving a math problem or choosing a move in chess — not just the key insights, but all the connecting thoughts and transitions. That would slow us down and put a linguistic box around what we can achieve. Information is often lost as we translate abstract ideas to their linguistic descriptions. This language-based reasoning is all today’s AI models can do, generating complete sentences and explanations even when much of this language serves only to maintain grammatical coherence rather than contribute to actual problem solving. It’s like forcing someone to narrate their entire thought process while solving a Rubik’s cube, when really, much of the solution comes from visual and spatial reasoning that doesn’t need words at all.

This language-only approach is very different from how our brains work. Neuroimaging studies show that when people engage in some types of reasoning tasks — from mathematical problem-solving to spatial navigation — the brain regions responsible for language processing can remain largely inactive. Instead, other neural networks take the lead, processing information in abstract patterns and relationships that never get translated into words. This difference isn’t just academic — it reveals a significant inefficiency in how current AI systems reason. While we can directly manipulate abstract concepts and relationships in our minds, LLMs must constantly “translate” their reasoning into and out of natural language, adding unnecessary computational overhead and potentially limiting their ability to discover novel solutions that might be hard to express in words.

COCONUT: A New Approach to AI Reasoning

Instead of forcing AI models to reason through words, COCONUT (Chain of Continuous Thought) allows them to operate directly in the abstract neural space of their internal, latent-space representations. Think of it like the difference between having to write out every step of a geometric proof versus being able to visualize and manipulate the mathematical concepts of angles and parallel lines directly in your mind. The system works by taking the AI’s internal neural state — its “thought” at a given moment — and feeding that directly back into the model as input for the next step of reasoning, without ever requiring it to be converted into words. This feedback process creates a continuous chain of abstract thoughts that can flow naturally from one to the next without translation.

COCONUT allows the AI to develop its own internal reasoning space, free from the constraints of human language, bringing several powerful advantages. First, it’s significantly more efficient, requiring fewer computational steps to reach solutions (and thus less energy as well). More importantly, the research showed that this approach allows the AI to consider multiple possible paths simultaneously — something that’s difficult to do when reasoning must be expressed linearly in words. The system can maintain several potential solutions in its abstract thought space and gradually refine them based on their promise, much like how humans often have multiple half-formed ideas that they’re evaluating simultaneously. In tests on complex reasoning tasks requiring extensive planning and search, COCONUT demonstrated superior performance while using fewer computational resources than traditional language-based approaches. By freeing AI from language, we might be opening the door to more sophisticated and nuanced forms of machine reasoning.

Parallels with Human Cognition

Consider how you catch a ball or ride a bicycle — these complex actions require sophisticated calculations of physics, timing, and spatial relationships, yet you perform them without any internal verbal commentary. This kind of wordless thinking isn’t limited to physical tasks; mathematicians often report thinking in abstract patterns and relationships rather than words, and artists frequently describe working through visual problems in purely visual terms. Forcing ourselves to verbalize our thought process can sometimes interfere with performance, as anyone who has tried to explain how they maintain their balance while walking can attest. It’s called the “yips”.

Considering our evolutionary past, we reasoned long before we developed language. Non-linguistic, abstract thought allows for faster processing of information, particularly in situations requiring quick decisions or complex spatial reasoning. When early humans needed to plan hunting strategies, navigate unfamiliar territories, or create tools, the ability to think in patterns, spatial relationships, and cause-effect sequences without the overhead of language would have been invaluable. Even in modern humans, some of our most sophisticated thinking happens in non-verbal forms — from the visual-spatial reasoning used by engineers and architects to the abstract pattern recognition by scientists and mathematicians. COCONUT mirrors this aspect of human cognition and suggests that we may be on the right path toward developing more capable and efficient artificial intelligence systems.

The Results

The proof is in the non-verbal pudding. The results of Meta’s COCONUT system were quite impressive, particularly in tasks requiring complex planning and multi-step reasoning. On logical reasoning problems that demanded extensive search and backtracking, the system not only outperformed traditional language-based AI approaches, but did so while using significantly fewer computational steps. For example, in one test called ProsQA that required navigating complex logical relationships, COCONUT achieved 97% accuracy compared to 77.5% for conventional methods, while using only a fraction of the computational tokens! This efficiency gain isn’t just about speed — it is a qualitatively different and more powerful way of approaching problem-solving.

Results of COCONUT on several benchmarks.

Perhaps the most fascinating result was COCONUT’s emergence of sophisticated problem-solving strategies that weren’t explicitly programmed. The system naturally developed the ability to perform what researchers call “breadth-first search” — considering multiple possible solutions simultaneously and gradually eliminating less-promising paths. It’s similar to how human experts often approach complex problems, maintaining several potential solutions in mind rather than committing to a single path too early. What’s particularly interesting is that this behavior emerged organically from the system’s architecture; it wasn’t specifically trained to think this way. It emerged. Just as human brains naturally develop efficient problem-solving strategies through experience, COCONUT seemed to discover more effective ways of reasoning simply by being freed from the constraints of language-only thought.

Future Implications

By demonstrating that AI can reason more effectively when freed from language constraints, Meta’s research opens new possibilities for developing artificial intelligence systems that think more like humans do. This enhanced reasoning could be particularly helpful in domains where human experts rely heavily on non-verbal reasoning, such as scientific discovery, architectural design, or medical diagnosis. Imagine AI systems that could process complex spatial relationships in protein folding, or reason about engineering problems using the same kind of intuitive physical understanding that human experts develop. This approach might also help bridge the gap between AI’s current capabilities and the kind of flexible, intuitive reasoning that we employ effortlessly.

The fact that both humans and AI systems seem to reason more effectively when operating in abstract spaces rather than through language alone suggests that there might be fundamental principles about how intelligence works that transcend the specific substrate (biological or artificial) in which it operates. Looking ahead, researchers are already exploring ways to expand this approach, including the possibility of pretraining AI models with continuous thoughts to enable more generalized reasoning capabilities. Another potential direction involves combining traditional language-based reasoning with continuous thought, allowing AI systems to switch between verbal and non-verbal modes of thinking depending on the task at hand — just like we do. This hybrid approach could lead to AI systems that are both more powerful and more natural in their problem-solving abilities.

Conclusion

The development of COCONUT is an important step in the evolution of artificial intelligence, suggesting that the path to more capable AI systems might lie beyond the constraints of language-based reasoning. Just as we don’t need to translate every thought into words to solve complex problems, AI systems may become more powerful when allowed to reason in abstract neural spaces without the constraints of language. This parallel between human and artificial intelligence suggests that certain principles of effective reasoning are universal, beyond the specific implementation details of biological or artificial systems. The future of AI reasoning might look less like the explicit verbal chains of thought we’re familiar with today, and more like the fluid, abstract reasoning processes that characterize human expert thinking. By letting AI systems develop their own internal representations and reasoning patterns, we may open the door to artificial intelligence that can tackle increasingly complex challenges with greater efficiency and sophistication.