Hypothesis: Inner Dialog Was A Key But Partial Exaptation for The Evolution of Language in Humans
The Reflective Exaptation Hypothesis is a synthesis in understanding the origins of human language that hold that inner dialog played a central role in refining cognitive processes in speech.
Abstract
This article proposes the Reflective Exaptation Hypothesis, a new account of language evolution that positions inner dialog (inner speech)—a recursively structured, symbolically rich system of self-directed cognition—as the core substrate that was partially exapted into linguistic expression. Drawing on converging insights from evolutionary neuroscience, developmental psychology, and ecological theory, we argue that the internal structure of thought preceded and scaffolded the external structure of language. Unlike models that treat syntax as a communicative innovation or sudden computational leap, we propose that syntax was pre-adapted within inner dialog, and later externalized via coupling with expressive motor and social signaling circuits.
This framework builds upon Fitch’s exaptationist framework, which emphasizes neural reuse in motor and hierarchical planning systems, and extend it by identifying inner dialog—not imitation or planning per se—as the cognitively complete precursor to syntax. Integrating findings from Geva and Fernyhough, we further ground our model in developmental evidence showing that the dorsal language stream matures in synchrony with the emergence of inner speech. Kolodny and Edelman’s Cognitive Coupling Hypothesis complements our view by offering a plausible ecological context—tool teaching—through which recursive thought systems and communication circuits were repeatedly co-activated and neurally integrated. Our model thus unites these perspectives to offer a functionally continuous, neuroanatomically plausible, and computationally tractable explanation of language origins. It generates specific testable predictions about the relationship between inner speech, syntax, neurodevelopment, and evolutionary context, reframing language not as a sudden invention, but as the externalization of structured thought.
Language, in this view, emerged through the coupling of this internal system with preexisting social signaling pathways. This coupling was further enabled by the generalization of fine motor control circuits, initially evolved for manual precision, to the articulatory apparatus. We synthesize developmental, neurological, comparative, and computational evidence to support this hypothesis, and we propose testable predictions to guide future inquiry into the layered origins of linguistic capacity.
Introduction
Language is a hallmark of human cognition, yet its evolutionary origins remain one of science’s most persistent enigmas. The dominant theories tend to frame language as an adaptation for communication. However, we offer a contrasting view: that language arose through the partial exaptation of a pre-linguistic cognitive system—inner dialog—that was originally non-communicative.
Inner dialog refers to the internally generated, often silent, stream of speech-like thoughts that humans use to plan actions, model scenarios, inhibit impulses, and recursively simulate outcomes. It was originally thought to emerge developmentally through the internalization of external speech (Vygotsky) but too quickly assumes a life of its own, becoming a distinct cognitive workspace that seems anticipatory rather than due to internalization.
We hypothesize that this system, having evolved for general-purpose reflective cognition, provided the structural and functional substrate for symbolic communication. Over time, it became partially coupled with expressive motor and social signaling circuits, enabling the externalization of structured thought as structured language. Language did not cause inner dialog, nor did inner dialog evolve for language. Rather, their coupling occurred because the structural affordances of inner dialog made it an ideal candidate for exaptive reuse as refined motor skill circuits emerged, optimized to control tool use.
Theoretical Framework: Exaptation and Cognitive Coupling
We ground our hypothesis in the evolutionary concept of exaptation (Gould & Vrba, 1982) and extend insights from neural reuse and cognitive coupling models (Fitch, 2011; Kolodny & Edelman, 2018). Inner dialog, rather than being an appendage of speech, is understood here as a core cognitive routine—deeply embedded in metacognition, prospective reasoning, and abstract modeling.
Its structure is recursive, combinatorial, and symbolic—qualities which render it inherently suited to expressing complex meanings. These properties likely predated the evolution of linguistic expression, making inner dialog a reservoir of latent syntactic potential. Rather than being selected for communication, language co-opted existing internal routines when the neural and ecological conditions permitted their expressive coupling.
Developmental and Neural Evidence
Developmental neuroscience supports a stepwise trajectory. The dorsal language stream—especially the arcuate fasciculus and Broca’s area (notably BA 45)—develops in parallel with the emergence of inner speech in children (Geva & Fernyhough, 2019). While the ventral stream is present at birth, the dorsal stream matures slowly, tracking the transition from overt private speech to silent inner dialog.
Crucially, BA 45 is not merely linguistic; it is involved in hierarchical sequencing, motor planning, and goal-directed abstraction. This multifunctional profile supports the notion that the cognitive architecture of language rests on general-purpose recursive circuits that were later co-opted for communicative use.
The Fine Motor Control Bridge
An often-overlooked link in language evolution is the role of fine motor control. Hominins evolved exceptional manual dexterity, enabling precise actions in tool use, foraging, and social grooming. These skills demanded real-time motor sequencing, feedback integration, and multilevel control.
We propose that these same fine-motor circuits—particularly within the primary motor cortex and supplementary motor areas—were later generalized to govern the vocal tract. This includes control of the lips, tongue, larynx, and velum. The repurposing of manual motor networks for vocal articulation enabled the externalization of inner dialog with sufficient resolution for phonological and syntactic complexity.
Importantly, this transition did not invent vocal control but reapplied existing precision systems to a new expressive domain. Articulation thus became a motor interface, capable of expressing the structures already present in inner dialog.
Exaptation Models in Language Evolution
Exaptation-based models of language evolution propose that non-linguistic systems were recruited for language. Fitch (2011) identifies motor hierarchies and vocal imitation as key precursors. Chomsky’s position—though controversial—suggests that language’s computational power emerged from non-communicative internal modeling systems, implying the existence of a Universal Grammar: an innate, species-specific generative architecture capable of recursive structure-building. This framework posits that a dedicated computational mechanism—most notably, the operation Merge—arose as a sudden evolutionary innovation.
In contrast, our hypothesis does not require such a domain-specific universal grammar. Instead, we propose that recursive, symbolic operations evolved incrementally within a general-purpose cognitive architecture, exemplified by inner dialog. These structures, shaped by pressures for planning, inhibition, and abstraction, were later partially exapted into external language. No single computational leap is necessary—only the gradual integration of preexisting modules, each with adaptive value independent of language.
Kolodny & Edelman (2018) propose the Cognitive Coupling Hypothesis, arguing that teaching and tool-making provided contexts where planning and communication circuits became functionally linked. We align with their structural premise but disagree on causality: inner dialog did not evolve for social coordination, nor is language simply an elaboration of teaching behavior. Instead, language emerged when existing reflective systems were partially exapted—not because they were already communicative, but because their structure lent itself to communication once expressive pathways matured.
Inner Speech and Symbolic Abstraction
Inner dialog ranges from fully specified verbalizations to condensed symbolic fragments (Fernyhough, 2004). These condensed forms often lack phonology but retain semantic abstraction and syntactic logic, enabling rapid thought traversal and recursive representation.
Such symbolic condensation—e.g., “if-then,” “before X,” or “not that but this”—forms the scaffolding of syntax. It allows for variable binding, nesting, and referential displacement—core features of linguistic expression. Once expressive channels (vocal or gestural) evolved to externalize these forms, structured language became inevitable. Thus, syntax was not “invented” for language—it was borrowed from the mind.
Testable Predictions
Neurodevelopmental Disorders
Conditions impairing inner speech (e.g., autism, schizophrenia) will exhibit parallel deficits in syntax, working memory, and symbolic abstraction.Neural Overlap
fMRI and lesion studies will show shared activation patterns between inner dialog and syntactic production—especially in BA 45, SMA, and the arcuate fasciculus.Motor Reuse and Central Role in Autism
Neural pathways used for manual fine-motor sequencing will overlap with those controlling articulatory precision in speech, particularly in humans. Motor control deficits will play a central role in speech-related neurodevelopmental disorders, including autism.Computational Simulation
Agents trained on internal planning and recursive reasoning, when paired with expressive modules, will spontaneously generate compositional output structures and learn to communicate with each other. If survival and reproduce is enhanced by sharing and acquiring information not directly observable by other agents, the trait will become fixed in the population.
Positioning Within Prior Work
The Reflective Exaptation Hypothesis builds directly upon the exaptationist framework advanced by Fitch (2011), sharing the core premise that human language emerged from the evolutionary repurposing of non-linguistic neural systems—particularly those involved in motor planning, hierarchical sequencing, and vocal control. Like Fitch, we emphasize the importance of key neuroanatomical structures such as Broca’s area (especially BA 44/45) and the arcuate fasciculus, and we acknowledge the foundational role of motor circuits in the emergence of syntactic processing. Both models reject continuity-based, adaptationist accounts that posit a direct line from animal communication to human language. Instead, both propose that disparate cognitive and sensorimotor systems were independently shaped by evolutionary pressures and later co-opted into a composite functional network that supports language.
Where our hypothesis departs is in its framing of the computational and cognitive substrate that enabled this co-option. Most notably, we place inner dialog (or inner speech)—a recursively structured, symbolically compressed system used for introspective thought, planning, self-monitoring, and conceptual manipulation—at the center of language evolution. While Fitch identifies multiple candidate circuits (including those supporting vocal imitation, hierarchical motor control, and cortical control of phonation), our model proposes that the internal architecture of inner dialog was itself the primary precursor to syntax. It provided a naturally recursive, combinatorial representational system that could be partially exapted for external communication without the need for any communicative function in its initial form.
This inner dialog system, we argue, evolved for domain-general reflective cognition—allowing early hominins to simulate alternative actions, inhibit impulses, structure goals, and integrate knowledge across time. These functions are inherently syntactic and symbolic, albeit non-verbal in their earliest stages. Once vocal articulation systems matured, this structured cognition could be externalized with minimal modification. In other words, syntax did not evolve for communication—it was co-opted from an already structured internal system of thought.
A second major contribution of our hypothesis is the emphasis on motor generalization, particularly the repurposing of fine motor control circuits—originally evolved for manual dexterity in tasks such as tool use, grooming, and gestural manipulation—into articulatory control mechanisms governing the lips, tongue, larynx, and soft palate. This generalization did not occur in isolation but likely unfolded gradually through overlapping use in gesture and vocalization, allowing for increasing resolution in the expression of structured internal representations. Unlike models that treat speech control as a domain-specific innovation, we argue that speech emerged from existing motor systems under selective pressure for precision, feedback integration, and control—traits already well-developed in the hands.
This shift in expressive substrate enabled a modular convergence: a coupling of symbolic inner dialog, preexisting social signaling circuits, and fine-motor expressive capacities. Importantly, our framework does not require the emergence of Universal Grammar, Merge, or any domain-specific computational mechanism. Instead, we posit that symbolic recursion, referential abstraction, and structural nesting were properties of inner dialog well before they were ever externalized in speech. Language, then, did not arise through a punctuated genetic event or a single evolutionary “leap,” but through the incremental binding of independently evolved systems—each with its own adaptive rationale.
Finally, our hypothesis is distinguished by its developmental and computational tractability. While Fitch rightly emphasizes the role of comparative neuroanatomy and convergent evolution, our model invites additional lines of inquiry: longitudinal neurodevelopmental studies of inner speech emergence, neuroimaging of syntax and self-directed cognition, and simulation-based models of internal planning systems undergoing coupling with expressive channels. We predict, for instance, that populations with disrupted inner speech (e.g., in autism, schizophrenia, or post-lesion aphasia) will also exhibit deficits in syntactic abstraction and symbolic reference—even in the absence of motor or social impairments. Such predictions are testable and falsifiable, and they open the door to integrative empirical research.
The Cognitive Coupling Hypothesis (Kolodny & Edelman, 2018) offers a compelling model of how pre-existing neural systems for action planning and communication may have become functionally linked through repeated co-activation in ecologically relevant contexts—most notably, in the teaching of tool-making among kin. Their framework provides a plausible mechanism by which systems initially used for sequential motor control and social coordination could have become structurally and developmentally yoked, giving rise to the capacity for symbolic language. Our Reflective Exaptation Hypothesis is fully compatible with this view, but contributes a novel and essential specification: the content of the structure being exapted. We propose that what became externalized through this coupling was inner dialog—a recursively structured, symbolically rich, self-directed cognitive system that evolved independently of communicative demands. COCO describes the how and where of neural convergence; we supply the what—the internal syntax that provided the substrate for linguistic expression. Together, the two models form a layered account of language evolution that is ecologically grounded, neuroanatomically plausible, and computationally constrained, while avoiding the need for domain-specific grammars or mutation-based leaps.
A Synthesis
The Reflective Exaptation Hypothesis offers a unifying synthesis between Fitch’s (2011) exaptationist model of syntax and Geva and Fernyhough’s (2019) developmental account of inner speech. While Fitch convincingly argues that human language arose through the repurposing of neural systems originally used for motor planning, vocal control, and hierarchical sequencing, he does not fully articulate the cognitive substrate that gives rise to syntactic structure itself. Our hypothesis builds on this foundation by identifying inner dialog—a recursively structured, symbolically condensed, and developmentally emergent system of self-directed cognition—as the primary functional precursor to syntax. Unlike general motor hierarchies or imitation circuits, inner dialog already encodes the hierarchical, compositional properties that define language, making it a natural candidate for exaptation into symbolic communication. This advances Fitch’s model by specifying the internal architecture that provided the representational raw material for syntax before it was expressed vocally.
In parallel, we integrate the developmental neurobiological findings of Geva and Fernyhough (2019), who demonstrate that inner speech emerges only as the dorsal language stream matures, particularly the fronto-temporal and fronto-parietal segments. This maturation mirrors our evolutionary account: just as children cannot engage in structured inner dialog without a developed dorsal stream, early hominins could not externalize symbolic thought without the neuroanatomical infrastructure to support it. Furthermore, their work emphasizes bidirectional causality—how sociocultural interaction shapes neural development—offering a mechanism by which inner dialog could evolve adaptively prior to its externalization. By combining Fitch’s evolutionary modularity with Geva and Fernyhough’s developmental trajectory, our hypothesis bridges the cognitive, neural, and evolutionary domains. We propose that language did not evolve to enable complex thought, but that it evolved by partially exapting the already structured substrate of inner dialog, made expressible through motor generalization and socially scaffolded communication.
In summary, while we are aligned with Fitch’s exaptationist logic and modular decomposition of language, the Reflective Exaptation Hypothesis offers a more integrated and functionally continuous model of language emergence, grounded in the symbolic architecture of inner thought. It proposes that language is not merely a repurposed signaling system but the external echo of an internal syntax, scaffolded by motor convergence and made tractable through modular reuse. This model brings cognitive, motoric, social, and symbolic systems into a common evolutionary framework, and it emphasizes how structure precedes expression, just as thought precedes speech.
Speculative Conclusion
Language did not evolve as a single-purpose solution to a communicative problem. It is the product of convergence: an emergent synthesis of inner dialog, fine-motor control, and social signaling systems—each with their own evolutionary history.
The Reflective Exaptation Hypothesis posits that language arose from the partial co-option of a system evolved for reflective thought. This coupling was made possible by shared structure—not shared origin. As these systems intertwined, symbolic communication became possible—not by design, but by the latent affordances already embedded in our neurocognitive architecture.
Language, then, is not a biological anomaly—it is a bricolage, an evolutionary patchwork of modules that, when aligned, opened the door to thought made public, mind made mobile, and history made possible.
Chomsky, N. (1980). Rules and representations. Columbia University Press.
[Overview of internal language (I-language), competence vs. performance, and UG concepts.]
Chomsky, N. (1995). The minimalist program. MIT Press.
[Introduces the Merge operation and the minimalist approach to Universal Grammar.]
Fernyhough, C. (2004). Alien voices and inner dialogue: Towards a developmental account of auditory verbal hallucinations. New Ideas in Psychology, 22(1), 49–68. https://doi.org/10.1016/j.newideapsych.2004.09.001
[Discusses expanded vs. condensed forms of inner speech.]
Fitch, W. T. (2011). The evolution of syntax: An exaptationist perspective. Frontiers in Evolutionary Neuroscience, 3, Article 9. https://doi.org/10.3389/fnevo.2011.00009
[Outlines exaptive origins of syntax and motor hierarchies.]
Geva, S., & Fernyhough, C. (2019). A penny for your thoughts: Children's inner speech and its neuro-development. Frontiers in Psychology, 10, Article 1708. https://doi.org/10.3389/fpsyg.2019.01708
[Reviews dorsal language stream development and inner speech emergence.]
Gould, S. J., & Vrba, E. S. (1982). Exaptation—a missing term in the science of form. Paleobiology, 8(1), 4–15. https://doi.org/10.1017/S0094837300004310
[Coined and elaborated the concept of exaptation.]
Kolodny, O., & Edelman, S. (2018). The evolution of the capacity for language: The ecological context and adaptive value of a process of cognitive hijacking. Philosophical Transactions of the Royal Society B: Biological Sciences, 373(1743), 20170052. https://doi.org/10.1098/rstb.2017.0052
[Proposes the “Cognitive Coupling Hypothesis” in language evolution.]
Vygotsky, L. S. (1987). Thinking and speech. In R. W. Rieber & A. S. Carton (Eds.), The collected works of L. S. Vygotsky (Vol. 1, pp. 39–285). Plenum Press.
[Seminal work on the internalization of external speech and the development of inner dialog.]
It makes sense that language would be a natural product of neural tissue that is already engaged in semiosic processes.