Doing things: reconstructing hominin cognitive evolution from the archeological record [version 1; peer review: 2 not approved]

Following Pain’s (2021) critical assessment of the prospects of minimal capacity inferences within cognitive archeology based on ‘classical’ cognitive science, I elaborate on the chances of these inferences within so-called embodied, embedded, extended, and enacted (4E) frameworks. Cognitive archeologists infer the cognitive abilities of past hominins from the remains found in the archeological record. Here they face the problem of choosing a theory from the cognitive sciences. Results vary considerably, depending on one’s cognitive theory, so choice matters. Where classical views conceive cognition as mainly involving representations and computing, more recent 4E approaches focus on interactions between environment, body, and brain: hence the same trace, like a stone tool, might require capacities like a mental ‘blueprint’ according to the former, but only environmentally guided perception according to the latter. Given this crucial choice of theory, what are the prospects of 4E then? I present a model of cognitive hominin evolution based on 4E and niche construction theory. Based on this model, I argue that we should be guardedly optimistic: contrary to first impressions, minimal capacity inferences work well within the 4E framework, and adopting 4E might give us a methodological advantage, too.


Introduction
Cognitive archeology is the brave attempt to infer the cognitive abilities of past people from their material remains, thereby exploring how the cognitive evolution of hominins -and eventually humans -has unfolded. This is no easy task. To explain the evolution of human cognition requires understanding which forces shaped and re-shaped over and again the cognitive underpinnings of hominin lifeways from several million to only a few thousand years ago. To yield such explanations, cognitive archeologists have a varied toolkit of inferences (cf. Abramiuk 2012; Currie 2018), but regarding how our ancestors thought, two kinds of inferences stand out: minimum capacity and cognitive transition inferences (cf. Pain 2021; Currie and Killin 2019; Killin and Pain 2021). The former is for inferring cognitive capacities from material remains (often stone tools): what cognitive ability enabled the behavior required to produce the material trace we found in the record? The latter is to infer a change in cognitive abilities from a transition within the material record.
Yet the material remains aren't sufficient to license an inference from stone to mind. Cognitive archeologists must interpret them considering a cognitive theory or model. Here, choice matters, for the results vary enormously with one theory or the other. Take the two 'grand' families of theories currently discussed in the cognitive sciences: the classical or 'orthodox' (Wheeler 2005; cf. Malafouris 2020) information-processing approach views cognition as mainly involving representations and computing; more recent embodied, embedded, extended, and enacted (4E) approaches focus on interactions between environment, body, and brain to yield cognitive abilities instead. So, the same trace in the record, say a specific stone tool, might require capacities like a mental 'blueprint' or sophisticated long-term memory according to classical theories, but only environmentally guided perception and simple learning according to ideas from the 4E family.
Cognitive archeologists have no choice but to decide. Only by using a theory or model, can they make inferences from artifact to cognitive capacity. And the choice of theory or model comes first. It is useless to switch between those 'grand' families of theories case by case or run through a trace from A to Z for which theory would yield which result. Particular artifacts might fare better with one approach, but these theories are about general aspects of the human mind. So we can't switch from 4E to 'orthodox' views from one artifact to another. The human mind is either one or the other (or something altogether different, but we all have to await how the history of cognitive science will unfold). So, hominin cognitive evolution is a narrative about the unfolding of a mind as described by either 4E or classical theories. (It's a different story what cognitive archeologists could contribute to the current understanding of human cognition. For example, they could show that, by and large, we yield better explanations of hominin cognitive evolution if we prefer 4E over 'orthodox' views (or vice versa). Such an endeavor could be -quite literally -an inference to the best explanation from the (best explanation of the) material record to which of the 'grand' theories on extant human minds gains more plausibility. But this is not the theme of this paper.) There is no way to elude: Cognitive archeologists must make their best guess about theory first and then get to work. So, if we want to yield evolutionary trajectories of hominin cognitive abilities, established (mainly) by the above inferences: Which theory should one choose then? This paper shall not decide this question. Instead, I want to elaborate on how inferences from the archeological record yield evolutionary trajectories of hominin cognition. Section 2 introduces this 'inferential engine': I present the early cognitive archeology approach, which by and large used 'orthodox' cognitive science, along with its problems. It will be a reference for later sections. In Sections 3 to 5, I discuss this 'inferential engine' within the context of 4E. Section 3 introduces the 4E cognitive framework; Section 4 provides an example from the emergence of numeracy in the neolithic as a 'proof of concept' for this framework. It shows the applicability of the 4E's but raises the question of how to infer 'thinking from things' within this framework. Section 5 thus will revisit these inferences. I argue that we should be guardedly optimistic: not only do these inferences work well within the new framework, but adopting 4E might give us a methodological advantage, too. I start by characterizing the explanation we aim for.

Lineage explanations
Cognition is a complex trait. Like any such feature, it unlikely evolved in just one step. Instead, one must reveal how 'baseline' organisms (without the trait in question) could turn into organisms with the trait. Any evolutionary trajectory of such lengths consists of many steps, and the selection of all these steps couldn't happen in just a single event, either. There must have been an ongoing interplay between events, which selected for adaptions, and these adaptions then being starting points for the next step toward the trait of interest.
To account for such trajectories, we need so-called lineage explanations (Calcott 2008;Sterelny 2012b). Such explanations must make plausible how every single step along the line could have emerged. Each step must start from a platform: a scenario that sets the stage with a certain selective pressure and already includes a possible precursor to an adaption to this pressure. The step itself consists of only a small transformation of this precursor then: slightly so modified as to function as adaptive variation (This requires, of course, that there indeed is selective pressure for such a transformation; and that this adaptive response was within reach for the organism). Once selected, the new stage will afterward have to perform as a platform for the next transformation.
The process becomes iterative, so a modified platform becomes a new stage, which functions as a platform for the following modification (and so on). Each step must be the transformed version of the previous setting, and each stage must be selected in its scenario. So to explain the emergence of a complex trait, we need an idea of a baseline and information about a series of small steps, where each step, in turn, consist of a platform, an event selecting for new variation, and how a transformation of the platform could function as such a required variation (i.e., as an adaption to specific selective pressures). Only with these ingredients might we yield an evolutionary trajectory -from one state (without the trait) to a series of other states (with precursors of the trait in question) down until a state with the trait emerges.
Take the manifold evolution towards complex eyes as an example (I follow Nilsson's (2013(I follow Nilsson's ( , 2009) exposition here). Eyeless animals did not turn into animals with complex eyes overnight. In a nutshell, it started with cells only "monitoring the ambient light intensity" (Nilsson 2013, 6; i.e., non-directional photoreception), which evolved into pigment cells capable of directional photoreception, followed by eyes capable of low-resolution vision. Finally, descending from this stage, a compound lensed eye capable of high-resolution vision for "detection, pursuit, and communication with other animals" (Nilsson 2013, 6) evolved. Each of the stages enumerated is a step toward complex eyes. Furthermore, each of these steps had to be selected for to be a platform for the next step.
Likewise with hominin cognition. To be sure, the evolution of the vertebrate eye covers a period of 600 million years (Lamb, Collin, and Pugh 2007). The specific characteristics of the modern human mind evolved much more quickly. Yet granted the main explanandum of cognitive archeology are "single individual components of cognition" (Wynn 2019, 501), say memory, spatial cognition, or language, and that their complete unfolding "occurred at several points in hominin evolution" (2019, 6): Such traits unlikely evolve without precursors over such long-time spans. Hence intermediate stages of precursors must meet likely occurred selection pressures to bring forth the (preliminary) final stage of the trait in question. Again, there must be a series of platforms transforming into new adaptions due to new selective pressures once and again until the trait in question could arise. In short, we need lineage explanations for hominin cognitive traits, too.
How does cognitive archeology achieve such an objective? In the remainder of this paper, I elaborate on how inferences made from the archeological record shall yield such lineage explanations. To start with, I present the early cognitive archeology approach.

Early cognitive archeology and its inferences
Brains alone enable cognition, according to classical cognitive science. If so, brains are the sole basis of cognitive capacities. These capacities, in turn, cause behavior, and behavior causes artifacts. If so, one can read off from an artifact first the behavior required (think Chaîne opératoire). One can then ask in a second step: what cognitive capacity is needed to produce this behavior?
This makes artifacts in the archeological record 'markers' for cognitive change within the hominin lineage. Artifacts function as indirect evidence of the occurrence of a particular mental capacity of its producer in the past. This first use of the archeological record is complemented by a second one: evidence about hominin brain evolution, as they manifest from the interpretation of fossilized skulls. Evidence about increasing brain size seems to correspond to the emergence of more complex artifacts -at least in a general pattern and over long timescales. Another example not based directly on artifacts is the well-known research by Robin Dunbar, which relates estimated growth in group size with increasing relative neocortical volume within the hominin lineage (Dunbar 1998). Bigger groups increase social complexity, a cognitive challenge one must cope with. Hence, if only brains realize cognition, increasing cognitive demand should manifest itself with changing brain size or structure (or an increase in neocortical volume, as in this case).
So, if we see a change in material culture and a corresponding change in brain size or structure, everything seems to match. Artifacts come into existence through the behaviors of their producers, and these behaviors become possible due to certain cognitive capacities they entertain -eventually because their brains realize these capacities. Since skulls and stone artifacts both fossilize, we have the in-principle possibility to track change in the archeological record and hence the evolution of cognition. Two kinds of inferences are at play here.
The first inferential practices here are so-called minimum capacity inferences. This kind of inference is best characterized as inference to the best explanation (cf. Pain 2021, 248): the assumed cognitive capacity functions as the best explanation for how an agent could bring forth the behavior required to produce the desired artifact as an outcome (Although it is rather common to frame these inferences as a deductive modus ponens-inference, cf. Abramiuk 2012, 143ff.). One must be parsimonious here: no more cognitive capacities than absolutely required shall be assumed. And eventually, which capacities are assumed to be necessary hinges on one's cognitive theory. Two different theories might explain differently how the same behavior is brought forth. For example, to execute the behavior required to yield something sophisticated as an Acheulean hand axe might invoke something as entertaining 'mental blueprints' according to one theory, but only perception and learning of associations to another (cf. Killin and Pain 2021, An excellent illustration of this is the case studies given in Killin and Pain 2022; as well as the discussion on the changing interpretation of Oldowan stone tools by Wynn 2023). So, the right choice of cognitive model matters (pace the overall tenet of Killin and Pain 2021), yet I will discuss the details later on (see Sections 4 and 5).
On an abstract level, such an inference runs as follows: Analysis: Artifact A requires behavior B to be produced.
Inference to the best explanation: Assumed a cognitive capacity C, this would be the best explanation for how an agent could bring forth B (and so A).
Hence: The agent (as the producer of A) likely was capable of C.
This kind of inference is a building block for tracking the chronology of cognitive change.
Minimum capacity inferences allow us to identify the single stages within the trajectory, where the transition between these stages into one another is argued for by cognitive transition inferences, which are compounds of single minimum capacity inferences. Pain sketches them as follows: (1) Run a minimal-capacity inference on a technology, Y.
(2) Run a minimal-capacity inference on a technology X, where the appearance of X predates the appearance of Y in the record.
(3) Identify a capacity, C, that is the best explanation for Y but not X.
(4) Infer that: a. C was absent during the period X was produced; and b. Y signals the emergence of C. (Pain 2021, 249f.) Note that (3) is already included in (1), something imprecisely depicted by Pain. A more sophisticated account would also add something like this for (2): (3') Identify a capacity, C -1 , that is the best explanation for X (but not Y); thereby changing the conclusion into: (4) (b') The transition from X to Y signals the transition from C -1 to C.
In any event, the application is simple: A minimum capacity inference gives us an idea about the cognitive capacity required to produce a particular artifact. So two of these inferences on either side of a transition in the archeological record from one artifact to another should signal a change in underlying cognitive capacity.
Frederik Coolidge and Thomas Wynn are an example of this strategy (Coolidge and Wynn 2018; also discussed in detail by Pain 2021, 250ff.; cf. also Sterelny 2017). They argue that one major leap in the cognitive development of Sapiens occurred due to enhanced working memory, caused by "an additive genetic mutation or an epigenetic event that affected the neural organization of the brain" (Coolidge and Wynn 2018, 233; also cited by Pain 2021). Coolidge and Wynn posit enhanced working memory as the basis for abilities like abstract reasoning, contingency planning, or symbolic thinking. After identifying such a sophisticated cognitive capacity as an underlying cause for modern human cognition, the question arises when this capacity emerged during our evolutionary history. Despite arguing for this transition in a more complex manner than I illustrate here, at one point within their argumentation, Coolidge and Wynn turn implicitly to a cognitive transition inference.
Consider the well-known figurine from Hohlenstein-Stadel, South Germany, ca. 40 kya. Since it depicts a lion-headed human, Coolidge and Wynn infer that the maker of this artifact must have been, at minimum, capable of abstract reasoning. The concepts of 'lion' and 'human' had to be combined into one abstract concept to yield a 'mental blueprint' for making this artifact. (As I only want to illustrate the 'inferential engine' at work here, I won't discuss this step in their overall argument.) Since abstract reasoning, according to their line of thought above, requires enhanced working memory, the Hohlenstein-Stadel figurine indicates the occurrence of this capacity. The latter leads to whether another artifact, which predates this figurine, also requires something like abstract reasoning. Coolidge and Wynn discuss the also well-known Blombos beads from South Africa (ca. 77 kya) as a possible candidate but infer that for these beads learning simple associations suffices (where the latter does not need any enhanced working memory)(cf. 2018, 262f.). Their cognitive transition inference in a sketch: (1) Minimum capacity inference: (An artifact like) The Hohlenstein-Stadel figurine requires enhanced working memory.
(2) Minimum capacity inference: (An artifact like) The Blombos beads requires learning of simple associations (only); where the appearance of the beads predates the appearance of the figurine in the record. Of course, the groups associated with the beads and the figurine weren't directly related, so to mark the abstract nature of the inference above, I added 'an artifact like' in brackets to clarify this issue; yet this does not change the implied chronology. Coolidge & Wynn eventually locate the emergence of enhanced working memory somewhere between 70-40 kya. To conclude, cognitive transition inferences allow one to sketch a chronology of the arrival of cognitive capacities, in principle allowing one to aim for lineage explanations of the capacity in question.
(Note that rather often, there is no information about the selective regime(s) contained, thus no argument about what caused the adaptions. Yet this is needed to explain a trait's complete lineage. The (imagined) story in early cognitive archeology, therefore, rather often seems to be that as soon as new cognitive abilities arose by genetic variation, hominins explored and exploited new opportunities with them -including the new things they were allowed to do. This way, positive selection for new possibilities drove hominin evolution into new directions.) From a theoretical perspective, all this is good practice. The problem arises with the details. Three different kinds of issues are to be identified here. First, the fossil record of brain sizes and structures doesn't fit the fossil record of material culture as straightforward as it once appeared (for a summary of the discussion see Sterelny 2021, 8557ff.; cf. also Kuhn 2020). The first appearance of stone tools is now associated with hominins 3.3 mya, long before brain size increased. Likewise, despite some coincidence of brain evolution in Erectines and Acheulian hand axes 1.7 mya, innovations such as the Levallois technique is now dated to 500 kya, with no corresponding change of brain size in the record. In general, the record does not reveal a unidirectional shift from simple to complex tools, along with increasing encephalization in the hominin lineage, but rather a fragile appearing and disappearing of tool technology with seemingly no direct link to brain evolution.
Second, a related problem is that we can track changes in a material culture where it is relatively safe to assume that no significant brain size changes occurred. Two causes for this are discussed, the first one being demographic. Group size might affect change in material culture with no change in the underlying cognitive capacities of its members.
One possibility would be inhibiting innovation due to small group sizes. For example, Premo and Kuhn (2010) have argued that Middle Paleolithic hominins might have lacked the ability to innovate due to high extinction rates within their groups, causing "the demographic fragility of the small social groups in which they lived" (Premo and Kuhn 2010, 8; also discussed by Pain 2021). High extinction rates lead to loss of information, which hinders the transfer of required techniques -in the end, no resources to innovate remain. In the same vein, amongst others, Sterelny and Hiscock (2014) argue that bigger groups would accelerate change in material culture due to a higher ability to store and transmit information. An example would be an onset of diversity in the material record in Middle Stone Age Africa 100 kya and the Upper Paleolithic (once so-called) 'cultural revolution' in Europe 40 kya, both associated with an increase in (local) population size, long after the speciation of Homo Sapiens and their brains 300.000 kya. If true, demographic factors might explain a change in material culture without an underlying change in brain size or structure.
The second cause could be environmental in nature. Different environments might ask for different toolkits to be deployed. Pain (2021) discusses the work of Torrence (2001) here: For example, variation in food availability should select for more various toolkits to secure these resources (for further discussion, see also Sterelny 2017, 238f.). So environmental factors could play a role in the sophistication of tool kits, making them a possibility to be considered. Eventually, the problems sketched so far undermine the direct inference from (change in) material record to (change in) cognition, at least given the assumption that brain size or structure is the sole unidirectional cause of cognitive capacity.
So, finally, the third problem might be a change in plausible accounts of cognition itself. As mentioned above, minimum capacity inferences depend on one's choice of cognitive framework; at first glance, they work well only with 'orthodox' cognitive science, assuming that there is a unidirectional causal link from brain structure to mental capacities, and from there to the artifacts uncovered in the archeological record. Yet within the last 30 years, the cognitive sciences have witnessed an alternative framework of the so-called 4E's (embodied, enactive, extended, and embedded). In line with at least some of these E's, new research also indicates high neuronal plasticity of human brains during both onto-and phylogeny Anderson 2014;. If true, then brains (alone) don't equal cognition, and tools (alone) no longer indicate cognitive capacities. Then, how can we infer cognitive abilities and track change within these abilities along evolutionary time scales? In the next section, I elaborate on this alternative framework before discussing its application in the remainder of this paper.

A new cognitive framework
Cognition isn't a complex feature only; it's also always the feature of someone. Turning this insight into a new framework of cognition has consequences. Such an agent has a body, and the physical make-up of this body matters to their abilities, including cognitive ones; hence cognition is embodied. Often, these abilities become possible only thanks to things external to one's brain; thus, cognition is extended. Likewise, interaction with one's environment, in general, plays an important role here; hence cognition is embedded. Even more radical, abilities aren't just stored as mental representations within one's brain, as something (previously) thought about and then executed: they only come into being through active engagement with the environment; hence cognition is enacted. And finally, if the new cognitive sciences are on the right track, cognition is a skill: something to be learned in a socially scaffolded environment. Cognition isn't inherited by genes alone but by developmental resources culturally transmitted. Given that this framework is correct, it will change how we tackle lineage explanations. Let me introduce it in some detail.
The body shapes cognition in at least two ways. First, one's physical make-up is not only controlled by cognition. Rather the other way round, bodily parts do constitute cognitive processes as well. The typical example is walking here: balancing during locomotion requires constant feedback from physiological structures in direct contact with the ground (plus the inner ear), not cognitive control over physiological structures via amodal representations of how to walk. So, cognition emerges by feedback loops via one's body with the environment, not by top-down control of the extra-cranial environment (including one's body) -an idea to be explored further below. Second, a body constitutes the way it can use opportunities to interact with its environment; the latter changes how cognition unfolds. For example, as Ben Jeffares (2010, 2013) argues, transforming a body from a predominantly obligate bipedal (think Ardipithecus) to a habitual bipedal like the Australopithecines also changes their cognitive biases. Since there is not much evidence about brain structure and material culture, Australopithecines get much neglected in accounts of cognitive evolution. Most often, they work as a 'chimp-like' baseline only. However, if Jeffares is right, bipedalism allowed them to interact more with the world. Their activities become more autonomous, and range size increases while constraints on their behavior cease. For example, they can carry things over more considerable distances now or exploit materials differently through a new way of hand-eye coordination. All "(t) hese behaviors set up the potential for new cognitive skills" (Jeffares 2013, 12f.). So due to a change in their bodily form, Australopithecines were open to new cognitive forms: a changing physical make-up allows them to exploit new opportunities, where such a changing set-up realizes new cognitive make-ups.
This also changes how the external environment can be used for cognitive purposes in a small but significant subset of these activities. Suppose one makes marks on a sheet of paper to visualize the number of paragraphs still to write, then scratches out one after another. It is an example of using artifacts to accomplish a cognitive task: "physical objects made by humans for the purpose of aiding, enhancing or improving cognition" (Hutchins 1995, 126). Here two aspects stand out. First, some of the artifacts humans use aid for cognition in the strong sense that without these aides, the cognitive process would be impossible. Hence the artifact is part of this ability. (Although, for our purposes here, we shouldn't descend into metaphysical debates about minds.) Second, such cognitive artifacts do not always foster or 'externalize' a preexisting 'internal' cognitive abilitysometimes, the task at hand gets solved by changing the required cognitive abilities to cope with them.  discusses Cole and Griffin (1980), which pointed out that 'remembering' something by writing it down means that one transforms the task into a "different set of functional skills" (ibid.). In this case, writing it down involves visual skills once used for face recognition (reading) rather than memorizing complex themes, for example. Further examples of cognitive artifacts are calendars, writing systems, and mnemonic devices (items to help to remember things, like a string on your finger). Some have even argued that cultural practices like proverbs or rules of thumb are cognitive artifacts, too (Norman 1993). So we're surrounded by all kinds of cognitive artifacts which form part of our mental life.
Generally, the environment one is embedded in facilitates one's cognitive possibilities. For hominins, these environments have already been a long time both social and self-created. I discuss both of them in turn.
Self-created environments play a massive part in hominin cognitive evolution. Here Niche Construction Theory enters. In a nutshell: Organisms modify their environments, thereby changing the selective pressures on themselves. So organisms, to some extent, created their selective regime (which they had to adapt to) themselves. A prime example is a selection for lactose tolerance in those Homo Sapiens groups (and only those) which engaged in dairying (cf. Boivin 2008, 200f.). In niche construction theory, the term 'niche' refers to just this: so a niche is always selective.
Furthermore, niche construction can result in evolutionary feedback: "organisms drive environmental change and organism-modified environments subsequently select organisms" (Laland and Brown 2006, 96). The latter again alter their environments anew for their descendants. Take Osvath and Gärdenfors (2004), who deploy niche construction to analyze the cognitive evolution of early hominins. According to them, a changing hominin niche 2.5 mya caused the coevolution of transport and planning. Transportation of raw materials and tools expanded at that time. Hence behavior sequences stretched more and more. Finally, according to Osvath and Gärdenfors, anticipatory planning became mandatory. For hominins planned these behavior sequences. Yet to handle these ever longer transport, they had to plan at a certain level in anticipatory mode. So here we have a case where a self-created niche of stone tool production and transportation imposed a cognitive challenge to which subsequent hominins had to adapt.
Yet it isn't only selection pressures that get culturally transmitted via niche construction. One final idea in 4E cognition is that of cognition being enacted. Cognition doesn't start with the passive reception of information, then acting on it to inform decisions, and finally becoming behavior in the world. Rather, cognition is already doing things: it is through ongoing interactions with one's environment that cognition gets realized. This way, cognitive abilities come into being by doing. Interactions with one's environment (with or without artifacts) become "transformational and not merely informational" (Paolo, Rohde, and Jaegher 2010, 39; cited by Overmann 2019, 432). A famous example (to be elaborated in the next section) is the idea that manipulating mathematical symbols is already the cognitive part, not just an external representation of pre-established inner lines of thought. (In this case, being enacted makes the mind also embedded or extended, since without the outer material, the kind of cognition considered wouldn't be possible.) So, cognition isn't just doing: for humans -and most likely many hominins before us -it is doing with things (cf. Baggs, Raja, and Anderson 2020).  has applied this line of thought even to knapping, arguing that no mental 'blueprint' guides the making of a hand axe, for any "decision about where to place the next blow, and how much force to use, is not taken by the knapper in isolation; it is not even processed internally." (Malafouris 2010, 17). Rather: "The flaking intention is constituted, at least partially, by the stone itself … (as) an integral and complementary part of the intention to knap" (2010, cf. also his 2013, 175ff.). If so, cognition turns out to be a skill, an activity often involving transforming material, turning them into artifacts, or operating with already produced (cognitive) artifacts. Such activities have to be learned.
In the context of hominin cognitive evolution, this also has important implications. As Karola Stotz (2010, 2017) argued, it is not only selective niches we inherit; humans also inherit the resources to deploy and develop answers to these selfimposed challenges. Humans can construct new variations and pass them on to later generations via scaffolds of teaching and learning (Sterelny 2003(Sterelny , 2012a. These teaching and learning scaffolds are required for the "robust and reliable development of species-specific traits" (Stotz 2017, 5). Humans, and very likely even hominins much earlier in our lineage, are no exception here. Even more important for our context is that these scaffolds enable "developmental plasticity" (2017, 5), which allows for an adaptive response to selective pressures -also the self-imposed ones of niche construction theorists. Here 4E, especially the idea of enactive cognition and 'materiality' (extended), comes into play: both developmental niche construction and 4E-cognition in tandem can explain rapid, theoretically even within less than one generation, adaption to selective pressures. As mentioned above: some have argued that cultural practices like proverbs or rules of thumb are cognitive artifacts. To repeat: these artifacts do not work because of their material properties only, but "are always embedded in larger socio-cultural systems that organize the practices in which they are used" (Hutchins 1995, 127). One has to embody rules for making these items, which makes social learning a prerequisite of their use. So not only selective regimes but also variations, some of them adaptions-to-be, including cognitive artifacts and skills, can come into being and be inherited 'self-made'.
To get down to the basics: A selective niche -or rather: the selective part of a niche -fosters cognitive skills required to cope with them. Sometimes a cognitive task can be solved by exchanging one ability with another and adding a material medium. Both production and use of this material, however, get established by preexisting cognitive skills. (Preexisting cognitive skills allow one to create new cognitive artifacts, which -in tandem with (maybe) other preexisting skills -allow for the cognitive skill required. The latter explains the emergence of new cognitive skills out of only preexisting ones.) Once acquired, new cognitive skills allow one to change things already existing, adapt to what is coming next as future opportunities, or re-direct and channel slowly accelerating developments in otherwise impossible ways. And all these abstractly sketched doings are instances of a down-to-earth biological process: niche construction in both its selective and developmental manifestation.
Theories of 4E cognition are still far from becoming orthodox in the field. Yet granted they are on the right track, how does this change how we should offer lineage explanations on hominin cognitive evolution? One advantage is high phenotypic plasticity: if cognitive skills depend on and are formed by social environments and technologies, they can come into being quickly and change rapidly (cf. Jeffares 2013). It also allows for high adaptability to the evolving niches: if cognitive skills don't have to wait for a generation (or more) to come into existence, a population can meet their challenges quicker. Hence, given we have enough archeological information about changing (selective) niches, and information about their inhabitants' potential resources to form new cognitive skills, cognitive change becomes more accessible to track than before. (I will elaborate on this in Section 5 below.) In contrast, within early cognitive archeology, as sketched in Section 2 above, change was caused by a relatively slow variation on the genetic base of neural structures. Also, given the cognitive transition inferences discussed above, they potentially allowed for tracking cognitive change but couldn't give a detailed answer to which selection pressure they were an answer. Regarding the use of the archeological record to identify cognitive skills, however, things aren't as straightforward as they were before. For one thing, if cognition is distributed and scaffolded, the archeological record of the size and organization of hominin brains no longer serves as direct evidence of corresponding cognitive change (Sterelny 2017, 243). For another, if there is no longer a clear causal arrow from cognition to artifact, artifacts no longer straightforwardly equal cognitive capacities. Hence we also lose another source of evidence.
If so, how to construct reasonable lineage explanations based on 4E-cognition and Stotz's conception of developmental niche construction? Showing this new framework in action works best with cognitive history, where we have relatively clear evidence of cognitive artifacts. I sketch an example from the early neolithic to make a case in point.

An example: the cognitive life of clay
When cognition is a compound of skills, its emergence must be the stepwise assembling of its components, either by transforming previously existing elements or recombining old ones to yield novel outcomes. Given a certain niche as the baseline for a lineage explanation, the existing skills must be sufficient to function as a platform. A minor transformation or recombination must yield a new cognitive skill as a step from one niche to the other. This iterates, and skills already existing within this new niche must suffice as a platform for the next step. (Granted that there is indeed a need, and thus selection for, these skills.) This way, skill after skill accumulates till everything is in place to explain the trait in question, be it a cognitive (compound of) skill(s) such as language, planning, mind reading -or numeracy.
Applying this scheme might look like the following case study on the emergence of numerical cognition in the Ancient Near East (ANE), 10.000-5.000 before present (BP). Well-known to any cognitive archeologist, introduced by Denise Schmand-Besserat (1992,1999,2009,2010), also discussed in detail from an enactivist viewpoint by Lambros Malafouris (2013; for a critical stance see Johnson and Everett 2021); and further elaborated by Karenleigh Overmann in recent years (2016,2017,2019). I rephrase this account on numeric cognition in the remainder of this section, adding the critical dimension of the selective regime(s) here and there to yield a complete lineage explanation.
Hunter-gatherers stroll around; but around 10.000 BP, some human groups in the Eurasian region became evermore locally bounded. Agriculture emerged for the first time in the Fertile Crescent of modern-day Syria to Iran. The reasons for this are multiple, scaffolded by other occurrences, and evolved over a more extended period as the once-used term 'neolithic revolution' might suggest. In any event, farming produced a surplus of food, with the need for storage, eventually yielding a new redistribution economy. This engendered social upheavals and cognitive ones as well.
A need arose to control these streams of goods. This fosters trivial but challenging problems (cf. Schmandt-Besserat 2009, 152). Vast amounts of grain cannot be pushed around easily to 'see' who should get what. Unruly animals are no more accessible goods in this regard. Sometimes, grain is still on the field and not yet delivered; thus, the actual transaction will occur in the future. This likewise fostered new cognitive challenges. First, keep track of all the goods redistributed, while it is not always easy to account for them. Furthermore, the number of goods soon exceeded any human ability to memorize them. So because of these new soil-bound activities, the early farmers required a new technique to cope with their reshaped way of life.
The ability to solve this problem was soil-bound, too. Although in a slightly different sense: the new cognitive ability required was made possible by using small things made of clay: so-called tokens. They were counters used to administer the redistribution of various community needs (cf. Schmandt-Besserat 2009, 147). Albeit used for counting, they worked differently from our abstract numerals (where numbers are comprehended as detached from any particular object to be counted). In the beginning, there were about 12 different shapes to describe various goods, like cones, spheres, cylinders, disks, tetrahedrons, etc. Furthermore, each kind of token stood in a one-to-one correspondence to units of goods. For example, to account for three jars of oil, three ovoid tokens had to be selected to represent this amount of goods. Even for different quantities of the same good different counters were used: for large and small units of grain, for example, cones and spheres, respectively.
Tokens are cognitive artifacts made of clay. They functioned differently to our way of counting, yet despite these differences, we have here the emergence of a new cognitive skill: counting and keeping track of items, realized by preexisting cognitive abilities, joined by a material artifact designed to fulfill this very task. Overmann (2019, 440) argues that counting with so-called 'restricted' numbers (up to 20, but very often only one, two, many) might have been there already by either using finger counting or a tally (basically something like stick or bone with marks on it). 'Artifacts' to memorize items likely also have been in play at that time as the Late Pleistocene hunter-gatherer certainly already used narratives to remember things (see Section 3 above). Yet tokens are abstract in a much more direct manner and hence communicate the information contained (number of goods) much more directly. We also see how one skill (i.e., memorizing by oral communication) might have been replaced by another one to fulfill a new cognitive task. Tokens are a case in point here: they 'build' a model with a straightforward one-to-one correspondence, thereby visualizing what to know rather than remembering it. Thereby it is making access to information in some sense much easier (cf. Schmandt-Besserat 1999, 25).
Yet another interesting feature of the token system is that one must learn how to use it to decode the information contained (cf. Schmandt-Besserat 1999, 25). The cognitive skills attached to it remain elusive without being initiated into this system. This is a prime example of the developmental niche argued for by Karola Stotz. Once a challenge comes up, and so the selective regime attached to it, a combination of already existing skills might solve the problem. But these skills -or the recombination thereof -must be taught, learned, and transmitted to become a staple adaption to the challenge in the long run.
Once established for counting and keeping track of goods, this new material artifact of tokens initiated new complex cognitive operations. For example, they quickly allowed for "patterning, the presentation of data in a particular configuration" (Schmandt-Besserat 1999, 25). So the ancient accountants could arrange the units according to a type of good, respective value, date of entry or expenditures, and so on. Furthermore, "just by moving or removing tokens(,) they could add, subtract, multiply, and divide" (Schmandt-Besserat 2009, 153). The latter operations are a new cognitive skill made possible by engaging with the material cf. Overmann 2016cf. Overmann , 2017cf. Overmann , 2019. Compared to finger counting, for example, tokens allow for higher numbers counted and thus increase the likelihood of such complex algorithms like multiplication to become deployed (cf. Overmann 2019). The materiality of tokens, i.e., sets of freely arrangeable and combinable objects, fosters exploring the possible ways of combining, separating, bundling, and de-bundling them (cf. Overmann 2019, 446). Here, from an enactive viewpoint, "(b) rains can focus on what they do best: managing interactions with the world" (Anderson 2014, 232), thus eventually realizing these arithmetical operations. For "(c) ognitive processing emerges from -is indeed identical to -these iterated interactions" (2014,232). This includes interactions with cognitive artifacts like tokens. As this new skill "stretched human cognition to cope with new levels of complexity" (Schmandt-Besserat 1999, 25), it also allowed for further regional development.
By 5.500 BP, the region's economies became more extensive. Tokens, in turn, became more 'complex', and their numbers rose to 350. As urban workshops entered the production of goods, there were now tokens for raw materials like wool and copper, as well as crafted products, for example, bread, beer, garments, jewelry, or textiles. Some are iconic; they 'picture' the good they stand for. This is evidence that specialists have crafted them for their purpose. Shortly after that, emerging urban settlements got involved in redistribution processes. The token system is still well-functioning despite multiplying both kinds and numbers of exchanged goods.
For the accountants of the urban settlements could cope with the new cognitive demand still using tokens, although the skill changed from mere tracking and counting to basic arithmetic operations. But again, looking back at the previous niche, all the basic skills required to realize this new skill were already there to meet this demand (as described above). Hence it would have been only a comparatively small step within our lineage explanation. This is not the only change, however.
About 5.300 BP, temple officials administered a redistribution of goods. By now, failing to deliver goods would lead to penalties for producers. Transactions between different producers and consumers became more complex, too. Any exchange is by now officially regulated under contracts, with severe consequences for breaches. To keep track of these debts, tokens were put into hollow clay balls, so-called 'envelopes', until the debt was paid. And to keep track of the content of these 'envelopes', the same tokens were pressed on the outside of them while still wet, leaving marks on the surface of the clay.
This practice of pressing tokens into clay turned out to be a momentous event. Gradually three-dimensional tokens standing for specific amounts of good became two-dimensional signs. For example, cones and spheres for grain became wedge-shaped and circular signs. Within a century, the envelopes were replaced by solid clay tablets; yet the impressions of the tokens on the surface remained. This yielded the next step, for "(b)y innovating a new way of keeping records of goods with signs, the envelopes created the bridge between tokens and writing" (Schmandt-Besserat 2009, 149). This new skill was in dire need.
With ever bigger getting city states, around 5100 BP, the kinds and numbers of goods reached an unexampled level. This challenged the administrative officials to transform their formerly used cognitive artifacts. They still needed new ways to keep track of all goods. The first change was brought about by no longer pressing tokens into the tablet. Instead, officials were now using a pointed stylus to sketch the former forms directly on the surface. This opened up an essential possibility. For now, the officials no longer record the ever-higher number of goods to be accounted for by repeating the token assigned to the good as often as necessary, for this would have produced hardly cognizable long rows. Instead, they used a specific sign for the good, preceded by signs to represent the numbers of this good. These are the first (proto-)numerals in Mesopotamia, and eventually, they made "obsolete the use of different counters and numerations to count different products" (Schmandt-Besserat 2009, 153). Interestingly, no new signs were invented, but the old signs for a small amount of grain (the wedge) and a large amount of grain (the circle) were now assigned with arithmetical values, for example, '1' for the former and '10' for the latter. A quantity of '33 jars of oil' hence was presented as three circles (10 + 10 + 10), followed by three wedges (1 + 1 + 1), again followed by the sign for 'jar of oil' (based on the former token used for this good).
This new cognitive artifact involving (proto-)numerals was still needed for the same function, but it had to be adapted to the changing niche. Concretely, it had to adapt to ever more data to be manipulated to redistribute ever bigger streams of goods between ever more participants in the whole process. But again, all skills required for this were already there: recombined with this new artifact, this unique compound became a platform for the next step. Schmand-Besserat summarizes this well: "When these cognitive skills had been internalized for several millennia, the human mind was ready for new strides in abstraction. Concrete counting with tokens was the necessary foundation for the invention of writing." (Schmandt-Besserat 2009, 153) In the long run, the transition from tokens to tablets leads to both writing and mathematics. A change in material (and hence cognitive) artifact fostered new ways of abstraction and allowed for new behaviors (cf. Schmandt-Besserat 2009, 153; cf. also Overmann 2019, 447). As for mathematical operations, the split between (proto-)numerals and accounted commodities allowed them to explore further possibilities on what one can do with numbers. First used only for "recording and communicating numerical information" (Overmann 2019, 457), the sheer volume of numerical information recordable in a concise way via the new medium (e.g., multiplication tables) allowed the scribes of this information to eventually come up with new and more abstract ways to calculate with numbers than possible within the token system. As Overmann states: "This gave scribes more options for calculating than using tokens: They could additionally use information from tables or memory, a factor in developing new, complex algorithms for manipulating relations between numbers." (Overmann 2019, 457) This niche then provided the next platform, for "numbers are a cognitive technology that enables the management of complexity, allowing for even greater complexity to emerge." (Overmann 2019, 450). So, in turn, this allowed for the next step within our lineage explanation toward complex numerical cognition.
Current 'enactivist' neuroscience might add evidence (cf. Anderson 2014, 232f.). As indicated above, Anderson has argued that 'doing math' is "a sensorimotor skill characterized by the iterative interactions with external symbols" (2014, 233). Following enactive cognition, getting engaged in mathematical reasoning is a case of practices with symbols, whereas these symbols function not as "abstract indicators of some mathematical statement to be reproduced in an inner language of thought but rather as icons" (2014, 233; cf. also Malafouris 2013, 115 interpretation of the transition from tokens to tablets here). The spatial manipulation of these icons would then be part of mathematical reasoning, where changing the spatial properties likewise changes mathematical content (cf. also De Cruz 2012; De Cruz and De Smedt 2013; Menary 2015). If so, Anderson claims, mathematical reasoning could be disrupted when the spatial manipulation of these symbols is disrupted; this way, showcasing the underlying sensorimotor processes realizing this cognitive skill.
And indeed, there is evidence for this. Anderson (2014, 234-36; like Menary 2015, 14f.) discusses several experiments from Landy and colleagues, all indicating that cognitive processing varies with the spatial properties of the external artifacts used. For example, Landy and Goldstone found that spacing around operators in mathematical equations matters. In one study (Landy and Goldstone 2007a), participants were asked to write down equations by hand: interestingly, they grouped numbers to be added less close together than numbers to be multiplied -thus indicating that spacing might be a cue for processing the content of an equation. In the next step, Landy and Goldstone (2007b) asked participants to judge whether addition and multiplication equations are true or false. In one set of these equations, incorrectly doing addition before multiplication led to false judgments. Also, in this set, the spacing around '+' was closer than around '*' (contrary to the way the participants above would have written down the equation by hand, at least regarding spacing). The results matched the prediction that spacing is a cue. In this set, only 55% of the participants gave a correct judgment, compared to 80-90% with two other sets of equations, where spacing was either even (thus 'neutral') or 'consistent' (tighter around '*' than '+'). This suggests that "episodes of formal reasoning are indeed typically organized by attention-based interactions with external environments" (Landy and Linkenauger 2010, 2168; cited by Anderson 2014, 237). The latter is also fostered by yet another experiment where the illusion of motion interferes with the way notations need to be moved on a page to solve the equation (for details, see Landy and Goldstone 2009), displaying again that problem-solving in mathematics appears to be a matter of how we arrange symbols on the page (cf. also Menary 2015, 14).
Manipulation is here of mathematical notations, not concrete objects like tokens (the former being 'marks on paper', whereas tokens rather resemble something like an abacus). However, interestingly, both consist of spatial manipulation of objects in some sense or the other. If true, this might be additional evidence from current neuroscience (although given under an 'enactivist' interpretation of brain anatomy and activities) that the new cognitive framework is the right one to give proper explanations of why and how cognitive skills unfold in human populations (and history).
Interim conclusion. The new cognitive framework works best when cognitive artifacts are preserved and we have rich knowledge of (changing) selective regimes. In this case, we have enough information to make a convincing lineage explanation; and the idea that cognitive change is realized by acquiring new skills neatly explains the stepwise transition from one state to another. (Note, however, that no minimum capacity inferences were deployed in this example, and hence the characterized cognitive transitions here do not depend on compounds of them. Probably because the 'artifacts' under consideration for these inferences would be streams of goods like goats or grain, but in principle, such inferences would be possible here, too.) Indeed, the specific case of ANE numerical cognition might be an inference to the best explanation for the new framework, in contrast to its 'orthodox' alternative. Early cognitive archeology runs into a 'representational puzzle' here: When the first (proto-)numerals were used, the same sign could change its value depending on the good accounted for; only a millennia later, the first 'real' abstract numbers seem to appear in the record (cf. Damerow, Englund, and Nissen 1988). But if genetically induced numerical cognition came first, why do we find a 'misuse' of (proto-)numerals in the beginning, not outright abstract numbers, as soon as there is a switch from tokens to tablets? This indicates that the new framework might be the better model of the human mind to work with.
Things become more complicated with the deep past, however. Already 75-80.000 kya, with artifacts like the already mentioned 'Blombos beads', possible stages of a lineage explanation become ambiguous. In principle, these beads could be a cognitive artifact for counting just like tokens were -and hence form part of an explanation of the evolution of numerical cognition. There is reason to assume that hominins at that time would have been capable of this skill, and the materiality of the beads might allow for concepts like "one more" or "as many as" (another string of beads) (cf. Overmann 2019, 448). However, other uses are also possible (like being a social ornament, for example). Furthermore, in this specific case, a selective regime as an incentive to assume counting is ambiguous, too. For groups have been rather small, so any need for the subsistence of the group might not necessitate counting -unlike in the case of the ANE-people 10.000 BP discussed above, where a redistribution economy had to be accounted for. As Overmann (2019, 438) convincingly argues, to go beyond restricted numbers, a group's "demographic density" must generate an according need first. So interpreting these beads as a cognitive artifact for counting wouldn't be the most likely candidate explanation. One has to be careful since even if the materiality of an artifact allows for being a cognitive artifact, this doesn't imply that this possibility has also been realized. Yet ambiguities like this don't rule out any use of lineage explanations for the deep past (including our inferences discussed in Section 2). I discuss their prospects within the 4E framework in the final section.

Inferences revisited
Cognitive history might provide a proof of concept for the new framework, but persistent challenges remain regarding the deep past. Cognitive artifacts, if they existed, virtually never 'fossilized'. Language skills, the ability to follow norms, causal reasoning, teaching complex behavior sequences to others, and the like don't leave direct evidence in the archeological record. Likewise, traces indicative of such abilities are fragmentary. They leave much room for speculation.
Here, minimum capacity and cognitive transition inferences enter. Yet Malafouris (2020) locates archaeologists using them in the 'orthodox' camp. Historically, this is right, and even the philosophers discussing these inferences (again Pain 2021; Killin and Pain 2021; Currie and Killin 2019) tend to representationalism or even computationalism. However, this 'inferential engine' can be deployed usefully by an enactivist archeologist, too.
For minimum capacity inferences are still a valuable tool, also within the 4E framework. First of all, the presence of a particular artifact tells us something about the minimum behavioral requirements at least some members of the group in question had to be capable of. They still show us with sufficient certainty which behavior sequences the maker of an artifact had to realize. The link between the behavior required to produce an artifact and the cognitive requirement to bring forth this same behavior is affected by the choice of cognitive theory in the background, but it leaves the former behavioral part of the inference unimpaired. As such, we get a 'base' from which to determine the cognitive demand for executing the required behavior sequences. (How this demand is formulated, however, might vary with one's choice of cognitive theory; see below.) So, the function of some artifacts as 'markers' for cognitive abilities remains intact. Furthermore, extracting behavior from artifacts yields enormous potential for future studies. Miriam Haidle's cognigram-account is a prime example here. To name just three, she has worked out meticulously the behavior sequences required to produce such different artifacts as a folded leaf (to scoop water) by Pan troglodytes, wooden spears by Heidelbergensians or a bow-and-arrow set by anatomically modern Sapiens, among many others (Haidle 2009(Haidle , 2012Lombard and Haidle 2012).
Things also look good concerning selective regimes. There are traces of the complexity a group had to cope with: in foraging, regarding their range size, ecological variety, and so on. These likely made up much of the selective niche. Furthermore, in terms of the overall complexity of a niche, they tell us something additional to minimum capacity inferences on the cognitive demand of this very niche. In sum: both the cognitive demand at a given niche and the selective regime of the previous niche, which had to select for the ability to fulfill this demand, can, to some extent, be inferred from the material record.
However, this only displaces problems. The question is how exactly a specific cognitive skill for an earlier identified demand was realized at a given niche. Within early cognitive archeology, the assumption was that a change in cognitive capacity equals a change in brain size or structure. Hence any change in the archeological record on fossilized skulls could be an argument for any cognitive change identified earlier within the archeological record on artifacts. Within the 4E framework, this simple covariance can't be assumed anymore.
Even more daunting, as argued in Sections 3 and 4, most advanced cognitive skills are compounds of many 'soft' cognitive artifacts and other skills. So given a specific cognitive demand is identified via an artifact, one must first analyze how a skill meeting this demand might have been realized out of other cognitive skills. Since 'soft' cognitive artifacts and skills generally didn't fossilize -like clay tablets with mathematical inscriptions -there is much indirect information to be obtained from the material record, with all uncertainties of such an endeavor attached. Nevertheless, even if fragmentary, attempts to construct a lineage explanation have their virtues: they order the available evidence in the most coherent way possible, and they build the base for further investigations by allowing to formulate new hypotheses afterward (cf. Currie and Sterelny 2017).
This gives us the following road map: Given 4E-cognition: A composite of cognitive skills that meets the cognitive demand must be realized.
This means: Given an artifact at Niche N, infer the cognitive demand at N (by conducting a minimum capacity inference on this artifact).
Analyze which cognitive skill could meet this demand; analyze which single components this skill could have been made of. Then: Try to infer these single skills from the material record of a niche previous to N; i.e.: identify the single components of a previous niche and whether they could create in a small step the new cognitive skill (given the previously identified cognitive demand).
Finally, try to identify the selective regime of the previous niche and whether this regime selected for the cognitive demand in question.
This would be the reconstruction of one step within a lineage explanation: from a platform -the single components of a previous niche -and a selective regime within this previous niche towards a new skill (given the previously identified cognitive demand by a minimum capacity inference of a particular artifact) at the niche under consideration (see Figure 1).
Cognitive transition inferences also remain part of the 'inferential engine'. Given two artifacts A and A': If A predates A' and both are indicative of different cognitive demands, where the demand of A' exceeds the demand of A, an inference based on both these artifacts still would demonstrate a cognitive change from A towards A'. However, much information must be obtained to yield a complete lineage explanation within the 4E framework. Again, the cognitive skills realizing both the demand required to make A and A' need to be analyzed. Being composites, based on which other skills were they realized? Regarding the cognitive skill to produce A, did the single components of this skill already exist in a niche prior to A? Did within the niche of A already exist all components required to realize the skill necessary for A'? Do we have evidence in the material record of A's prior niche that makes selection towards A (or the cognitive demand associated with A) likely? Do we have evidence in the material record of A's niche that makes selection towards A' (or the cognitive demand associated with A') likely? Given these pieces of information can be obtained, complex lineage explanations become possible as well, based on sequences of cognitive transition inferences. The evolution of anticipatory planning based on Oldowan and Acheulean stone tool production has already been mentioned in Section 3. Other examples (by the same author) would include causal reasoning Lombard 2018, 2020) or teaching (Gärdenfors and Högberg 2017). Here the authors apply a combination of different streams of evidence to reconstruct niches of certain hominins (e.g., from Homo habilis, Erectines, and so on), apply an analysis of the cognitive requirements to produce certain artifacts (e.g., that the Oldowan industry required "demonstrative teaching"), and eventually track cognitive transition through a change in the artifactual material record (e.g., the transition towards an artifact like bow and arrow as indicative of "Causal Network Understanding"; (cf. Gärdenfors and Lombard 2018, 3)). Gärdenfors and Lombard (2020) even highlight the importance of technology in forming hominin cognition during evolution, i.e., the apply the embedded mind thesis on their evolutionary trajectory.
To be sure, the approaches mentioned above are not enactivist. However, in principle, it would be possible to rephrase the 'cognitive part' of their explanations in terms of (the assemblage of) cognitive skills, roughly along the lines mentioned above and as exemplified in Sections 3 and 4. (Although this would also change the interpretation of the artifacts and require further information about possible precursors skills in their reconstruction of the hominin niches, of course). To conclude, lineage explanations based on the 4E framework are possible even in the deep past, and there is also an important methodological implication involved here.
Thinking of cognitive capacities as complex skills makes any evolutionary explanation of them an entangled affair. To give an example (albeit still a rather abstract one): Suppose a cognitive skill that requires more or less explicit norm following -say, argumentation perhaps. Internalizing or publicly announcing norms itself is a cognitive skill. In this regard, Jonathan Birch (2021a, 2021b) has argued that Acheulean-style hand axe production implies the capacity for norms. If probable, any lineage explanation of our cognitive skill must thus be in accordance with his account (or otherwise convincingly argue against it).
There is a lesson to learn here. The point just made iterates, and here entanglement becomes a methodological 'check and balances'. The new way we would construct lineage explanations within any 4E framework -as skills composed out of other skills, which have to function at different times as different platforms for additional skills to emerge -has consequences: it fosters the alignment of the chronologies of all single components. Our account of a specific skill must fit with other lineage explanations, for it will have to be in accordance with any plausible account (itself constructed as lineage explanations) on all components of our skill under review.
In early cognitive archeology, following 'orthodox' cognitive science, a cognitive capacity is comprehended as a rather quasi-independent component of the human mind. In this picture, human cognition assembles a set of quasi-independent subsystems. Of course, interdependencies are considered, too: as mentioned above, Coolidge and Wynn argues that abstract reasoning depends on enhanced working memory; likewise, Ian Tattersall (2017) argues for an association of language with symbolic behavior (for an inference from the archeological record suggestive of symbolic behavior to the emergence of language). By and large, however, comprehending cognitive capacities as subsystems, initiated by genetically evolved brain structures first, make them appear to be relatively stable 'units'. They give the impression of being examinable in relative isolation to other cognitive components of one's overall cognitive make-up.
Cognitive skills are much more fragile in this regard (cf. Heyes 2018, 217f.). There are far more interdependencies to be considered. This fosters different accounts (which are, given the usual division of labor in our discipline, often worked out by various colleagues) to be in alignment with one another regarding their chronology of "cognitive events": at which time (niche) which skill was realized and accessible to a population. In the end, this inter-connectedness of different cognitive skills makes any account of the 'package' of an enacted hominin mind much more coherent. We now have different chronologies of different skills, which in different lineage explanations must match each other. In other words, when constructing a lineage explanation for a specific cognitive skill, we meticulously must strive for "consilience" (Wilson 1998) with virtually any account on the single skills, which function as components of our skill. With so many checks and balances, we naturally progress to a much more balanced account.

Conclusion
To sum up the state of play: Minimum capacity-and cognitive transition inferences are used by both 'orthodox' and 4E cognitive archaeologists alike. Within an 'orthodox' framework, however, this implies that brains equal cognition and inferences made from artifacts are directly indicative of cognitive capacities. Hence any transition in artifacts should equal a transition in fossils indicative of changing brain encephalizationsomething not supported by the archaeological record anymore (as far as we know; see Section 2). Within a 4E framework, this problem vanishes: in principle, one could argue that any change in the material record on artifacts is accompanied by a change in cognitive skill. This might be interpreted as an epistemological incentive for the new framework. Furthermore, although contested by some (cf. Brown 2021), 4E cognition allows for quick cognitive changes, even within the lifetime of one generation. Hence it makes cognitive transitions much more likely, compared to cognition as assumed in the background of early cognitive archeology, where random genetic variations must cause new brain structures first. So given enough knowledge of the cognitive challenges of a group and their potential to create new cognitive skills, establishing lineage explanations becomes increasingly possible.
Some problems remain. If cognitive demand in the past did not select for brain size and structure only, but compounds of brains and external scaffolds, then, also within the 4E-framework, it becomes more intricate to conjecture about why and how hominin brains evolved the way they did. Despite that, distinguishing between different candidate explanations for a given cognitive skill remains the most challenging problem (cf. Moore and Brown 2022). Only careful consideration of the archeological record (and hopefully new findings here and there) might make a change. In the end, given this endeavor's speculative and fragmentary nature, we might have to live with the fact that, eventually, we will have a small class of candidate explanations for a given trait and no further means to decide between them. However, the more lineage explanations of different hominin cognitive skills we try to reconstruct, the more we are forced to strive for a coherent overall picture of all their cognitive abilities. This in itself should select for more advanced accounts on hominin cognitive evolution in the future.

Data availability
No data are associated with this article. This is an open access peer review report distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

Alex Aston
Keble College, University of Oxford, Oxford, England, UK The author provides an account of how 4E cognition can be synthesised with Niche Construction Theory in order to make minimal capacity inferences and discern cognitive lineages. Overall, I find the authors writing style very readable and the general structure of the argument to be coherent for the most part. However, though I am sympathetic to the authors goals, and believe that the paper has a great deal of potential, the paper is underdeveloped at present. More specifically, I think that the theoretical positions of the author are a bit muddied at times and I find the overall treatment of the various conceptual frameworks and the evidence to be a bit superficial. Thus, I recommend major revisions.
First and foremost, the author appears to subscribe to a functionalist account of 4E cognition based on their terminology, framing of the issues and use of evidence, despite indicating to the contrary with some of their comments. For example, in the first paragraph on page 8 (of the pdf version of the article) the author gives an example of functional extended cognition in the vein of Clark and Chalmers 1998 and then alludes to how artefacts might constitute a cognitive process that would be otherwise impossible. The author then goes on to describe such relationships in terms of a functional reorganisation of cognitive capacities to generate novel behaviours, e.g., numerical cognition. In the midst of this paragraph, the author states, "Although, for our purposes here, we shouldn't descend into metaphysical debates about minds." Unfortunately, the author cannot so glibly put aside the fact they have placed themselves in the midst of a wide-ranging and robust debate about the ontology of the mind. For those of us that understand material culture as constitutive of cognition on a phenomenological level, this framing is wholly inadequate. The author is well within reason to argue for a functionalist 4E position; however, the problem resides in the fact that the author seems to assume that this is the default position in 4E cognition. Thus, the author makes numerous sweeping claims which more radically minded cognitive archaeologists would reject. For example, on page 13 the author states "Cognitive artifacts, if they existed, virtually never 'fossilized'." Any cognitive archaeologist that subscribes to frameworks such as Material Engagement Theory or Radically Enactive Cognitive Archaeology would reject this claim outright. From these perspectives, all artefacts are constitutive elements of distinct cognitive processes. Of course, it is completely valid to reject such positions, but the author seems unaware of the full range of perspectives and debates in cognitive archaeology. Correspondingly, despite the author's engagement with 4E frameworks, they still appear to assume that material culture is an epiphenomenon of internal cognitive capacities, a position that most archaeologist employing 4E frameworks would reject. The author needs to engage with the nuances of these debates, and at least explain robustly why they choose a extended functionalism over both standard representationalist and more radical approaches to cognition. I shall include a list of readings at the end of my comments that I believe will help the author greatly improve their arguments.
At present, I shall focus on more specific issues I have with the paper. In general, I think that the first two sections need more effective signposting of the authors position, such as pointing out core axioms and assumptions and gesturing toward the critique they develop in section 3. For example, the discussion of "single individual components of cognition" in the fourth paragraph on page 4 could really use a bit more indication of the author's position and direction of their argumentation.
As for section 3, I believe that the section needs a bit of restructuring, more thorough explication as well as more holistic integration of the various concepts employed. I personally recommend developing focused subsections dealing with the specific topics raised such as the 4Es and Niche Construction in more detail. At present, these concepts are dealt with in a fairly superficial manner, ignore a good deal of relevant literature and lack an effective synthesis, particularly in terms of connecting all 4Es into a robust niche constructing framework. I'll return to this point with my recommended readings below. As for section 4 on numerical cognition, I shall largely defer to Dr Overmann's comments on the topic as she is an expert on the topic. However, I think that a more detailed analysis of the materials, action-perception dynamics and developmental implications is warranted. I will also point out that the statement in paragraph 1 on page 10 that, "Hunter-gatherers stroll around; but around 10.000 BP, some human groups in the Eurasian region became evermore locally bounded." This is a rather facile simplification, and I would direct the author to Graeber and Wengrow's recent work for a solid overview of how this narrative is changing: Wengrow, D., & Graeber, D. (2015). Farewell to the 'childhood of man': ritual, seasonality, and the origins of inequality. Journal of the Royal Anthropological Institute, 21(3), 597-619. Graeber, D., & Wengrow, D. (2021). The dawn of everything: A new history of humanity.
A few final points, before I get into further reading recommendations. The author really needs to define their terminology more effectively, the use of scaffold and entanglement particularly stand out to me as needing explication. Furthermore, when arguing for minimal capacity inferences, the author needs to effectively address a number of issues, such as potential issues of equifinality and convergence in the mosaics of hominin evolution (issues that are particularly relevant considering the emergence, disappearance and re-emergence of similar artefact assemblages across large spans of time, e.g., Howiesons Poort). Furthermore, I think the author should explain why an artefact should be viewed as an epiphenomenon of underlying capacities (particularly given the ambiguities in the record between artefacts and encephalisation as well as the authors embrace of a niche construction framework). Finally, I think that authors discussion of cognition as skill has a great deal of merit, yet it is underdeveloped and needs to engage much more thoroughly with the existing literature, a few recommendations are: Garofoli, D. (2017). Holistic mapping: Towards an epistemological foundation for evolutionary cognitive archaeology. Journal of Archaeological Method and Theory,24(4), 1150-1176.  (1), 65-89. Barona, A. M. (2021). The archaeology of the social brain revisited: rethinking mind and material culture from a material engagement perspective. Adaptive Behavior, 29 (2) I think that at least some engagement with the above readings will then allow for a significant refinement of the arguments put forward in the final two sections of the paper and a far more convincing conclusion.

For section 3, I recommend the following to help flesh out the relationship between 4E approaches and Niche
A few final points: Pg. 3 "Cognition is a complex trait." From my perspective, I see this framing as a category error, akin to saying metabolism is a complex trait. Cognition is a general category that encompasses variable and distinct expressions amongst different organisms.

1.
Pg. 7 "Cognition isn't a complex feature only; it's also always the feature of someone." This is a vague and somewhat confusing sentence that needs more explication.

Are sufficient details of methods and analysis provided to allow replication by others? Partly
If applicable, is the statistical analysis and its interpretation appropriate? Not applicable Colorado, USA Overview. The article discusses two approaches in evolutionary cognitive archaeology (ECA). It describes them as taking, respectively, "orthodox" and 4E views of cognition. The 4E view is illustrated with a sustained discussion and analysis of Schmandt-Besserat's view of numeracy in the ancient Near East. The example is misplaced: While Schmandt-Besserat's views are widely known, she applied no theory, cognitive or otherwise, to guide her interpretation of artifacts associated with either numeracy or writing. This outdated work is a curious choice to illustrate the 4E ECA approach, especially when there are recent, well-developed 4E analyses of ancient Near Eastern numeracy and literacy to draw upon. Accordingly, a major revision is recommended. This reviewer is willing to review a revised manuscript that takes the comments below into consideration.

Introduction:
The introduction contains several assumptions and claims that are misleading or inaccurate: P3: "Cognitive archeology is the brave attempt to infer the cognitive abilities of past people from their material remains, thereby exploring how the cognitive evolution of homininsand eventually humans -has unfolded.": The opening statement presents a narrow view of (evolutionary*) cognitive archaeology (ECA), which is also concerned with the role of materiality in human cognition in 4E models of the human mind. Since the author expounds on 4E models, it is important to define cognitive archaeology accurately at the onset. (*Ideational cognitive archaeology is concerned with inferring ideas from past material culture.) The term "brave" comes across as subtly disparaging. ○ P3: "To explain the evolution of human cognition requires understanding which forces shaped and re-shaped over and again the cognitive underpinnings of hominin lifeways from several million to only a few thousand years ago." (italics added): Not all ECA ignores that past few thousand years (assuming that brains and behaviors are essentially modern and thus unlikely to shed much light on evolutionary questions); nor is this period coextensive with cultural rather than evolutionary change in cognition. Given ECA's concern with the role of materiality in cognition and the understanding that evolutionary change remains ongoing today (if difficult to discern), it is more accurate to say that ECA is concerned with cognitive evolution from the onset of tool use (currently understood as emerging 3.4 million years ago) to the present day and beyond. ○ P3: "So, the same trace in the record, say a specific stone tool, might require capacities like a mental 'blueprint' or sophisticated long-term memory according to classical theories, but only environmentally guided perception and simple learning according to ideas from the 4E family.": Rather than offering generic examples that imply a lack of familiarity with the literature, here the author might engage with ECA publications. For example, Coolidge and Wynn (e.g.,  argue from stone tools to constructs like working memory, expertise, and creativity, while Malafouris ( , 2021 uses stone tools and mark-making to question intentionality and skill, the emergence of meaning, and where the boundaries of the mind should be drawn. Note that Coolidge and Wynn interpret patterns of change in the archaeological record using conventional psychological constructs, while Malafouris is a 4E theorist. ○ P3: "Cognitive archeologists have no choice but to decide. Only by using a theory or model, ○ can they make inferences from artifact to cognitive capacity. And the choice of theory or model comes first.": The author seems to suggest that ECA practitioners must necessarily choose either a standard psychological paradigm or a 4E approach. However, this is misleading. Certainly, within the cognitive sciences, the first two Es (embodied and embedded) have become commonplace, if not mainstream. Granted, the cognitive sciences may still consider the second two Es (extended and enactive) to be radical positions, but even here, there is ongoing research and debate. The point is simply that ECA, like the cognitive sciences, cannot be accurately described as falling into two distinct camps that are easily bifurcated. As a relatively new discipline, ECA is still experimenting with theories and methods, and merging (what the author describes as) the two approaches is ongoing (e.g., Wynn et al., 2021b; Overmann's work in numeracy and early writing systems). Lineage explanations. For this section, the author might review work in ECA (e.g., Wynn et al., 2021a) for an orientation to its epistemology. That is, rather than explaining inferences in evolutionary biology generally, focus on the epistemology of ECA specifically.
Early cognitive archeology and its inferences. The author's characterization of cognitive archaeology in this section as "early" is not explained. 4E constructs are not replacing the use of classical/orthodox psychological and neuroscientific constructs within ECA. Further, given Colin Renfrew (a 4E proponent) as an early pioneer of ECA, both approaches are equally as old within the discipline. P. 4: "So, if we see a change in material culture and a corresponding change in brain size or structure, everything seems to match." One of the largest gaps in our understanding of human cognitive evolution is the lag between fossil indications of cognitive change (e.g., larger skulls indicating larger brains) and archaeological evidence of behavioral change (e.g., greater complexity of tool design). It the lag did not exist, it would be easy to say that bigger brains created better tools (when fossil change precedes archaeological change) or that tool use influenced brain change (when archaeological change precedes fossil change). Unfortunately, the pattern is unclear, as the author later notes. This ambiguity, however, does not argue in favor of one or the other; the relation is more likely to be highly intertwined and mosaic. ○ P4: "Artifacts come into existence through the behaviors of their producers, and these behaviors become possible due to certain cognitive capacities they entertain -eventually because their brains realize these capacities.": Here the author appears to subscribe to the idea that bigger brains make better tools. ECA also examines the reverse direction of influence, the idea that tools make minds (i.e., that tool use has an effect on cognition). See . ○ P5: The author's characterization of Coolidge and Wynn's Enhanced Working Memory (EWM) hypothesis is more caricature and straw man than not, and the hypothesis deserves to be taken more seriously. Coolidge and Wynn draw on multiple technologies (not just the Blombos beads or Hohlenstein-Stadel figurine) to make their argument, and they also operationalize working memory so that its change can be identified archaeologically. ○ P5: "Frederik Coolidge": Coolidge's first name is misspelled. It is Frederick. ○ P6: The (imagined) story in early cognitive archeology, therefore, rather often seems to be that as soon as new cognitive abilities arose by genetic variation, hominins explored and ○ exploited new opportunities with them -including the new things they were allowed to do.": Coolidge and Wynn's EWM hypothesis posits an inheritable genetic mechanism (which, as the author correctly notes, remains unidentified) precisely because working memory is a highly heritable genetic trait.
P6: "Second, a related problem is that we can track changes in a material culture where it is relatively safe to assume that no significant brain size changes occurred.": This is a problem only if brain size increases are assumed to precede, index, and cause material change; note that ECA does not generally assume this to be the case. Further, ECA does not generally assume that demographic change is the only other factor with explanatory power. ○ A new cognitive framework. For this section, the author should appeal to the 4E literature to define the 4E terms: embodied ; embedded/situated Robbins & Aydede, 2009;), distributed ; extended ; and enactive (Clark, 1997;; representationalism . P8: "First, some of the artifacts humans use aid for cognition in the strong sense that without these aides, the cognitive process would be impossible. Hence the artifact is part of this ability. (Although, for our purposes here, we shouldn't descend into metaphysical debates about minds.)": However, the debate about whether an artifact is part of a cognitive process is exactly what is at issue in the claim that minds are extended.

○
The discussion of extension is attenuated and unclear; instead of explaining and illustrating extension, the discussion is diverted into Niche Construction Theory, which as presented is not well tied to the concept of extension. ○ An example: the cognitive life of clay P9: "Showing this new framework in action works best with cognitive history, where we have relatively clear evidence of cognitive artifacts." The author needs to define what is meant by the terms "cognitive history" and "cognitive artifacts." ○ P10: Rather than accepting Schmandt-Besserat's (1992a, 1992b) account of Mesopotamian numbers at face value, the author should understand that her interpretation of clay tokens was performed in the absence of any guiding psychological or cognitive theory (beyond some tenuous and frankly indefensible appeals to the work of Lévy-Bruhl (e.g., his 1912 Les fonctions mentales dans les societes inferieures; see Schmandt-Besserat, 1982, p. 873), without any understanding of numeracy and contemporary number systems, and with no thought of a 4E approach. The author should consult the many criticisms of Schmandt-Besserat's work (e.g., Chrisomalis, 2005Zimansky, 1993). One of those criticisms is that there is little basis to conclude, as Schmandt-Besserat does, that numerical or counting tokens were used as far back as the 9 th millennium BCE. Secure evidence of numerical tokens does not emerge until the mid-4 th millennium BCE. Rather than explicating Schmandt-Besserat's outdated hypothesis at length, the author should explain the 4E approach as articulated by Malafouris and Overmann.
○ P11: ""When these cognitive skills had been internalized for several millennia, the human mind was ready for new strides in abstraction. Concrete counting with tokens was the necessary foundation for the invention of writing." (Schmandt-Besserat 2009, 153)": Schmandt-Besserat's hypothesis regarding the invention of writing is as outdated and theoretically ungrounded as her work in numbers. For a 4E explanation of how scripts and literacy emerge, see Overmann ( , 2022. P12: Manipulation is here of mathematical notations, not concrete objects like tokens (the former being 'marks on paper', whereas tokens rather resemble something like an abacus). However, interestingly, both consist of spatial manipulation of objects in some sense or the other.": The author appears to have missed the fact that Overmann treats written notations (both numerical and non-numerical) as material objects. See Overmann (2016bOvermann ( , 2022. ○ P12: "Interim conclusion. The new cognitive framework works best when cognitive artifacts are preserved and we have rich knowledge of (changing) selective regimes.": As previously noted, the 4E approach is not "new" in ECA. Further, the conditions of the last 10,000 years are quite different from those of the 3.4 million years that precede them, and this should be acknowledged. Specifically, behaviors and brains in this period are effectively modern, so it is easier to generalize insights gained with extant brains to ancient ones (in the way that is true of generalizing from extant brains to erectines, habilines, or australopithecines). The archaeological record is also much richer, enabling a more detailed look at material change.
Finally, interactions with materiality are likely to engender cultural change in cognition, rather than evolutionary (genetically inheritable) change.
○ P12: "Indeed, the specific case of ANE numerical cognition might be an inference to the best explanation for the new framework, in contrast to its 'orthodox' alternative.": The author provides very little support for or insight into the 4E framework, since the preponderance of explanation is focused on Schmandt-Besserat (who would fall into what the author calls the orthodox alternative, noting again that her interpretation of the archaeological record is ungrounded by any cognitive theories) and not Malafouris and Overmann (who fall within the 4E camp).

Inferences revisited
P13: "Cognitive artifacts, if they existed, virtually never 'fossilized'. Language skills, the ability to follow norms, causal reasoning, teaching complex behavior sequences to others, and the like don't leave direct evidence in the archeological record." The author needs to define what is meant by "cognitive artifact"; in the 4E approach, any artifact has the potential to play a role in cognition. A lot of work on causal reasoning and teaching based on the archaeological record has been done (e.g., Gärdenfors & Högberg, 2017a;Haidle, 2017;Osiurak & Reynaud, 2020b, 2020a. ○ P13: Since there is no discussion of minimum capacity or cognitive transition inferences in Section 4, the author has not established grounds for concluding that they belong to the classical approach, rather than the 4E approach.

Conclusion:
The interim sections (1)(2)(3)(4)(5) do not adequately support the conclusions as currently presented. After revising the interim sections, the author should revisit the conclusions.