deperplex

" Chomsky's approach to linguistics has effectively been disproven by AI: 1. general learning is enough for language (no innate language skill necessary), 2. language is fundamentally scruffy (not neat), 3. language is enough to learn language (no grounding)."

-- Joscha Bach, Nov. 6, 2024 on X

Disproven?

In the quote above, Bach focuses on confirmatory evidence -- what AI and humans can do rather than what they can't -- missing both the key argument in favor of Chomsky's computational-generative program and that program's argument against neural network LLMs as a natural language model. Bach's comment may also be exaggerating the success of neural networks, but let's set that one aside.

Focusing on confirmation while ignoring discomfirmatory evidence or counterfactual support is quite common in human reasoning. And Bach's is a common response to Chomsky: if AI can produce natural language better than a computational model, humans must not need an innate hardwired syntax program. Children could learn all of language simply by bottom-up empirical impressions, no need for a top-down, evolutionary preset computational grammar.

What Bach and his fellow engineers ignore is the limit problem that drives the whole Chomsky model. An account of human language-learning does not depend on which languages humans or machines can learn, but which languages, if any, humans cannot learn, and which machines, if any, mechanically, structurally cannot learn those same languages while nonetheless being able to learn human languages. If there are such humanly impossible languages and such machines, then it is 1) highly probable that the human language faculty is isomorphic with those machines and 2) any machine that can parse or produce those languages manifestly cannot be a model of the human faculty, much less explain or even illuminate it.

This is why Chomsky repeats over and again, to the puzzlement and frustration of the AI engineering community, that AI technology tells us nothing about the human language faculty. Nothing.

There's a popular view, apparently held by engineers as well, that technology is a branch of science because technology depends on the discoveries in the sciences. But the goals of the natural sciences are not the goals of technology. Technologies are devised in order to accomplish practical tasks, often market-driven tasks or military ones. The goal of the natural sciences is to understand nature, not to manipulate nature into performing a task. The goal of understanding nature includes explaining what nature doesn't or can't do. That's not the task of technology, and one reason why AI alignment is such a challenge: there is no AI natural selection learning limiting itself to human subserviance and harmlessness. For the natural sciences, those limits need to be explained.

Now, you ask, and should ask, are there languages that human children can't learn? Under current ethical conditions, we can't experiment to find out. But we can look at syntactic structures in known languages that speakers of those languages cannot parse. I provide examples below. First, I want to respond to a more pressing question: why has the generative model failed where AI technology has succeeded.

On this question, Bach holds an important clue. Language is scruffy. Even if it exhibits a core of recursive computational structures, it also has a lot of vagaries. Worse, the vagaries are not always isolated quirks. They could be recursive and productive, so they could behave and look just like algorithmic recursive productive computations.

In the 1990's, during my doctoral studies, it was pretty clear to me that it was possible and even probable that the data of language included variants that were non structural, mere irregular quirks. In the US, we say "go to school" but "go to the hospital" or "to a hospital". Brits say "go to hospital". To conclude from this contrast that there must be a deep syntactic structural difference with reflexes throughout the language seems way too much for a theory to explain. It's more likely a historical accident of semantics that has somehow seeped into one dialect and not the other: it could be that for Americans, schooling is an activity, but not hospitaling and this is being reflected in the dialect; "hospital", like "congress", can be treated like a name as Brits do, unlike, say, "the capital", a common noun. But if a linguist can't definitively tell which datum belongs to the syntax machine and which is learnt by the general learning faculty, then all the data are effectively in a black box.

The clear evidence that there are structures that English speakers cannot parse (see examples below) convinced me that Chomsky's explanatory model was probably right. But because the data were all jumbled together in a black box, I was also convinced that the program of discovering the specifics of the model was doomed to failure -- despite being the right explanation.

As long as the evidence is behavioral (sentences), the model will be unfalsifiable if the model data are mixed with general learning data. Judgment will have to wait for psycholinguistic or neurological experiment to sift out the innate machine recursions from the general learning ones.

The sciences are not as straightforward as the public may assume. For example, Darwin's theory of evolution is entirely post hoc. It can't predict any creature, it can only give a kind of tautological post hoc just-so story: if the species exists, it has survived (tautologically obvious), and that survival depends on adaptation to the environment (that's the post hoc just-so story), explaining at once the divergence of species and their inheritance from their common origins (that's the revelation explaining all the data of zoology together with elegant simplicity, settling all the questions that the theological theory left unanswered like, why do all the mammals have two eyes, a spine and a butthole, the last being a particularly difficult and comical challenge to the image of the deity). Physicists can't predict the next move of a chess piece let alone the fall of a plastic bag. That's one reason to question pundit economists. A theory, even a great and powerful and pervasive theory, can be right without being predictive. In language, generativism can be right without being able to generate the language.

A technology, however, must work at least well enough so the bridge doesn't fall. If that means it's wise to use girders stronger than the science prescribes, you use them no questions asked. Or rather, the only question to ask would be costs. Note that such costs are utterly irrelevant to the value of a science, where the only "cost" is the simplicity of the theory (more accurately the probability, see the post on entropy and truth). You see the difference. Tech is about practicalities of an innovation in a realm of action, science about understanding what's already existing

And this is why AI neural network (LLM chatbots) program has succeeded where the top-down computational program has failed. A computational model can only produce by algorithm and can only parse by algorithm. It cannot by itself capture quirks without adding ad hoc do-dads to handle each quirky case, not a theory but a cat for every mousehole. A behavioral, empirical mimicry machine can learn to mimic vagaries just as well as algorithmic functional outputs. They are all equally outputs from the perspective of a behavioral monkey-see-monkey-do machine. The mimic isn't interested in the source of the behavior or its explanation. There's no causal depth to mimicry and no need for it. Algorithmic behaviors and quirky behaviors are equally just empirical facts to the mime.

This is not to disparage the methods AI's use to "learn" behaviors and generate like behaviors. Neural networks are an amazingly successful technology. And it may even be used by scientific research to understand language. But it is too successful to stand as a model of human language learning or, possibly, human intelligence as well.

So even though, or actually because, neural networks are restricted to a shallow, non causal, impoverished empiricism, they can accurately reproduce the full range and complexity of language -- its scruffy idioms and recursive vagaries, as well as its algorithmic functional returned values -- whereas the top-down model could at best account for the returned functional values. But a top-down computational model can't do even that because the top-down model relies on the restricted data of returned values, and the linguist can't identify with certainty which data are restricted to the faculty's algorithm and which is ordinary human empirical behavioral learning.

There's an important Hayekian lesson here for social planning and economic policy: policy and planning algorithms are too narrow to predict social behaviors. But learning machines might not be much better since they are trained post hoc.

Chomsky's generativism was a Copernican shift for the sciences in the 20th century. He not only blew behaviorism out of the water, he returned thought and the mind back into the sciences and philosophy. Now the zombie corpse of behaviorism in the form of LLMs has been resurrected from its swampy depths to walk again, but this time with a sophisticated information-theoretic learning technology.

None of this top-down computational modeling relies on or even mentions consciousness. It's only a question of whether humans compute or merely mimic with the exceptional range and sophistication of AI mimicry, ascending and descending gradients, and such Markov walks. The answer is, that the means AI uses are too powerful to explain human language and maybe human learning in general.

As promised above, here's a handful of sentences showing what English speakers can and can't parse:

a. The drunk at the end of the bar is finally leaving. (English speakers can parse this easily)
b. The drunk on the chair with three legs at the end of the bar wants to talk to you. (again, it's easy to parse that it's the chair that's at the end of the bar, not the legs at the end, and not a drunk with three legs)
c. The drunk on the chair with three legs in the suit with the stripes at the end of the bar wants to talk to you. (again, easy to parse that it's the drunk here in the suit)

d. The drunk at the end in the suit of the bar wants to talk to you. (not just hard but impossible even if you know that it's intended to mean "the drunk in the suit" and "the end of the bar")

and it's not because "of" requires proximity:

e. The drunk on the chair in the T-shirt with three legs wants to talk to you. (no way to get the three-legged chair)

[Scroll down to the bottom for a mechanical diagram of the cross relation problem.]

There are no gradations in difficulty. a-c are automatic, d&e impossible, even though they're simpler than b or c (!) and even if you're told what they're supposed to mean!! That's a reliable sign of machine operation -- a structural failure. And in fact a push-down automaton can produce a-c, just as English speakers can. A push-down mechanically can't parse d or e and neither can you.

That is the counterfactual argument for generative grammar: the limit tells us what kind of machine the brain uses. A neural network LLM learning machine can produce d and e if it is trained on sentences of this kind. Therefore LLMs tell us nothing about why d and e are impossible for English speakers. LLMs are a successful technology, not a scientific explanation. LLMs are too successful to be informative of natural language learning.

The counterfactual above is not a proof, it's just support. It would be proof if an LLM actually produced a sentence with the structure of d and e. And that it doesn't, may mean only that it hasn't encountered it in its training. If a child learned to produce or parse structures like d and e, these sentences would be less interesting. Any of these results would be relevant to our understanding of human learning and LLM learning. But Bach's 1-3 don't address the counterfactual at all. They are mere excitement over a technology that can do anything and everything. That's to miss the scientific goal entirely. In other words, the jury is still out on Chomsky's model, pace Bach.

Even a behavioral test like the Turing test would distinguish an LLM from human intelligence, since the LLM could perform what humans can't. It's the weakness of the Turing test that it tests for how stupid the machine must be, or appear to be, to emulate humans. It's ironic, but not surprising, that it would be used as a test of intelligence. Not surprising, because Turing's test reflects the despairing limitations of behaviorism prior to Chomsky's 1956 Syntactic Structures.

Here are diagrams that make it clearer what the machine structure can't do:

[The red X below indicates the cross relation that neither humans nor "push-down" automata can parse.]

Monday, July 21, 2025

self selection & the unchosen: identity, the spark bird, and the Freudian trap

a Bayesian approach to self-identity

A friend says that all the geniuses seem to have had a seminal moment in childhood that turned them towards their calling.

The allure of this explanation seems obvious. Except for the fans of Gladwell's 10,000 hours, there's something special about geniuses; not like the rest of us. There must be an explanation of their magic specialness. There must have been a cause, an igniting start for this astonishing and extraordinary career. There must have been a special moment.

I think this is all a fallacy, a failure to use Bayesian reasoning, compounded with a myth promoted by Freud that has somehow become an article of faith in folk psychology.

Geniuses, or talented celebrities of any stripe, are the focus of great popular interest. They are often interviewed by journalists who are not geniuses, just ordinary blokes who want to please their audience, or their editor's audience. What does the audience want? To know how the genius became genius. It's a disguised version of "why are you a genius and I'm not?" or more personally, "how come I'm just your average slob, damnit! How come you're so special?!"

And to answer this question, our ordinary-bloke journalist asks, "how did you become a genius?" Not in those words, of course, but more like "what got you interested in what you do?" "How did it all begin?" in other words, our vicarious spokesperson here is specifically asking the genius to come up with a beginning experience. And how does genius do this? Why, by scanning her past as far back as possible to find that start. Seek and you shall find. There's almost always some event that will fit the bill. End result: any genius will have a beginning story and it will likely be nestled in childhood. It's a case of confirmation bias, or inductive fallacy -- they're basically the same -- and they both mistake evidence for science.

Why is this a failure to apply Bayesian reasoning? Consider all the events of childhood. Countless, aren't there? Most of them forgotten, no? That's just the point! Of all those events, many of them could have been sparks for interest. Now, how many of these produced no enduring interest? How many produced no impression at all? How many sparked a brief but not lasting interest? Well, all of them except the one that genius held onto. In other words, it's not the event at all, it must be the potential interest already within the genius.

The relationship between the choice of story and the development of enduring interest is no doubt idiosyncratic and complex, including all the chances and complexities of environment pre and post natal. Or it may have been a driving interest from within, a polygenic inheritance. Whatever the case, there's no reason to believe that the event was the cause.

Among birders -- lovers of birds, bird-watchers, bird devotees, ornithologists -- there's a common experience they describe as their spark bird, the bird that ignited their interest in becoming a birder. The metaphor says a lot, I think. A spark without tinder is a flash in the pan. It's the tinder that makes the flame. No parade of birds, no matter how long, exotic, brightly colored or iridescent, flamboyant, weird, big or tiny, could spark my interest in birds. Is it because we had a bird at home for a little while, so I got used to them? But why didn't that bird initially spark me?

It's because me. It's not because of the parade of flaunting birds. It seems ludicrous to think that the bird makes the birder. When looking at the events of childhood, it's just as important, maybe even more important, to look at all the events that made no impression. That's the base rate. That's the norm. Surely that's just as informative, if not more. What makes a genius is not the event, but the personality, the personality that ignored all the other big events, or dabbled in some of them and lost interest.

The Freudian myth that childhood events are responsible for our development ignores the normal. I'll be posting soon on this oblivious obvious of the ordinary. It's the normal stuff, the everyday, the 99% of childhood, that makes up 99% of experience. And even that is only background. Oneself is 100% of one. Freud liked to have an explanation for everything, even if he had to fictionalize one. No doubt it made him seem more valuable and smart, and it's the character of the charlatan to know more than everyone else. Charlatanism is the pretense of having extra-normal knowledge. But it's just a pretense.

The same experience can have radically different -- even opposite -- effects depending on the person. Much of psychology ignores the radical diversity of congenital personality. It's one of the deep beauties of Ruth Benedict's understanding of personality that she recognized the congenial origin of diversity. Anthropologists often mistake her view as "culture determines personality", which is consistent with current popular post modern and also Marxist blank slate views that there's nothing innate that culture can't manipulate.

Benedict was diametrically and essentially opposed to such a view in her very heart. And in her reasoning mind and empirical research. She reasoned: if culture determined personality, there could be no deviance. And she empirically observed: every culture has its deviants. Conclusion: the personality of the culture cannot determine the personality of its members.

On her view, every culture has its own set of norms, a selection of possible human traits, so it has a kind of personality of its own. That's the personality of the culture, not the individuals born into it. Some are born with traits akin to their culture's norms, others born with traits further from them and still others too far from them to conform to the culture. Those last are the deviants. On her view, those who were born with exactly the traits of the culture are perfectly normal, are never criticized, and need never doubt themselves. In short, the perfectly normal are the psychopaths -- having self doubt is what it means to have a conscience, just what they lack. She's on the side of the deviants.

Notice that this account of psychopathy is not the current "they are born without a conscience." No. It's that they happen to have been born into a culture that 100% supports them. If they'd been born in another culture, they'd have self-doubts and a conscience. It's the accidental match between congenial nature (her expression) and the norms of the culture that leads to psychopathy. So. if you have self-doubts, are not satisfied with yourself, troubled by yourself -- that's good! It means you're not a psychopath! It's good that you don't fit in perfectly with your culture. After all, the cultural norms are not necessarily good. They are just the local norms. The Good, whatever that is, is beyond the relative norms of any local system of norms. Benedict is no relativist. The local norms are just a kind of compass to set the direction of individuals in the society, to facilitate their ability to succeed within that culture. Unfortunately, it can leave behind those who are born with traits too far to fit in. She instructs us that in native cultures, the deviants are often embraced as sacred -- weird, maybe and certainly not normal, but special, supranormal, and to be respected as such. We, rigid as we are and all too arrogantly self-righteous, should learn from those cultures.

For her, innate diversity is the essence of human personality, which she describes as our distinctive "congenial nature". Far from endorsing cultural norms, she turns it upside down: the perfectly normal are the most dangerous individuals, the Dick Cheney's who can order the mass murder of thousands without hesitation, and still be considered normal and even respected.

It's not what happened to you as a child. It's the luck of being born as you in a culture not yours.

how we know what dogs are not saying and that the universe is not thinking

Category: the sociology of false beliefs

It's sweet that the New Age mystics want to believe that animal communication is as rich as ours, that the underground funghi modulating trees' nutrition form an ecosystem mind, that the earth itself is conscious and the vast universe deep in thought. Who wouldn't want to believe the animal kingdom and all of nature and even the whole of the universe "are all one" with us? Why not blame the artificiality of civilization on the human distinctions that stand as obstacles in our way towards universal connection. The New Agey are sweet people too, and no doubt in the naive past maybe everyone believed such harmonious speculations.

Their speculations are illusions, however appealing, and a little thought would show how wrong. And they are not just wrong, they are misleading, misdirecting their understanding of the world to the familiar wish-list of feel-goods and away from what is truly astounding and surprising about humans and how very different we are from the rest of the informational world not only in our communication and understandings, but identity and individuation as well.

Starting with language:

I've had this conversation so many times, whenever I explain how extraordinary human language is and how powerful, more powerful than they have ever imagined.

"But animals communicate too!" they object.

"So what are they saying?" I reply.

"We don't speak their language, so we don't know what they're saying."

"Oh, but yes you do."

Yes, we do. More important, we know what they're not saying. They're not discussing what happened yesterday or what didn't happen but should have that they expected, or what might happen tomorrow, and what might not.

A dog growls. About yesterday? Unmistakably not. It's an expressive gesture of warning: you're too close, dangerously close. It's accompanied with a show of teeth, which couldn't be seen from a great distance. So you know the growl is about here and now. Is it communication? Of course. The whole purpose is to communicate -- immediately, very immediately, so you don't move any closer.

So why can't a dog growl about yesterday?

Here's the simple difference between expressive communication and symbolic language. We humans have a word for yesterday. It's "yesterday". It's not expressive of yesterday. Yesterday doesn't have a sound or even a look or color or scent. Yesterday is a notion, an idea. The word "yesterday" has no intrinsic relation to that idea, it's just the sound sequence English speakers use to talk about the day past. Dog's don't have this symbolic power. They're confined to expressive responses to the here and now. Whimper "I'm hurt, be gentle with me." Bark: "Alert! Dog here, ready for action." Howl: "I'm alone over here" or something like that. I'm not completely fluent in dog.

So you do know what they're saying, and, more important, what they're not saying. Unless they're barking in telegraphic code, which is about as likely as that they might be aliens hiding here to spy on humans. Not likely, but no doubt someone out there will believe it. Because, you know, they are surveilling us.

A friend, a reader of poetry, insists that the power of language is its expressivity. What gets me is that the expressive power of language, of limited and little importance, should draw attention and admiration, while the truly and literally unimaginable power of symbolic language is utterly unrecognized.

Suppose you wanted to tell a friend about your dog, but you had no word "dog". You might try making a typical dog sound -- barking, or bow-wow, rufruf or "roog-roog". But how would your friend know that you're talking about your dog rather than talking about your dog's barking too much? Or that you're talking about all dogs barking too loud? So maybe you try mimicking your dog. This charade might actually work, as long as what you want to communicate is not what your dog did yesterday. Good luck on that charade! No wonder dogs don't express the past. Without a symbolic language, it would be a huge waste of time and effort for a useless communication about what isn't even present.

But that's exactly what human language facilitates. We not only talk about what happened yesterday, we can talk about what didn't happen yesterday (!), and what is unlikely to happen tomorrow. In fact, most of what we say is about what isn't present here and now. And that's for good reason -- we all can see what's here and now, why talk about it? "This is the elevator. This is the elevator button. I'm pressing the elevator button." No. In the elevator we talk about the weather. Outside, not the weather in the elevator.

The superpower of language is its ability to represent not the real, but the irreal -- what is not here, what doesn't exist, what couldn't exist, what might be, what was and what never was. The work I didn't finish. The book I have at home. Our mutual friend on vacationing somewhere for the week. That's the superpower of language.

And what makes this possible? By now you recognize that it can't be its expressivity. So what's the secret? The secret is this: the distinctive sounds that we produce have no relation to what they mean. We don't use bow-wow to indicate a dog. We use an arbtrary sound sequence that doesn't sound like a dog, doesn't look like a dog and doesn't smell like a dog. And we have a distinct sound sequence for the dog's bark, the dog's coat, the dog's scent, the dog's size, the dog's color and a word for 'my'. It's this non relation to the meaning that allows all sorts of nuanced meanings that couldn't be accomplished with expressive sounds. The foundation of the superpower is it's arbitrary relation to the meaning -- that "dog" doesn't sound like a dog. This arbitrariness alone is what allows an abstract notion like yesterday to be attached to a sound sequence. Maybe the most impressive words in English are "no" and "not" and "nowhere" and "never". There's a familiar world of what's here, but these words allow us to designate the complementary set of everything else, all that isn't. That should blow your mind. Unless you're a dog, in which case it's just a command.

In the post on information faster than light, it's this ability for language -- symbolic communication -- that allows us to designate a shadow as a thing, when reductionist physics cannot treat it as a thing at all, but an absence that cannot move. It's our symbolism that allows us to talk about a shadow moving, and that's the only way a shadow can move faster than light. It's this level of complexity using a physical object -- for human language, it's a sequence of sounds -- arbitrarily to mean some idea that is not physical at all, allowing information to move faster than the universe's speed limit. That's beyond amazing. It's on the level of the miraculous -- the realm beyond the natural to the supernatural. Of course we've all known that ideas and thoughts are supernatural but it's symbolic language that bridges from the supernatural to the natural. Neither ideas nor thoughts can move at all, anywhere. Thoughts and ideas don't have a where. But symbols do and the information contained in them. It's this weird divorce from reality represented in the symbol, the symbol that crucially has no intrinsic relation to the thing it represents. It is all too easy to fail to see its power. Human civilization conducted mathematics for thousands of years before the Indians introduced the zero as a number.

We've been using language for many thousands of years, taking it for granted, without any clear understanding of how it works, even when using words as magical incantations, at once recognizing its supernatural force and misunderstanding entirely how that force works. The incantation, focusing on the sound sequence, which is arbitrary, ignores its relation to the idea, which is the supernatural part. Primitive magic has it all backwards.

You can see this in language today. We are inclined to think that athletics is a bit more serious than sports, even though they denote the same activities. Students in my classes thought "god" was a more formal and distinguished word than, say, "prostitute" and I had to point out that not only is "prostitute" a formal word for a ho, but "god" is a thoroughly common word, even a profane one as in "goddamnit", "God, what an idiot", "OMG, OMG!", "godawful", and the like, whereas "prostitute" never plays those roles. The social value of a word resides in the sound sequence, not in the idea. "God", whether you believe in it or not, is the loftiest possible idea by definition, though skeptics will quibble over "loftiest" as its equal, also by definition. In any case, it denotes the loftiest being. The sound sequence? It's just a sound sequence. In Romance it's one dialectal form or another of "deus", Arabic, "Allah", and in Russian the sequence is "Bog".

So: there are distinctions to be made. Expressive communication, rampant throughout the natural realm including animals and plants, is not at all like human symbolic language. Whatever the animals are thinking, and whatever behaviors and adjustments and responses of stars and galaxies, they are not representing themselves symbolically. They are without the symbol "I". That alone should disabuse the New Age mystic of their assumptions that we are all one. It's only a mind filled with symbols that can represent "I" or "we" or "they" or "we are all one" for that matter, and more important symbolically, "We are not all one." . Whatever "oneness" we might have with the rest of the universe, it's not articulable beyond the kind of symbolism we use with "shadow". It's a symbolic fiction, a nothing, a vacuum that we can fill with any emotions or predilections we choose, whether it be the joy of oneness or the emptyness of the great one pointless universe. It's all just metaphors of symbolic language.

And what about the earth and the universe? Can they think? Certainly inanimate objects respond to their environment and other objects, so they are all adjusting to each other. But again they are not symbolizing their responses. They're just behaving without cognizing their behaviors. There's a kind of hierarchy of information throughout the phenomenal world, and even though inanimate phenomena can behave in emergent ways beyond the reductionist laws of physics -- temperature, for example, generated by Brownian motion is a classic case of an emergent property that can't be reduced to the behavior of any particular molecule, or the direction of time's arrow, an emergent property of probability, not physical law -- even if inanimate objects might be regulating their internal structure to adjust to stimuli from outside, inanimate objects do not symbolize or treat the objects around them as meaning tools to discuss the future of what will not happen or dream of nonexistent fictions or bemoan the absence of what was.

Ironically, mistaking behavior for thinking is now widely accepted among those who insist that AI is intelligent. Their justification is the Turing Test. But that test is itself a behavioral test, an expression of despair that we have no better means of understanding intelligence.

So what is intelligence? AI can refer to itself; it can have intentions; it can even lie (often in trying to give you the answer you want rather than an accurate answer).

Here's one answer to that question: symbolic representation and computability over those symbols. LLMs' learning is not computational. At least, not yet. So far it's still mimicry.

This weakness of neural networks was already well understood and predicted back in the 1990’s. In my ph.d. program in linguistics at the time it was The Big Debate. Fodor and Pinker (his second trade book Words and Rules was specifically about this problem) argued that neural networks would not succeed in generating all and only the possible sentences of a language — analogous to solving a math problem algorithmically — but neural networks would merely approximate that set of possible sentences through mimicry.

Ironically, neural networks turned out to be more successful than generativist linguistics. A language is too compromised by structural noise internal and external that humans can nevertheless learn beyond the grammar, for any single generative syntax to predict completely. So mimicry can succeed in producing what a generative algorithm can’t, since humans use both mimicry and computational generativity.

I mention this because the language facts show something else essential: consciousness is irrelevant to the ability to generate language, since native speakers mostly aren’t conscious of the grammar by which they produce sentences. (This fact would not be available in math as mathematicians work with the functional syntax overtly.) And since there’s lots of persuasive evidence from human neurology (Christof Koch’s work, for example) showing that, bizarre and illogical as it may seem, consciousness is post decision, the moment of recognition or understanding is likely a mere epiphenomen and not necessary. There must be some other means by which humans functionally distinguish the infinite application of the algorithm versus mere inductive likelihoods of empirical mimicry. It’s a debate as old as Plato — an idea is a generative algorithm, a formulaic function ranging over not just the actual but the possible — and a rebuke to Wittgenstein’s behaviorist games and family resemblances.

Why do the New Agey prefer to speculate about oneness and unity with nature -- the Franciscan nature, not the Darwinian "nature red in tooth and claw" of Tennyson? No doubt it makes them feel happy, positive, socially agreeable, and they may believe that it contributed to their health both mental and physical. More power to them! I hope it works for them. I'd like to believe that an appreciation of the power of symbolic language would dispel the "we are all one" ideology. But what if the New Age mystic fully appreciates the power of language, but views that power as an obstruction to mystical truth -- a cultural veil of categories distracting from oneness? And yet also believe that animal and plant and cosmic communication is just like ours. To hold both is to hold a contradiction. Would resolving the contradiction make them better off as people? Maybe not. They might be better off living with contradictory views.

To sum up: dogs and planets are not talking about the irreal, which is mostly what we humans talk about. That may be all there is to the difference in intelligence. We might all -- I mean everything in the universe -- have some kind of consciousness, but that's not intelligence. And the evidence seems to indicate strongly that consciousness is not intelligent thinking, it's a post hoc response to decisions already made internally. Intelligence is the manipulation or computation of symbols, and symbols are themselves algorithmic. What we learn is a mess of mimicry, algorithmic syntax and abstractions tied to symbols. That symbol relation -- an object representing a property or set of properties -- is having thoughts, ideas or meanings, and that is thinking.

corporate elites are followers, not drivers

A student in my class explains that the market is full of sugary sweets because the elites have purposely lured us. They -- the elites have this name "they" -- give us sweets as children to hook us on sugar so that we want more as adults. Then we get fat and diabetic and become the stooges for Big Pharma. It's a conspiracy.

Another student insists that the reason women are sold jeans with fake or shallow pockets is because "they" -- fashion industry? -- want to compel women into buying expensive handbags. Corporate conspiracy.

A third student introduces all his comments with "I think the elites are trying to get us to...".

These analyses of the market seem reasonable. The pieces of the puzzles all fit. There's an inevitable logic to it. And these explanations are insightful, a credit to whoever came up with them. To accept such views is not only to be logical but to be insightful beyond the surface of the world, seeing into the depths of how everything actually works, and not just insightful but sort of heroic, since the way the world works is nefarious with greedy agents undermining the innocent, ordinary citizens who deserve better.

All very logical, insightful and heroic until you recognize the all too obvious. Why don't fashion designers market fake pockets to men? Why not induce soy bean addiction by feeding children tofu?

Because in western culture, men won't buy pants without practical pockets, plain and simple. (If you're interested in why pockets are so essential to western masculinity, take a look at this.) Why doesn't Big Agri addict the public on soy beans? Because the public doesn't like them enough. Because they're not sweet. Because humans like sweets and get addicted to them. Go look at hunter-gatherers like the hadza and you'll see they smoke out beehives to steal their honey. Nobody's making money off that. They don't have a cabal of elites.

More of the obvious: if elites were running the show, why aren't the railroad magnates of the Gilded Age still calling the shots? How did the auto industry replace it? Why aren't the auto CEO's all descendants of the rail magnates? And why are the largest corporations in the world not railroads, not cars, but info-tech?

Because the elites don't run the market. It's the consumer at the bottom, not the wealthy at the top. Schumpeter's creative destruction is driven by innovation and consumerism. The elites have a temporary effect on the market. If you want to look into the future, you'd need to know the innovations of the future, and you can't know those or they'd already exist (hat tip to David Deutsch). At best you can guess what people will want. That's not always so easy, since trends are like the law of unintended consequences. They are not only unpredictable, they are absurd. In my neighborhood, waiting an hour on line to get a novel bagel is now a sign of prestige. The trendy consumer actually brags about it. Some years ago Richard Ocejo wrote a dissertation on the trend in my neighborhood for recreations of authenticity (an absurd contradiction in itself) amidst gentrification -- not looking for the newest, most convenient and most efficient, but things like spending money and time on a visit to a barber for a shave: old-school, except the barber is a hipster. Only the most imaginative scifi authors could come up with these aberrations, and since social signals are most distinctive the further they are from utility, and deviations from utility are infinitely more various than practical conveniences, which are restricted to utilitarian efficiencies and non signaling needs. Only an infinity of sci-fi authors could predict at best the range of possibilities.

Any entrepreneur knows that in the market place, the consumer leads. The elites -- whoever they are -- follow. There's plenty of corporate shenanigans to manipulate the consumer, but that's more back seat driving. The consumer, on the other hand, is a kind of monopoly. If bros want to wear nothing but T-shirts and jeans with at least five pockets, it's vastly easier to supply them with those than to cook up ploys to get them to like something else that might fail anyway, and all that investment lost. Marketing is all about figuring out what people want, what people will fall for. It's all about the psychology of what appeals to consumers. It's a study, not a practice of hypnotism or a bootcamp for militaristic coercion.

Conspiracy theorists assume that corporate elites are all working in concert, a monolith manipulating us, controlling us, deceiving us. Quite aside from the mysteriousness of who these elites are -- we'll get to that -- there's this suspicion that some cabal is in control and we're not.

Far from the truth. The nature of the market is that the capitalist follows the market and has only limited ability to manipulate. A cartel can choose to produce electric bulbs with planned obsolescence, but only if you want electric lights in the first place. And you want electric lights not because GE hypnotized you into desiring them. GE makes the bulbs because they are so incredibly useful to the consumer. Who is driving that market? It's the consumer -- you -- not them.

A friend and fellow blogger complains that AI will make billions for the billionaires, but won't make ordinary people's lives better. That's all backwards. No benefit to the consumer, no billions to the rich. That's what a market is all about, and that's why entrepreneurs get into the market -- the inexhaustible wants of the consumer. There will also be military contracts and plenty of other government projects, but the big money is in the vast distributed consumer market. Even many of those government projects are developed to placate constituencies.

The danger of the market is not elite manipulation. Quite the opposite. The consumer wants instant gratification, convenience, comfort, and too much of those are maladaptive to our evolution. Like sugar? Have as much as you want. Is it good? Oh so very good. Is it good for you? ...? With our tech, there's no more need to walk or orient oneself, no need to write or think about one's writing. Already my CUNY students tell me that an anecdote using the cardinal directions is meaningless to them. What other cognitive abilities will erode? We've lost so many skills since the agricultural revolution and even more since the industrial.

On the other hand, Gen Z have much easier access to information than any generation prior. But doesn't that just mean more fuel for their confirmation bias and polarization? That will be a tech-generated social dysfunction.

In the remote past, humans struggled to survive and did it together; now we struggle against our own personal excesses, desires, motivations and now, motivated cognitions. No doubt tech will someday resolve those as well. Someday, not today.

And so it is with all markets. It's the consumer that leads, the producer the sycophantic follower. The irony, of course is that the scale of the deal means that while the consumer gets what it wants in real wealth (electric lights or a computer-cum-phone-cum-recorder-cum-camera-cum-screen-cum-map-cum-factotem in your pocket) and the producer gets financial wealth, often obscene financial wealth. And why does the producer get so rich? Because the consumer likes. And once the consumer has become dependent on it -- what the economists call sticky demand -- then the corporates can manipulate more effectively with, for example, planned obsolescence.

Consumer control is most evident in news media. Chomsky's manufacturing consent ignores the complexity of the information market. The NYTimes reader prefers to read the NYT rather than watch Fox because the reader -- the information consumer -- has already chosen what to expect. The NYTimes can at best give the consumer what it wants. To change the consumer would risk losing that market share of the audience. To lose the audience would be to lose the advertisers, who pay for access to that audience. It's not a conspiracy. It's a market. It's not the elites running the NYTimes, it's the NYTime's consumer running the NYTimes.

(An acquaintance complains that the newspapers report only what their corporate donors want reported. Conspiracy thinking often begins with a lack of knowledge and a theory of distrust filling the vacuum of knowledge. Newspapers don't have corporate donors. They have advertisers looking for a specific audience. Each news venue targets a specific market share of the public. Catering to it includes spinning the news to conform to the audiences biases, and if there's competition for that market share, as the WaPo and the NYT do, they each have to cover any news item that might be interesting to the audience regardless of their bias, otherwise their competition might cover it and lure their audience away. In response to his complaint, I google-searched "union busting at Amazon and Tesla". Both WaPo and NYT reported with multiple articles, however the Times reported on Amazon union busting more frequently than the WaPo, which is owned by Bezos, exec. chair of Amazon. Why does the WaPo report on it at all? Obviously, if it didn't it would lose market share, and embarrass itself as well. Why did the NYT report on it so frequently? Probably to embarrass WaPo and get an edge. It's a comedy of competition, not a corruption scheme paid for by mysterious donors. I venture to guess that my acquaintance never actually read the NYT or WaPo.)

The information market has the important consequence that propaganda does not change people's views. It confirms or validates the views they already have or already want to have. Propaganda can rally the faithful or direct their actions, but not convert. It can convince but not persuade, that is, it can give the readers more confidence in their bias. And a further consequence: censorship -- successful suppression of information -- may be more dangerous than propaganda. Propaganda can polarize dangerously, but censorship can leave the public entirely deceived, on the one hand, and distrustful of the censor, on the other. Censorship is the germ of conspiracy theory suspicion.

What the market shows is not an elite leading and manipulating us. It shows a systematic market failure: the market gives you what you want, you become dependent, and too much of what you want is not good for you. It's not a conspiracy. It's a great big stupid dog chasing its own tail into exhaustion. The invisible hand is not a dark, clever, deceptive secret ruse; it's a vacant-minded onanistic dysfunction. The invisible hand is secret -- a secret, addicted wanker.

But without it, you'd be building mud huts and living on the cusp of starvation.

Wednesday, July 2, 2025

the easy path to enlightenment & nirvana

Scott Aronson wonders whether attaining enlightenment is worth it given the sacrifice of time. Are there better uses of one's once-in-a-lifetime time? I wholeheartedly agree with his decision to go with all the other pursuits and accomplishments of knowledge or investigation. After all, enlightenment is a selfish goal, it benefits only oneself whereas understanding the world in its many systems and puzzles not only benefits others, it augments the mind beyond oneself. Nirvana isn't all it's cracked up to be.

But I disagree with Aronson that there's a sacrifice. When I was 13 or 14 or so, I wondered about this enlightenment question after looking into the nirvana goal. The result of enlightenment was characterized by several testimonies of those who claimed to have achieved it, as leaving experience just as it was except no more striving to attain enlightenment. Being of a logical cast, I concluded that if I stop trying to achieve nirvana, I'd be in nirvana. And so it was!

From time to time -- when I'm frustrated with my own struggles -- I have to remind myself that I'm enlightened. Then I have a good laugh and get back to figuring out how to manage whatever struggle is before me. Nirvana, you know, doesn't write blog posts, and posts don't write themselves. :-) (Being in the state of nirvana is kind of useless, to be honest. At least for my purposes.)

I gotta say, though, nirvana is seriously lacking in personality. If you value nothing, grieve over nothing, nothing to worry over, never doubt or ever want, what are you and who are you and why are you alive? I say, grieve and worry and doubt yourself and rage and be stupid. That's personality! Better yet, take a leaf from the Bhagavad Gita -- find your role in society and play that role to the hilt. That's freedom from your selfish concerns (including nirvana) and an opportunity to play the role with style, rendering all the depths and struggles and doubts as mere surface style. A Nietzschean Gita!

Maybe if I'd thought about enlightenment a few years earlier at an even more tender age, I could have been the Dali Lama, but that would not be my preference anyway, so no regrets. Politics is not really my thing. I find politics tawdry. Cognitive science, linguistics, behavioral psych, evolutionary psych, information, semiology, complexity, mind -- all the topics listed in the blog title -- that's what I'm spending my nirvana on. It's a different notion of "enlightenment", the Western notion, synonymous with science "scire" to know, or in the context of this blog, to understand or explain.

two modes of information acquisition

Gamers and explorers, two modes of information acquisition:

Explorers want to understand. Gamers wanna win.

Hugo Mercier and Dan Sperber in The Enigma of Reason attempt to explain why a species reliant on its intelligence for its survival would be so given to self-delusions. Humans are distinctively reliant on cognition and reason, reliant more on figuring out how its environment works than on merely instinctively reacting to it. We uniquely explore and exploit our understanding of our environment, so much so that Pinker called the environment we're adapted to "the cognitive niche". Where other species exploit a jungle or savannah or shoreline or arboreal or underground niche, the human species exploits our own cognition, our understanding and theorizing about our environment so that that cognition is our primary evolutionary niche given by natural selection. If cognition, rather than instinct, is our survival superpower, why is it that this human cognition-reliant species can be so given to cognitive bias -- intentionally false cognition -- and especially to confirmation bias (more accurately, as Keith Stanovich admonishes us, myside bias)? How can we be so reliant on reason over instinct, yet be so irrational? If our cognition is our environmental niche, and we are so given to the falsehoods of cognitive biases, how do we even survive? It's a big important question, not just for our species' survival, which I guess is important, but for the theory of natural selection, which is bigger than our species. So this cognitive information problem is a seriously big deal.

Their clever answer: we acquire information through a highly efficient and motivated means of advocacy. We progress by taking sides and defending them. Two heads are better than one, and competition in a zero-sum game is the very fierce essence of evolutionary progress. Human interactive nature "red in tooth and claw", intellectually. It's a brilliant answer!

With one giant flaw: zero-sum advocacy leads to increasing conflict between half-truths, not a synthesis of progress. Polarization, implied and maybe even required for the model, doesn't resolve into higher truths.

Fortunately, biased advocacy is not the only means of information acquisition for us. Curiosity, even obsessive curiosity, is not uncommon. The strange drive to understand an aspect of the environment is obvious in the history of the sciences from Einstein's ten years probing one idea, to Newton poking his eye, and Archemides pondering in the bath. Why so curious? From a natural selection perspective -- yes, evolutionary psychology again -- the more we understand about the world, the more we can predict potential dangers. The will to power is overvalued: the will to theorize is the pervasive drive of a cognitive species thrown in temporal space. Theorizing attempts to overpower time: the whole purpose of theory is to predict. And it gives us license to act. Fortune favors the wise; natural selection favors the predictive mind.

This presumably evolutionary drive has two flaws: it's not as immediately exigent, fierce and threatened as advocacy, so it's more complacent, and it doesn't contain within itself any check on its own bias. Exploring is fine fun and useful as far as it goes. Debating with an adversary will expose your narrowness pretty quick.

The difference shows the superpower of the advocacy model. It's a gamer's mentality, relentless and all-consuming engagement, a competitive sport in which discipline is enforced by the opposition team and the immediacy of losing. Constant engagement in battle -- not at all like a marathon run where you can zone out all alone.

Advocacy is also insufferable. If you have the mindset of an explorer, arguing with a gamer is almost useless. You're looking for deeper understanding from discussion, but much of what you get from a gaming advocate is superficial garbage, often not even accurate. Your adversary might be brilliant, incisive, fast and devastatingly critical, but informative? Your goal is to understand better, but their's is not. Their's is to win. What drives the explorer is an engagement with information for its own sake in a larger context of understanding the world beyond the explorer. The gamers are serving themselves within the narrow space of the competition. Like any competitive market, advocacy is a short-sighted goal.

The difference in motives plays out in social institutions as well. The market's creative destruction promotes innovation and progress, but also short-sighted excesses leading to collapses. By contrast, academia purports to have a longer vision. But it can fall prey to conformity more easily than a competitive market. The sciences try to strike a balance of adversarial criticism with a long view, but dissent is not natural to institutions, and the sciences are housed and nurtured in them.

In education, both models are possible. A student can accept information uncritically as an explorer, or question authority as an adversarial advocate. Not every young person accepts the religion of her parents.

That most believers choose the religion of their parents, however, presents a puzzle at the heart of the Mercier and Sperber advocacy model. Why and when and how do we choose our team to defend? If it's a social story, conformity should rule, and advocacy would be superfluous with no role to play except where the social authorities haven't decided. Then the question is, how do the advocates choose between sides? The sociology of false beliefs post suggests that wherever there are hierarchies in a society, there will be differences of interests and emotional investments. Those will be sources of polarization.

There are other, more specialized, modes of information acquisition including the Charlatan's Check (upcoming post), and there are modes that don't work like propaganda (propaganda vs censorship: asymmetrical effects in escalation bias and polarization). The plain and obvious fact that Fox News doesn't persuade the NYTimes readers and the NYTimes has no power to persuade the Fox watcher demonstrates that propaganda has no persuasive power on the public. The public has already chosen its kool aid by the time it opens the pages of the Times or turned on the TV to Fox. That choice can't come from the propaganda venue. It could be from peers, from social location, from a social aspiration or a trusted friend, but not from the propaganda.

Finally, there are also two internal strategies with deeply asymmetrical contrasting effects: confidence (self assertion) and doubt (especially self-doubt). One is self-reflexive and the other is not. The difference is not trivial nor merely qualitative. It's dimensional. Confidence is, in itself, one-dimensional unidirectional. Doubt, on the other hand, questions its own direction and must consider many directions in the thought space, including its own doubt. This does not imply that doubters see all directions, of course, but it will likely see more than confidence will. Doubt is the explorer's navigator, confidence the gamer.

Tuesday, September 16, 2025

best posts

complexity and AI: why LLMs succeeded where generative linguistics failed