Friday, November 21, 2025

sunk cost: loss aversion or Bayesian failure?

Loss aversion is a natural selection emotion tied to survival. Loss has a finite bottom boundary -- no bananas means starvation and death -- whereas acquisition has an infinite or unbounded superfluity. No one needs forty billion bananas and using them takes some effort and imagination, like maybe using them for bribes towards world domination. The normal person wants to assure herself first that there's something to eat tonight. World domination later, if we're still interested after dinner.

So sunk cost fallacy is an emotional attachment to what's been spent. But it is also a failure of Bayesian analysis of time. So you stay in the movie theater not only because you don't want to throw away the ticket you spent money on, but also because that emotional attachment -- loss aversion -- has blindt you to the time outside the theater. The ticket has focused on the loss rate of leaving: 100% of the next hour will be lost. But that's forgetting all the value outside the theater. 

This Bayesian interpretation predicts that people whose time is extremely valuable -- people with many jobs or jobs that have high returns whether in financial wealth or real wealth (personal rewards) are less likely to stay in the theater. Their focus will be trained on the time outside the theater. The losses will be adjusted for the broader context of the normal. We should expect that the very busy or very productive will be resistant to the fallacy. 

Of course, there's also the rich who don't worry about throwing a ticket away, the marginal value of which money is low or worthless. But overall, the sunk cost fallacy should occur only with people who have time to waste, whose time is not pressingly valuable. The sunk cost fallacy may be an arithmetic fallacy of focus, not just an evolutionary psychology of risk-aversion. 

Freud and Haidt got it backwards: the unconscious is rational; the conscious mind is not

A friend insists that I'm disciplined, since he sees that I take time every day to work out on the gymnastics bars in the local park. I object: I work out because I enjoy getting out of the house. He concedes that we do what we enjoy without discipline. 

We both have it all wrong. I'm well aware that the opportunity cost of staying at home is, on most days, far greater than the opportunity cost of going to the park and socialize while practicing acrobatics. I know that I need to socialize every day and maintain my strength and agility. But that rational cost-benefit equilibrium never motivates me. I'm comfortable at home, I don't feel energetic enough to brave the cold -- there's any number of reasons to stay home. If I debate with myself over whether to go out, I will stay. The immediateness of laziness -- the comfort of now -- overcomes any rational equilibrium. So how do I get to the park? It's not discipline. I don't even understand what "discipline" means. Is there an emotion of discipline? Is it suppressing one's thinking -- including one's deliberating and second guessing and procrastinating and distracting oneself -- and just do it? 

As mentioned in this post, the unconscious mind's rational intentions will make decisions without consultation with the conscious mind, as long as the conscious mind is distracted. Focus my conscious attention on going out and immediately I'm feeling comfortable and lazy and call up all the reasons to stay home. Think about anything else unrelated, and soon enough it seems time to grab the wool sweater and go. Sometimes I'll watch myself grab the wool without knowing when I made the decision to grab it. It just happens.

It's the unconscious mind that knows what's best for my long-term goals. It's my conscious mind that's swayed by the emotions of now. Haidt treats emotions as the unconscious mind, sort of following Kahneman. This is a mistake inherited from Freud, in turn inherited from Plato's Pheadrus and popularized in the 19th century romantics, Schopenhauer, Wagner and Nietzsche, that whole crowd convinced that the uncontrolled emotions, dark, mysterious and dangerous, disturb and cloud the serenity and clarity of the reasoning awareness. 

This is all mythology, and religious mythology, self-punishing and confused. 

Awareness and emotions cohabit the now. This is obvious, a truism. "I feel, therefore I am" is equally definitive and necessary. There's little to distinguish between think, perceive, and feel. The awareness of the momentary environment feeds the emotions. The comfort of my chair is at once an awareness and an emotion. Any reasoned attempt to dissuade me from the comfort of my chair in the now for the sake of a merely imagined future will engage struggle, and the strength of perception will likely win. Every failed dieter, every procrastinator, every substance abuser, every phone zombie, every wanker knows this all too well. 

I once got off all sugar, not by struggling against my desire, but by distracting myself with the one thought that I knew would always distract me from that very desire: distracting myself with the desire itself. I spent my idle time thinking about all the sweets I most like, listing them, ordering them and categorizing them -- cookies, cakes, chocolates, ice creams -- and thought hard about which in each category I most wanted (childhood comfort favorites mostly beat fancy treats). You know, the opposite of "Don't think about an zebra!" That's a recipe for sure failure. But, "Okay, there's no way out of thinking about the zebra. Let us now then examine this zebra that is inhabiting our mental space" and pretty soon, sliding down with no struggle at all, you're too deep in...and you're enjoying it. 

It's the unconscious drives that are independent of the emotions and awarenesses of now. It's the unconscious mind that is the real decision maker. This detached, rational, disciplined, far-sighted unconscious mind is free from the emotions. It's the rational nagging mind of what I know I should do, and that I would do but for the interventions of my conscious, biased, instant-gratification emotions of the aware-now. 

The emotions are always immediate -- they are feelings and have to be felt in the now. The unconscious mind isn't in the now at all. It's a hidden subterfugeal world of long-term rational sabotages against my conscious will. Freud misplaced the conscience. It's not the superego, it's the subterego, the intuitive fast system that's thinking far ahead, working to keep me well against my will and motivated reasoning.

the spirituality paradox

Spirituality often cloaks itself in moral guise as shedding selfishness in favor of embracing the Other whether it be other sentient beings or the world of inanimate phenomena: amor vincit omnia.

The goal of such spirituality is to transcend the self, but the purpose is to improve the self. So, for example, a spiritual cult or movement targets the individual member for the spiritual elevation of that individual. It's not a movement to save the cows and chickens, or preserve pristine nature. It's a movement to bring the individual's self to a higher spiritual state. Saving the chickens is a by-product. In other words, it's a selfish purpose with a selfless goal. 

From my biased perspective it's not merely contradictory and self-defeating (I mean the doctrine is defeating the doctrine -- a doctrine at cross-purposes to itself), but also self-serving, decadent and essentially degenerate. Yes, you have only one life to live, so there's plenty incentive to perfect that life for itself -- I'm down with that, for sure -- but there are billions of others and possibly an infinity of other interests to pursue than this one self. Arithmetically, the others should win, were it not for the infinite value of one's own life. 

But here's the difference: attending to things beyond oneself also perfects or augments one's own meagre life. One path to transcendent enlightenment is studying the Other, instead of limiting oneself to navel-gazing. That's a path towards two infinities added together: the broad study of, say, the psychology of the Other will shed equal light on one's own psychology, while the study of, say, thermodynamics or information theory, will take you far beyond oneself. 

Arithmetically, two infinite series are no greater than one infinite series, but you still can see the advantage of an infinite series within yourself and an infinite series of yourself and all the others', the infinitesimal within plus the infinite outside. 

love-fragility inequality

better to have loved and lost than never to have loved at all??

Romance might be the most wonderful experience in life. also the most precarious. Is the precariousness worse than the wonderfulness is good? Kahneman and Tversky and Thaler and Gilovich tell us that we're more risk-averse than benefit-embracing. The epigraph above must be a fiction. 

It's hard to measure such extreme emotions, but if it's true, as is widely reported, that losing a job is worse than losing a loved one, then maybe romance is an exception to behavioral psychology's "losing is twice as bad as gain is good". So "better to have loved and lost than never to have loved at all" is a good gamble, since there are worse things than losing in romance, but nothing better than loving. 

Tuesday, September 16, 2025

complexity and AI: why LLMs succeeded where generative linguistics failed

Chomsky's approach to linguistics has effectively been disproven by AI: 1. general learning is enough for language (no innate language skill necessary), 2. language is fundamentally scruffy (not neat), 3. language is enough to learn language (no grounding)."

-- Joscha Bach, Nov. 6, 2024 on X

Disproven?

In the quote above, Bach focuses on confirmatory evidence -- what AI and humans can do rather than what they can't --  missing both the key argument in favor of Chomsky's computational-generative program and that program's argument against neural network LLMs as a natural language model. Bach's comment may also be exaggerating the success of neural networks, but let's set that one aside. 

Focusing on confirmation while ignoring discomfirmatory evidence or counterfactual support is quite common in human reasoning. And Bach's is a common response to Chomsky: if AI can produce natural language better than a computational model, humans must not need an innate hardwired syntax program. Children could learn all of language simply by bottom-up empirical impressions, no need for a top-down, evolutionary preset computational grammar. 

What Bach and his fellow engineers ignore is the limit problem that drives the whole Chomsky model. An account of human language-learning does not depend on which languages humans or machines can learn, but which languages, if any, humans cannot learn, and which machines, if any, mechanically, structurally cannot learn those same languages while nonetheless being able to learn human languages. If there are such humanly impossible languages and such machines, then it is 1) highly probable that the human language faculty is isomorphic with those machines and 2) any machine that can parse or produce those languages manifestly cannot be a model of the human faculty, much less explain or even illuminate it. 

This is why Chomsky repeats over and again, to the puzzlement and frustration of the AI engineering community, that AI technology tells us nothing about the human language faculty. Nothing. 

There's a popular view, apparently held by engineers as well, that technology is a branch of science because technology depends on the discoveries in the sciences. But the goals of the natural sciences are not the goals of technology. Technologies are devised in order to accomplish practical tasks, often market-driven tasks or military ones. The goal of the natural sciences is to understand nature, not to manipulate nature into performing a task. The goal of understanding nature includes explaining what nature doesn't or can't do. That's not the task of technology, and one reason why AI alignment is such a challenge: there is no AI natural selection learning limiting itself to human subserviance and harmlessness. For the natural sciences, those limits need to be explained. 

Now, you ask, and should ask, are there languages that human children can't learn? Under current ethical conditions, we can't experiment to find out. But we can look at syntactic structures in known languages that speakers of those languages cannot parse. I provide examples below. First, I want to respond to a more pressing question: why has the generative model failed where AI technology has succeeded. 

On this question, Bach holds an important clue. Language is scruffy. Even if it exhibits a core of recursive computational structures, it also has a lot of vagaries. Worse, the vagaries are not always isolated quirks. They could be recursive and productive, so they could behave and look just like algorithmic recursive productive computations. 

In the 1990's, during my doctoral studies, it was pretty clear to me that it was possible and even probable that the data of language included variants that were non structural, mere irregular quirks. In the US, we say "go to school" but "go to the hospital" or "to a hospital". Brits say "go to hospital". To conclude from this contrast that there must be a deep syntactic structural difference with reflexes throughout the language seems way too much for a theory to explain. It's more likely a historical accident of semantics that has somehow seeped into one dialect and not the other: it could be that for Americans, schooling is an activity, but not hospitaling and this is being reflected in the dialect; "hospital", like "congress", can be treated like a name as Brits do, unlike, say, "the capital", a common noun. But if a linguist can't definitively tell which datum belongs to the syntax machine and which is learnt by the general learning faculty, then all the data are effectively in a black box. 

The clear evidence that there are structures that English speakers cannot parse (see examples below) convinced me that Chomsky's explanatory model was probably right. But because the data were all jumbled together in a black box, I was also convinced that the program of discovering the specifics of the model was doomed to failure -- despite being the right explanation

As long as the evidence is behavioral (sentences), the model will be unfalsifiable if the model data are mixed with general learning data. Judgment will have to wait for psycholinguistic or neurological experiment to sift out the innate machine recursions from the general learning ones. 

The sciences are not as straightforward as the public may assume. For example, Darwin's theory of evolution is entirely post hoc. It can't predict any creature, it can only give a kind of tautological post hoc just-so story: if the species exists, it has survived (tautologically obvious), and that survival depends on adaptation to the environment (that's the post hoc just-so story), explaining at once the divergence of species and their inheritance from their common origins (that's the revelation explaining all the data of zoology together with elegant simplicity, settling all the questions that the theological theory left unanswered like, why do all the mammals have two eyes, a spine and a butthole, the last being a particularly difficult and comical challenge to the image of the deity). Physicists can't predict the next move of a chess piece let alone the fall of a plastic bag. That's one reason to question pundit economists.  A theory, even a great and powerful and pervasive theory, can be right without being predictive. In language, generativism can be right without being able to generate the language. 

A technology, however, must work at least well enough so the bridge doesn't fall. If that means it's wise to use girders stronger than the science prescribes, you use them no questions asked. Or rather, the only question to ask would be costs. Note that such costs are utterly irrelevant to the value of a science, where the only "cost" is the simplicity of the theory (more accurately the probability, see the post on entropy and truth). You see the difference. Tech is about practicalities of an innovation in a realm of action, science about understanding what's already existing

And this is why AI neural network (LLM chatbots) program has succeeded where the top-down computational program has failed. A computational model can only produce by algorithm and can only parse by algorithm. It cannot by itself capture quirks without adding ad hoc do-dads to handle each quirky case, not a theory but a cat for every mousehole. A behavioral, empirical mimicry machine can learn to mimic vagaries just as well as algorithmic functional outputs. They are all equally outputs from the perspective of a behavioral monkey-see-monkey-do machine. The mimic isn't interested in the source of the behavior or its explanation. There's no causal depth to mimicry and no need for it. Algorithmic behaviors and quirky behaviors are equally just empirical facts to the mime. 

This is not to disparage the methods AI's use to "learn" behaviors and generate like behaviors. Neural networks are an amazingly successful technology. And it may even be used by scientific research to understand language. But it is too successful to stand as a model of human language learning or, possibly, human intelligence as well. 

So even though, or actually because, neural networks are restricted to a shallow, non causal, impoverished empiricism, they can accurately reproduce the full range and complexity of language -- its scruffy idioms and recursive vagaries, as well as its algorithmic functional returned values -- whereas the top-down model could at best account for the returned functional values. But a top-down computational model can't do even that because the top-down model relies on the restricted data of returned values, and the linguist can't identify with certainty which data are restricted to the faculty's algorithm and which is ordinary human empirical behavioral learning. 

There's an important Hayekian lesson here for social planning and economic policy: policy and planning algorithms are too narrow to predict social behaviors. But learning machines might not be much better since they are trained post hoc. 

Chomsky's generativism was a Copernican shift for the sciences in the 20th century. He not only blew behaviorism out of the water, he returned thought and the mind back into the sciences and philosophy. Now the zombie corpse of behaviorism in the form of LLMs has been resurrected from its swampy depths to walk again, but this time with a sophisticated information-theoretic learning technology. 

None of this top-down computational modeling relies on or even mentions consciousness. It's only a question of whether humans compute or merely mimic with the exceptional range and sophistication of AI mimicry, ascending and descending gradients, and such Markov walks. The answer is, that the means AI uses are too powerful to explain human language and maybe human learning in general. 

As promised above, here's a handful of sentences showing what English speakers can and can't parse:

a. The drunk at the end of the bar is finally leaving. (English speakers can parse this easily)

b. The drunk on the chair with three legs at the end of the bar wants to talk to you. (again, it's easy to parse that it's the chair that's at the end of the bar, not the legs at the end, and not a drunk with three legs)

c. The drunk on the chair with three legs in the suit with the stripes at the end of the bar wants to talk to you. (again, easy to parse that it's the drunk here in the suit) 

d. The drunk at the end in the suit of the bar wants to talk to you. (not just hard but impossible even if you know that it's intended to mean "the drunk in the suit" and "the end of the bar") 

and it's not because "of" requires proximity: 

 e. The drunk on the chair in the T-shirt with three legs wants to talk to you. (no way to get the three-legged chair)

[Scroll down to the bottom for a mechanical diagram of the cross relation problem.]

There are no gradations in difficulty. a-c are automatic, d&e impossible, even though they're simpler than b or c (!) and even if you're told what they're supposed to mean!! That's a reliable sign of machine operation -- a structural failure. And in fact a push-down automaton can produce a-c, just as English speakers can. A push-down mechanically can't parse d or e and neither can you. 

That is the counterfactual argument for generative grammar: the limit tells us what kind of machine the brain uses. A neural network LLM learning machine can produce d and e if it is trained on sentences of this kind. Therefore LLMs tell us nothing about why d and e are impossible for English speakers. LLMs are a successful technology, not a scientific explanation. LLMs are too successful to be informative of natural language learning. 

The counterfactual above is not a proof, it's just support. It would be proof if an LLM actually produced a sentence with the structure of d and e. And that it doesn't, may mean only that it hasn't encountered it in its training. If a child learned to produce or parse structures like d and e, these sentences would be less interesting. Any of these results would be relevant to our understanding of human learning and LLM learning. But Bach's 1-3 don't address the counterfactual at all. They are mere excitement over a technology that can do anything and everything. That's to miss the scientific goal entirely. In other words, the jury is still out on Chomsky's model, pace Bach. 

Even a behavioral test like the Turing test would distinguish an LLM from human intelligence, since the LLM could perform what humans can't. It's the weakness of the Turing test that it tests for how stupid the machine must be, or appear to be, to emulate humans. It's ironic, but not surprising, that it would be used as a test of intelligence. Not surprising, because Turing's test reflects the despairing limitations of behaviorism prior to Chomsky's 1956 Syntactic Structures. 

Here are diagrams that make it clearer what the machine structure can't do:

[The red X below indicates the cross relation that neither humans nor "push-down" automata can parse.]



Monday, July 21, 2025

self selection & the unchosen: identity, the spark bird, and the Freudian trap

a Bayesian approach to self-identity

A friend says that all the geniuses seem to have had a seminal moment in childhood that turned them towards their calling. 

The allure of this explanation seems obvious. Except for the fans of Gladwell's 10,000 hours, there's something special about geniuses; not like the rest of us. There must be an explanation of their magic specialness. There must have been a cause, an igniting start for this astonishing and extraordinary career. There must have been a special moment.  

I think this is all a fallacy, a failure to use Bayesian reasoning, compounded with a myth promoted by Freud that has somehow become an article of faith in folk psychology. 

Geniuses, or talented celebrities of any stripe, are the focus of great popular interest. They are often interviewed by journalists who are not geniuses, just ordinary blokes who want to please their audience, or their editor's audience. What does the audience want? To know how the genius became genius. It's a disguised version of "why are you a genius and I'm not?" or more personally, "how come I'm just your average slob, damnit! How come you're so special?!" 

And to answer this question, our ordinary-bloke journalist asks, "how did you become a genius?" Not in those words, of course, but more like "what got you interested in what you do?" "How did it all begin?" in other words, our vicarious spokesperson here is specifically asking the genius to come up with a beginning experience. And how does genius do this? Why, by scanning her past as far back as possible to find that start. Seek and you shall find. There's almost always some event that will fit the bill. End result: any genius will have a beginning story and it will likely be nestled in childhood. It's a case of confirmation bias, or inductive fallacy -- they're basically the same -- and they both mistake evidence for science. 

Why is this a failure to apply Bayesian reasoning? Consider all the events of childhood. Countless, aren't there? Most of them forgotten, no? That's just the point! Of all those events, many of them could have been sparks for interest. Now, how many of these produced no enduring interest? How many produced no impression at all? How many sparked a brief but not lasting interest? Well, all of them except the one that genius held onto. In other words, it's not the event at all, it must be the potential interest already within the genius. 

The relationship between the choice of story and the development of enduring interest is no doubt idiosyncratic and complex, including all the chances and complexities of environment pre and post natal. Or it may have been a driving interest from within, a polygenic inheritance. Whatever the case, there's no reason to believe that the event was the cause. 

Among birders -- lovers of birds, bird-watchers, bird devotees, ornithologists -- there's a common experience they describe as their spark bird, the bird that ignited their interest in becoming a birder. The metaphor says a lot, I think. A spark without tinder is a flash in the pan. It's the tinder that makes the flame. No parade of birds, no matter how long, exotic, brightly colored or iridescent, flamboyant, weird, big or tiny, could spark my interest in birds. Is it because we had a bird at home for a little while, so I got used to them? But why didn't that bird initially spark me? 

It's because me. It's not because of the parade of flaunting birds. It seems ludicrous to think that the bird makes the birder. When looking at the events of childhood, it's just as important, maybe even more important, to look at all the events that made no impression. That's the base rate. That's the norm. Surely that's just as informative, if not more. What makes a genius is not the event, but the personality, the personality that ignored all the other big events, or dabbled in some of them and lost interest. 

The Freudian myth that childhood events are responsible for our development ignores the normal. I'll be posting soon on this oblivious obvious of the ordinary. It's the normal stuff, the everyday, the 99% of childhood, that makes up 99% of experience. And even that is only background. Oneself is 100% of one. Freud liked to have an explanation for everything, even if he had to fictionalize one. No doubt it made him seem more valuable and smart, and it's the character of the charlatan to know more than everyone else. Charlatanism is the pretense of having extra-normal knowledge. But it's just a pretense.  

The same experience can have radically different -- even opposite -- effects depending on the person. Much of psychology ignores the radical diversity of congenital personality. It's one of the deep beauties of  Ruth Benedict's understanding of personality that she recognized the congenial origin of diversity. Anthropologists often mistake her view as "culture determines personality", which is consistent with current popular post modern and also Marxist blank slate views that there's nothing innate that culture can't manipulate. 

Benedict was diametrically and essentially opposed to such a view in her very heart. And in her reasoning mind and empirical research. She reasoned: if culture determined personality, there could be no deviance. And she empirically observed: every culture has its deviants. Conclusion: the personality of the culture cannot determine the personality of its members. 

On her view, every culture has its own set of norms, a selection of possible human traits, so it has a kind of personality of its own. That's the personality of the culture, not the individuals born into it. Some are born with traits akin to their culture's norms, others born with traits further from them and still others too far from them to conform to the culture. Those last are the deviants. On her view, those who were born with exactly the traits of the culture are perfectly normal, are never criticized, and need never doubt themselves. In short, the perfectly normal are the psychopaths -- having self doubt is what it means to have a conscience, just what they lack. She's on the side of the deviants.

Notice that this account of psychopathy is not the current "they are born without a conscience." No. It's that they happen to have been born into a culture that 100% supports them. If they'd been born in another culture, they'd have self-doubts and a conscience. It's the accidental match between congenial nature (her expression) and the norms of the culture that leads to psychopathy. So. if you have self-doubts, are not satisfied with yourself, troubled by yourself -- that's good! It means you're not a psychopath! It's good that you don't fit in perfectly with your culture. After all, the cultural norms are not necessarily good. They are just the local norms. The Good, whatever that is, is beyond the relative norms of any local system of norms. Benedict is no relativist. The local norms are just a kind of compass to set the direction of individuals in the society, to facilitate their ability to succeed within that culture. Unfortunately, it can leave behind those who are born with traits too far to fit in. She instructs us that in native cultures, the deviants are often embraced as sacred -- weird, maybe and certainly not normal, but special, supranormal, and to be respected as such. We, rigid as we are and all too arrogantly self-righteous, should learn from those cultures. 

For her, innate diversity is the essence of human personality, which she describes as our distinctive "congenial nature". Far from endorsing cultural norms, she turns it upside down: the perfectly normal are the most dangerous individuals, the Dick Cheney's who can order the mass murder of thousands without hesitation, and still be considered normal and even respected. 

It's not what happened to you as a child. It's the luck of being born as you in a culture not yours. 

how we know what dogs are not saying and that the universe is not thinking

Category: the sociology of false beliefs

It's sweet that the New Age mystics want to believe that animal communication is as rich as ours, that the underground funghi modulating trees' nutrition form an ecosystem mind, that the earth itself is conscious and the vast universe deep in thought. Who wouldn't want to believe the animal kingdom and all of nature and even the whole of the universe "are all one" with us? Why not blame the artificiality of civilization on the human distinctions that stand as obstacles in our way towards universal connection. The New Agey are sweet people too, and no doubt in the naive past maybe everyone believed such harmonious speculations. 

Their speculations are illusions, however appealing, and a little thought would show how wrong. And they are not just wrong, they are misleading, misdirecting their understanding of the world to the familiar wish-list of feel-goods and away from what is truly astounding and surprising about humans and how very different we are from the rest of the informational world not only in our communication and understandings, but identity and individuation as well. 

Starting with language: 

I've had this conversation so many times, whenever I explain how extraordinary human language is and how powerful, more powerful than they have ever imagined. 

"But animals communicate too!" they object.

"So what are they saying?" I reply. 

"We don't speak their language, so we don't know what they're saying." 

"Oh, but yes you do." 

Yes, we do. More important, we know what they're not saying. They're not discussing what happened yesterday or what didn't happen but should have that they expected, or what might happen tomorrow, and what might not. 

A dog growls. About yesterday? Unmistakably not. It's an expressive gesture of warning: you're too close, dangerously close. It's accompanied with a show of teeth, which couldn't be seen from a great distance. So you know the growl is about here and now. Is it communication? Of course. The whole purpose is to communicate -- immediately, very immediately, so you don't move any closer. 

So why can't a dog growl about yesterday? 

Here's the simple difference between expressive communication and symbolic language. We humans have a word for yesterday. It's "yesterday". It's not expressive of yesterday. Yesterday doesn't have a sound or even a look or color or scent. Yesterday is a notion, an idea. The word "yesterday" has no intrinsic relation to that idea, it's just the sound sequence English speakers use to talk about the day past. Dog's don't have this symbolic power. They're confined to expressive responses to the here and now. Whimper "I'm hurt, be gentle with me." Bark: "Alert! Dog here, ready for action." Howl: "I'm alone over here" or something like that. I'm not completely fluent in dog. 

So you do know what they're saying, and, more important, what they're not saying. Unless they're barking in telegraphic code, which is about as likely as that they might be aliens hiding here to spy on humans. Not likely, but no doubt someone out there will believe it. Because, you know, they are surveilling us.

A friend, a reader of poetry, insists that the power of language is its expressivity. What gets me is that the expressive power of language, of limited and little importance, should draw attention and admiration, while the truly and literally unimaginable power of symbolic language is utterly unrecognized. 

Suppose you wanted to tell a friend about your dog, but you had no word "dog". You might try making a typical dog sound -- barking, or bow-wow, rufruf or "roog-roog". But how would your friend know that you're talking about your dog rather than talking about your dog's barking too much? Or that you're talking about all dogs barking too loud? So maybe you try mimicking your dog. This charade might actually work, as long as what you want to communicate is not what your dog did yesterday. Good luck on that charade! No wonder dogs don't express the past. Without a symbolic language, it would be a huge waste of time and effort for a useless communication about what isn't even present. 

But that's exactly what human language facilitates. We not only talk about what happened yesterday, we can talk about what didn't happen yesterday (!), and what is unlikely to happen tomorrow. In fact, most of what we say is about what isn't present here and now. And that's for good reason -- we all can see what's here and now, why talk about it? "This is the elevator. This is the elevator button. I'm pressing the elevator button." No. In the elevator we talk about the weather. Outside, not the weather in the elevator. 

The superpower of language is its ability to represent not the real, but the irreal -- what is not here, what doesn't exist, what couldn't exist, what might be, what was and what never was. The work I didn't finish. The book I have at home. Our mutual friend on vacationing somewhere for the week. That's the superpower of language.

And what makes this possible? By now you recognize that it can't be its expressivity. So what's the secret? The secret is this: the distinctive sounds that we produce have no relation to what they mean. We don't use bow-wow to indicate a dog. We use an arbtrary sound sequence that doesn't sound like a dog, doesn't look like a dog and doesn't smell like a dog. And we have a distinct sound sequence for the dog's bark, the dog's coat, the dog's scent, the dog's size, the dog's color and a word for 'my'. It's this non relation to the meaning that allows all sorts of nuanced meanings that couldn't be accomplished with expressive sounds. The foundation of the superpower is it's arbitrary relation to the meaning -- that "dog" doesn't sound like a dog. This arbitrariness alone is what allows an abstract notion like yesterday to be attached to a sound sequence. Maybe the most impressive words in English are "no" and "not" and "nowhere" and "never". There's a familiar world of what's here, but these words allow us to designate the complementary set of everything else, all that isn't. That should blow your mind. Unless you're a dog, in which case it's just a command. 

In the post on information faster than light, it's this ability for language -- symbolic communication -- that allows us to designate a shadow as a thing, when reductionist physics cannot treat it as a thing at all, but an absence that cannot move. It's our symbolism that allows us to talk about a shadow moving, and that's the only way a shadow can move faster than light. It's this level of complexity using a physical object -- for human language, it's a sequence of sounds -- arbitrarily to mean some idea that is not physical at all, allowing information to move faster than the universe's speed limit. That's beyond amazing. It's on the level of the miraculous -- the realm beyond the natural to the supernatural. Of course we've all known that ideas and thoughts are supernatural but it's symbolic language that bridges from the supernatural to the natural. Neither ideas nor thoughts can move at all, anywhere. Thoughts and ideas don't have a where. But symbols do and the information contained in them. It's this weird divorce from reality represented in the symbol, the symbol that crucially has no intrinsic relation to the thing it represents. It is all too easy to fail to see its power. Human civilization conducted mathematics for thousands of years before the Indians introduced the zero as a number. 

We've been using language for many thousands of years, taking it for granted, without any clear understanding of how it works, even when using words as magical incantations, at once recognizing its supernatural force and misunderstanding entirely how that force works. The incantation, focusing on the sound sequence, which is arbitrary, ignores its relation to the idea, which is the supernatural part. Primitive magic has it all backwards. 

You can see this in language today. We are inclined to think that athletics is a bit more serious than sports, even though they denote the same activities. Students in my classes thought "god" was a more formal and distinguished word than, say, "prostitute" and I had to point out that not only is "prostitute" a formal word for a ho, but "god" is a thoroughly common word, even a profane one as in "goddamnit", "God, what an idiot", "OMG, OMG!", "godawful", and the like, whereas "prostitute" never plays those roles. The social value of a word resides in the sound sequence, not in the idea. "God", whether you believe in it or not, is the loftiest possible idea by definition, though skeptics will quibble over "loftiest" as its equal, also by definition. In any case, it denotes the loftiest being. The sound sequence? It's just a sound sequence. In Romance it's one dialectal form or another of "deus", Arabic, "Allah", and in Russian the sequence is "Bog". 

So: there are distinctions to be made. Expressive communication, rampant throughout the natural realm including animals and plants, is not at all like human symbolic language. Whatever the animals are thinking, and whatever behaviors and adjustments and responses of stars and galaxies, they are not representing themselves symbolically. They are without the symbol "I". That alone should disabuse the New Age mystic of their assumptions that we are all one. It's only a mind filled with symbols that can represent "I" or "we" or "they" or "we are all one" for that matter, and more important symbolically, "We are not all one." . Whatever "oneness" we might have with the rest of the universe, it's not articulable beyond the kind of symbolism we use with "shadow". It's a symbolic fiction, a nothing, a vacuum that we can fill with any emotions or predilections we choose, whether it be the joy of oneness or the emptyness of the great one pointless universe. It's all just metaphors of symbolic language. 

And what about the earth and the universe? Can they think? Certainly inanimate objects respond to their environment and other objects, so they are all adjusting to each other. But again they are not symbolizing their responses. They're just behaving without cognizing their behaviors. There's a kind of hierarchy of information throughout the phenomenal world, and even though inanimate phenomena can behave in emergent ways beyond the reductionist laws of physics -- temperature, for example, generated by Brownian motion is a classic case of an emergent property that can't be reduced to the behavior of any particular molecule, or the direction of time's arrow, an emergent property of probability, not physical law -- even if inanimate objects might be regulating their internal structure to adjust to stimuli from outside, inanimate objects do not symbolize or treat the objects around them as meaning tools to discuss the future of what will not happen or dream of nonexistent fictions or bemoan the absence of what was. 

Ironically, mistaking behavior for thinking is now widely accepted among those who insist that AI is intelligent. Their justification is the Turing Test. But that test is itself a behavioral test, an expression of despair that we have no better means of understanding intelligence. 

So what is intelligence? AI can refer to itself; it can have intentions; it can even lie (often in trying to give you the answer you want rather than an accurate answer). 

Here's one answer to that question: symbolic representation and computability over those symbols. LLMs' learning is not computational. At least, not yet. So far it's still mimicry. 

This weakness of neural networks was already well understood and predicted back in the 1990’s. In my ph.d. program in linguistics at the time it was The Big Debate. Fodor and Pinker (his second trade book Words and Rules was specifically about this problem) argued that neural networks would not succeed in generating all and only the possible sentences of a language — analogous to solving a math problem algorithmically — but neural networks would merely approximate that set of possible sentences through mimicry. 

Ironically, neural networks turned out to be more successful than generativist linguistics. A language is too compromised by structural noise internal and external that humans can nevertheless learn beyond the grammar, for any single generative syntax to predict completely. So mimicry can succeed in producing what a generative algorithm can’t, since humans use both mimicry and computational generativity. 

I mention this because the language facts show something else essential: consciousness is irrelevant to the ability to generate language, since native speakers mostly aren’t conscious of the grammar by which they produce sentences. (This fact would not be available in math as mathematicians work with the functional syntax overtly.) And since there’s lots of persuasive evidence from human neurology (Christof Koch’s work, for example) showing that, bizarre and illogical as it may seem, consciousness is post decision, the moment of recognition or understanding is likely a mere epiphenomen and not necessary. There must be some other means by which humans functionally distinguish the infinite application of the algorithm versus mere inductive likelihoods of empirical mimicry. It’s a debate as old as Plato — an idea is a generative algorithm, a formulaic function ranging over not just the actual but the possible — and a rebuke to Wittgenstein’s behaviorist games and family resemblances.

Why do the New Agey prefer to speculate about oneness and unity with nature -- the Franciscan nature, not the Darwinian "nature red in tooth and claw" of Tennyson? No doubt it makes them feel happy, positive, socially agreeable, and they may believe that it contributed to their health both mental and physical. More power to them! I hope it works for them. I'd like to believe that an appreciation of the power of symbolic language would dispel the "we are all one" ideology. But what if the New Age mystic fully appreciates the power of language, but views that power as an obstruction to mystical truth -- a cultural veil of categories distracting from oneness? And yet also believe that animal and plant and cosmic communication is just like ours. To hold both is to hold a contradiction. Would resolving the contradiction make them better off as people? Maybe not. They might be better off living with contradictory views. 

To sum up: dogs and planets are not talking about the irreal, which is mostly what we humans talk about. That may be all there is to the difference in intelligence. We might all -- I mean everything in the universe -- have some kind of consciousness, but that's not intelligence. And the evidence seems to indicate strongly that consciousness is not intelligent thinking, it's a post hoc response to decisions already made internally. Intelligence is the manipulation or computation of symbols, and symbols are themselves algorithmic. What we learn is a mess of mimicry, algorithmic syntax and abstractions tied to symbols. That symbol relation -- an object representing a property or set of properties -- is having thoughts, ideas or meanings, and that is thinking.