Sunday, November 7, 2021

On the nature of human language

This post, after a hiatus of so many years, will also be a little change of pace for the blog. Actually, after so many years, just posting is a "change of pace", but this will be a change of pace for other reasons.

The inspiration for this post was when I recently watched a lecture on the evolution of language by a distinguished scholar named Mark Pagel, which you can see for yourself here. He states some uncontroversial points about human history, but interwoven are a number of comments which seem odd, and a general thread that only late Homo Sapiens could speak, and this is taken as a given. For instance, in a discussion on Neanderthal/Sapiens interbreeding in which he states, matter of factly, that we don't know if their children would have had language. Indeed, in his language, he is extremely explicit about this point - at about 29 minutes in he discusses the human family tree, including Neanderthals and Denisovans and states, very firmly that "not a single one of them has language, as far as we can guess". His criteria for this seems to almost totally based on culture toolkits, but for Neanderthals he has to have an an extra reason to exclude them because they seem to have had fairly sophisticated culture (although, at other times, he just discounts Neanderthal culture altogether). His extra "evidence" for this is that, although they had the same FOXP2 gene as us, it is affected by different regulatory genes. This accepts rather uncritically the idea that FOXP2 is a vital component of language communication in humans. It is perhaps possible that modern human languages rely on the gene, but not that language itself (whether spoken, or mixed-mode with signs, or whatever) requires the gene. In any case, this is still highly speculative and should be couched in terms more appropriate to indicate it is.

One of his major points supporting the idea that only Homo sapiens have ever had language is an image shown at around 57 minutes in. It shows the geographical and chronological distribution of human ancestors starting with Homo ergaster then Homo erectus, neanderthalensis and up to sapiens. Even just the initial split of ergaster as separate from erectus is controversial. The image he shows demonstrates that sapiens spread over the whole world in a relatively brief period of time. It seems to me uncontroversial to suggest that sapiens were smarter and more adaptable than most of their forebears, although I'm not sure we can guarantee that there were no Neanderthals, for example, who were smarter than a modern day human, given the wide range of abilities exhibited in any sufficiently large population. The small percentage of neanderthal DNA in modern Europeans perhaps owes more to the neolithic farming revolution than to any specific advantages of Homo sapiens over Homo neanderthalensis. In any case, the image also shows that Homo erectus lasted an extraordinarily long time and covered all of Asia, but almost nothing else (it is unclear from the image, but it seems to be indicating a relatively brief early presence in Africa as well). However, being a big fan of paleoanthropology and John Hawks in particular, this struck me as ringing a little untrue. A quick check of wikipedia reveals a much greater extent, which is to say all of Africa, Europe and Asia, including small islands that can only be reached by boat, showing a level of sophistication and planning which Mark Pagel ignores completely in his lecture. It's not like his lecture is so old that it couldn't include this new information either as it's from 2019. Indeed, his distribution of neanderthals is similarly too limited as they were in Europe, the Levant and central Asia. Certainly one of the most famous neanderthal finds is from well outside of Europe. John Hawks has written on the topic of the "myth" that African populations lack neanderthal ancestry. Wikipedia has a similar type of picture to Pagel's but which shows an African population of erectus persisting until Homo heidelbergensis appears, representing either an evolved state of Homo erectus or a replacement. The main point to note is that this picture, which seems to me based on my other reading, to be more the "consensus" view, is quite a bit different from that presented as uncontroversial in Mark Pagel's lecture. It seems fairly safe to say that Homo erectus was pretty widespread. Not as widespread as sapiens, but we should also remember at the same time that the evidence for sapiens is from more recent time frames and thus more likely to be observed by us whereas our ancestors used stone and wood, lived at much lower densities, may often have lived in areas that are now flooded as a result of sea rise at the end of the last ice age, and are thus just far less likely to be observed by us today. A culture that makes extensive use of wood and hide, for example, will largely disappear without a trace after 100,000+ years, so we shouldn't take the distribution of skeletal and tool finds as being indicative of the whole range of the species, but as a minimum area they must have travelled across at least. To my mind then it would seem clear that both Homo erectus and Homo neanderthalensis were quite successful and widespread, erectus especially so. His almost cartoonish characterisation of the lives of neanderthals seems to belong to another age altogether. I found myself opining as I listened to him, what would be conclude about Australian Aboriginals and their ability to speak any language at all, on the basis of what evidence would survive 100k years. No rock art of highly decorated barks or dilly bags would be found, no instruments (being as they were wood). They had great stone tools which we might be able to determine were hafted, but erectus had pretty good stone tools. Lost are all the dances, ceremonies, songs and other aspects of their intensely symbolic lives. And of course, gone would be all traces of language. Would Mark Pagel conclude they had never had language at all?

There is one funny point he makes in response to a question at about 1:22:00 in the lecture where he says "Nobody I think is saying Hebrew is the mother tongue" which segues nicely into the next source of inspiration for this post, which is Chomskyan linguistics in general, and the following strange lecture given by Chomsky in particular. The link with Mark Pagel's lecture comes at about 1:28 in Pagel's lecture when he says that "the basic structure of language is, again, kind of like Proto-Semitic" and again at around 1:32:50 where he says "Uli(?) and Proto-Semitic are the Ur-languages", Ur-language being a Frankenstein borrowing from German of Ursprache which, in this context, means "the original language". In other words, Chomsky is very clearly suggesting that the ancestor of biblical Hebrew is, indeed, the "original language", at least in our brains, and every other language is just a remapping of this underlying, genetic, built-in grammar to "surface" forms. These latter are Chomsky's way of saying "written languages of the world". I'm hesitant to use the word speech because, as should be obvious to anyone with even passing familiarity with Chomsky's work in linguistics, he clearly seems to be "studying" a rarified, pure, written form of languages and then trying, and as can be seen in terms of the history of Chomskyan linguistics and in terms of results, failing, to say something deep and meaningful about what real language is. In fact, although the above linked lecture of Chomsky's is from quite late in his career, it would seem to be the one that best summarises the underlying purpose of the enterprise in Chomsky's mind and that is to prove that Hebrew is essentially the original language, or the basic language grammar at least from which all modern languages are just deviants. To my mind, this actually explains a lot about why he has pursued such a strange mathematical formalism as a mechanism for pontificating about the origins of language which completely eschews statistical methods, comparative linguistics, or, indeed any kind of field work or evidence. He seems to prefer a kind of mental source, pulling all the evidence he needs from the English language itself coupled with his knowledge of some other languages. He apparently spoke fluent Hebrew some 50 years ago, but now doesn't feel comfortable to speak it any more and I couldn't find a video of him speaking any other languages, but he claims reading fluency at least in a few other than English.

Just to clear things up at this put, I am in no way against Chomsky. I may have sympathy for his controversial political stances. I can't say for sure as I've never read any of his books on those topics and his discourses in public honestly strike me as roughly as confused and tortuous as his linguistics, but I certainly don't hold his politics against him at all. In fact, I first really heard of him when I was a teenager and I heard about this wonderful new world of Generative Linguistics which, I read, held all the answers as to how language works and its origin. So I bought a little primer on the field. Unfortunately, the book left me extremely cold. It did its best to present the body of theory (really only hypotheses) in the field, and to discuss the problems and changes made in response to those problems. I don't know why it seemed so obvious to me, but not to many of those involved in the field, but it was clear that this was trying to understand written communication and the formalisms behind it, which is problematic to begin with, but also that it was just a very unsatisfying and unsatisfactory attempt at an explanation. Even in my teenage years, it seemed clear to me that it was as if I had asked for an explanation of all the amazing variety of life on Earth and when I finally read a book purporting to explain it the answer turned out to be something like "God did it". The explained mechanisms had no basis in biological reality and made little to no reference to the variety of languages available in the world. However, I wasn't deterred and just thought that I mustn't be understanding it properly and that in time, with more reading and research, I would understand it better.

I started an electrical engineering degree at university which gave me a solid understanding of higher mathematical concepts, including discrete maths. This seemed to possibly be the key! I think once of the books I read at this time even referenced one of Chomsky's results in computer "languages", which impressed me. However, it was also very clear to me that a formal mathematical language is an extremely different beast to a natural language. Still, Chomsky had a mighty intellectual reputation so I sought more information. At about this time I read two books in this area. One was the very popular and highly praised Steven Pinker's The Language Instinct (in 1994) and the other was much less well known amongst the general public, a collection of academic papers from 1998 titled "An Introduction to Connection Modelling of Cognitive Processes". I think I may have read this latter first and I was truly fascinated by its biological approach and its focus on modelling and experiments, even though it was with toy systems at the time due to limited computing power. It demonstrated that complex behaviour and learning was possibly with simple models, including that which Chomsky had "proven" to be impossible using formal mathematical languages, namely that human languages are impossible to learn due to a paucity of stimulus, a point of view which has been parodied as a "paucity of imagination" on Chomsky's part. It showcased small scale models which showed human infant like learning patterns, etc. It literally blew my mind. I felt a great dawning in my brain. Here was a system of modelling with a biological basis which could show human like traits. I could see that the limitations were due to the use of simplified models. In retrospect, I can't understand why the field of connectionist modelling didn't immediately cause the whole field of Chomskyan linguistics to be abandoned and all that wasting funding to be poured into language preservation and connectionist modelling related studies. This way back in 1998, but it took a while for anything much to show in linguistics. Still there are linguistics departments today in thrall to Chomsky's unscientific field of study. It's unscientific in the sense that it is seemingly impossible to disprove as the definitions can always be made slippery enough to evade any issue. My experience with Pinker's "The Language Instinct" was very much the opposite. I read it with a very open mind, expecting to be blown away, but found myself instead continually frustrated at the strawman arguments and false dichotomies which seemed to me very often to be the result of a paucity of imagination as well. Even if it hadn't had those problems, I was still left wondering "so what?". It seemed to basically say that language just appeared in our brains due to a language module and then tried to find something in the brain which could be described thus. It seemed to have no real explanatory or predictive power, quite the opposite of the book on connectionist modelling from a few years later.

At about this time I also read Terrence Deacon's "The Symbolic Species". This was an extremely interesting and careful evaluation of the evolutionary history and neuroscience of human language, including a very careful categorisation of different forms of communication, pointing out that only humans have the highest level of complexity, "symbolic communication". It ended with Deacon's personal pet hypothesis on the evolution of language. It was an interesting take, although highly speculative of course. John Hawk's did a great review of this book. The Symbolic Species and An Introduction to Connectionist Modelling were the first two books I read that took on Chomskyan linguistics head on with actual research, facts and a proper scientific viewpoint. They really resonated with me for both the approach and the results.

Since those days it has become clear to me that there are two broad groups of linguists in the world, those who study real world languages and work in recording and preserving those, learning fascinating new insights along the way, and then Chomskyan linguists who see to be doing the same old ivory tower work, sometimes perturbed by facts from the outside world. The former group very rarely pops their head up over the parapet because they saw no advantage in starting an argument using evidence against what seems more religious than scientific. There have been a few notable exceptions, however. Daniel Everett was certainly one. I've read several of his books which all attack the foundations of Chomskyan linguistics "from the inside" so to speak, as he himself was a former Chomsky devotee, but he has also worked in language documentation in remote, endangered languages and they changed his worldview very starkly. Even his latest book, ostensibly on the topic of the origins of language itself (spoiler: he traces it back to Homo erectus, which I agree at least makes more sense from an evolutionary perspective, although the evidence will always be weak) called "How Language Began: The Story of Humanity's Greatest Invention". He's also gone head to head with at least one Chomskyan in a live debate which just reinforced for me the muddle-headed worldview that Chomskyan linguistics induces in its obviously intelligent adherents. Somewhat like hearing a very intelligent person try to explain why they believe in a Young Earth.

Another very strong rejoinder to Chomskyan linguistics was delivered in a paper co-authored by a very highly respected Australian linguist, Nick Evans, who I first learnt of through his grammar of Bininj Kunwok. The paper is titled The Myth of Language Universals and I highly recommend reading it. It is a detailed, clever and careful piece by piece dissection of whichever parts of Universal Grammar and the Language Acquisition Device which could be pinned down to a single definition for long enough to take apart. The other dissection of Chomsky, this time by an AI researcher, Peter Norvig who has switched from the grammar rules based approach to a statistical learning approach to language modelling problems, and comes in from a very different direction, that of the philosophy of science itself and the results achieved by the different approaches. For anyone looking for the most comprehensive and comprehensible takedowns of Chomskyan's field of linguistics study, look no further than these two papers.

To wrap up this winding diatribe, I would like to recommend one of Jeffrey Elman's lectures on connectionist models for an update on the field and Michael Tomasello talking about his fascinating research on pinning down the underlying difference between humans and chimps which is the fundamental divide between us socially and, therefore, linguistically.

And, very finally, although I recently watched another lecture on the evolution of languages, the link for which I currently can't find, but which repeated the claim that click languages of Africa represent an ancient language and that we all used to speak using clicks but have lost these. This arrogant view that "primitive" hunter-gatherers must not be innovating in language because their material culture hasn't changed rapidly is thoroughly destroyed, in my opinion, by Tom Güldemann's paper "Clicks, genetics, and “proto-world” from a linguistic perspective". If we can just drop these prejudices and strange mid-20th century ideologies from linguistics, our progress in studying, recording, preserving, understanding and modeling language can only accelerate!

1 comment:

  1. Fascinating post. Have you ever taken any inspiration from Whorf's essay: "Language, Mind, and Reality"?

    ReplyDelete