“Language and Image Minus Cognition”: An Interview with Leif Weatherby

by Robin Manley

Leif Weatherby is an Associate Professor of German at New York University, where he directs the Digital Theory Lab. Robin Manley spoke with Dr. Weatherby about his latest book, Language Machines: Cultural AI and the End of Remainder Humanism (University of Minnesota Press, 2025), which argues that Large Language Models (LLMs) have effected a separation of cognition from language and computation in a form that corresponds to earlier structuralist theories.

Robin Manley: In the introduction to Language Machines, you argue that today’s prevailing frameworks for AI research all share a common theoretical mistake, which you call “remainder humanism.” This goes for the AI-skeptical linguistic arguments made by scholars like Emily Bender and Noam Chomsky, but also for the more social-scientific literature on AI bias and ethics, and even for the otherwise quite-opposed tradition of AI risk and alignment research. What is remainder humanism, and how does it limit these different approaches to thinking about LLMs?

Leif Weatherby: Remainder humanism is the term I use for us painting ourselves into a corner theoretically. The operation is just that we say, “machines can do x, but we can do it better or more truly.” This sets up a kind of John-Henry-versus-machine competition that guides the analysis. With ChatGPT’s release, that kind of lazy thinking, which had prevailed since the early days of AI critique, especially as motivated by the influential phenomenological work of Hubert Dreyfus, hit a dead end. If machines can produce smooth, fluent, and chatty language, it causes everyone with a stake in linguistic humanism to freak out. Bender retreats into the position that the essence of language is “speaker’s intent”; Chomsky claims that language is actually cognition, not words (he’s been doing this since 1957; his NYT op-ed from early 2023 uses examples from Syntactic Structures without adjustment).

But the other side are also remainder humanists. These are the boosters, the doomers, as well as the real hype people—and these amount to various brands of “rationalism,” as the internet movement around Eliezer Yudkowsky is unfortunately known. They basically accept the premise that machines are better than humans at everything, but then they ask, “What shall we do with our tiny patch of remaining earth, our little corner where we dominate?” They try to figure out how we can survive an event that is not occurring: the emergence of superintelligence. Their thinking aims to solve a very weak science fiction scenario justified by utterly incorrect mathematics. This is what causes them to devalue current human life, as has been widely commented.

Basically, we have two brands of “humanism” that both utterly fail to pay attention to the actual technologies at hand. The bright line between machine and human that they take as a premise prevents them from looking at language as such, and so allows them to go on letting either reference or intent be the essence of language. That’s a backhanded way of accepting the premise that cognition is utterly separate from culture, which is a serious scientific debate.

The position I take here is that LLMs don’t really solve that debate for human minds but do extend our ability to see truly cultural systems in action and give us some reasons to revisit that debate. Literary scholars dropped out of serious discussions of this type of issue about a generation ago, maybe even two, as I narrate polemically in chapter one, “How the Humanities Lost Language.” LLMs are calling us back into this world, to which we need to catch up fast.

RM: In order to overcome these pitfalls, you turn to French structuralism—a tradition that articulated similar theoretical concerns long before LLMs. In the first chapter, you contrast the structuralist account of language with two others: the “syntactical” approach developed by Chomsky that was so pivotal in the development of linguistics and the “statistical” view that has been more influential in cognitive and data science. To be very schematic: in the syntactical view, the output of an LLM is nothing more than a party trick, as it has no relation to the internal cognitive laws of language; whereas in the statistical view, the output of an LLM is indistinguishable from human writing. How does LLM output appear in your structuralist view? What does this suggest for how we think about the familiar question of intelligence?

LW: I’m not sure if the modelers would claim indistinguishability, though they seem bullish. It’s a weird area, which the book dwells on. On the one hand, we’re pretty sure these systems don’t do anything like what humans do to produce or classify language or images. They use massive amounts of data, whereas we seem to use relatively little; that data is highly structured, ours is not; etc. On the other hand, LLM outputs are indistinguishable from human writing in one important sense, which is that we have no tools to consistently identify which is which. Teachers and professors believe they can tell, but they’re basically wrong. We can often see little traces that are obvious—“sure, let me help you with that,” the overuse of the word “delve,” etc.—but studies show that neither human eyes nor quantitative tools can make this separation well. My university turned off the plagiarism software we previously relied on, Turnitin, almost as soon as ChatGPT was released, and hasn’t turned it or anything else back on.

We’re in this uncanny area where writing has been wrested away from us. It feels like language has gone the way of vision, music-making, and so many other essentially “human” capacities that were extended into media applications—gramophone, film, typewriter, to invoke Friedrich Kittler’s triumvirate of analog media. Language is an odd one, though, because it has no analog counterpart. It is the example of artificiality, made up of symbols that cut through its analog medium, sound. Because these systems work from text, there’s not even the element of sound; we’re faced with pure textual symbols (tokens) interacting with mathematical techniques of a particular variety. The math is relatively simple but not very elegant, and somehow all this results in language—real language, I claim. The talk out there about “synthetic text,” a phrase the AI critics use, makes no sense to me. Since Socrates, we have understood text as the demarcator of artificiality, of “synthesis.” All text and all language is synthetic—a point that I have developed in conversation with the philosopher Beatrice Fazi, who argues that these systems contain a structuralist distribution of concepts as well.

The reason I turn to structuralism is simple: European structuralism after Saussure thought of language as the totality of linguistic signs in their relation to each other. That totality is dynamic, but it is the whole that determines meaning, not “reference” alone. So the system isn’t built up of designations—“a rock,” “my mother,” and so on—but is systematic first. The system precipitates the referential and other local functions of language. LLMs are trained in such a way as to suggest that model before you get into any details. A massive amount of text is preprocessed and compressed into a giant matrix—the model—and you sample from that when you query the model. What is common between the structuralist view of language and LLMs is language as such—not psychology, not metaphysics, not logic. The very fact that we cannot distinguish between output from LLMs or humans—which is causing the “crisis” of writing, arts, and higher education—is evidence that we have basically captured language along its most essential axis. That does not mean that we have captured “intelligence” (we have not, and I’m not sure that that’s a coherent idea), and it doesn’t mean that we have captured what Feuerbach called the “species being” of the human; it just means that linguistic and mathematical structure get along, sharing a form located deeper than everyday cognition.

RM: Let’s stay with the first chapter for a moment. In a suggestive move, you claim that these different approaches to language are animated by a more fundamental philosophical disagreement over the appropriate model of empiricism and type of evidence for the study of language. You characterize Chomsky’s project as a Kantian one, in the sense that it searches for the transcendental laws and categories of language—the specifically human conditions of possibility for the experience of language in the first place. And you suggest that the statistical approach to language is a Humean one, locating meaning only in the observation of constant conjunction between words. By contrast, you claim that the structuralist approach to language is a dialectical one—noting parallels between Saussure’s account of linguistic value and Marx’s value theory. As you flag in a note, this is a striking claim, given that many of the most prominent figures in the structuralist tradition took themselves to be charting a path away from Hegel and Marx. Do you think this view leads you to a version of structuralism that departs in some ways from its original formulations? And how does this structuralist approach advance beyond the two alternatives you trace?

LW: No, I don’t think that I’m deviating from Saussure or Jakobson on the score of dialectics. I think that Saussure’s definition of the signifier-signified relationship is pretty clearly a variant of the internal contradiction that Marx locates in the commodity (use and exchange). Moreover, the idea that the value of a word is determined by its position in the total matrix of all words—that it has “no positive content”—is exactly how Marx thinks about the problem of value and price. All that goes back to some very complex stuff that emerged in Tübingen in the 1790s, when Hölderlin, Schelling, and Hegel were roommates.

There are a couple reasons that we tend not to think of structuralism as dialectical. The obvious one is that structuralism was narrated primarily by the post-structuralists, who were vehemently anti-Hegelian and anti-Marxist for reasons that I don’t totally understand. Take Jakobson: he tries to figure out the sound problem, and he digs into this extremely difficult area where meaning really resides in the indifference between sound and sign. In that area, according to his friend Claude Lévi-Strauss, he was a “dialectical” thinker. Together, both were also open to early language modeling: Jakobson was at some of the Macy conferences, where Claude Shannon’s work in this area was discussed, and he often refers to it. The second reason is that Lévi-Strauss famously rejected Marxism in favor of the “scientific” procedure of his newly-founded structuralism. But that rejection has little to do with dialectics; it has more to do with Sartre’s place in the French public sphere in the 1950s, the end of Western support for Stalin after 1956, and so on. A system characterized by a totality that we cannot apprehend, but which is expressed through constitutive, locally generative contradictions, is dialectical—full stop.

Neither of these aspects of structuralism ever appeared in my training in literary theory, and it was often assumed that these figures shared more with post-structuralism than with nineteenth-century thinkers like Marx, let alone Hegel. But that picture is skewed and outdated, and the facts just don’t bear it out. I think we’ll adjust that picture as we continue to historicize this stuff, but I didn’t want to get bogged down, so I wrote this chapter as a multi-pronged polemic.

My book provides a more complete account of why this view is superior to the transcendental and statistical-empirical ones. The short version is that the transcendental one always fumbles the actual empirical stuff—you get Chomsky and his followers essentially saying that words aren’t language—and the statistical-empirical one is positivist in such a way that it amounts to optimization without enlightenment. But the structuralist view, and dialectics itself, have lacked serious engagement with and development of thought about and involving quantity. I have an essay co-authored with Matthew Handelman on “digital dialectics” forthcoming, which makes the other half of this argument. The problem is that, in mainstream critical and literary theory of the twentieth century, encounters with real quantitative thought were always fleeting or glancing. We need to fix that, but not in such a way that we cede the core of the insight that comes from German Idealism, Marxism, and structuralism.

RM: The middle section of the book develops twin arguments about cognition and computational generation, both of which come to look like parts of a more general cultural-semiotic process. In chapter two, you interpret Warren McCulloch and Walter Pitts’ early work on neural nets as establishing the necessary formal characteristics of any possible intelligence, whether instantiated in a brain or a computer. And in chapter four, you follow Juan Luis Gastaldi’s argument that the power of computational neural nets today has confirmed the structuralist account of language. I’m interested in how these chapters sit alongside historical scholarship on the relationship between structuralism and computational accounts of language or mind, such as Bernard Geoghegan’s Code and Lydia Liu’s The Freudian Robot. How do you think about the relationship between your analysis and these more historical accounts? Geoghegan’s book, for example, develops a more skeptical position on this connection than your own, arguing that there is something ultimately technocratic about the structuralist analytical frame. How does your approach to these ideas differ from that one?

LW: I am separately working on a book about cybernetics and German Idealism. Throughout the history of the creation of the formalized cognitive sciences, from the start of analytical philosophy around 1900 and the invention of digital computing—which together led, by way of cybernetics, to the rise of computer and cognitive science separately—the issues that Kant laid on the table and Hegel elaborated have been neatly suppressed in such a way that they constantly return. In cybernetics, though, there was acute awareness of this problem. From Norbert Wiener to Gregory Bateson and Margaret Mead to Warren McCulloch and on to later figures, we find a consistent but subterranean focus on issues like the “synthetic a priori” and dialectical problems around symbol-use. To put it in modern terms, cybernetics was trying to answer the question of how to combine symbols and learning, or rationalism and empiricism. Because of the relatively deep classical training most of them had, they leaned on the most obvious example: Kant. This is a thumbnail of my historiographical argument, one that would sit alongside and perhaps differ a bit with Liu’s, Geoghegan’s, Hayles’s, and Dupuy’s accounts.

Language Machines is a different kind of book. I tell stories in it, but none of them pretends to be comprehensive, and most of them are not really original, which I document. It’s a theory book, pure and simple. At some point, we all got on this strange track where we try to do history and theory at the same time, and the resulting tendency has been to tease out “theoretical lessons.” I think that there are some pretty hard limits to that approach. Even looking beyond the question of how to square the theory-history circle in writing, there are questions of audience that make this pretty much impossible. This book, for better or for worse, is a thought experiment, and it’s one that stakes its claim on the idea that LLMs are massively influential and potentially informative technologies, while also primarily being tools for a pretty ho-hum story about the proletarianization of intellectual labor in capitalism.

As I’ve recently written, LLMs are a big update to the practice of bureaucracy in the modern world. Kittler’s idea that bureaucracy and poetry are two sides of the same coin gets a major swing here in favor of language as a way to control data and computation. The current panic arises from the deeply rooted assumption that what make us truly human is language. Structuralism is the best account so far of the not-altogether-human aspects of language, just as Idealism is the best account of the not-altogether-human aspects of concepts. I just can’t work my way from stories about “Cold War subjectivity” or the “ontology of the enemy” or “data colonialism” to the kind of analysis I want to do here. I think that we need a more frankly theoretical approach that adapts our significant toolkit to the history of the present—say, 1970 to now—without hiding behind superannuated buzzwords. If AI is going to be cultural, then we need a straightforward cultural theory of AI that is in conversation with the other disciplines that have central stakes in the AI project.

RM: A central claim of chapters four and five is that language and computation share form—minimally, that they are commensurable and possibly interchangeable systems of meaning. This raised for me one of the classical knots in structuralist thought: history and the question of the diachronic. Saussure, of course, had to bracket the diachronic in order to stabilize the process of signification and fix language as a proper scientific object, whereas poststructuralist thinkers after him would give up on the stability of meaning and the claims to science in order to reintroduce historical transformation. How do you think about these problems in the history of structuralism, and do you think anything changes with those problems in light of your analysis of computation? Do you take there to be any relevant difference between the historical existences of language and computation?

LW: All contemporary neural net-based systems are trained on some usually enormous amount of data, but they don’t train during the actual interactions they carry out. The result is that you’re always querying a massive cross-section of synchronic data, one that is always being updated on the back end in a number of ways, but which is nevertheless a slice through a recent state of language as text. You could say its diachrony is heavily interrupted, filled with non-linguistic intensions like “safety” that sit very awkwardly with its mathematico-linguistic core. The way that the training updates happen is a bit obscure because it’s often a corporate secret to one extent or another, but it’s not happening in dialogue, which we take to be a central feature of human language development and use. The reasons for this discontinuity seem to trace back to Microsoft’s Tay chatbot, which learned to deny the Holocaust within hours of its release into the wilds of the internet. So, there are safety concerns that prevent product development of this kind. That is a general weakness of these extremely expensive systems: scientifically testing their limits is difficult because you have to work with the very limited set that actually exists. (It would cost hundreds of billions of dollars to do it properly and would have to be done primarily in academia. The federal funding environment, for all its friendliness to AI companies, isn’t exactly trending that way.)

My own intuition is that the distribution of meanings produced by language machines heavily favors the present. I’ve seen evidence of them “hallucinating” in a way that seems to come from particular meme-based language—stuff with high signal on the internet that might be unrepresentative of spoken English, let alone the historical record. But when I started to think about this during the writing of the book—and I was revisiting Saussure—I realized that this, too, is an almost exact match with his linguistics. As you say, he “bracketed” the diachronic, but it was a principled stand. He thought of meaning as deposited from earlier periods of use as a kind of sediment. We’re capable—and I think academics and writers and readerly people are particularly prone to—thinking of history as informing or even dominating present usage (if you’re of a very broadly speaking Heideggerian, Herderean, Boasian or Sapirian, or even Wittgensteinian stripe, you’re probably going to take that precedence as an assumption). But Saussure criticized that view: the present extent of the system of values is the result of an accrual process, but it contains far more internal, systemic signal about meaning than any study of the past can show.

The only issue, of course, is that no one has access to the total distribution. LLMs are not the total distribution, but they’re a far larger chunk of it than we’ve ever before been able to see or play with. So even if you “only” train them on what is now approaching 10 trillion tokens and then don’t allow them to continue to learn, the compressed model contains a shocking majority of the entire written record (and thus diachronic signal) as well as a cross-section of our textual synchronic language. It’s important that they’re text machines. In the book, I avoid going into great depth about the difference between spoken and written language, basically because I think that, after a few thousand years of textual culture—and its spread through moveable type, the typewriter, the internet, and social media—we’re at a place where it’s pretty hard to grasp or locate the line between “language” and “text.” Maybe that line would have been easier to study in Socrates’s Athens. Today, cognitive science tries to neatly box off these questions about media into “culture,” but then suddenly everyone is willing to conflate compressed text data with cognition (or, as the critics do, dismiss that compression). These questions haunt both model and experiment, which attempt to exclude culture from precisely that place—text—where culture is most obviously deposited. That’s why I go after cognitive science polemically. For the first time, we now have quantitative access to the distribution of that cultural totality, however partial and imperfect; that is the event that I’m trying to describe. That’s what made me write the book.

RM: Your book calls for a return to structuralism and, relatedly, the development of a “general poetics” that would study all forms of meaning generation. What theoretical potentials do you think remain untapped in structuralism, and what new paths of development might we take them in today? Do you understand your call as running in parallel to other recent attempts to reinvigorate the structuralist and formalist traditions in literary and media studies, such as Anna Kornbluh’s The Order of Forms?

LW: We need to integrate a literary-theoretical approach—structural or otherwise—with cognitive and data science. I don’t think that it will be an easy or very comfortable process, and it might not work at all (which would be informative in itself). The glancing encounters of Jakobson, Barthes, and even Derrida with cybernetics need to be developed into theories that make sense of what we have before us now: numbers and words interoperating systemically without concepts producing the interaction or the outputs. My critique of Derrida basically just concerns his misreading of Saussure on the score of “writtenness.” Saussure’s notion of the signifier is plenty “material.” Derrida, together with his school, completely stopped paying attention to linguistics, the concrete history of writing as medium or through media. In his terms, these “restricted economies” are what give any theoretical thinking friction—without them, we end up with a practice of “close reading” that is hard to even define.

This book has a much smaller object than Sianne Ngai’s Our Aesthetic Categories or Anna Kornbluh’s Immediacy, but I regard those as the most significant recent works in literary theory. What makes them so powerful is that they both turn the crucial aspects of Jameson’s understanding of postmodernism into concrete analysis. However, I disagree with Kornbluh’s narrative of the “return of the imaginary” in immediacy. To my mind, the dialectic of the imaginary and the symbolic remains more or less as Lacan described it in Seminar II, when he was thinking through cybernetics using Hegel. It is more or less wrapped around feedback technologies like networked digital computers, just as Lacan laid out there. Kornbluh’s account is awesome; I’m just adding that—especially now, with the development of “multimodal” AI systems based on giant corpuses of text-image data—we need to develop a theory that focuses on that relationship between text and image. Structuralism, for me, is the impulse to investigate the systems of representation in question—in the multimodal case, quantity + image + language—without the motivation of the human-machine divide. Leaving behind remainder humanism does not mean saying humans and machines are the same; it means refusing to allow an obsessive compulsion over the putative difference between them to obscure analysis. I see that project as in harmony with anything downstream from Jameson’s historical aesthetics.

As I turned back to structuralism to make sense of language machines, I often pointed out to those in my Digital Theory Lab the need for this hybrid theory of language and image minus cognition. To date, we’ve come up with a very limited existing vocabulary to address it. Historically, it seems like art history, media theory, and literature departments just haven’t gotten along well enough to produce it, and cognitive science has a limited interest in cultural issues of this type. I hope that this book captures the first moment of this new cultural type of AI with enough depth for other, similar projects to build on, even if by rejection or disagreement, so that we don’t miss the moment. There is a ton of talk about how AI threatens the humanities, and all of the usual discouraging institutional and economic reasons for that are correct. But, if we can muster the strength, I think that the humanities has a generational opportunity here, too. Data science crossed the Rubicon into our territory, and we need to answer the call—even if prevailing forces are against us.

Robin Manley is a PhD candidate in the Department of Rhetoric at the University of California, Berkeley, with affiliations to the Program in Critical Theory and the Center for Science, Technology, Medicine, and Society. He is also Coeditor in Chief of the journal Qui Parle: Critical Humanities and Social Sciences. He works on the intellectual history of cybernetics and its relation to post-Kantian philosophy, media theory, and social thought.

Edited by Zac Endter.

Featured image: Visualization of “attention” in an LLM, produced with BertViz and upscaled with AI.

The Journal of the History of Ideas Blog

“Language and Image Minus Cognition”: An Interview with Leif Weatherby

jhiblog

Subscribe By Email

The Journal of the History of Ideas Blog

“Language and Image Minus Cognition”: An Interview with Leif Weatherby

jhiblog

Subscribe to receive an email notification for each new post