What I’m watching: Cutting Edge Engineering Australia on YouTube
What I’m listening to: Can’t Buy a Thrill by Steely Dan
A hobby I enjoy: Bicycle mechanics
Chatter about ChatGPT is everywhere. In talking to localgov analysts and managers, I hear uncertainty about what it is and what it means. There’s anxiety about whether it will eliminate jobs. And a giddy vignette about how ChatGPT just saved their bacon on a deadline.
I’m mostly gloomy about ChatGPT. I’ve built, validated, and critiqued statistical models for a decade, and I’m used to them being oversold and misused. I think tools like ChatGPT will mostly make institutions less smart and less efficient, even as an increasingly challenging world demands the opposite. But it doesn’t have to be this way. Paradoxically, in a world where robots can write, I think the route to smart, efficient institutions runs through humans reading – slowly, intently, collectively, but selectively.
Large Language Models in Brief
First, to get clear on some terminology, from general to specific:
- Artificial Intelligence (AI) is a discipline going back to the 1950s interested in “whether or not it is possible for machinery to show intelligent behavior”.
- Generative AI refers to a recent family of statistical models in the AI tradition that generate media from prompts supplied by users. Generative AI includes products that generate images resembling photographs or artwork and videos such as “deepfakes”.
- Large Language Models (LLMs) are one kind of generative AI. They are gigantic statistical models fit to huge volumes of conversational text. They generate prose and poetry from prompts and through back-and-forth interaction with users.
- ChatGPT is one embodiment of an LLM from a company called OpenAI. ChatGPT might be ubiquitous like Google Chrome in ten years. But it might be dead like Netscape Navigator. LLMs are here to stay, even if ChatGPT isn’t.
LLMs Cannot Think
It’s important to take care in how we talk about LLMs. The way that ChatGPT writes in the first person about itself? The way it feeds you its output one word at a time, instead of all at once as produced under the hood? All on purpose. Clever product design, marketing, and reporting are being invested in the illusion that LLMs can think. They cannot. Propagating this illusion, at bottom, is a power grab by people in the technology industry.
We say that LLMs “hallucinate” when they employ faulty logic, cite nonexistent sources, and gaslight their users, among other sins – but LLM hallucination is not a treatable mental illness. If you fit a statistical model to human text, it will operate as designed: generating human-looking streams of words, irrespective of truth, and reproducing the nasty ways humans sometimes talk to each other.
While LLM developers are working hard to fix the most problematic behaviors, these superficial repairs will never make LLMs think. One day, perhaps a machine will be able to think in the way that a human thinks: scrutinizing and synthesizing evidence, employing logic, applying moral reasoning, and judiciously hedging on uncertain conclusions. That machine will be completely different from an LLM.
In observing how institutions have operated and made decisions across sectors, an unexpected analogy comes up in thinking about LLMs: Microsoft PowerPoint.
Before PowerPoint, slide presentations were hard to produce. Physical slides were chemically developed in a photographic lab, then loaded into a projector carousel. The effort of production encouraged painstaking effort to get one’s thinking clear in advance. PowerPoint eliminated the effort of producing slide presentations, which could now be whipped up in mere hours. Output expectations rose to match the new tools; soon it was expected to make PowerPoints quickly, even if the thinking was not ready. Visually compelling slides, a cinema-style meeting format, and a presenter’s social privilege made up for a lack of getting things right.
This is status quo for most of Corporate America, as well as politically protected public institutions like the Pentagon. PowerPoint ate them whole long ago. I expect LLMs to do the same. LLMs do exactly the same thing as PowerPoint, but for prose. These institutions will simply cede one more communication channel – the written space of emails, employee chats, and reports – to rushed nonsense that nobody can trust. Deluged with untrusted prose, it will get even harder to think straight at these places. But they will continue to plod along, because they are big and rich. They have enough scale to waste a lot of smart people’s attention to get anything done, but still survive.
Small, lean institutions – like local governments – cannot afford to waste their human capital like this.
A Culture of Written Communication
To find an alternative, the analogy of LLMs to PowerPoint evokes the opposite of PowerPoint culture: a culture of written communication. As a data scientist at Capital One (Capital One PowerPoint culture approaches a religion), I was startled when I learned how things were done at Amazon. In a letter to shareholders, Amazon’s founder Jeff Bezos wrote:
We don’t do PowerPoint (or any other slide-oriented) presentations at Amazon. Instead, we write narratively structured six-page memos. We silently read one at the beginning of each meeting in a kind of “study hall.” […] It would be extremely hard to write down the detailed requirements that make up a great memo. Nevertheless, I find that much of the time, readers react to great memos very similarly. They know it when they see it. The standard is there, and it is real, even if it’s not easily describable. […] A great memo probably should take a week or more [to write].
When I first learned about the Amazon six-pager, I was struck by the writing half of this workflow. It’s still crucial – writing itself forces people to confront and clarify their own thinking in a way that PowerPoint does not.
Today, now that robots can write, I notice the reading half much more. If your jurisdiction were to adopt six-page memos and in-person study halls today, LLM text would almost certainly wind up in those memos. Perhaps it even should.
But if a memo meets the approval of a room full of colleagues scrutinizing its contents, that means something. It likely means that the author has really gotten really clear about the facts and nuance of a matter, intensively editing any LLM material they used. Collective scrutiny bestows trust on a document; trust bestows agility and efficiency.
Study Hall at City Hall
Even before LLMs, we faced information overwhelm. LLMs are making that problem far worse, promising to inundate every text medium with plausible-sounding, potentially misleading, robotic prose. It’s impossible to read all of it. But if people get together and choose to read critical parts of it and mark those parts safe in collective discussion, there will be a foundation of trustworthy documentation to build on.
There are no technical barriers to memos and study halls. In a world where robots can write, memos and study halls are the best response I’ve found so far.
Of course, if your team tries this, I’d love to hear about it.