An in-depth history of Large Language Models—and what their ubiquity, disruption, and creativity mean from a wider sociopolitical perspective.
In November 2022, ChatGPT swept the globe with a mixed frenzy of excitement and anxiety. Was this a step closer to reaching singularity or just another marvel in machine learning? Author Stephan Raaijmakers provides a comprehensive introduction to Large Language Models (LLMs), describing what exactly they are capable of from a technical and creative standpoint. This concise volume covers everything from the architecture of LLM neural networks to the limitations of LLMs to how our governments can regulate this technology. In explaining how exactly LLMs learn from data sets, Raaijmakers defangs the more sensational arguments we may be familiar with. Instead, he offers a more grounded approach to how this groundbreaking—and increasingly ubiquitous—form of artificial intelligence will shape our society for years to come.
It's written for executives and coming from 18-24 months behind of SOTA and misses critical mention about the time it covers such as RAG and inference-time scaling. Pushing constructor theory without mentioning current memory improvements. Future looking part is weak and stating obvious.
This does a very good job of both introducing (and covering quite a bit) of the subject, and — most importantly — in providing a wide discussion and perspectives on the limitations of these techniques.
It is able to accomplish this in a little over 200 pages of high quality accessible writing (which ups the probability that most people who start this book will read to the end).
There is an extensive set of references for following up ideas in more detail.
This is part of a larger series of MIT Press’s “Essential Knowledge” series of similar excellent slim volumes on important scientific topics.
All in all, I feel that Large Language Models is a pretty solid book if you want an overview of what these systems are under the covers.
What I liked most is that the author always keeps it grounded. Raaijmakers keeps coming back to a simple truth: LLMs are designed to produce plausible-sounding language, not guaranteed facts. That one idea explains why they feel so capable, and why they can also mislead with false positives.
From a practical point of view, the book is most useful when discussing LLM limitations. Hallucinations aren’t an edge case. Bias is not a side issue. And governance really matters because a very small number of private organisations control the data, compute, and design choices that shape how these tools work and behave (often without any transparency involved).
I also liked that the text doesn’t stop at 'how LLMs work.'
In a very introductory way, the book gets into how you use LLMs effectively and responsibly (chapters 5 through 7). Here, the discussion of grounding (e.g., RAG, tool use/function calling, graph-RAG, and stronger guardrails) is especially relevant if you’re trying to bring these tools into real AI workflows where reliability, accuracy, and trust matter.
It’s not a particularly light read (for pure businesspeople), and it’s not trying to be. And equally, it is not one for the AI purists (who criticize it for its 'simplicity').
Seems the author cannot win either way, but, in my personal view, he does a decent job of steering a middle ground (which I think was his intent).
And if you want a clearer mental model of LLMs and how to think about designing effective use cases for using generative AI, then this is probably worth your time, at least as a starting point.