Statistical A Bayesian Course with Examples in R and Stan builds readers’ knowledge of and confidence in statistical modeling. Reflecting the need for even minor programming in today’s model-based statistics, the book pushes readers to perform step-by-step calculations that are usually automated. This unique computational approach ensures that readers understand enough of the details to make reasonable choices and interpretations in their own modeling work. The text presents generalized linear multilevel models from a Bayesian perspective, relying on a simple logical interpretation of Bayesian probability and maximum entropy. It covers from the basics of regression to multilevel models. The author also discusses measurement error, missing data, and Gaussian process models for spatial and network autocorrelation. By using complete R code examples throughout, this book provides a practical foundation for performing statistical inference. Designed for both PhD students and seasoned professionals in the natural and social sciences, it prepares them for more advanced or specialized statistical modeling. Web Resource The book is accompanied by an R package (rethinking) that is available on the author’s website and GitHub. The two core functions (map and map2stan) of this package allow a variety of statistical models to be constructed from standard model formulas.
I bought version 1 of this book, because I cannot currently afford/justify version 2. This may have something to do with the fact that it was a real slog to install the software which I needed in order to run the example code, or perhaps it was just that hard. Regardless, it was a teeth-grinding experience, but for some reason I prevailed in querying the internet (and in one case a real live person) as to how to get the many steps accomplished. If I had not been in as good a mood when I started, I might have given up on the book right then, before starting.
But, I'm so glad I persevered. Once set up and able to follow along, this book was a wonder.
There were chimpanzees deciding whether or not to share a banana with another chimp. There was an investigation of gender bias in admissions at UC-Berkeley. There was an investigation into the (real, but non-causal) relationship between how many Waffle Houses a state has, and its divorce rate, a correlation which is especially surprising given that it is stronger than the correlation to the marriage rate. There was a parable of Good King Markov and his island kingdom. There were frequent references to golems.
There was a fair amount of code, most of it in the language R, but every once in a while a bit in Stan. Many sections ended with a section called "Rethinking", and sometimes also with one called "Overthinking". There were a lot of charts and graphs, most of which the author provided the code to make (or alter) yourself.
There was a Nick Cave reference.
It managed to teach me a good bit about how to use Bayesian statistical modeling to tackle messy, real world data, and turn it into information. It did so in a way which was eminently readable, although I did have to take a break for a day every chapter (or less) simply because my brain was full of so much new stuff that I needed to absorb. But, I was always eager to get back to it the next day.
It is a masterwork, all the more impressive given the low standards of readability in the textbook field, generally.
This is a great book, one that will influence me for the rest of my life as a data scientist. The style is informal, adventurous and open. It very much treats statistics as an open discipline where many approaches can "make sense", and it's all just a big playground really. This style, which is somehow common among Bayesian statisticians, speaks to me, much more than the rigid, one-correct-test-thinking that you can find in some books on statistics. Statistical Rethinking really inspired me, and the explanations and examples, particularly those in chapter 12 and 13 on multilevel models, correlations and gaussian processes, really improved my understanding of those topics.
I found a couple of issues with the book that make me not award it all five stars. I think these come from individual preferences and background, and for many this could be the perfect introduction to Bayesian modelling. In fact, I have recommended this book to many already. But, first of all, the arguments are sometimes a bit long and wordy. The writing is intellectual and creative, but I would have appreciated some succinctness here and there, perhaps some rigoruous maths in the right places. Second, I don't really like the `rethinking` package. It's not that much more difficult to write `stan` programs as a beginner, and I think experience with `stan` is more valuable. I had a couple of annoyances with the package as well, where the error messages and behaviour were confusing. At one point it took me two hours to find out that I used floats where only integers worked, but I got an error that just wasn't clear at all. I think it's not a good choice to base a book on a package with only a single maintainer, in this case, the author himself. As a last aside, the rest of the code is really old-school R, no `dplyr` or `ggplot2`. This may be the right choice by the author for others, I would have preferred the use of the `tidyverse` throughout the book.
The book is based on a course by the author, and the lectures can be watched on Youtube as well. I really enjoyed those lessons as well. All in all it was a great experience to work through this book and the exercises.
Third update: I am now at Chapter 11 and couldn't help but come here to say something. I have to say this book is really idiosyncratically written and the arrangement of chapters is beyond me and sometimes gives me the impression that the author is arranging the text in this way just to show off his stats knowledge (which should never be the driving force of a textbook in my view). The clarity of the content is not very stable, sometimes I find his explanation really clear whereas other times it is super frustrating to understand him. In sharp contrast, I find doing Bayesian data analysis by Kruschke much more accessible and logical.
Second update: Ok. A couple of weeks passed and I am now about to start Chapter 8. Happy to revise my rating to four star as I kinda got used to the style and went easy on myself by not trying to decode everything the author writes and his R codes. For future readers, be mentally prepared for Chapter 4 as it is a hurdle to jump (I had to read it twice). Once you passed that the next few chapters in comparison become more accessible.
First review: I am only at the fourth chapter and willing to change my rating if I do get to finish it. TBH, reading all those perfect reviews makes me doubt myself... why am I not enjoying this book?? why do I keep having the feeling that I need to read other materials to complement it? As a matter of fact, I did have to read another book in parallel with it (a student's guide to Bayesian stats, much better from my experience!), as I don't feel the conceptual bit is being fully explained. The arrangement of content is also not entirely logical, it seems to aim at this short cut to do Bayesian modelling right away without laying solid the conceptual foundation for it. I do still appreciate the codes and sides, but the author's idiosyncratic style is really an acquired taste. At least for now.
BTW, I am reading together with a group of students motivated to learn stats (mostly with frequentist background) and we go through the book and exercises very seriously, I can tell you the struggle is not alone. Perhaps this book is really only intended for purists who only learned Bayesian.
This book is a one-stop shop for learning statistical modeling.
The first six chapters demonstrate many of the concepts in Bayesian statistics and linear models, using fully-worked examples in R. Note that the R code leans heavily on STAN (through the rstan package) and the author's own rethinking package. This makes the examples small enough to be workable, and the mechanisms employed in the rethinking package are fully explained.
Chapters 7 through 12 gradually introduce new and more powerful modeling concepts, and things start to get complicated somewhere around chapter ten.
The last two chapters, 13 and 14, put the rest of the book into practice. This is where the models being developed finally start to feel useful, instead of somewhat contrived. I'd fallen out of the habit of doing the examples around chapter 8, but came back to it for these two. Well worth it.
McElreath is one of the new breed of statisticians calling for sanity and reproducibility (ala Statistics Done Wrong: The Woefully Complete Guide, How Not to Be Wrong: The Power of Mathematical Thinking), and his coverage of statistical modeling is not be missed for the practitioner. He also happens to have written the only statistics textbook that I've read cover-to-cover (and among this CRC "Texts In Statistical Science" series, the only one I've even gotten halfway through).
In short: if you're doing stats, or learning to, read it.
Well written and a good introduction to batesian regression, even without any prior knowledge about regression. Highly recommend the accompanying YouTube series as well!
I was fortunate to read this excellent book. It's opinionated, often quotable, and a fair summary is that "you can imagine your own generative process, simulate data from it, write the model, and verify that it recovers the true parameter values. You don't have to wait for a mathematician to legalize the model you need." (page 376) Recommended.
"Thinking generatively—how the data could arise—solves many problems. Many statistical problems cannot be solved with statistics. All variables are measured with error. Conditioning on variables creates as many problems as it solves. There is no inference without assumption, but do not choose your assumptions for the sake of inference. Build complex models one piece at a time. Be critical. Be kind." (page 553)
This book is an exemplary introduction to the Bayesian thought process. It's additionally quite good for practicing and learning R. When reading this, you will likely learn and have fun; it's rare to find both of these (or, quite often, just one) in one text. The tone is very conversational and friendly, and Dr. McElreath doesn't take himself too seriously. If you choose to use this book, I would strongly recommend his excellent lectures on YouTube that accompany the book. Overall, I would strongly recommend this for anyone with an interest in Bayesian statistics (which should be anyone with an interest in statistics in general).
Easily the best textbook I've ever read. Changed the way I think about statistics; I definitely agree that "When it comes to regression, multi-level regression deserves to be the default approach."
McElreath's approach of "subordinating statistics to science" is a refreshing and important take. I and other students learning with this text are in good hands.
Great book and introduction to Bayes. A little wordy, to be sure, but I got a lot of critical knowledge out of it, and was much more able to build Bayesian models after reading this.
Accessible, warm, and inviting. Though with a background in numerical methods including MCMC and HMC, it is sometimes a little difficult to see much of that beautiful math being hidden away. Honestly apart from the ornate descriptions of sampling techniques, this book is a fantastic introduction to statistics in both pure and applied fields. The exercises and examples are also worth working through, something which I rarely admit.
Haven't read it yet, just a quick skim, but I like the idea behind Ch 16: case studies where he starts with a scientifically-motivated mathematical model first and only adds statistical nuance later. For example, modeling humans as approximate cylinders in order to set up a hypothetical model for predicting weight from height -- first he starts with V = pi r^2 h, and then makes scientifically reasonable assumptions, and only then shows how it can be approximated by a GLM. The purpose is to contrast with the usual approach in stats textbooks, which is to toss all the variables into a GLM first and hope they are interpretable later. I don't think this math-modeling-first approach is *always* useful or even possible, but it is a handy tool to have in our arsenal. (And to be frank, I haven't seen the "add statistical nuance" piece very often in the applied math modeling courses or textbooks I have seen. For example, modeling population dynamics with Diff Eqs alone is a start, but only handles what happens *on average*; it's not enough to allow for statistical inferences, such as pinning down *how precise* your estimates are. So McElreath's Ch 16 approach should be a useful example for applied math textbooks too.)
On the other hand, I'm not a fan of the really cutesy chapter titles. They make it hard to tell what's actually covered in the book, vs what isn't. His Ch 1.3 lists the four tools he's going to focus on: * Bayesian data analysis * Model comparison [by which I think he means assessing predictive performance, eg with cross-validation or things like AIC -- instead of focusing on the MOEs of our estimates as in classical statistical inference] * Multilevel models * Graphical causal models
That's all well and good -- these are important topics. But I disagree with his framing of classical statistics in Ch 1. More crucially, I'm disappointed not to see any in-depth coverage of data collection and study design. They're not even mentioned in the index!
So (again, based on a quick skim and not yet a thorough reading) it's hard for me to see how I'd use this book in teaching an introductory course, even if it were aimed at grad students.
First and foremost, this is a textbook. I spent about a year working through this book. I had an idea of what "Bayesian" meant before hand, but no idea how to implement it for research. This book gets you to that point. Great.
Well, why do most readers rate it 5 stars and laud it? Perhaps because this textbook helps the reader learn and think more than any other textbook they've read. And uniquely, it tries to ensure you don't hate that process. McElreath never talks down to the reader, never assumes you're as comfortable with the math as he is, and never assumes that you should already know what he's talking about.
The value in the book is also that it is not just here's how you program something and make "data science" happen. The book is about Bayesian analysis, but it's the best book I've read for thinking about linking the use of statistics to science (social, natural, whatever) and the representation of real-world phenomena. The emphasis on causality, DAGs, and information theory were not what I was expecting to find in the book and I'm grateful they were covered in here (even if I still have just a basic grasp of them).
McElrath is also a good writer. Particularly in the first few chapters, a lot of the writing is conversational and actually fun to read. The later chapters don't really lose this aspect, but as the content builds upon itself, perhaps my brain just had less capacity to focus on the conversational parts.
McElreath has a clear philosophy about how science should be conducted and that becomes apparent when reading Statistical Rethinking. P-hacking is a no-no and thinking about how measurement affects statistical results is a lesson to be applied to all analysis.
I had a decent background in R and data analysis before reading this and would suggest those are helpful skills to have. I came across this book because Gelman's seemed just a touch too advanced for me, particularly for self-study. It's a great book for an applied researcher who is stepping past introductory quantitative methods.
The excellent writing in this book was a breath of fresh air compared to much of the genre of statistics textbooks. The author does a fantastic job explaining quantitative reasoning in clear English and developed a great set of metaphors and examples that are skillfully deployed throughout the book (e.g. models as golems; fitting of separate models rather than partially-pooled models as voluntary amnesia, etc.).
While I think the book's most valuable contributions are its clear explanations of statistical reasoning and methods, it also provides a useful introduction to the current state-of-the art software tools available for analyzing Bayesian models (i.e. Stan and R).
This book is a great introduction to statistical thinking. While the emphasis is on Bayesian approaches, much of the content could also be used for thinking through frequentist analyses. A central point of this book is that many of us were taught statistics in a formulaic way: if you want to compare means between two groups, run a t test; if you need to analyze a contingency table, use a chi square test, etc... But this approach often exlcudes the hypotheses and prior information we have about the scientific questions we are trying to answer. The approach McElreath presents puts the science front and center. I will be referencing this text for years to come.
This a rich and informative book. However, I will caution beginner-level readers in starting with this book, whether they are picking it up to learn causal inference, Bayesian statistics, or multilevel modeling. In each of these areas, I have found other materials that are far more beginner-friendly. I think this book works best for those with an already-beginner level of competency in causal inference, Bayesian stats, and MLMs. This text will serve to deepen a beginner’s knowledge in each of these topics and synthesize them.
A true jewel in terms of content and writing style. Using Jorge Luis Borges "The Garden of Forking Paths" story as an allegory of the likelihood function is the most elegant way I've seen to begin a stats book. It's possible to feel the passion and knowledge of Dr. McElreath in every sentence of the book. This is one of those books that I will take with me to my lonely island and read over and over again.
A brilliant stats book. I have done a lot of stats but still learned a lot in this book. But especially for people who might have been put off by overly technical stats, this is the book to go. I like the applied nature of the book.
I am still reading the book, but I can not resist to express my gratitude to the author ! Be it your first book on Bayesian inference. Fuller review to follow.
This is a great entry level book to Bayesian statistics. I am an economist by training so I have a lot of experience with rigorous statistics courses, but always from a frequentist perspective. This book is not nearly on the level of rigor that I am used to, and rigor is something that I desire and think is important in order to achieve a deep level of understanding. However, I think this book is outstanding and is useful for two types of people:
1) Scientists (or non-scientists) with little statistical training. For them this book is a great and practical guide.
2) Scientists with a lot of frequentist statistical training, but with no exposure to Bayesian statistics. For such people, I would recommend that their final goal be to read a more rigorous book such as Gelman et al.'s Bayesian Data Analysis (note: I haven't yet read Gelman et al.). HOWEVER, I would recommend first reading McElreath's Statistical Rethinking to get acquainted with the concepts of Bayesian statistics before heading off into a much denser book. That is my plan. I breezed through Statistical Rethinking in a matter of days, and it greatly helped me in understanding the Bayesian perspective. I predict that the knowledge gained from Statistical Rethinking will make reading Gelman et al. or other tougher reads much faster and more accessible. In addition, though I do still desire a more in depth understanding, Statistical Rethinking has given me a lot of practical skills for conducting Bayesian analysis that I can already apply today.
I studied and wrote a thesis for a masters in econ from the last gasps of the spoiled 90s into Y2K, when my Econometrics 4XX prof, fresh from teaching freshman intro stats at MIT, introduced me to the wonder and trauma of the Gujarati text Basic Econometrics. Mr. post-doc MIT expected us, a handful of rube grad students at my distinctly-more-shitty-than-MIT, Mormon-church-owned private university [BYU main campus], to be ahead of his average MIT freshman on stats... Needless to say, it was rough going. Sufficient to make atheists of all involved.
Inauspicious start to a finance & econ career, but appetizing.
McElreath is a brilliant teacher, and writer. Rethinking is a pleasure to read, something I could never imagine saying about any other text book I've had the misfortune of using in a vain attempt at learning something. I've not had to program anything since running STATA in Microsoft DOS - wow, I'm suddenly feeling really old - but learning R with real examples and data was actually fun and incredibly stimulating. Hopefully will forestall the Alzheimer's for a few more months.
This book has a very thorough from the ground up structure. And although the subject of Bayesian statistics was probably too hard for me, the book did a very good job of preparing it in the most easy to consume way.
There are plenty of examples in R, each described extensively. The book starts with very basic definition of Bayesian probability as logic. Then there come simple models, interactions between variables, descriptions of models for inference and models for prediction. The book goes on to explain how to evaluate models, how to compare them and what pitfalls are expected on the way of designing a good model.
The book explains that no amount of statictics can compensate for bad science and highlights the spots where no matter how good your statistical skill is it would not be enough to make decent inferences.
Overall it is a very good book and I highly recommend reading it as an introductory course into Bayesian statistics.
Occasionally drawn away from the main course material by the specific examples and models he discusses I think he should remain more even handed (like Kimura and Hubbell both of whom I think happen to be on the right side of the debates he brings up and him on the wrong side). I also am not sure I am convinced by the pragmatic case (or philosophical) for Bayesianism over frequentism I don't have a strong opinion on such issues. In any case I really enjoyed the lecture series on youtube I actually didn't really work through most of the coding exercises so I can't evaluate how useful they are as pedagogical tools. It probably helps to get some practice actually building models but he does most of this work and does it well in my opinion it almost feels repetitive especially if one has some prior exposure to DAGs but I think it's a good course overall.
I liked it. A nice and short (~450 pages) introduction to the Bayesian modeling. I am a bit disappointed that the author created a high-level package that uses an R package (rstan) that then calls the package of interest ("STAN") to show some of the modeling principles. In a sense, this book is about modeling your approach to modeling and forming an intuitive and high-level of understanding without many implementation complexities and details. However, it leaves you with a hunger for more and wondering whether other books/courses might have been better. I expected more hardcore explanations, and it seemed more intuition-based. I am ambivalent here.