Jump to ratings and reviews
Rate this book

Software Design X-Rays: Fix Technical Debt with Behavioral Code Analysis

Rate this book
Are you working on a codebase where cost overruns, death marches, and heroic fights with legacy code monsters are the norm? Battle these adversaries with novel ways to identify and prioritize technical debt, based on behavioral data from how developers work with code. And that's just for starters. Because good code involves social design, as well as technical design, you can find surprising dependencies between people and code to resolve coordination bottlenecks among teams. Best of all, the techniques build on behavioral data that you already have: your version-control system. Join the fight for better code!

Use statistics and data science to uncover both problematic code and the behavioral patterns of the developers who build your software. This combination gives you insights you can't get from the code alone. Use these insights to prioritize refactoring needs, measure their effect, find implicit dependencies between different modules, and automatically create knowledge maps of your system based on actual code contributions.

In a radical, much-needed change from common practice, guide organizational decisions with objective data by measuring how well your development teams align with the software architecture. Discover a comprehensive set of practical analysis techniques based on version-control data, where each point is illustrated with a case study from a real-world codebase. Because the techniques are language neutral, you can apply them to your own code no matter what programming language you use. Guide organizational decisions with objective data by measuring how well your development teams align with the software architecture. Apply research findings from social psychology to software development, ensuring you get the tools you need to coach your organization towards better code.





If you're an experienced programmer, software architect, or technical manager, you'll get a new perspective that will change how you work with code.

What You Need:

You don't have to install anything to follow along in the book. TThe case studies in the book use well-known open source projects hosted on GitHub. You'll use CodeScene, a free software analysis tool for open source projects, for the case studies. We also discuss alternative tooling options where they exist.

276 pages, ebook

Published March 8, 2018

Loading interface...
Loading interface...

About the author

Adam Tornhill

4 books28 followers

Ratings & Reviews

What do you think?
Rate this book

Friends & Following

Create a free account to discover what your friends think of this book!

Community Reviews

5 stars
61 (39%)
4 stars
63 (41%)
3 stars
22 (14%)
2 stars
5 (3%)
1 star
2 (1%)
Displaying 1 - 25 of 25 reviews
Profile Image for Mark Seemann.
Author 6 books438 followers
July 9, 2020
This is some next-level shit.

With his work on analysing historical source-control data, Adam Tornhill is doing something new. The idea is to extract data from e.g. Git and use the log to find patterns in how people actually work with the code. This is a technique that he already described in Your Code As a Crime Scene. This book continues that idea, but since there's a bit of overlap with the previous book, I found that the first 70 pages was mostly repetition.

The ideas presented here open up a new field of inquiry, and it's somewhat overwhelming. I imagine that it must be similar to what the idea of Continuous Integration must have felt like decades ago. For example, when I worked in Microsoft in the late 2000's, our team had a dedicated build master whose only responsibility was to keep the daily build running. That was a full-time job. These days, we've learned enough about Continuous Integration, as well as Continuous Delivery, that we have tools for many such tasks. Most teams don't need a dedicated build master, but can handle configuration of CI/CD as part of their other duties.

It seems to me that the kind of work Adam here presents is in its infancy. I find many of the ideas stimulating and useful, but it also looks like a job for a specialist. Just like we've figured out how to make tools and services for CI/CD, Adam has developed a tool and service called CodeScene to encapsulate much of his knowledge. This makes much sense to me. If he doesn't do it, someone else will.

The book often refers to CodeScene and explains how to use it to perform a particular analysis. It makes sense to me to use the only tool available on the market, but it's still a weakness of the book. I don't much mind that the book could be viewed as a sales pitch for CodeScene (just like I didn't much mind the tie-in to Neo4j in Graph Databases), but unless you follow along in the actual CodeScene tool, the flow of the analyses are sometimes a bit hard to follow.

As I read through the book, I liked it more and more. The really visionary stuff comes in the second half of it. It's a worthwhile read, but you have to be patient with it.
Profile Image for Sandro Mancuso.
Author 2 books284 followers
May 6, 2020
As I read this book right after reading Your Code As a Crime Scene, this book caused a smaller impact—there is some overlap in content. Overall this is a better book and it was great to see the evolution in the ideas and tooling.

We are running quite a few software modernisation projects for our clients and the ideas and tools in this book will be very useful to us.
Profile Image for Jo.
37 reviews8 followers
May 28, 2018
Oh my. It's been a while since I read a technical book this quickly.

The book's premise is simple: every software company is sitting on a veritable gold mine of data that remains largely untapped. This gold mine could guide you to hotspots in your software that are costing you money. It reveals where architectural choices made in the past are hindering daily work today. It can highlight collaboration problems across teams. It might even act as a virtual team member that alerts you when things start to get out of hand from a technical debt perspective.

What is this gold mine, you ask? VERSION CONTROL HISTORY.
The book outlines simple and actionable techniques to mine your version control history for insights like these. It also proposes some interesting refactorings and incremental redesign techniques I've never come across before (e.g. "splinter refactoring"). Every chapter ends with hands-on exercises on famous codebases like .NET core, erlang & git.
What's more, the proposed techniques are simple and language-neutral so you can apply them on your own codebase as is.

I unleashed some of the simpler techniques on a new codebase I started working on this month and wow, it's like having a senior developer of the team walk you through all the pain points.

Highly recommended read. If you find yourself diving into new codebases on a regular basis, this is an essential tool to highlight potential problem areas quickly. I also highly recommend applying the techniques on your own codebase. You might be surprised where you're actually losing money!
Profile Image for Sebastian Gebski.
982 reviews898 followers
December 4, 2017
Review based on beta version (B2.0 if I remember correctly).

I wasn't a fan of previous Adam's book ("Your code as a crime scene"), but fortunately this one is significantly better. It's focused on a very interesting (& in fact - quite unique) concept of getting insights regarding software's quality (various kinds of technical debt) not from static analysis of the code but from patterns in the changes that have occurred to the codebase over the period of time.

My initial reaction wasn't very enthusiastic - I've seen THINGS :) & know that theory seems quite predictable while in reality people do very odd activities around their codebases, but authors has quite neatly proven a lot of his theories with examples straight from large, public codebases (OSS). Again (like in his previous book) he's using mainly the tool he's working on (CodeScene), but this time it doesn't appear like black magic - you can easily follow the operations performed, tool is used mainly for illustration purpose.

What else on the pros side?
I kinda measure the quality of tech books with number of bookmarks made during the read - I didn't make that many for at least half a year. Plenty of quality remarks, quotations & statements - about normalization of deviance, "strategic" decisions, social loafing & many more.

Anything on the cons side then?
Yeah. Maybe (but I doubt that) it's a matter of me reviewing the Beta version, but too frequently I had a feeling that Adam is starting the chapter with a nice concept, a plan for the chapter, a vision of what he wants to send as a message, but ... then the finished chapter felt inconclusive, like left hanging without a conclusion or in some cases in the middle of a way. Like author has forgotten what was the point or he had a good idea but didn't know what to do with it. In fact, that was really annoying & made the book feel far less practical & pragmatic than it could be.

Otherwise than that, it's a really good book. And the only one on this particular topic - which in fact seems promising enough to give it at least some attention (& no, I'm not saying I'll replace my static analysis stuff with these ideas/tools).
Profile Image for Babak Ghadiri.
32 reviews6 followers
March 18, 2022
این کتاب به نظرم دو تا از مهمترین راههای بهبود توسعه نرم‌افزار (به عنوان صنعت یا موردی) رو به شکل مناسبی بیان و ادغام کرده.
ایده‌ی اولش اینه که از تاریخچه‌ی گیت میشه به عنوان منبع باارزشی از اطلاعات (مثلا برای شناسایی نقاط بازآرایی کد مهتر یا شناسایی جفت‌شدگی‌های! پنهان) برای تحلیل و بهبود استفاده کرد.
(استفاده از data science)

ایده‌ی دومش نمایش و تاکید روی تاثیر مسائل انسانی (فردی و تیمی) در توسعه‌ی نرم‌افزاره. به نظرم مقدار توجه به عامل انسانی و ارتباطات در توسعه‌ی نرم‌افزار (در برابر تکنولوژی و فرآیند) خیلی کمتر از چیزیه که باید باشه و یکی از حوزه‌هایی که میتونه باعث جهش در کیفیت و سرعت توسعه نرم‌افزار در آینده بشه، آشنایی با و استفاده از روانشناسی توسط توسعه‌دهندگان و مدیران هست. (استفاده از روانشناسی)

نویسنده‌اش روانشناسی خونده و سازنده‌ی ابزار کدسنس هست. پر از نکات جالب و جدید هست که در کدهای نرم‌افزارهای مطرح و بزرگ وجودشون رو نشون داده و بررسیشون کرده. کلا این حوزه‌ی
Behavioral Code Analysis
به نظرم خیلی مفیده و جای کار داره.

https://codescene.com/
Profile Image for Júlia Birkett.
4 reviews1 follower
December 3, 2020
What I like best in this book is that I’ve never read something similar. The analysis that the author exposes and the explanations behind them are incredible! He teaches us how to take advantage of our version control to analyze our way of working and to help us in taking technical decisions (refactorings, technical debts priorization...) and company decisions (hiring, knowledge silos...). Adam Tornhill writes in a cool way which means it’s an easy-reading book. Didn’t read “Your Code As a Crime Scene” because everyone says they both overlap. Highly recommended!
Profile Image for Artur Skowroński.
15 reviews2 followers
August 4, 2019
While I consider first part as far better than the second, such an analytic approach to the code was really refreshing or even enlighting experience.

Many of the technical are really actionable, and focus on techniques not tools is definitely most interesting part of this publication.
Profile Image for Nathan Brodsky.
15 reviews11 followers
December 2, 2019
Easy to follow book. Very clear explanations without a bunch of fluff. Great answers to the whys that you ask when designing software.
December 3, 2019
I'd like to start with the following statement: this is one of the most valuable books on software engineering I've read to date. The rest of the review will hopefully explain why I think so.

The main idea of the book is that we can get lots of valuable information about our code, our architecture and our organization by mining the data in our version control systems. The reason is that these systems come closest to a log of how the code and architecture have evolved over the years. We can thus "replay" and filter this history to detect issues and pull some interesting metrics. For me, I sort of knew the data was there but lacked the guidance on how to make sense of that data. This book provides both - a way to mine the source control system (examples use Git) as well as advice on how to make this data useful.

The techniques to mine the source control data are presented in both a conceptual and practical way. I really liked that, as the conceptual description allows translating these techniques to other source control systems and the practical examples (with specific Git commands, ready to apply) allows to try these techniques out.

The guidance on how to interpret this data is what makes this book stand out. The content can be roughly split into three main questions that the book tries to answer:

1. How to discover the files, methods, and components where paying back technical debt brings the biggest return of investment?
2. Does my architecture support the way my system evolves and if not, how to make it so?
3. Does my organizational design make it easy to work with my system the way it is structured and if not, what can I do about it?

This book might at the first sight look like a "legacy code" book, but it very soon proves to be much more than that - for example, the second part not only shows how to detect suboptimal architectural decisions, it also discusses several architecture styles, their pros and cons and how to use data to determine whether a particular style is for you. I had lots of fun confronting the author's data-driven conclusions with Robert C. Martin's Clean Architecture, which I consider a more principle-based book. The author himself does a lot of such comparisons - putting the industry-accepted principles within the context of real data is what shines throughout Software Design X-Rays. That is also why you should not expect a prescriptive approach here. The book does not describe any "silver bullets". Rather, it discusses various options, nuances and contexts. Also, it respects software development as an evolutionary process and does not only describe a desired outcome of software engineering effort (e.g. how a good design should look like), but also situations where the bad design or organizational decisions have taken their toll and where sometimes counter-intuitive measures may provide a better outcome. I saw a lot of this in the third part where Adam suggests that, instead of fighting the way our code is structured with idealized teams breakdown, it might be sometimes more optimal to organize teams around the structure of the code, no matter how broken it is. As mentioned, the content goes way above just "legacy code" and if I could change the subtitle of the book, I would probably name it "Modern software engineering through the lens of data". You will not only learn from this book how to approach technical cleanups, you will also get valuable lessons from evolution of existing, real-life projects. The examples use publicly available pieces of code written in several different programming languages as well as some polyglot projects.

Most of the techniques presented in the book are surprisingly simple, which makes them even more powerful as when reading through the book, I could not reject the urge to start implementing some of them myself (and I mostly did). I had lots of fun analyzing my pet projects, both with my own, half-baked implementations and using the CodeScene tool that Adam has created with his team and which is free for limited use (but only with public Git repositories). Indeed, if you are seriously planning to dig into the contents of the book, you'd better familiarize with this tool as most of the examples and illustrations use it.

Another plus for me is that the content is presented in a lightweight form, with lots of illustrations and examples and occasional dose of industry humor. This is not an academic book, and even though it talks a lot about metrics, you can hardly find a math formula inside. All of this made the book an easy read for me.

Before I finish this review, several words of warning. First of all, this is the second book by Adam Tornhill on the topic of behavioral code evolution, succeeding Your Code as a Crime Scene. I did not read that book, but if you did, there might be some content here that you already know. Second, this is probably not a book for novice programmers as it is accurately presented on the cover. You need to at least have some experience with the problems author is addressing to understand why they are relevant and to interpret the results of the techniques presented in the book.

Overall, I wholeheartedly recommend this book. It shifted and expanded my views on code, design, architecture and software organizations and I cannot help but look at these topics now from the perspective of product evolution they are meant to support.
Profile Image for Johnny.
36 reviews1 follower
July 27, 2021
Great book that I should have read before. Really, you are interested in productive modern software development - go get it. It's all about sustainable and well-explained metrics that you may gather and base next actions relying on.
542 reviews9 followers
June 9, 2019
A great book about how to understand code without spending too much time reading it. If you are interested in the big picture this book will be a great help. It explains in great detail how you can use Git to find out what part of your code change often, who changed it, which parts need attention and offer themselves as a good starting point for refactoring.

With all the commands and ideas on how to visualize them, you can go and use those tricks in your own projects. It is surprisingly easy and helps you to add facts to your suspicions on what part of the codebase should be improved.
Profile Image for Marek Kowalcze.
26 reviews1 follower
January 28, 2021
Very interesting book about how to look at legacy codebase. Too often we just analyse the code as a static snapshot and forget about the time dimension to see how it was changing. Version control history can give us important insights about where the technical debt really is.

Highly recommended for anyone related with software development.
Profile Image for Riley Holmes.
60 reviews20 followers
January 30, 2019
I have not worked on a really large project before so have not experienced the kinds of growing pains / bottlenecks discussed here. But I really liked the methods presented for reading the version control history to find problematic pieces of code. I hope to revisit and use these techniques eventually if an appropriate situation arises.
Profile Image for Matt Eland.
21 reviews7 followers
December 25, 2019
Original, innovative, and outright fantastic. On top of that, Adam's writing is succinct and spot on. This is a top recommendation for any dev thinking about maintainable code.
Profile Image for Tom.
88 reviews11 followers
November 27, 2020
Excellent set of practical tools to use in supplying numerical data to support plans of attack to improve large codebases.
14 reviews5 followers
June 4, 2019

Some good technical discussion and quantitive analysis on how modularity can vary within a codebase, but also a lot of liberal bullshit about "social coding".
E.g., "The root cause of the Challenger Explosion wasn't technical - it was a social issue".

Wrong - how can this be a meaningful discussion? Every technical phenomenon is necessarily preceded by a social phenomenon, at least until AI takes over the world. There's no point in discussing whether the cause of a technical problem was ultimately social or ultimately technical, because it is always both. Tornhill's distinction between social and technical issues is misguided, in my opinion. When code is written by humans, they're the same damn thing.

The most valuable part of this book is not the discussion of git commands or the fancy d3 visualizations or the exercises (which I didn't do), but seeing what Tornhill counts. The metrics used to quantitatively analyze problems can also be used to analyze your own codebase in a rigorous way. For example, that you can measure code duplication by counting linked modifications to different files over many commits, or that writing modular code is most important for large files with many modifications. That last point might me obvious, but I had never seen it articulated so clearly before.
This entire review has been hidden because of spoilers.
Profile Image for Supriya Srivatsa.
41 reviews3 followers
May 17, 2020
The book has some amazing, really good ideas. It presents several facets of how something as daily-use as Git can offer numerous, in-depth insights into a codebase, its evolution and even the often undermined social and organisational aspects of software engineering. The author also brings in psychological constructs often and how it relates to software engineering. I found these interesting but it may not appease all. What I really think could have been better - the writing and structure of the book. While it brings forth amazing ideas and concepts, the writing comes across as highly repetitive, often iterating over the same concepts and ideas at different levels of abstraction.
76 reviews
August 5, 2018
Software Desgin X-Rays reads like 1/3 behavioral psychology, 1/3 git man page, and 1/3 technical paradigm introduction. It manages to eloquently and concisely introduce the idea that technical debt it a human rather than a technical problem, explain the interplay among various actors (programmers in this case), and provide a detailed set of instructions to visualize said interplay and technical debt.

Reading this made me realize how little I understand of the codebase lifecycle and the various design patterns proffered as organizational solutions
7 reviews
February 2, 2020
yeah, not bad, gave me a few ideas and certainly corrected some misconceptions I may have had about software teams. it would be some hard work to implement some of the ideas in this book but probably worth it for larger software projects. it's also a bit of an advert for the authors services.

Summary: use information from your git logs to track behaviours and bottlenecks hindering project velocity. mostly around code complexity vs changes required over time.
February 26, 2021
Pragmatic tips on CLI based analyses to get you started quickly paired with in depth tips for deep dive into version control history.

Unexpectedly, the advice is paired with references to studies backing up the claims of validity for the methods employed in this book.

Unfortunately, the examples in this book to follow along on Adam’s public web page are often broken and unusable by now.
Profile Image for Bugzmanov.
193 reviews37 followers
May 27, 2019
Between 3 and 4. It’s somewhat repetitive. And I couldn’t get more info out of this book after reading the first half.
Bibliography was definitely my favorite part. It mentions some awesome papers I was completely unaware of.
Profile Image for Zaki Shaheen.
49 reviews6 followers
November 27, 2020
A good introduction to behavioral code analysis. This provided me with tools that help me navigate a legacy code base. While Adam is knowledgeable, he does seem to promote his product codesense a bit too much in the book and a lot of tools in the book are only available in codesense.
April 15, 2023
The book presents a view of software development from a sociological perspective. It's very practical and contains a lot of examples for code analysis.
It will change the way in which I approach software development.
Profile Image for Olena Sovyn.
124 reviews37 followers
May 28, 2020
Book provides a new view about how meaningful tech debt can be detected based on the data from repositories and version controls system. Refreshing reading that dives a new perspective.
Displaying 1 - 25 of 25 reviews

Can't find what you're looking for?

Get help and learn more about the design.