I hated hated hated hated hated Data Smart by John W. Foreman. I’m being dramatic. But I did read the whole thing and now I think John is almost definitely a really cool guy who made a valiant effort teaching a challenging topic. If I ever found him in my particular Maryland suburb, I’d buy him a beer. Stick with me, I think this book is probably perfect for some folks–it just wasn’t for me.
So Foreman’s topic: Doing data science stuff.
Here’s the thing, when I bought the book, I had in mind that I wanted to get a better sense of the kinds of things I could do with various chunks of data that I already have. I was looking for something fairly high-level, dealing with the larger topic.
Somewhere between the quirky upside-down book cover, the subtitle “Using Data Science to Transform Information into Insight” and my bad habit of buying books based entirely on my gut, instead of the free sample, I just had the wrong impression about what I was getting into. And, oh how dearly I paid for it.
Okay, just kidding, so far as technical books are concerned, this one was pretty good. John writes in a way that makes calling him “Foreman” seem too formal. And it’s by design, I think. After the fifth consecutive page of Excel formulas about wine or pregnant ladies, John reminds readers that what they just saw was “badass”. And, his enthusiasm is welcome. It’s a dry topic, and the thing is, some of the stuff he is doing is pretty damn cool.
There are 10 meaty chapters. Each chapter opens with a set-up data situation. A bunch of data about custom crafted sword sales for example. Data about people who may or may not be pregnant, or data about call center employees. Then John provides a goal for that data, for example, we want to guess at who’s pregnant, or which call center employees are true outliers. And finally, he introduces the technical methods he’d be using and with that, moves onto solving that challenge within Excel.
This is the primary place that I felt the text was lacking for a reader of my kind, at this moment in my life (What, John, you weren’t writing this book specifically for me… Well I never!). As he introduces the methods, he moves into the problem solving before the real scope of the solution can be appreciated. I would have very much liked him to reframe solutions a number of times in very different context before getting into how to actually work with the formulas.
As each method and process is explained, there are screenshots in-line on the page for following along. These are a little challenging, but I was getting the gist. He also includes sample Excel spreadsheet data on the book’s website.
John’s explanations are almost all conversational, which made it fairly easy to read simply for the flavor. However, I imagine that if I were following along with the included Excel spreadsheets on the computer, I would find this style of writing a little frustrating to keep my place.
The payoff comes at the end of each chapter where the final solution is found. There’s a kind of mystery unfolding with each formula. Based on the way I was reading the book, I think it would have been more helpful to see a deconstruction of the final solution first, then have it explained from the ground up.
The final chapter, which is the only chapter that ditches Excel, deals with the analytics-focused programming language: R. This was actually my favorite chapter. Although throughout the book I didn’t follow along with the sample spreadsheets that are included on the Author’s website, I found that I had picked up the majority of what was necessary to keep up. I was able to appreciate the benefit of working outside of Excel in a special environment.
The Conclusion of the book sees John get a little more candid about data science and the industry. I enjoyed his insights there, and in some ways, it was more like the book I wanted to read from him than this one had been. But that’s just me.
This was a good book, most likely read by the wrong person at the wrong time. That been said, there’s some juicy stuff in here, especially if you’re super comfortable with Excel and have plenty of data floating around your hard drive. That casual reader, looking for something that examines the broad strokes, there’s probably something more suited for us. But probably not something quite as cool.