With the Statistics Playbook by Gary Sutton
This book will teach you how to use R in a different way than any other book out there. You will approach R concepts using publicly-available NBA statistical data rather than prepared datasets, and learn how to combine various methods and techniques.
Read on to learn more.
Data is changing the way businesses and other organizations work. Back in the day, the challenge was getting data; now the challenge is making sense of it, sifting through the noise to find the signal, and providing actionable insights to decision-makers. Those of us who work in data, especially on the front end–statisticians, data scientists, business analysts, etc.–have many choices in programming languages to select from.
R is a go-to programming language with an ever-expanding upside for slicing and dicing large data sets, conducting statistical tests of significance, developing predictive models, producing unsupervised learning algorithms, and creating top-quality visual content. Beginners and professionals alike, up and down an organization and across multiple verticals, rely on the power of R to generate insights that drive purposeful action.
Why I am writing this book
My goal is to write a book that’s different from other, similar books, most of which are organized by method or technique. A similar book might contain a chapter on linear regression or another chapter on handling missing data, for example. There’s nothing fundamentally wrong with that; but in the real world one doesn’t wrangle data for the sake of wrangling data; there’s always a larger reason. Statistics Playbook: Using R with Real NBA Data Sets repeatedly ties very different methods and techniques together and demonstrates how they work as one, using real data sets to achieve real outcomes.
The book is therefore organized into stories as much as it’s divided into chapters. And every story has a beginning and an end with a plot in between. Every story starts by loading packages and importing, introducing, and explaining real data, not packaged data; and every story ends with a series of unique insights. In the middle data is wrangled, explored, analyzed, modeled, tested, and visualized, leveraging a combination of built-in R functions and some of the most advanced packages available.
Of course, the NBA is the common theme throughout. However, you don’t need to know who Steph Curry is to get the most out of my book. In fact, many of the concepts—the advantages of rest, the impacts from income inequality, etc. are transferrable to your work and everyday life.
What is inside this book?
This book provides end-to-end and step-by-step instructions for discovering and generating a series of unique and fascinating insights with R. This book is different from other R manuals you might already be familiar with, for the following reasons:
- The book does not use pre-packaged data sources, but rather publicly-available data sets that require a tremendous amount of transformation, tidying, wrangling, and summarization; therefore, you will get exposed to a wide range of complex, real-world, data manipulation techniques that are not necessary when otherwise working with clean data, which is what you get in other manuals.
- The book contains almost 300 visualizations. Some of these should be familiar to most of you; but many of them will surely be net new. These visualizations–e.g. Dendrograms, Sankey Diagrams, pyramid plots, facet plots, and Lorenz Curves–provide insights not possible with more conventional visualizations. You’ll get step-by-step instructions to create all these, and more.
- Whereas other manuals are organized by learning method or by package or by course of action in a typical data analysis flow–or maybe some combination thereof–this book is instead organized by project. Each chapter represents a discrete project where, like in the real world, you are required to think and execute end-to-end in order to supply a new and actionable insight. There is no such thing as a data manipulation project or a data visualization project (except maybe at university). But there is always such a thing as a project in which you must perform all, or most, of the following activities: locate and acquire data, explore and analyze it, prep it, wrangle it, visualize it, test it, and model it.
- There is nevertheless a common theme throughout, and that is the NBA. R is used herein to reveal a series of unique and fascinating insights into the NBA. But sports and data have always come hand-in-hand–has it ever been possible to intelligently discuss or write about sports without incorporating team and player statistics and other numeric data into the conversations? And now there’s plenty of publicly-available NBA data to interrogate, and plenty of opportunity to provide net new actionable insights, as one should in any project.
What do you need to know to use this book?
Readers would ideally have the following skills and experience:
- Some (even minimal) exposure to R and/or some level of expertise with at least one other statistical software tool
- Some foundational knowledge of basic statistical concepts
- Some background around visualization best practices
- Some understanding of basketball and basketball statistics would be helpful, but is not absolutely necessary
These same readers would optimally also have one or more of the following professional or educational backgrounds:
- Junior-level data scientist or data analyst with experience in R looking to upskill themselves
- Mid-level or senior-level data scientist or data analyst most comfortable or more experienced with a statistical software tool other than R
- Undergraduate or graduate data science student
- Data-savvy basketball fans and others in the “basketball” or sports professions
Interested in learning more? You can check out the book here.