Development of a logistic regression model to predict the outcome of NBA games
thesisposted on 05.05.2020, 00:00 by Zachary Campbell
The goal of this logistic regression project was to analyze a dataset of every NBA game from 2014 to 2018 in order to build a model that best predicts if a team will win or lose a game based on a variety of common statistics recorded at all basketball games. The potential predictors that could be included in the final model were location (home or away), how many points were scored, field goals made, field goals attempted, field goal percentage, three pointers made, three pointers attempted, three point percentage, free throws made, free throws attempted, free throw percentage, offensive rebounds, total rebounds, assists, steals, blocks, turnovers, and total fouls. These potential predictors were all evaluated to make sure that there was a significant difference between wins and losses for each one. Correlation matrices were also run to ensure that a model would not suffer from multicollinearity between predictors. Several models were fit with different logistic regression methods and the model's concordance, AIC, and VIF were compared to determine which model best predicted the outcome of an NBA game. The chosen best model was then analyzed and interpreted to bring significance to the individual terms as well as the entire model as a whole.