Building NBA Insights AI — Weeks 3 to 5 Recap

Aykut Onat
Jul 6, 2025
2 min read

Updated: Jul 10, 2025

When I started this project, I set out to answer one question: Can Machine Learning reliably predict NBA games using only past stats and context? Weeks 3 through 5 were when that vision started turning into a working system.

Week 3 — Teaching the Model to Think Like a Coach

By Week 3, I had already collected and cleaned data from the 2023–2024 NBA season. It was time to train the model. But I didn’t want to just throw raw numbers at it. I engineered rolling averages — points, assists, rebounds, turnovers, and field goal percentages — to simulate how a coach might evaluate a team’s current form rather than season-long stats.

We experimented with two modeling approaches: Logistic Regression for interpretability and XGBoost for performance. Logistic regression gave us a baseline, but XGBoost — with its ability to capture non-linear relationships — quickly outperformed.

Using these features, we trained our first binary classification model to predict win/loss outcomes. The model evaluation was based on accuracy and log loss, and we split our data into training and testing sets to validate its performance.

Week 3 - Can We Predict NBA Wins? Which NBA Model Wins?

Week 4 — Putting the Model to Work

A model is only as useful as its interface — and Week 4 was all about making our model accessible and testable.

We built a Streamlit interface that allowed users to either manually input data or upload CSVs to simulate NBA matchups. The key function predict_game_result() made it possible to generate outcomes for individual games or entire seasons in batch mode.

We also created a batch inference pipeline to simulate entire weeks of the NBA schedule, calculating predicted wins and comparing them with actual outcomes.

To visualize this data, we published Tableau dashboards displaying:

Predicted vs Actual Wins per team
Team-level shooting percentages in wins vs losses
Simulation performance charts over time

Week 4 — Putting the Model to Work

Week 5 — Smarter Features, Smarter Model

This was the week we leveled up — transforming our model into something context-aware and much more accurate.

We introduced several new game context features, including:

WIN_STREAK: Current streak to reflect momentum
REST_DAYS: Days since last game
Opponent_WinRate: Cumulative win rate of the opposing team up to that game
DateOrdinal: Numeric representation of game date for time-aware modeling
Team_WinRate: Rolling win rate of the team

After updating our feature set, we retrained the model using GridSearchCV to fine-tune hyperparameters like max_depth, learning_rate, and n_estimators.

We evaluated the new model on both the 2023–24 season (seen data) and the 2024–25 season (completely unseen data).

The results were promising: Over 70% prediction accuracy on unseen data — a huge leap forward.

Stay Connected

This project is fully documented through videos, and blog posts. You can follow every step:

Access Code : A step-by-step project implementation guide is available on my GitLab.

GitLab

YouTube Playlist:

Building NBA Insights AI — Weeks 3 to 5 Recap

Week 3 — Teaching the Model to Think Like a Coach

Week 4 — Putting the Model to Work

Week 5 — Smarter Features, Smarter Model

Stay Connected

Recent Posts

Comments

Contact

Machine Learning AI Data Systems Blog | Aykut Onat