American Football (NFL)

This tutorial covers how to work with NFL tracking data from the Big Data Bowl using the unravelsports package.

The unravelsports package supports NFL tracking data from the Big Data Bowl competitions, allowing you to:

Load and process NFL tracking data
Convert plays to graph structures
Train Graph Neural Networks for play prediction
Analyze player movements and formations

Interactive Notebook

A comprehensive Jupyter notebook walks through the entire process:

Big Data Bowl Guide
- Loading Big Data Bowl CSV files
- Converting to graphs
- Training GNN models
- Making predictions

Data Format

Big Data Bowl Data

The Big Data Bowl provides three main CSV files:

tracking_week*.csv: Player and ball tracking data
- gameId: Unique game identifier
- playId: Unique play identifier
- nflId: Player identifier
- frameId: Frame number
- x, y: Position coordinates
- s: Speed
- a: Acceleration
- dis: Distance traveled
- o: Orientation angle
- dir: Direction of travel
players.csv: Player information
- nflId: Player identifier
- height: Player height
- weight: Player weight
- position: Player position (QB, RB, WR, etc.)
plays.csv: Play-level information
- gameId, playId: Identifiers
- quarter: Quarter number
- down, yardsToGo: Down and distance
- possessionTeam: Team with possession
- offenseFormation: Formation name
- defendersInTheBox: Number of box defenders
- (and many more columns)

Basic Usage

Step 1: Load Data

Load the Big Data Bowl CSV files:

from unravel.american_football import BigDataBowlDataset

# Load data
bdb_dataset = BigDataBowlDataset(
    tracking_file_path="tracking_week_1.csv",
    players_file_path="players.csv",
    plays_file_path="plays.csv",
)

# View the data
print(bdb_dataset.dataset.head())

The resulting Polars DataFrame includes all tracking data merged with player and play information.

Step 2: Add Labels and Graph IDs

For supervised learning, add labels and graph IDs:

from unravel.utils import add_dummy_label_column, add_graph_id_column

# Add labels (use your own labels for real tasks)
bdb_dataset.dataset = add_dummy_label_column(bdb_dataset.dataset)

# Create graph ID for each play
bdb_dataset.dataset = add_graph_id_column(
    bdb_dataset.dataset,
    by=["gameId", "playId"]
)

Step 3: Convert to Graphs

Convert tracking data to graph structures:

from unravel.american_football import AmericanFootballGraphConverter

converter = AmericanFootballGraphConverter(
    dataset=bdb_dataset,
    self_loop_ball=True,
    adjacency_matrix_connect_type="ball",
    adjacency_matrix_type="split_by_team",
    label_type="binary",
)

# Convert to PyTorch Geometric graphs
graphs = converter.to_pytorch_graphs()

Step 4: Train a Model

Train a Graph Neural Network:

from unravel.utils import GraphDataset
from unravel.classifiers import PyGLightningCrystalGraphClassifier
import pytorch_lightning as pyl
from torch_geometric.loader import DataLoader

# Create dataset and split
dataset = GraphDataset(graphs=graphs, format="pyg")
train, test, val = dataset.split_test_train_validation(4, 1, 1)

# Create data loaders
train_loader = DataLoader(train, batch_size=32, shuffle=True)
val_loader = DataLoader(val, batch_size=32)
test_loader = DataLoader(test, batch_size=32)

# Initialize and train model
model = PyGLightningCrystalGraphClassifier(
    node_features=converter.n_node_features,
    edge_features=converter.n_edge_features,
    global_features=converter.n_graph_features,
)

trainer = pyl.Trainer(max_epochs=10)
trainer.fit(model, train_loader, val_loader)
trainer.test(model, test_loader)

Data Availability

Big Data Bowl data is released annually for Kaggle competitions:

Big Data Bowl Homepage
Previous years’ data available for download
Includes selected weeks from NFL season
Requires Kaggle account (free)