unravel.soccer.SoccerGraphConverter
- class unravel.soccer.SoccerGraphConverter[source]
Convert soccer tracking data from Polars DataFrame to graph structures for GNN training.
This class transforms soccer tracking data into graph representations suitable for Graph Neural Networks. Each frame of tracking data becomes a graph with players and the ball as nodes, with edges representing spatial relationships or team affiliations.
The converter supports two GNN frameworks: - PyTorch Geometric (recommended) via
to_pytorch_graphs()- Spektral (deprecated, Python 3.11 only) viato_spektral_graphs()- Graph Structure:
Nodes: Players (home team, away team) and ball
Node Features: Position, velocity, acceleration, distances, angles (12 default features)
Edges: Defined by adjacency_matrix_type (team-based, spatial, or dense)
Edge Features: Distances, angles, relative velocities (6-7 default features)
Global Features: Optional match-level features attached to ball node
- Key Features:
Configurable node and edge feature engineering
Multiple adjacency matrix types (split_by_team, delaunay, dense)
Custom feature functions via decorators
Automatic padding for fixed-size graphs
Ball connection strategies (all players, carrier only, none)
Permutation invariance via random node ordering
- Parameters:
dataset (
KloppyPolarsDataset) – Polars dataset with tracking data. Must have been processed withadd_graph_ids()and optionallyadd_dummy_labels().chunk_size (
int, optional) – Number of graphs to process simultaneously. Higher values use more memory but may be faster. Defaults to 20000.non_potential_receiver_node_value (
float, optional) – Node feature value (0-1) assigned to defending team players. Used to distinguish attackers from defenders. Defaults to 0.1.edge_feature_funcs (
List[Callable], optional) – Custom edge feature functions decorated with@graph_feature(type="edge"). If None, uses defaults. Defaults to None.node_feature_funcs (
List[Callable], optional) – Custom node feature functions decorated with@graph_feature(type="node"). If None, uses defaults. Defaults to None.global_feature_cols (
List[str], optional) – Column names from the dataset to use as graph-level features (e.g., match score, team ratings). Must be constant within each graph_id group. Defaults to empty list.global_feature_type (
Literal[``”ball”, ``"all"], optional) – Where to attach global features. “ball” attaches to ball node only, “all” attaches to all nodes. Defaults to “ball”.additional_feature_cols (
List[str], optional) – Extra columns from dataset to make available to custom feature functions (e.g., player height, position). Defaults to empty list.
- settings
Configuration for graph conversion including adjacency matrix type, padding, and feature settings.
- Type:
GraphSettingsPolars
- Raises:
ValueError – If dataset is not a KloppyPolarsDataset.
ValueError – If required columns (graph_id, label) are missing.
ValueError – If custom feature functions are not properly decorated.
Example
>>> from unravel.soccer import KloppyPolarsDataset, SoccerGraphConverter >>> from kloppy import sportec >>> >>> # Load and prepare data >>> kloppy_dataset = sportec.load_open_tracking_data(only_alive=True) >>> polars_dataset = KloppyPolarsDataset(kloppy_dataset=kloppy_dataset) >>> polars_dataset.add_dummy_labels(by=["frame_id"]) >>> polars_dataset.add_graph_ids(by=["frame_id"]) >>> >>> # Create converter >>> converter = SoccerGraphConverter( ... dataset=polars_dataset, ... self_loop_ball=True, ... adjacency_matrix_connect_type="ball", ... adjacency_matrix_type="split_by_team", ... label_type="binary", ... ) >>> >>> # Convert to PyTorch Geometric format >>> graphs = converter.to_pytorch_graphs() >>> print(f"Created {len(graphs)} graphs") >>> print(f"Node features: {converter.n_node_features}") >>> print(f"Edge features: {converter.n_edge_features}")
Note
For detailed configuration options, see
GraphSettingsPolars. For custom features, seegraph_feature()decorator.Warning
If not using padding (
pad=False), graphs with incomplete player data (< 22 players) will be dropped. Usepad=Truefor variable-sized teams.See also
KloppyPolarsDataset: Prepare tracking data.GraphDataset: Wrap graphs for training.graph_feature(): Create custom features. ../tutorials/soccer_gnn: Complete GNN training tutorial. Graph FAQ: Detailed configuration guide.- __init__(engine='auto', prediction=False, self_loop_ball=False, adjacency_matrix_connect_type='ball', adjacency_matrix_type='split_by_team', label_type='binary', defending_team_node_value=0.1, random_seed=False, pad=False, verbose=False, label_col=None, graph_id_col=None, sample_rate=None, dataset=None, chunk_size=20000, non_potential_receiver_node_value=0.1, edge_feature_funcs=<factory>, node_feature_funcs=<factory>, global_feature_cols=<factory>, global_feature_type='ball', additional_feature_cols=<factory>)
- Parameters:
engine (Literal['auto', 'gpu'])
prediction (bool)
self_loop_ball (bool)
adjacency_matrix_connect_type (Literal['ball', 'ball_carrier', 'no_connection'])
adjacency_matrix_type (Literal['delaunay', 'split_by_team', 'dense', 'dense_ap', 'dense_dp'])
label_type (Literal['binary'])
defending_team_node_value (float)
pad (bool)
verbose (bool)
label_col (str)
graph_id_col (str)
sample_rate (float)
dataset (KloppyPolarsDataset)
chunk_size (int)
non_potential_receiver_node_value (float)
edge_feature_funcs (List[Callable[[Dict[str, Any]], ndarray]])
node_feature_funcs (List[Callable[[Dict[str, Any]], ndarray]])
global_feature_type (Literal['ball', 'all'])
- Return type:
None
Methods
__init__([engine, prediction, ...])get_player_by_id(player_id)get_players_by_team_id(team_id)plot(file_path[, fps, timestamp, ...])Plot tracking data as a static image or video file.
to_custom_dataset([include_object_ids])Spektral requires a spektral Dataset to load the data for docs see https://graphneural.network/creating-dataset/
to_graph_dataset([include_object_ids])Spektral requires a spektral Dataset to load the data for docs see https://graphneural.network/creating-dataset/
to_graph_frames([include_object_ids])to_pickle(file_path[, verbose, ...])We store the 'dict' version of the Graphs to pickle each graph is now a dict with keys x, a, e, and y To use for training with Spektral feed the loaded pickle data to CustomDataset(data=pickled_data)
to_pyg_graphs([include_object_ids])to_pytorch_graphs([include_object_ids])Convert graph frames to PyTorch Geometric Data objects.
to_spektral_graphs([include_object_ids])Attributes
adjacency_matrix_connect_typeadjacency_matrix_typedefending_team_node_valueenginefeature_optsgraph_framesgraph_id_collabel_collabel_typepadpredictionrandom_seedreturn_dtypessample_rateself_loop_ballverbose- dataset: KloppyPolarsDataset = None
- plot(file_path, fps=None, timestamp=None, end_timestamp=None, period_id=None, team_color_a='#CD0E61', team_color_b='#0066CC', ball_color='black', sort=True, color_by='ball_owning', anonymous=False, plot_type='full', show_label=True, show_ball_label=False, show_timestamp=True, next_closest_timestamp=False)[source]
Plot tracking data as a static image or video file.
This method visualizes tracking data for players and the ball. It can generate either: - A single PNG image (if either fps or end_timestamp is None, or both are None) - An MP4 video (if both fps and end_timestamp are provided)
- Parameters:
file_path (
str) – The output path where the PNG or MP4 file will be savedfps (
int, optional) – Frames per second for video output. If None, a static image is generatedtimestamp (
pl.duration, optional) – The starting timestamp to plot. If None, starts from the beginning of available dataend_timestamp (
pl.duration, optional) – The ending timestamp for video output. If None, a static image is generatedperiod_id (
int, optional) – ID of the match period to visualize. If None, all periods are includedteam_color_a (
str, default"#CD0E61") – Hex color code for Team A visualizationteam_color_b (
str, default"#0066CC") – Hex color code for Team B visualizationball_color (
str, default"black") – Color for ball visualizationcolor_by (
Literal[``”ball_owning”, ``"static_home_away"], default"ball_owning") – Method for coloring the teams: - “ball_owning”: Colors teams based on ball possession - “static_home_away”: Uses static colors for home and away teamsanonymous (
bool, defaultFalse) – Whether to anonymize player labelsplot_type (
Literal[``”pitch_only”, ``"graph_only","full"], default"full") – Type of plot to generate: - “pitch_only”: Shows only the soccer pitch visualization - “graph_only”: Shows only the graph features (node features, adjacency matrix, edge features) - “full”: Shows both pitch and graph visualizationsshow_pitch_label (
bool, defaultTrue) – Whether to show the label on the pitch visualizationshow_pitch_timestamp (
bool, defaultTrue) – Whether to show the timestamp on the pitch visualizationnext_closest_timestamp (
bool, defaultFalse) – When plotting a .png and the timestamp isn’t 100% correct we find the next correct timestamp and use that to plot.sort (bool)
show_label (bool)
show_ball_label (bool)
show_timestamp (bool)
- Returns:
The function saves the output file to the specified file_path but doesn’t return any value
- Return type:
Notes
Output file type is determined by parameters: - PNG: Generated when either fps or end_timestamp is None, or both are None - MP4: Generated when both fps and end_timestamp are provided
- Raises:
ValueError – If file extension doesn’t match the parameters provided (e.g., .mp4 extension but missing fps or end_timestamp, or .png extension with both fps and end_timestamp)
- Parameters:
file_path (str)
fps (int)
timestamp (duration)
end_timestamp (duration)
period_id (int)
team_color_a (str)
team_color_b (str)
ball_color (str)
sort (bool)
color_by (Literal['ball_owning', 'static_home_away'])
anonymous (bool)
plot_type (Literal['pitch_only', 'graph_only', 'full'])
show_label (bool)
show_ball_label (bool)
show_timestamp (bool)
next_closest_timestamp (bool)
- __init__(engine='auto', prediction=False, self_loop_ball=False, adjacency_matrix_connect_type='ball', adjacency_matrix_type='split_by_team', label_type='binary', defending_team_node_value=0.1, random_seed=False, pad=False, verbose=False, label_col=None, graph_id_col=None, sample_rate=None, dataset=None, chunk_size=20000, non_potential_receiver_node_value=0.1, edge_feature_funcs=<factory>, node_feature_funcs=<factory>, global_feature_cols=<factory>, global_feature_type='ball', additional_feature_cols=<factory>)
- Parameters:
engine (Literal['auto', 'gpu'])
prediction (bool)
self_loop_ball (bool)
adjacency_matrix_connect_type (Literal['ball', 'ball_carrier', 'no_connection'])
adjacency_matrix_type (Literal['delaunay', 'split_by_team', 'dense', 'dense_ap', 'dense_dp'])
label_type (Literal['binary'])
defending_team_node_value (float)
pad (bool)
verbose (bool)
label_col (str)
graph_id_col (str)
sample_rate (float)
dataset (KloppyPolarsDataset)
chunk_size (int)
non_potential_receiver_node_value (float)
edge_feature_funcs (List[Callable[[Dict[str, Any]], ndarray]])
node_feature_funcs (List[Callable[[Dict[str, Any]], ndarray]])
global_feature_type (Literal['ball', 'all'])
- Return type:
None