A comprehensive Python package for calculating Shapley values in cooperative game theory. Shapley values provide a mathematically principled method for fairly distributing payoffs among players based on their marginal contributions to coalitions.
The Shapley value is a solution concept from cooperative game theory that assigns a unique distribution (among the players) of a total surplus generated by the coalition of all players. This package provides four approaches:
| Class | When to use | Scales to |
|---|---|---|
ShapleyCombinations |
Pre-defined coalition values (exact) | ~15 players |
ShapleyValue |
Pre-defined coalition values, alt algorithm | ~15 players |
ShapleyValueCalculator |
Dynamic evaluation function (exact, parallel) | ~20 players |
MonteCarloShapleyValue |
Dynamic evaluation function (approximate, parallel) | 100+ players |
pip install shapley-value
from shapley_value import ShapleyCombinations
players = ['Alice', 'Bob', 'Charlie']
coalition_values = {
(): 0,
('Alice',): 10, ('Bob',): 20, ('Charlie',): 30,
('Alice', 'Bob'): 50, ('Alice', 'Charlie'): 60, ('Bob', 'Charlie'): 70,
('Alice', 'Bob', 'Charlie'): 100,
}
calculator = ShapleyCombinations(players)
shapley_values = calculator.calculate_shapley_values(coalition_values)
# {'Alice': 16.67, 'Bob': 33.33, 'Charlie': 50.0}
from shapley_value import MonteCarloShapleyValue
def my_model(coalition):
"""Any callable β ML model, simulation, or analytic function."""
return sum(coalition) ** 0.9 if coalition else 0.0
mc = MonteCarloShapleyValue(
my_model,
players=list(range(50)), # 50 players β exact is infeasible
num_samples=5000,
random_seed=42,
n_jobs=-1, # use all CPU cores
)
values = mc.calculate_shapley_values() # Dict[player, float]
ShapleyCombinations)Use this when every coalition value is already known:
from shapley_value import ShapleyCombinations
players = ['Player1', 'Player2', 'Player3']
coalition_values = {
(): 0,
('Player1',): 100, ('Player2',): 200, ('Player3',): 300,
('Player1', 'Player2'): 450,
('Player1', 'Player3'): 500,
('Player2', 'Player3'): 600,
('Player1', 'Player2', 'Player3'): 900,
}
calculator = ShapleyCombinations(players)
shapley_values = calculator.calculate_shapley_values(coalition_values)
print(shapley_values)
ShapleyValueCalculator)Use this when a callable computes the coalition value and you have β€ ~20 players:
from shapley_value import ShapleyValueCalculator
def profit_function(coalition):
if not coalition:
return 0
return sum(coalition) + len(coalition) * 10 # synergy bonus
players = [100, 200, 300]
calculator = ShapleyValueCalculator(profit_function, players, n_jobs=-1)
shapley_values = calculator.calculate_shapley_values()
print(shapley_values)
raw_data = calculator.get_raw_data() # detailed per-coalition data
calculator.save_raw_data('analysis.csv') # export to CSV
n_jobs follows the scikit-learn convention: 1 = sequential, -1 = all cores, k = exactly k cores.
MonteCarloShapleyValue)Use this for large games (20+ players) where exact computation is infeasible.
Complexity is O(m Γ n) where m = num_samples and n = number of players,
versus O(2βΏ) for exact methods.
from shapley_value import MonteCarloShapleyValue
def complex_model(coalition):
"""Expensive callable β e.g. a trained ML model."""
if not coalition:
return 0.0
return float(sum(p ** 1.2 for p in coalition))
players = list(range(1, 51)) # 50 players
mc = MonteCarloShapleyValue(
complex_model,
players=players,
num_samples=5000, # more samples β lower error
random_seed=0, # set for reproducibility
n_jobs=-1, # parallelise across all CPU cores
)
# Core result
values = mc.calculate_shapley_values()
# Diagnose how many samples you need
convergence_df = mc.get_convergence_data() # DataFrame shape (5000, 50)
# Inspect per-permutation marginal contributions
raw_df = mc.get_raw_data() # columns: iteration, permutation, player, marginal_contribution
num_samples| Players | Suggested num_samples |
Typical error |
|---|---|---|
| β€ 10 | 1 000 | < 0.5 % |
| 10β30 | 5 000 | < 1 % |
| 30β100 | 10 000β50 000 | 1β3 % |
Use get_convergence_data() to plot running estimates and verify convergence
for your specific game.
n_jobs quick reference# Sequential (default β no overhead, good for small games / debugging)
mc = MonteCarloShapleyValue(f, players, n_jobs=1)
# All available CPU cores
mc = MonteCarloShapleyValue(f, players, n_jobs=-1)
# Exactly 4 cores
mc = MonteCarloShapleyValue(f, players, n_jobs=4)
Note: Permutations are generated in a fixed order before the parallel step, so
random_seedguarantees bit-identical results regardless ofn_jobs.
ShapleyCombinationsclass ShapleyCombinations:
def __init__(self, players: List[Any])
def calculate_shapley_values(
self, coalition_values: Dict[Tuple, float]
) -> Dict[Any, float]
def get_all_coalitions(self) -> List[Tuple]
ShapleyValueCalculatorclass ShapleyValueCalculator:
def __init__(
self,
evaluation_function: Callable[[List[Any]], float],
players: List[Any],
n_jobs: int = 1, # sklearn-style: 1=seq (default), -1=all cores, k=k cores
)
def calculate_shapley_values(self) -> Dict[Any, float]
def get_raw_data(self) -> pd.DataFrame
def save_raw_data(self, file_path: str) -> None
MonteCarloShapleyValueclass MonteCarloShapleyValue:
def __init__(
self,
evaluation_function: Callable[[List[Any]], float],
players: List[Any],
num_samples: int = 1000, # permutations to sample
random_seed: Optional[int] = None,
n_jobs: int = 1, # sklearn-style parallelism
)
def calculate_shapley_values(self) -> Dict[Any, float]
def get_convergence_data(self) -> pd.DataFrame # running estimates per iteration
def get_raw_data(self) -> pd.DataFrame # per-permutation detail
ShapleyValueLow-level calculator using the weighted-marginal-contribution formula directly.
class ShapleyValue:
def __init__(
self,
players: List[Any],
coalition_values: Dict[Tuple[Any, ...], float],
)
def calculate_shapley_values(self) -> Dict[Any, float]
n_jobs on both ShapleyValueCalculator and MonteCarloShapleyValue:
1 = sequential, -1 = all cores, k = exactly k coresrandom_seed in MonteCarloShapleyValue gives
bit-identical output regardless of n_jobsget_convergence_data() lets you plot running
estimates and tune num_samples for your accuracy targetget_raw_data() and save_raw_data() for downstream analysisShapleyCombinations / ShapleyValueCalculator)Exact computation enumerates all 2βΏ coalitions and becomes impractical beyond ~20 players.
| Players | Coalitions | Sequential | Parallel (n_jobs=-1) |
Speedup |
|---|---|---|---|---|
| 5 | 32 | < 0.001 s | < 0.001 s | 1Γ |
| 10 | 1 024 | ~ 6.6 s | ~ 1.1 s | ~ 6Γ |
| 12 | 4 096 | ~ 31 s | ~ 2.6 s | ~ 12Γ |
| 15 | 32 768 | ~ 4 min | ~ 20 s | ~ 12Γ |
MonteCarloShapleyValue)O(m Γ n) cost β scales to 100+ players. Parallelism is beneficial when the evaluation function is expensive (e.g. ML model inference, simulations).
| Players | num_samples |
n_jobs=1 |
n_jobs=-1 |
Notes |
|---|---|---|---|---|
| 20 | 5 000 | < 1 s | < 1 s | cheap eval function |
| 50 | 5 000 | < 1 s | < 1 s | cheap eval function |
| 50 | 10 000 | ~ 2 s | ~ 0.7 s | expensive eval (~1 ms/call) |
| 100 | 10 000 | ~ 5 s | ~ 1.5 s | expensive eval (~1 ms/call) |
Rule of thumb: if your evaluation function is cheap (< 10 Β΅s), use
n_jobs=1to avoid joblib process-spawn overhead. If itβs expensive (ML model, simulation), setn_jobs=-1for maximum throughput.
# Install dev dependencies
pip install -e ".[dev,examples,performance]"
# Run the full test suite (53 tests)
python -m pytest tests/ -v
# Run only stress / performance tests (with printed benchmark table)
python -m pytest tests/test_montecarlo_stress.py -v -s
The suite is organised into:
| File | Tests | Coverage |
|---|---|---|
test_calculator.py |
11 | ShapleyValue, ShapleyCombinations, ShapleyValueCalculator |
test_montecarlo.py |
29 | Correctness, reproducibility, parallelism, edge cases |
test_montecarlo_stress.py |
13 | 20β100 player stress, timing bounds, benchmark table |
We welcome contributions!
git checkout -b feature/amazing-feature)git commit -m 'Add amazing feature')git push origin feature/amazing-feature)git clone https://github.com/Bowenislandsong/shapley-value.git
cd shapley-value
pip install -e ".[dev,examples,performance]"
python -m pytest tests/
This project is licensed under the MIT License β see the LICENSE file for details.
If you use this package in your research or project, please cite it as:
@software{song2026shapley,
author = {Song, Bowen},
title = {Shapley Value Calculator},
year = {2026},
publisher = {GitHub},
url = {https://github.com/Bowenislandsong/shapley-value},
version = {0.0.9}
}
APA Format:
Song, B. (2026). Shapley Value Calculator (Version 0.0.9) [Computer software]. https://github.com/Bowenislandsong/shapley-value
For more citation formats see CITATION.cff.
v* tag push; citation / version metadataShapleyValueCalculator uses n_jobs (sklearn-style); doc and landing-page sync; example runner includes Monte Carlo exampleMonteCarloShapleyValue with sklearn-style n_jobs; comprehensive stress tests; convergence and raw-data diagnosticsFor more information about Shapley values and cooperative game theory, see the Wikipedia article.