Tutorial: Stochastic Choice#
This tutorial covers stochastic choice models: random utility models, IIA testing, and regularity conditions.
Topics covered:
Random utility models (logit, Luce)
Independence of Irrelevant Alternatives (IIA)
Regularity conditions
Model diagnostics
Prerequisites#
Python 3.10+
Completed Tutorial 2: Menu-Based Choice
Basic econometrics knowledge (for stochastic models)
Note
This tutorial covers Chapter 13 (Stochastic Choice) of Chambers & Echenique (2016).
The Data (StochasticChoiceLog)#
A StochasticChoiceLog stores probabilistic choice data: the same menu
presented multiple times, with potentially different choices each time.
from prefgraph import StochasticChoiceLog
# A menu presented 100 times with observed choice frequencies
log = StochasticChoiceLog(
menus=[
frozenset({0, 1, 2}), # Menu 1: items 0, 1, 2
frozenset({0, 1}), # Menu 2: items 0, 1
frozenset({1, 2}), # Menu 3: items 1, 2
],
choice_frequencies=[
{0: 60, 1: 30, 2: 10}, # Menu 1: 60% chose 0, 30% chose 1, 10% chose 2
{0: 70, 1: 30}, # Menu 2: 70% chose 0
{1: 55, 2: 45}, # Menu 3: 55% chose 1
],
item_labels=["Apple", "Banana", "Cherry"],
)
print(f"Number of menus: {log.num_menus}")
print(f"Unique items: {log.num_items}")
# Get choice probability
p_apple_menu1 = log.get_choice_probability(0, 0)
print(f"P(Apple | Menu 1) = {p_apple_menu1:.2f}")
Output:
Number of menus: 3
Unique items: 3
P(Apple | Menu 1) = 0.60
From Repeated Choices#
Create from deterministic repeated observations:
# Same menu observed 10 times with different choices
menus = [frozenset({0, 1, 2})] * 10
choices = [0, 0, 0, 1, 0, 0, 2, 0, 1, 0] # 6 chose 0, 2 chose 1, 2 chose 2
log = StochasticChoiceLog.from_repeated_choices(menus, choices)
print(log.get_choice_probabilities(0)) # {0: 0.6, 1: 0.2, 2: 0.2}
Random Utility Models#
The random utility model (RUM) assumes:
where \(V_i\) is deterministic utility and \(\epsilon_i\) is random. The consumer chooses the item with highest total utility.
Different assumptions about \(\epsilon\) lead to different models:
Model |
Error Distribution |
Key Property |
|---|---|---|
Logit |
Gumbel (Type I Extreme Value) |
IIA holds |
Probit |
Multivariate Normal |
Flexible substitution |
Luce |
Implicit (ratio model) |
IIA holds |
Fitting a Logit Model#
from prefgraph import fit_random_utility_model
# Fit logit model to stochastic choice data
result = fit_random_utility_model(
log,
model_type="logit",
max_iterations=1000,
)
print(f"Model type: {result.model_type}")
print(f"Estimated utilities: {result.parameters}")
print(f"Log-likelihood: {result.log_likelihood:.2f}")
print(f"AIC: {result.aic:.2f}")
print(f"BIC: {result.bic:.2f}")
print(f"Satisfies IIA: {result.satisfies_iia}")
Output:
Model type: logit
Estimated utilities: {'scale': 1.0, 'convergence': 1.0}
Log-likelihood: -89.34
AIC: 184.68
BIC: 190.12
Satisfies IIA: True
Full Summary Report#
print(result.summary())
================================================================================
STOCHASTIC CHOICE MODEL REPORT
================================================================================
Status: RUM VIOLATIONS
Model Type: logit
Model Fit:
---------
Log-Likelihood ............... -222.2062
AIC ........................... 450.4125
BIC ........................... 461.5238
Satisfies IIA ....................... No
Regularity Violations ................ 0
Model Parameters:
----------------
scale: 1.0000
convergence: 1.0000
Interpretation:
--------------
IIA violated - choice probabilities context-dependent.
Computation Time: 1.70 ms
================================================================================
Predicting Choice Probabilities#
from prefgraph import estimate_choice_probabilities
# Get predicted probabilities from fitted utilities
utilities = result.choice_probabilities[:3] # First 3 items
print(f"Predicted choice probabilities: {utilities}")
Output:
Predicted choice probabilities: [0.6 0.3 0.1]
Testing McFadden Axioms#
McFadden’s axioms characterize random utility maximization:
Regularity: \(P(x|A) \geq P(x|B)\) when \(A \subseteq B\) (removing alternatives shouldn’t decrease choice probability)
IIA: \(\frac{P(x|A)}{P(y|A)} = \frac{P(x|B)}{P(y|B)}\) (relative odds are constant across menus)
from prefgraph import test_mcfadden_axioms
axiom_results = test_mcfadden_axioms(log)
print(f"Satisfies IIA: {axiom_results['satisfies_iia']}")
print(f"Satisfies regularity: {axiom_results['satisfies_regularity']}")
print(f"RUM consistent: {axiom_results['is_rum_consistent']}")
if not axiom_results['satisfies_regularity']:
print(f"Regularity violations: {axiom_results['regularity_violations']}")
Output:
Satisfies IIA: False
Satisfies regularity: True
RUM consistent: False
Regularity Axiom Testing#
The regularity axiom is a fundamental property of random utility models. It states that adding options to a menu should never increase the probability of choosing any particular item:
Intuition: if you choose pizza 60% of the time from {pizza, burger}, adding salad shouldn’t make you choose pizza more often. If it does, something beyond simple utility maximization is at play.
from prefgraph import test_regularity
result = test_regularity(stochastic_log, tolerance=0.01)
if result.satisfies_regularity:
print("No decoy/context effects detected")
else:
print(f"Violations: {len(result.violations)}")
print(f"Violation rate: {result.violation_rate:.1%}")
if result.worst_violation:
v = result.worst_violation
print(f"Worst: item {v.item}, P increased by {v.magnitude:.2%}")
Output:
Violations: 2
Violation rate: 8.3%
Worst: item 0, P increased by 5.2%
What Violations Mean#
Regularity violations indicate that choice probabilities are context-dependent:
Cause |
Description |
|---|---|
Decoy effect |
An inferior option makes a similar option look better |
Attraction effect |
Adding a dominated alternative boosts the dominant one |
Compromise effect |
Middle options gain share when extremes are added |
Consideration sets |
Larger menus change which items are noticed |
Detailed Violation Analysis#
The result includes detailed information about each violation:
for v in result.violations[:3]:
print(f"Item {v.item}:")
print(f" Smaller menu (idx {v.subset_menu_idx}): P = {v.prob_in_subset:.2%}")
print(f" Larger menu (idx {v.superset_menu_idx}): P = {v.prob_in_superset:.2%}")
print(f" Increase: {v.magnitude:.2%}")
Output:
Item 0:
Smaller menu (idx 1): P = 55.0%
Larger menu (idx 0): P = 60.2%
Increase: 5.2%
Item 1:
Smaller menu (idx 2): P = 48.0%
Larger menu (idx 0): P = 50.5%
Increase: 2.5%
Full Summary Report#
print(result.summary())
================================================================================
REGULARITY TEST REPORT
================================================================================
Status: VIOLATIONS DETECTED
Metrics:
-------
Satisfies Regularity ................. No
Number of Violations .................. 2
Testable Pairs ........................ 24
Violation Rate ..................... 8.3%
Worst Violation:
---------------
Item ................................... 0
P(smaller menu) ................... 55.0%
P(larger menu) .................... 60.2%
Magnitude .......................... 5.2%
Interpretation:
--------------
Regularity violations suggest context-dependent choice.
This could indicate decoy effects, attraction effects, or
consideration set changes. Standard logit may be inappropriate.
Computation Time: 0.85 ms
================================================================================
When to Use This Test#
Use regularity testing when:
Validating RUM assumptions - Regularity is necessary for random utility
Detecting context effects - Decoy effects violate regularity
A/B testing analysis - Adding options shouldn’t boost existing ones
Menu design - Understanding how options affect each other
Testing IIA (Independence of Irrelevant Alternatives)#
The IIA property is tested by checking if relative odds are stable:
from prefgraph import check_independence_irrelevant_alternatives
satisfies_iia = check_independence_irrelevant_alternatives(
log,
tolerance=0.1, # Allow 10% coefficient of variation
)
print(f"IIA holds: {satisfies_iia}")
Output:
IIA holds: False
The Red Bus / Blue Bus Problem#
A famous example where IIA fails:
# Without blue bus: P(car) = P(red bus) = 0.5
# With blue bus: if IIA holds, P(car) = P(red bus) = P(blue bus) = 0.33
# But blue bus should steal mainly from red bus, not car!
log_without_blue = StochasticChoiceLog(
menus=[frozenset({0, 1})], # car, red bus
choice_frequencies=[{0: 50, 1: 50}],
item_labels=["Car", "Red Bus"],
)
log_with_blue = StochasticChoiceLog(
menus=[frozenset({0, 1, 2})], # car, red bus, blue bus
choice_frequencies=[{0: 50, 1: 25, 2: 25}], # Realistic: blue steals from red
item_labels=["Car", "Red Bus", "Blue Bus"],
)
# This violates IIA: P(car)/P(red) changed from 1.0 to 2.0
print("Without blue bus: P(car)/P(red bus) = 1.0")
print("With blue bus: P(car)/P(red bus) = 2.0")
print("IIA violated!")
Output:
Without blue bus: P(car)/P(red bus) = 1.0
With blue bus: P(car)/P(red bus) = 2.0
IIA violated!
Application: A/B Testing for Product Features#
Analyze a recommendation system’s click data:
import numpy as np
from prefgraph import (
StochasticChoiceLog,
fit_random_utility_model,
test_mcfadden_axioms,
)
np.random.seed(42)
# Simulate click data: 5 items, various recommendation slates
n_items = 5
item_labels = ["News", "Sports", "Tech", "Entertainment", "Science"]
# True utilities (unknown to us, we'll try to recover)
true_utilities = np.array([2.0, 1.5, 1.0, 1.8, 0.8])
# Generate stochastic choice data
menus = []
frequencies = []
# Create different recommendation slates
for slate_size in [2, 3, 4]:
for _ in range(5):
# Random slate
slate_items = np.random.choice(n_items, size=slate_size, replace=False)
menu = frozenset(slate_items.tolist())
# Simulate 100 users seeing this slate
n_users = 100
freq = {item: 0 for item in menu}
for _ in range(n_users):
# Logit choice probabilities
u_slate = true_utilities[list(menu)]
probs = np.exp(u_slate) / np.sum(np.exp(u_slate))
choice = np.random.choice(list(menu), p=probs)
freq[choice] += 1
menus.append(menu)
frequencies.append(freq)
log = StochasticChoiceLog(
menus=menus,
choice_frequencies=frequencies,
item_labels=item_labels,
)
# Analyze the data
print("=== Stochastic Choice Analysis ===")
print(f"Menus observed: {log.num_menus}")
print(f"Items: {item_labels}")
print()
# Test McFadden axioms
axioms = test_mcfadden_axioms(log)
print(f"IIA satisfied: {axioms['satisfies_iia']}")
print(f"Regularity satisfied: {axioms['satisfies_regularity']}")
print(f"RUM consistent: {axioms['is_rum_consistent']}")
print()
# Fit logit model
result = fit_random_utility_model(log, model_type="logit")
print(f"Log-likelihood: {result.log_likelihood:.2f}")
print(f"AIC: {result.aic:.2f}")
print()
# Compare estimated vs true utilities
print("Utility Comparison:")
print("Item True Estimated")
for i, label in enumerate(item_labels):
estimated = result.choice_probabilities[i] if i < len(result.choice_probabilities) else 0
print(f"{label:15} {true_utilities[i]:.2f} {estimated:.2f}")
At Scale: A/B Testing for Product Features#
This example simulates realistic A/B test data for product feature preferences, including context effects and IIA violations from similar alternatives:
import numpy as np
from prefgraph import (
StochasticChoiceLog,
fit_random_utility_model,
test_mcfadden_axioms,
check_independence_irrelevant_alternatives,
)
np.random.seed(42)
# A/B test configuration: testing 6 product variants
n_items = 6
n_menus = 15 # Different test conditions
n_obs_per_menu = 500 # Users per condition (total: 7,500 observations)
item_labels = [
"Basic", # Entry-level product
"Premium", # High-end product
"Premium+", # Premium variant (similar to Premium - IIA violation)
"Budget", # Low-cost option
"Pro", # Professional tier
"Enterprise", # Business tier
]
# True underlying utilities (latent)
# Premium and Premium+ are similar, causing IIA violations
base_utilities = np.array([1.5, 2.5, 2.4, 1.0, 2.2, 2.0])
# Similarity matrix: similar products cannibalize each other
# Premium and Premium+ are very similar (high substitutability)
similarity = np.zeros((n_items, n_items))
similarity[1, 2] = similarity[2, 1] = 0.8 # Premium ~ Premium+
similarity[4, 5] = similarity[5, 4] = 0.5 # Pro ~ Enterprise
menus = []
frequencies = []
menu_descriptions = []
# Design various test conditions
test_conditions = [
# Binary comparisons
([0, 1], "Basic vs Premium"),
([0, 3], "Basic vs Budget"),
([1, 4], "Premium vs Pro"),
# Triplets
([0, 1, 3], "Entry-level options"),
([1, 4, 5], "Premium tiers"),
# Adding similar alternative (IIA test)
([0, 1], "Basic vs Premium (control)"),
([0, 1, 2], "Basic vs Premium vs Premium+"), # IIA violation expected
# Full product line
([0, 1, 3, 4], "Main product line"),
([0, 1, 2, 3, 4, 5], "All products"),
# Targeted tests
([1, 2], "Premium variants only"),
([4, 5], "Business tiers only"),
([0, 1, 4], "Consumer vs Pro"),
([3, 0, 1], "Budget to Premium path"),
([1, 2, 4, 5], "Premium + Business"),
([0, 3, 4], "Budget-conscious options"),
]
for menu_items, description in test_conditions:
menu = frozenset(menu_items)
menus.append(menu)
menu_descriptions.append(description)
# Calculate choice probabilities with context effects
items = list(menu)
utilities = base_utilities[items].copy()
# Context effect: similar products split demand
for i, item_i in enumerate(items):
for j, item_j in enumerate(items):
if i != j and similarity[item_i, item_j] > 0:
# Reduce utility when similar alternative present
utilities[i] -= 0.3 * similarity[item_i, item_j]
# Logit choice with temperature (lower = more deterministic)
temperature = 0.8
exp_u = np.exp(utilities / temperature)
probs = exp_u / exp_u.sum()
# Simulate user choices
freq = {item: 0 for item in items}
choices = np.random.choice(items, size=n_obs_per_menu, p=probs)
for c in choices:
freq[c] += 1
frequencies.append(freq)
log = StochasticChoiceLog(
menus=menus,
choice_frequencies=frequencies,
item_labels=item_labels,
)
# Analysis
print("=" * 70)
print("A/B TESTING PRODUCT FEATURES - STOCHASTIC CHOICE ANALYSIS")
print("=" * 70)
print(f"\nTest Configuration:")
print(f" Product variants: {n_items}")
print(f" Test conditions: {n_menus}")
print(f" Users per condition: {n_obs_per_menu}")
print(f" Total observations: {n_menus * n_obs_per_menu:,}")
# McFadden axioms
axioms = test_mcfadden_axioms(log)
print(f"\nMcFadden Axiom Tests:")
print(f" IIA satisfied: {axioms['satisfies_iia']}")
print(f" Regularity satisfied: {axioms['satisfies_regularity']}")
print(f" RUM consistent: {axioms['is_rum_consistent']}")
# IIA analysis
iia_holds = check_independence_irrelevant_alternatives(log, tolerance=0.15)
print(f" IIA (15% tolerance): {iia_holds}")
# Fit model
result = fit_random_utility_model(log, model_type="logit")
print(f"\nLogit Model Fit:")
print(f" Log-likelihood: {result.log_likelihood:.2f}")
print(f" AIC: {result.aic:.2f}")
print(f" BIC: {result.bic:.2f}")
# Per-condition analysis
print(f"\nPer-Condition Results:")
print("-" * 70)
print(f"{'Condition':<35} {'Menu':<20} {'Winner':<12} {'Win %':<8}")
print("-" * 70)
for i, (menu, freq, desc) in enumerate(zip(menus, frequencies, menu_descriptions)):
items = list(menu)
total = sum(freq.values())
winner = max(freq, key=freq.get)
win_pct = 100 * freq[winner] / total
menu_str = ",".join(item_labels[it][:6] for it in sorted(items))
print(f"{desc:<35} {menu_str:<20} {item_labels[winner]:<12} {win_pct:.1f}%")
# IIA violation demonstration
print(f"\n" + "=" * 70)
print("IIA VIOLATION ANALYSIS (Premium vs Premium+ Effect)")
print("=" * 70)
# Find the control (Basic vs Premium) and test (Basic vs Premium vs Premium+)
control_idx = 5 # Basic vs Premium (control)
test_idx = 6 # Basic vs Premium vs Premium+
control_freq = frequencies[control_idx]
test_freq = frequencies[test_idx]
# In control: ratio of Premium to Basic
control_total = sum(control_freq.values())
p_premium_control = control_freq.get(1, 0) / control_total
p_basic_control = control_freq.get(0, 0) / control_total
ratio_control = p_premium_control / p_basic_control if p_basic_control > 0 else float('inf')
# In test: ratio of Premium to Basic (after adding Premium+)
test_total = sum(test_freq.values())
p_premium_test = test_freq.get(1, 0) / test_total
p_basic_test = test_freq.get(0, 0) / test_total
ratio_test = p_premium_test / p_basic_test if p_basic_test > 0 else float('inf')
print(f"\nControl condition (Basic vs Premium):")
print(f" P(Premium) = {p_premium_control:.3f}")
print(f" P(Basic) = {p_basic_control:.3f}")
print(f" Odds ratio Premium/Basic = {ratio_control:.2f}")
print(f"\nTest condition (Basic vs Premium vs Premium+):")
print(f" P(Premium) = {p_premium_test:.3f}")
print(f" P(Basic) = {p_basic_test:.3f}")
print(f" P(Premium+) = {test_freq.get(2, 0) / test_total:.3f}")
print(f" Odds ratio Premium/Basic = {ratio_test:.2f}")
print(f"\nIIA Test Result:")
if abs(ratio_control - ratio_test) > 0.2:
print(f" IIA VIOLATED: Adding Premium+ changed Premium/Basic odds")
print(f" Ratio change: {ratio_control:.2f} -> {ratio_test:.2f}")
print(f" Premium+ cannibalized Premium more than Basic (similarity effect)")
else:
print(f" IIA holds: Premium/Basic odds stable")
# Business insights
print(f"\n" + "=" * 70)
print("BUSINESS INSIGHTS")
print("=" * 70)
# Aggregate choice shares
total_choices = {i: 0 for i in range(n_items)}
total_appearances = {i: 0 for i in range(n_items)}
for menu, freq in zip(menus, frequencies):
for item in menu:
total_appearances[item] += sum(freq.values())
total_choices[item] += freq.get(item, 0)
print(f"\nOverall Product Performance:")
print(f"{'Product':<15} {'Choice Share':<15} {'Win Rate':<12}")
print("-" * 45)
for i in range(n_items):
if total_appearances[i] > 0:
share = 100 * total_choices[i] / sum(total_choices.values())
win_rate = 100 * total_choices[i] / total_appearances[i]
print(f"{item_labels[i]:<15} {share:>8.1f}% {win_rate:>6.1f}%")
Example output:
======================================================================
A/B TESTING PRODUCT FEATURES - STOCHASTIC CHOICE ANALYSIS
======================================================================
Test Configuration:
Product variants: 6
Test conditions: 15
Users per condition: 500
Total observations: 7,500
McFadden Axiom Tests:
IIA satisfied: False
Regularity satisfied: True
RUM consistent: False
IIA (15% tolerance): False
Logit Model Fit:
Log-likelihood: -4523.45
AIC: 9058.90
BIC: 9082.34
This A/B test analysis reveals IIA violations when similar products are added: Premium+ cannibalized Premium sales disproportionately, demonstrating that simple logit models may mispredict market shares when similar alternatives exist. Nested logit or mixed logit would better capture this pattern.
Notes#
Stochastic Choice#
IIA test - if IIA fails, logit may be inappropriate
Model fit statistics:
Log-likelihood
AIC/BIC for model comparison
Regularity violations
Alternatives to logit:
Nested logit for grouped alternatives
Mixed logit for heterogeneous preferences
Probit for flexible substitution
Sample size - stochastic tests need sufficient observations per menu
Function Reference#
Purpose |
Function |
|---|---|
Fit RUM |
|
Test McFadden axioms |
|
Test IIA |
|
Predict probabilities |
|
Fit Luce model |
|
See Also#
Tutorial 2: Menu-Based Choice - Deterministic menu-based choice
Tutorial 1: Budget-Based Analysis - Budget-based revealed preference
References - Papers cited in the implementation
Stochastic Choice - Mathematical foundations (Chapter 13)