Quick Start¶
This guide shows you how to calibrate ranking scores in just a few lines of code.
Basic Example¶
import torch
from rankcal import TemperatureScaling, ece_at_k, reliability_diagram
# Your ranking scores and binary relevance labels
scores = torch.tensor([0.9, 0.8, 0.7, 0.6, 0.5, 0.4, 0.3, 0.2, 0.1])
relevance = torch.tensor([1, 1, 0, 1, 0, 0, 1, 0, 0])
# Fit a calibrator
calibrator = TemperatureScaling()
calibrator.fit(scores, relevance)
# Calibrate scores
calibrated = calibrator(scores)
# Evaluate calibration at top-k
ece = ece_at_k(calibrated, relevance, k=5)
print(f"ECE@5: {ece:.4f}")
# Visualize calibration
fig = reliability_diagram(calibrated, relevance, k=5)
fig.savefig("reliability.png")
What Just Happened?¶
- Input data: We have ranking scores (model confidence) and binary relevance labels (ground truth)
- Fit calibrator: The
TemperatureScalingcalibrator learns to adjust scores so they reflect true probabilities - Transform scores: Calling
calibrator(scores)applies the learned transformation - Evaluate:
ece_at_kmeasures how well calibrated the top-k scores are - Visualize: The reliability diagram shows calibration quality graphically
Choosing a Calibrator¶
Different calibrators have different tradeoffs:
from rankcal import (
TemperatureScaling, # Simple, 1 parameter
IsotonicCalibrator, # Non-parametric, robust
PiecewiseLinearCalibrator, # Flexible, differentiable
MonotonicNNCalibrator, # Most flexible, for complex patterns
)
# For most cases, start with IsotonicCalibrator
calibrator = IsotonicCalibrator()
calibrator.fit(scores, labels)
calibrated = calibrator(scores)
See the User Guide for detailed guidance on choosing a calibrator.
Next Steps¶
- Choosing a Calibrator - Decision tree for selecting the right calibrator
- Hyperparameters - Tuning calibrator parameters
- Evaluation - Understanding calibration metrics
- API Reference - Full API documentation