Utilities¶
Helper functions for generating test data and working with calibration.
Data Generation¶
generate_calibrated_data¶
Generate perfectly calibrated scores and labels.
The scores are uniformly distributed in (0, 1), and labels are sampled as Bernoulli with p = score. This gives perfect calibration by construction.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
n_samples
|
int
|
Number of samples to generate |
1000
|
seed
|
Optional[int]
|
Random seed for reproducibility |
None
|
Returns:
| Type | Description |
|---|---|
Tuple[Tensor, Tensor]
|
Tuple of (scores, labels) tensors |
generate_miscalibrated_data¶
Generate miscalibrated scores and labels.
Generates true probabilities, then distorts them with temperature scaling to create miscalibration. Labels are sampled from true probabilities.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
n_samples
|
int
|
Number of samples to generate |
1000
|
temperature
|
float
|
Temperature for miscalibration (>1 = overconfident, <1 = underconfident) |
2.0
|
seed
|
Optional[int]
|
Random seed for reproducibility |
None
|
Returns:
| Type | Description |
|---|---|
Tuple[Tensor, Tensor]
|
Tuple of (miscalibrated_scores, labels) tensors |