Skip to content

Utilities

Helper functions for generating test data and working with calibration.

Data Generation

generate_calibrated_data

Generate perfectly calibrated scores and labels.

The scores are uniformly distributed in (0, 1), and labels are sampled as Bernoulli with p = score. This gives perfect calibration by construction.

Parameters:

Name Type Description Default
n_samples int

Number of samples to generate

1000
seed Optional[int]

Random seed for reproducibility

None

Returns:

Type Description
Tuple[Tensor, Tensor]

Tuple of (scores, labels) tensors

generate_miscalibrated_data

Generate miscalibrated scores and labels.

Generates true probabilities, then distorts them with temperature scaling to create miscalibration. Labels are sampled from true probabilities.

Parameters:

Name Type Description Default
n_samples int

Number of samples to generate

1000
temperature float

Temperature for miscalibration (>1 = overconfident, <1 = underconfident)

2.0
seed Optional[int]

Random seed for reproducibility

None

Returns:

Type Description
Tuple[Tensor, Tensor]

Tuple of (miscalibrated_scores, labels) tensors