Models Overview¶
All 7 models implement the BaseModel abstract interface and are accessible through the MODEL_REGISTRY.
Model Comparison¶
| Model | Type | Key Config | Strengths |
|---|---|---|---|
| Logistic Regression | Tabular | C=1.0, balanced weights | Fast, interpretable baseline |
| Random Forest | Tabular | 200 trees, max_depth=15 | Handles non-linear relationships |
| HMM | Sequence | 1-8 hidden states | Captures latent pitch states |
| AutoGluon | Tabular | good_quality preset | Automated model selection |
| LSTM | Sequence | 2-layer, hidden=64, window=8 | Long-range sequence dependencies |
| 1D-CNN | Sequence | 3 conv layers, kernel=3 | Local pattern detection |
| Transformer | Sequence | d_model=64, 4 heads, 2 layers | Self-attention over sequences |
BaseModel Interface¶
All models implement the following abstract interface:
from pitch_sequencing.models.base import BaseModel
class BaseModel(ABC):
@property
def name(self) -> str:
"""Human-readable model name."""
@property
def model_type(self) -> str:
"""'tabular' or 'sequence' — determines input shape."""
def fit(self, X_train, y_train, X_val=None, y_val=None, **kwargs):
"""Train the model."""
def predict(self, X) -> np.ndarray:
"""Return predicted class labels."""
def predict_proba(self, X) -> np.ndarray:
"""Return class probabilities (n_samples x n_classes)."""
def get_params(self) -> dict:
"""Return model hyperparameters."""
Input Shapes¶
- Tabular models (
model_type = "tabular"): Expect(n_samples, n_features)NumPy arrays or DataFrames - Sequence models (
model_type = "sequence"): Expect(n_samples, window_size, n_features)3D arrays
Model Registry¶
Models are accessed by name through the registry:
from pitch_sequencing import get_model, MODEL_REGISTRY
# List all registered models
print(list(MODEL_REGISTRY.keys()))
# ['logistic_regression', 'random_forest', 'hmm', 'autogluon', 'lstm', 'cnn1d', 'transformer']
# Instantiate a model with config
model = get_model("lstm", {"hidden_size": 64, "num_layers": 2})