Options & Configuration#
This page documents all available options for configuring your spml2 workflow. You can set these in your options_user.py file.
Available Options#
Option |
Type |
Default |
Description / Possible Values |
|---|---|---|---|
test_mode |
bool |
False |
Enable test mode for quick runs (uses a small sample) |
debug |
bool |
False |
Enable debug mode (extra output, skips some models) |
target_name |
str |
‘target’ |
Name of the target column in your data |
test_df_size |
int |
1000 |
Number of rows for test DataFrame (if test_mode) |
test_ratio |
float |
0.20 |
Proportion of the dataset to use as test split |
root |
Path or str |
‘./input’ |
Root directory for data files |
real_df_filename |
str |
‘example.dta’ |
Main data file name (supports .dta, .parquet, .csv, .xlsx) |
output_folder |
Path or str |
‘Output’ |
Output folder for results |
numerical_cols |
list[str] or None |
None |
List of numerical columns (None = infer automatically) |
categorical_cols |
list[str] or None |
None |
List of categorical columns (None = infer automatically) |
data |
pandas.DataFrame or None |
None |
Pass a custom DataFrame directly (advanced use; otherwise data is loaded from file) |
stratify |
bool |
True |
Whether to stratify train/test splits (recommended for classification) |
random_state |
int |
42 |
Random seed for reproducibility |
raise_error |
bool |
True |
Raise errors while running models especially during plotting and getting feature importances (set False to suppress and continue) |
sampling_strategy |
str or float |
‘auto’ |
SMOTE sampling strategy (see imbalanced-learn docs) |
n_splits |
int |
5 |
Number of cross-validation splits |
shap_plots |
bool |
False |
Enable SHAP plots |
roc_plots |
bool |
True |
Enable ROC curve plots |
shap_sample_size |
int |
100 |
Number of samples for SHAP plots |
pipeline |
ImbPipeline or None |
None |
Custom pipeline (advanced users) |
search_type |
str |
‘random’ |
Hyperparameter search type (‘random’ or ‘grid’) |
search_kwargs |
dict or None |
None |
Additional kwargs for search (e.g., {‘verbose’: 1}) |
Example options_user.py#
from pathlib import Path
from spml2 import Options
from models_user import models
from imblearn.pipeline import Pipeline as ImbPipeline
from sklearn.preprocessing import StandardScaler
from imblearn.over_sampling import SMOTE
user_pipeline = ImbPipeline([
("preprocessor", StandardScaler()),
("smote", SMOTE(random_state=42)),
# Add more steps as needed
])
options = Options(
test_mode=False,
debug=False,
target_name="target",
test_df_size=1000,
test_ratio=0.20,
root=Path("./input"),
real_df_filename="example.dta",
output_folder="Output",
numerical_cols=None,
sampling_strategy="auto",
n_splits=5,
shap_plots=False,
roc_plots=True,
shap_sample_size=100,
pipeline=user_pipeline,
search_type="random",
search_kwargs={"verbose": 1},
)
print(options)
See the comments in options_user.py for more details and customization tips.