Using different Models
Although mother is build around catboost it basically supports other models from the ML community (like all sklearn estimators). For example, RandomForest is already supported. Furthermore, own models can be provided.
Tip
['RandomForestClassifierMother', 'RandomForestRegressorMother', 'LassoClassifierBinaryMother', 'LassoClassifierMulticlassMother', 'LassoRegressorMother', 'CatboostClassifierMother', 'CatboostGaussianProcessRegressorMother', 'CatboostRankerMother', 'CatboostRegressorMother']
RandomForestClassifierMother
A RandomForest classifier pipeline for the MOTHER framework, integrating hyperparameter optimization via Optuna and providing default parameter management. Inherits from both scikit-learn's RandomForestClassifier and the AbstractMotherPipeline for seamless integration with the MOTHER machine learning workflow.
get_hyperparameter_space
Defines the hyperparameter search space for RandomForestClassifier using Optuna.
Args: X: Feature matrix for training data. y: Target vector for training data. trial (Trial): Optuna trial object for suggesting hyperparameters. prefix (str, optional): Prefix to add to hyperparameter names. Defaults to "".
Returns: dict: Dictionary of hyperparameter names (with prefix) and their suggested values.
default_parameters
Returns the default hyperparameters for the RandomForestClassifier.
Args: prefix (str, optional): Prefix to add to hyperparameter names. Defaults to "".
Returns: dict: Dictionary of default hyperparameter names (with prefix) and their values.
For more information on the parent class just use 'help(ml.get_model_class("RandomForestClassifierMother")'
Using Lasso with Hyperparameter Tuning
Providing your own Model
To provide your own model and make this step as easy as possible, we provide the AbstractMotherPipelineClass.
Bases: ABC
The abstract Mother pipeline is a conventional sklearn estimator / transformer etc. but adds methods for hyperparameter definition. Furthermore, it ensures for non sklearn classes and derived classes that they are compatible to the sklearn pipeline interface. This is done by implementing the get_params and set_params methods.
Source code in mother/ml/core.py
predict_uncertainty(X, **kwargs)
This method needs to be implemented for the models that supports uncertainty. Otherwise, the model will run self.predict(X).
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
X
|
DataFrame
|
input features to predict target value |
required |
Source code in mother/ml/core.py
Your own model just has to inherit from that class and implement the required functions that provide the hyperparameters you want to tune. For example, see the implementation of the Lasso model. Since lasso basically has one parameter to be tuned, the implementation is fairly easy.
import logging
from typing import Literal, Mapping, Optional, Union
from optuna.trial import Trial
from sklearn.linear_model import Lasso, LogisticRegression
from mother.ml.core import AbstractMotherPipeline
from mother.ml.models import utils
module_logger: logging.Logger = logging.getLogger(__name__)
class LassoRegressorMother(Lasso, AbstractMotherPipeline):
"""
MOTHER class for a LASSO regression including hyperparameter optimization
"""
def get_hyperparameter_space(self, X, y, trial: Trial, prefix: str = "") -> dict:
"""
Define the hyperparameter search space for Lasso regression.
Parameters:
X: array-like
Feature matrix.
y: array-like
Target vector.
trial: optuna.trial.Trial
Optuna trial object for suggesting hyperparameters.
prefix: str, optional
Prefix to add to hyperparameter names.
Returns:
dict: Dictionary containing hyperparameter names and their suggested values.
"""
return utils.add_prefix_to_dict_keys(
{"alpha": trial.suggest_float(prefix + "alpha", 1e-6, 1e1, log=True)},
prefix=prefix,
)
def default_parameters(self, prefix: str = "") -> dict:
"""
Return the default hyperparameters for the Lasso model.
Parameters:
prefix: str, optional
Prefix to add to hyperparameter names.
Returns:
dict: Dictionary containing default hyperparameter values.
"""
return utils.add_prefix_to_dict_keys({"alpha": 1e-3}, prefix=prefix)
def set_params(self, **params):
"""
Set the parameters of the Lasso model.
Parameters:
**params: Keyword arguments for the parameters to set.
"""
return super().set_params(**params)
def get_params(self, deep=True) -> dict:
return super().get_params(deep=deep)
class LassoClassifierBinaryMother(LogisticRegression, AbstractMotherPipeline):
"""
MOTHER class for a LASSO classification including hyperparameter optimization
"""
def __init__(
self,
penalty: Literal["l1"] = "l1", # Lasso uses L1 penalty
*,
dual: bool = False,
tol: float = 0.0001,
C: float = 1,
fit_intercept: bool = True,
intercept_scaling: float = 1,
class_weight: Optional[Union[Mapping, str]] = "balanced",
random_state: int = 42,
solver: str = "liblinear", # 'liblinear' can be used for L1 penalty and is less complex
max_iter: int = 3000,
verbose: int = 0,
warm_start: bool = False,
n_jobs: Optional[int] = None,
) -> None:
super().__init__(
penalty,
dual=dual,
tol=tol,
C=C,
fit_intercept=fit_intercept,
intercept_scaling=intercept_scaling,
class_weight=class_weight,
random_state=random_state,
solver=solver, # type: ignore
max_iter=max_iter,
verbose=verbose,
warm_start=warm_start,
n_jobs=n_jobs,
)
def get_hyperparameter_space(self, X, y, trial: Trial, prefix: str = "") -> dict:
"""
Define the hyperparameter search space for Lasso classification.
Parameters:
X: array-like
Feature matrix.
y: array-like
Target vector.
trial: optuna.trial.Trial
Optuna trial object for suggesting hyperparameters.
prefix: str, optional
Prefix to add to hyperparameter names.
Returns:
dict: Dictionary containing hyperparameter names and their suggested values.
"""
return utils.add_prefix_to_dict_keys(
{
"C": trial.suggest_float(prefix + "C", 1e-6, 1e1, log=True),
},
prefix=prefix,
)
def default_parameters(self, prefix: str = "") -> dict:
"""
Return the default hyperparameters for the Lasso model.
Parameters:
prefix: str, optional
Prefix to add to hyperparameter names.
Returns:
dict: Dictionary containing default hyperparameter values.
"""
return utils.add_prefix_to_dict_keys({"C": 1e0}, prefix=prefix)
def set_params(self, **params):
"""
Set the parameters of the Lasso model.
Parameters:
**params: Keyword arguments for the parameters to set.
"""
return super().set_params(**params)
def get_params(self, deep=True) -> dict:
return super().get_params(deep=deep)
class LassoClassifierMulticlassMother(LassoClassifierBinaryMother):
"""
MOTHER class for a LASSO classification with multiclass support.
Inherits from LassoClassifierBinaryMother.
"""
def __init__(self, **kwargs):
module_logger.warning(
"""LassoClassifierMother selected. 'Saga' is used as solver.
Scale input features beforehand to improve convergence."""
)
if "solver" in kwargs:
module_logger.warning(
"LassoClassifierMulticlassMother selected. 'Saga' is used as solver for multiclass problems."
)
# Use 'saga' solver for multiclass support
# 'saga' supports L1 penalty and is suitable for large datasets
kwargs["solver"] = "saga"
super().__init__(**kwargs)
Registering your model using MotherModelRegistry
To register your own model and use it easily within the mother framework you can register your model with the available decorator.
MotherModelRegistry
Singleton registry for dynamically discovering and managing model classes in the 'mother.ml.models' package.
This class scans the models directory for Python files matching the pattern 'm_*.py', imports them, and registers all classes that inherit from AbstractMotherPipeline. It provides mappings for model class names, lower-case lookups, and a list of supported algorithms. The registry is used to facilitate model discovery, retrieval, and algorithm support checks throughout the mother.ml package.
Attributes:
| Name | Type | Description |
|---|---|---|
models_dir |
Path
|
Path to the directory containing model modules. |
model_classes |
dict
|
Mapping of model class names to their class objects. |
model_classes_lower |
dict
|
Mapping of lower-case model class names to their canonical names. |
supported_algorithms |
list
|
List of supported algorithm names discovered from model files. |
Source code in mother/ml/__init__.py
| Python | |
|---|---|
26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 | |
list_registered_models()
List all registered models with their algorithms.
Source code in mother/ml/__init__.py
register_model(model_class, algorithm=None)
Register a model class manually.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
model_class
|
Type[AbstractMotherPipeline]
|
The model class to register |
required |
algorithm
|
Optional[str]
|
Optional algorithm name. If not provided, will be derived from class name |
None
|
Source code in mother/ml/__init__.py
unregister_model(model_class_name)
Unregister a model class.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
model_class_name
|
str
|
Name of the model class to unregister |
required |
Source code in mother/ml/__init__.py
algo_is_supported(algorithm)
Check if the specified algorithm is supported.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
algorithm
|
str
|
Name of the algorithm to check |
required |
Returns:
| Name | Type | Description |
|---|---|---|
bool |
bool
|
True if supported, False otherwise |
Source code in mother/ml/__init__.py
describe_model(name)
Get help text for a model class.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
name
|
str
|
Name of the model class |
required |
Returns:
| Name | Type | Description |
|---|---|---|
str |
str
|
Help text for the model class |
Source code in mother/ml/__init__.py
get_available_algorithms()
Get a list of all supported algorithms.
Returns:
| Type | Description |
|---|---|
List[str]
|
List[str]: Names of supported algorithms |
Source code in mother/ml/__init__.py
get_model_class(name)
Get a model class by name.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
name
|
str
|
Name of the model class |
required |
Returns:
| Name | Type | Description |
|---|---|---|
Type |
type[AbstractMotherPipeline]
|
The model class |
Raises:
| Type | Description |
|---|---|
KeyError
|
If the model class is not found |
Source code in mother/ml/__init__.py
get_model_class_by_algorithm(algorithm)
Get model classes by algorithm name.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
algorithm
|
str
|
Name of the algorithm |
required |
Returns:
| Name | Type | Description |
|---|---|---|
Type |
list[Type[AbstractMotherPipeline]]
|
The model class |
Source code in mother/ml/__init__.py
get_model_class_by_algorithm_and_type(algorithm, model_type)
Returns the appropriate model class based on the algorithm and model type.
Source code in mother/ml/__init__.py
get_supported_models()
Get a list of all supported model class names.
Returns:
| Type | Description |
|---|---|
List[str]
|
List[str]: Names of supported model classes |
Source code in mother/ml/__init__.py
register_model(algorithm=None)
Decorator to register a model class.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
algorithm
|
Optional[str]
|
Optional algorithm name |
None
|
Example
@register_model("my_algorithm") class MyCustomMother(AbstractMotherPipeline): pass
Source code in mother/ml/__init__.py
See the following example how to implement your custom RandomForest Classifier.
Example
from mother import ml
from sklearn.ensemble import RandomForestClassifier
@ml.register_model("custom_rf")
class CustomRandomForestMother(RandomForestClassifier, ml.AbstractMotherPipeline):
def get_hyperparameter_space(self, X, y, trial, prefix=""):
return {
f"{prefix}n_estimators": trial.suggest_int("n_estimators", 10, 100),
f"{prefix}max_depth": trial.suggest_int("max_depth", 3, 10),
}
def default_parameters(self, prefix=""):
return {f"{prefix}n_estimators": 50, f"{prefix}max_depth": 5}
print(ml.get_model_class_by_algorithm("custom_rf"))