134 lines
4.0 KiB
Plaintext
134 lines
4.0 KiB
Plaintext
This rule raises an issue when a machine learning estimator or optimizer is instantiated without specifying the important hyperparameters.
|
|
|
|
== Why is this an issue?
|
|
|
|
When instantiating an estimator or an optimizer, default values for any hyperparameters that are not specified will be used.
|
|
Relying on the default values can lead to non-reproducible results across different versions of the library.
|
|
|
|
Furthermore, the default values might not be the best choice for the specific problem at hand and can lead to suboptimal performance.
|
|
|
|
Here are the estimators and the parameters considered by this rule :
|
|
[%header, format=dsv, stripes=even]
|
|
|===
|
|
Scikit-learn - Estimator:Hyperparameters
|
|
|
|
AdaBoostClassifier:learning_rate
|
|
AdaBoostRegressor:learning_rate
|
|
GradientBoostingClassifier:learning_rate
|
|
GradientBoostingRegressor:learning_rate
|
|
HistGradientBoostingClassifier:learning_rate
|
|
HistGradientBoostingRegressor:learning_rate
|
|
RandomForestClassifier:min_samples_leaf, max_features
|
|
RandomForestRegressor:min_samples_leaf, max_features
|
|
ElasticNet:alpha, l1_ratio
|
|
NearestNeighbors:n_neighbors
|
|
KNeighborsClassifier:n_neighbors
|
|
KNeighborsRegressor:n_neighbors
|
|
NuSVC:nu, kernel, gamma
|
|
NuSVR:C, kernel, gamma
|
|
SVC:C, kernel, gamma
|
|
SVR:C, kernel, gamma
|
|
DecisionTreeClassifier:ccp_alpha
|
|
DecisionTreeRegressor:ccp_alpha
|
|
|
|
MLPClassifier:hidden_layer_sizes
|
|
MLPRegressor:hidden_layer_sizes
|
|
|
|
PolynomialFeatures:degree, interaction_only
|
|
|===
|
|
|
|
[%header, format=dsv, stripes=even]
|
|
|===
|
|
PyTorch - Optimizer:Hyperparameters
|
|
|
|
Adadelta:lr, weight_decay
|
|
Adagrad:lr, weight_decay
|
|
Adam:lr, weight_decay
|
|
AdamW:lr, weight_decay
|
|
SparseAdam:lr
|
|
Adamax:lr, weight_decay
|
|
ASGD:lr, weight_decay
|
|
LBFGS:lr
|
|
NAdam:lr, weight_decay, momentum_decay
|
|
RAdam:lr, weight_decay
|
|
RMSprop:lr, weight_decay, momentum
|
|
Rprop:lr
|
|
SGD:lr, weight_decay, momentum
|
|
|===
|
|
|
|
// How to fix it section
|
|
|
|
include::how-to-fix-it/scikit-learn.adoc[]
|
|
|
|
include::how-to-fix-it/pytorch.adoc[]
|
|
|
|
ifdef::env-github,rspecator-view[]
|
|
|
|
(visible only on this page)
|
|
|
|
== Implementation specification
|
|
|
|
Implementation will be quite tricky if we want to avoid false positives.
|
|
|
|
Abort if :
|
|
|
|
- In a Pipeline/make_pipeline used for hyperparameter search
|
|
|
|
https://github.com/SERG-Delft/dslinter/blob/main/dslinter/checkers/hyperparameters_scikitlearn.py#L48-L70[List of DSLinter estimators]
|
|
https://github.com/SERG-Delft/dslinter/blob/main/dslinter/checkers/hyperparameters_pytorch.py[List of DSLinter optimizers]
|
|
|
|
Possible baby step : only check for some estimators ( for exemple the meta-learners)
|
|
|
|
Ignore parameters :
|
|
|
|
- n_jobs
|
|
|
|
- that ends in `param` ?
|
|
|
|
With PyTorch we should check as well when an optimizer as per parameters set
|
|
|
|
optim.SGD([
|
|
{'params': model.base.parameters(), 'lr': 1e-2},
|
|
{'params': model.classifier.parameters()}
|
|
], lr=1e-3, momentum=0.9)
|
|
|
|
We should not raise an issue if we cannot inspect the content of the list and the dictionaries.
|
|
|
|
=== Message
|
|
|
|
For scikit-learn estimators:
|
|
Specify important hyperparameters when instantiating a Scikit-learn estimator.
|
|
|
|
For PyTorch optimizers
|
|
Specify important hyperparameters when instantiating a PyTorch optimizer.
|
|
|
|
=== Issue location
|
|
|
|
Primary : name of the estimator/optimizer
|
|
|
|
No secondary location
|
|
=== Quickfix
|
|
|
|
There is a possible quickfix : add all the missing parameters at their default values
|
|
|
|
endif::env-github,rspecator-view[]
|
|
|
|
== Resources
|
|
|
|
=== Articles & blog posts
|
|
|
|
* Probst, P., Boulesteix, A. L., & Bischl, B. (2019). Tunability: Importance of
|
|
Hyperparameters of Machine Learning Algorithms. Journal of Machine Learning Research,
|
|
20(53), 1-32.
|
|
* van Rijn, J. N., & Hutter, F. (2018, July). Hyperparameter importance across datasets.
|
|
In Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery &
|
|
Data Mining (pp. 2367-2376).
|
|
|
|
=== Documentation
|
|
|
|
* PyTorch Documentation - https://pytorch.org/docs/stable/optim.html[torch.optim]
|
|
|
|
=== External coding guidelines
|
|
* Code Smells for Machine Learning Applications - https://hynn01.github.io/ml-smells/posts/codesmells/11-hyperparameter-not-explicitly-set/[Hyperparameter not Explicitly Set]
|
|
|