This rule raises an issue when an invalid nested estimator parameter is set on a Pipeline. == Why is this an issue? In the sklearn library, when using the `Pipeline` class, it is possible to modify the parameters of the nested estimators. This modification can be done by using the `Pipeline` method `set_params` and specifying the name of the estimator and the parameter to update separated by a double underscore ``++__++``. [source,python] ---- from sklearn.pipeline import Pipeline from sklearn.svm import SVC pipe = Pipeline(steps=[("clf", SVC())]) pipe.set_params(clf__C=10) ---- In the example above, the regularization parameter `C` is set to the value `10` for the classifier called `clf`. Setting such parameters can be done as well with the help of the `param_grid` parameter for example when using `GridSearchCV`. Providing invalid parameters that do not exist on the estimator can lead to unexpected behavior or runtime errors. This rule checks that the parameters provided to the `set_params` method of a Pipeline instance or through the `param_grid` parameters of a `GridSearchCV` are valid for the nested estimators. == How to fix it To fix this issue provide valid parameters to the nested estimators. === Code examples ==== Noncompliant code example [source,python,diff-id=1,diff-type=noncompliant] ---- from sklearn.pipeline import Pipeline from sklearn.decomposition import PCA pipe = Pipeline(steps=[('reduce_dim', PCA())]) pipe.set_params(reduce_dim__C=2) # Noncompliant: the parameter C does not exists for the PCA estimator ---- ==== Compliant solution [source,python,diff-id=1,diff-type=compliant] ---- from sklearn.pipeline import Pipeline from sklearn.decomposition import PCA pipe = Pipeline(steps=[('reduce_dim', PCA())]) pipe.set_params(reduce_dim__n_components=2) # Compliant ---- == Resources === Documentation * Scikit-Learn documentation - https://scikit-learn.org/stable/modules/compose.html#access-to-nested-parameters[Access to nested parameters] * Scikit-Learn documentation - https://scikit-learn.org/stable/modules/generated/sklearn.model_selection.GridSearchCV.html#sklearn-model-selection-gridsearchcv[GridSearchCV reference] ifdef::env-github,rspecator-view[] (visible only on this page) == Implementation specification Verify that the keys of the dict passed to the set_params method of the Pipeline or the keys of the dict passed to the params_grid of a GridSearchCV, contain the name of the estimator and a parameter that exists on the estimator. We should refer to the stubs, to verify if a parameter is valid. The name of the estimator is set in the Pipeline constructor in both cases. The key should have the following shape __. === Message Provide valid parameters to the nested estimators. Secondary location message: The invalid parameter is set here. === Issue location The parameter passed as param_grid or the parameter passed to set_params Secondary location: The invalid parameter name (the key of the dictionary) endif::env-github,rspecator-view[]