![github-actions[bot]](/assets/img/avatar_default.png)
You can preview this rule [here](https://sonarsource.github.io/rspec/#/rspec/S6734/python) (updated a few minutes after each push). ## Review A dedicated reviewer checked the rule description successfully for: - [ ] logical errors and incorrect information - [ ] information gaps and missing content - [ ] text style and tone - [ ] PR summary and labels follow [the guidelines](https://github.com/SonarSource/rspec/#to-modify-an-existing-rule) --------- Co-authored-by: guillaume-dequenne-sonarsource <guillaume-dequenne-sonarsource@users.noreply.github.com> Co-authored-by: Guillaume Dequenne <guillaume.dequenne@sonarsource.com>
92 lines
3.1 KiB
Plaintext
92 lines
3.1 KiB
Plaintext
This rule raises an issue when the `inplace` parameter is set to `True` when modifying a Pandas DataFrame.
|
|
|
|
== Why is this an issue?
|
|
|
|
Using `inplace=True` when modifying a Pandas DataFrame means that the method will modify the DataFrame in place, rather than returning a new object:
|
|
|
|
[source,python]
|
|
----
|
|
df.an_operation(inplace=True)
|
|
----
|
|
|
|
When `inplace` is `False` (which is the default behavior), a new object is returned instead:
|
|
|
|
[source,python]
|
|
----
|
|
df2 = df.an_operation(inplace=False)
|
|
----
|
|
|
|
Generally speaking, the motivation for modifying an object in place is to improve efficiency by avoiding the creation of a copy of the original object. Unfortunately, many methods supporting the inplace keyword either cannot actually be done inplace, or make a copy as a consequence of the operations they perform, regardless of whether `inplace` is `True` or not. For example, the following methods can never operate in place:
|
|
|
|
* drop (dropping rows)
|
|
* dropna
|
|
* drop_duplicates
|
|
* sort_values
|
|
* sort_index
|
|
* eval
|
|
* query
|
|
|
|
Because of this, expecting efficiency gains through the use of `inplace=True` is not reliable.
|
|
|
|
Additionally, using `inplace=True` may trigger a `SettingWithCopyWarning` and make the overall intention of the code unclear. In the following example, modifying `df2` will not modify the original `df` dataframe, and a warning will be raised:
|
|
|
|
[source,python]
|
|
----
|
|
df = pd.DataFrame({'a': [3, 2, 1], 'b': ['x', 'y', 'z']})
|
|
|
|
df2 = df[df['a'] > 1]
|
|
df2['b'].replace({'x': 'abc'}, inplace=True)
|
|
# SettingWithCopyWarning:
|
|
# A value is trying to be set on a copy of a slice from a DataFrame
|
|
----
|
|
|
|
In general, side effects such as object mutation may be the source of subtle bugs and explicit reassignment is considered safer.
|
|
|
|
When intermediate results are not needed, method chaining is a more explicit alternative to the `inplace` parameter. For instance, one may write:
|
|
|
|
[source,python]
|
|
----
|
|
df.drop('City', axis=1, inplace=True)
|
|
df.sort_values('Name', inplace=True)
|
|
df.reset_index(drop=True, inplace=True)
|
|
----
|
|
|
|
Through method chaining, this previous example may be rewritten as:
|
|
|
|
[source,python]
|
|
----
|
|
result = df.drop('City', axis=1).sort_values('Name').reset_index(drop=True)
|
|
----
|
|
|
|
For these reasons, it is therefore recommended to avoid using `inplace=True` in favor of more explicit and less error-prone alternatives.
|
|
|
|
== How to fix it
|
|
|
|
To fix this issue, avoid using the `inplace=True` parameter. Either opt for method chaining when intermediary results are not needed, or for explicit reassignment when the intention is to perform a simple operation.
|
|
|
|
=== Code examples
|
|
|
|
==== Noncompliant code example
|
|
|
|
[source,python,diff-id=1,diff-type=noncompliant]
|
|
----
|
|
import pandas as pd
|
|
def foo():
|
|
df.drop(columns='A', inplace=True) # Noncompliant: Using inplace=True is error-prone and should be avoided
|
|
----
|
|
|
|
==== Compliant solution
|
|
|
|
[source,python,diff-id=1,diff-type=compliant]
|
|
----
|
|
import pandas as pd
|
|
def foo():
|
|
df = df.drop(columns='A') # OK: explicit reassignment
|
|
----
|
|
|
|
|
|
== Resources
|
|
=== Documentation
|
|
|
|
* Pandas Enhancement Proposal - https://github.com/pandas-dev/pandas/pull/51466[PDEP-8: In-place methods in pandas]
|