This rule raises an issue when the ``++pandas.DataFrame.values++`` is used instead of the ``++pandas.DataFrame.to_numpy()++`` method.
== Why is this an issue?
The ``++values++`` attribute and the ``++to_numpy()++`` method in pandas both provide a way to return a NumPy representation of the ``++DataFrame++``. However, there are some reasons why the ``++to_numpy()++`` method is recommended over the ``++values++`` attribute:
* *Future Compatibility:*
The ``++values++`` attribute is considered a legacy feature, while the ``++to_numpy()++`` is the recommended method to extract data and is considered more future-proof.
* *Data type consistency:*
If the ``++DataFrame++`` has columns with different data types, NumPy will choose a common data type that can hold all the data. This may lead to loss of information, unexpected type conversions, or increased memory usage. The ``++to_numpy()++`` allows you to select the common type manually, passing the ``++dtype++`` argument.
* *View vs Copy:*
The ``++values++`` attribute can return a view or a copy of the data depending on whether the data needs to be transposed. This can lead to confusion when modifying the extracted data. On the other hand, ``++to_numpy()++`` has ``++copy++`` argument allowing to force it always to return a new NumPy array, ensuring that any changes you make won't affect the original ``++DataFrame++``.
* *Missing values control:*
The ``++to_numpy()++`` allows to specify the default value used for missing values in the ``++DataFrame++``, while the ``++values++`` will always use ``++numpy.nan++`` for missing values.
== How to fix it
Use the ``++to_numpy()++`` method instead of the ``++values++`` attribute to get a NumPy representation of the ``++DataFrame++``.
=== Code examples
==== Noncompliant code example
[source,python,diff-id=1,diff-type=noncompliant]
----
import pandas as pd
df = pd.DataFrame({
'X': ['A', 'B', 'A', 'C'],
'Y': [10, 7, 12, 5]
})
arr = df.values # Noncompliant: using the 'values' attribute is not recommended