rspec/rules/S6714/python/rule.adoc
github-actions[bot] a62f01bd72
Create rule S6714: Passing a list to np.array should be preferred over passing a generator. (#2934)
You can preview this rule
[here](https://sonarsource.github.io/rspec/#/rspec/S6714/python)
(updated a few minutes after each push).

## Review

A dedicated reviewer checked the rule description successfully for:

- [ ] logical errors and incorrect information
- [ ] information gaps and missing content
- [ ] text style and tone
- [ ] PR summary and labels follow [the
guidelines](https://github.com/SonarSource/rspec/#to-modify-an-existing-rule)

---------

Co-authored-by: joke1196 <joke1196@users.noreply.github.com>
Co-authored-by: David Kunzmann <david.kunzmann@sonarsource.com>
2023-09-22 12:10:54 +02:00

64 lines
2.2 KiB
Plaintext

This rule raises an issue when a generator is passed to ``++np.array++``.
== Why is this an issue?
The creation of a NumPy array can be done in several ways, for example by passing a Python list to the `np.array` function.
Another way would be to pass a generator to the ``++np.array++`` function, but doing so creates a 0-dimensional array of objects and may not be the intended goal.
This NumPy array will have a have a data type (dtype) of ``++object++`` and could hold any Python objects.
One of the characteristics of NumPy arrays is homogeneity, meaning all its elements are of the same type.
Creating an array of objects allows the user to create heterogeneous array without raising any errors and creating such an array can lead to bugs further in the program.
[source,python]
----
arr = np.array(x**2 for x in range(10))
arr.reshape(1)
arr.resize(2)
arr.put(indices=1, values=3) # No issues raised.
----
The NumPy array `arr` shown above now holds 2 values: a generator and the number 3.
== How to fix it
To fix this issue, either:
* pass a Python list instead of a generator to the ``++np.array++`` function or,
* explicitly show the intention to create a 0-dimensional array of objects by either adding ``++Any++`` as the type hint of the generator or by specifying the ``++dtype++`` parameter of the NumPy array as ``++object++``.
=== Code examples
==== Noncompliant code example
[source,python,diff-id=2,diff-type=noncompliant]
----
arr = np.array(x**2 for x in range(10)) # Noncompliant: the resulting array will be of the data type: object.
gen = (x*2 for x in range(5))
arr = np.array(gen) # Noncompliant: the resulting array will be of the data type: object.
----
==== Compliant solution
[source,python,diff-id=2,diff-type=compliant]
----
from typing import Any
arr = np.array([x**2 for x in range(10)]) # Compliant: a list of 10 elements is passed to the np.array function.
arr = np.array(x**2 for x in range(10), dtype=object) # Compliant: the dtype parameter of np.array is set to object.
gen: Any = (x*2 for x in range(5))
arr = np.array(gen) # Compliant: the generator is explicitly type hinted with Any.
----
== Resources
=== Documentation
* NumPy Documentation - https://numpy.org/doc/stable/reference/typing.html#arraylike[ArrayLike]