133 lines
4.8 KiB
Plaintext
133 lines
4.8 KiB
Plaintext
This rule raises an issue when the identity operator is used with cached literals.
|
|
|
|
== Why is this an issue?
|
|
|
|
The identity operators ``++is++`` and ``++is not++`` check if the same object is on both sides,
|
|
i.e. ``++a is b++`` returns ``++True++`` if ``++id(a) == id(b)++``.
|
|
|
|
The CPython interpreter caches certain built-in values for integers, bytes, floats, strings, frozensets and tuples. When a value is cached, all its references are pointing to the same object in memory; their ids are identical.
|
|
|
|
The following example illustrates this caching mechanism:
|
|
|
|
[source,python]
|
|
----
|
|
my_int = 1
|
|
other_int = 1
|
|
|
|
id(my_int) == id(other_int) # True
|
|
----
|
|
|
|
In both assignments (to `my_int` and `other_int`), the assigned value `1` comes from the interpreter cache, only one integer object `1` is created in memory.
|
|
This means both variables are referencing the same object. For this reason, their ids are identical and `my_int is other_int` evaluates to `True`.
|
|
This mechanism allows the interpreter for better performance, saving memory space, by not creating new objects every time for commonly accessed values.
|
|
|
|
However this caching mechanism does not apply to every value:
|
|
|
|
[source,python]
|
|
----
|
|
my_int = 1000
|
|
|
|
id(my_int) == id(1000) # False
|
|
my_int is 1000 # False
|
|
----
|
|
|
|
In this example the integer `1000` is not cached. Each reference to `1000` creates an new integer object in memory with a new id.
|
|
This means that `my_int is 1000` is always `False`, as the two objects have different ids.
|
|
|
|
This is the reason why using the identity operators on integers, bytes, floats, strings, frozensets and tuples is unreliable as the behavior changes depending on the value.
|
|
|
|
Moreover the caching behavior is not part of the Python language specification and could vary between interpreters.
|
|
CPython 3.8 https://docs.python.org/3.8/whatsnew/3.8.html#changes-in-python-behavior[warns about comparing literals using identity operators].
|
|
|
|
This rule raises an issue when at least one operand of an identity operator:
|
|
|
|
* is of type ``++int++``, ``++bytes++``, ``++float++``, ``++frozenset++`` or ``++tuple++``.
|
|
* is a string literal.
|
|
|
|
If you need to compare these types you should use the equality operators instead `==` or `!=`.
|
|
|
|
=== Exceptions
|
|
|
|
The only case where the `is` operator could be used with a cached type is with "interned" strings.
|
|
The Python interpreter provides a way to explicitly cache any string literals and benefit from improved performances, such as:
|
|
|
|
* saved memory space.
|
|
* faster string comparison: as only their memory address need to be compared.
|
|
* faster dictionary lookup: if the dictionary keys are interned, the lookup can be done by comparing memory address as well.
|
|
|
|
This explicit caching is done through interned strings (i.e. `sys.intern("some string")`).
|
|
|
|
[source,python]
|
|
----
|
|
from sys import intern
|
|
|
|
my_text = "text"
|
|
intern("text") is intern(my_text) # True
|
|
----
|
|
|
|
Note however that interned strings don't necessarily have the same identity as string literals.
|
|
|
|
It is also important to note that interned strings may be garbage collected, so in order to benefit from their caching mechanism,
|
|
a reference to the interned string should be kept.
|
|
|
|
== How to fix it
|
|
|
|
Use the equality operators (`==` or `!=`) to compare ``++int++``, ``++bytes++``, ``++float++``, ``++frozenset++``, ``++tuple++`` and string literals.
|
|
|
|
=== Code examples
|
|
|
|
==== Noncompliant code example
|
|
|
|
[source,python,diff-id=1,diff-type=noncompliant]
|
|
----
|
|
my_int = 2000
|
|
my_int is 2000 # Noncompliant: the integer 2000 may not be cached, the identity operator could return False.
|
|
|
|
() is tuple() # Noncompliant: this will return True only because the CPython interpreter cached the empty tuple.
|
|
(1,) is tuple([1]) # Noncompliant: comparing non empty tuples will return False as none of these objects are cached.
|
|
----
|
|
|
|
|
|
==== Compliant solution
|
|
|
|
[source,python,diff-id=1,diff-type=compliant]
|
|
----
|
|
my_int = 2000
|
|
my_int == 2000
|
|
|
|
() == tuple()
|
|
(1,) == tuple([1])
|
|
----
|
|
|
|
|
|
== Resources
|
|
|
|
=== Documentation
|
|
|
|
* Python Documentation - https://docs.python.org/3.8/whatsnew/3.8.html#changes-in-python-behavior[Changes in Python behaviour].
|
|
* Python Documentation - https://docs.python.org/3/library/sys.html?highlight=sys.intern#sys.intern[sys.intern]
|
|
|
|
=== Articles & blog posts
|
|
|
|
* Adam Johnson's Blog - https://adamj.eu/tech/2020/01/21/why-does-python-3-8-syntaxwarning-for-is-literal/[Why does Python 3.8 log a SyntaxWarning for 'is' with literals?]
|
|
* Trey Hunner's Blog - https://treyhunner.com/2019/03/unique-and-sentinel-values-in-python/#Equality_vs_identity[Equality vs identity]
|
|
|
|
ifdef::env-github,rspecator-view[]
|
|
|
|
'''
|
|
== Implementation Specification
|
|
(visible only on this page)
|
|
|
|
=== Message
|
|
|
|
* Replace this "is" operator with "=="; identity operator is not reliable here.
|
|
* Replace this "is not" operator with "!="; identity operator is not reliable here.
|
|
|
|
|
|
=== Highlighting
|
|
|
|
Primary: the "is" or "is not" operator.
|
|
|
|
|
|
endif::env-github,rspecator-view[]
|