61 lines
2.5 KiB
Plaintext
61 lines
2.5 KiB
Plaintext
![]() |
This rule raises an issue when the argument `dayfirst` or `yearfirst` is set to `True` on `pandas.to_datetime` function with an incorrect string format.
|
||
|
|
||
|
== Why is this an issue?
|
||
|
|
||
|
The `pandas.to_datetime` function transforms a string to a date object. The string representation of the date can take multiple formats. To correctly parse these strings,
|
||
|
`pandas.to_datetime` provides several arguments to setup the parsing, such as `dayfirst` or `yearfirst`. For example setting `dayfirst` to `True` indicates to `pandas.to_datetime`
|
||
|
that the date and time will be represented as a string with the shape `day month year time`. Similarly with `yearfirst`, the string should have the following shape `year month day time`.
|
||
|
|
||
|
These two arguments are not strict, meaning if the shape of the string is not the one expected by `pandas.to_datetime`, the function will not fail and try to figure out which part of the string is the day, month or year.
|
||
|
|
||
|
In the following example the `dayfirst` argument is set to `True` but we can clearly see that the `month` part of the date would be incorrect. In this case `pandas.to_datetime` will ignore the `dayfirst` argument, and parse the date
|
||
|
as the 22nd of January.
|
||
|
|
||
|
[source,python]
|
||
|
----
|
||
|
import pandas as pd
|
||
|
|
||
|
pd.to_datetime(["01-22-2000 10:00"], dayfirst=True)
|
||
|
----
|
||
|
|
||
|
No issue will be raised in such a case, which could lead to bugs later in the program. Either the user made a mistake by setting `dayfirst` to `True` or the month part of the date is incorrect.
|
||
|
|
||
|
|
||
|
== How to fix it
|
||
|
|
||
|
To fix this issue either correct the string representation of the date to match the expected format, or remove the arguments `dayfirst` or `yearfirst`.
|
||
|
|
||
|
=== Code examples
|
||
|
|
||
|
==== Noncompliant code example
|
||
|
|
||
|
[source,python,diff-id=1,diff-type=noncompliant]
|
||
|
----
|
||
|
import pandas as pd
|
||
|
|
||
|
pd.to_datetime(["01-22-2000 10:00"], dayfirst=True) # Noncompliant: the second part of the date (22) is not a valid month
|
||
|
|
||
|
pd.to_datetime(["02/03/2000 12:00"], yearfirst=True) # Noncompliant: the year is not the first part of the date
|
||
|
|
||
|
pd.to_datetime(["03-14-2000 10:00"], dayfirst=True) # Noncompliant
|
||
|
----
|
||
|
|
||
|
==== Compliant solution
|
||
|
|
||
|
[source,python,diff-id=1,diff-type=compliant]
|
||
|
----
|
||
|
import pandas as pd
|
||
|
|
||
|
pd.to_datetime(["01-12-2000 10:00"], dayfirst=True) # Compliant: the date will be parsed as expected
|
||
|
|
||
|
pd.to_datetime(["2000/02/28 12:00"], yearfirst=True) # Compliant
|
||
|
|
||
|
pd.to_datetime(["03-14-2000 10:00"]) # Compliant
|
||
|
----
|
||
|
|
||
|
|
||
|
== Resources
|
||
|
=== Documentation
|
||
|
|
||
|
* Pandas documentation - https://pandas.pydata.org/docs/user_guide/timeseries.html#converting-to-timestamps[Converting to timestamps]
|