rspec/rules/S5843/description.adoc
Johann Beleites 903426703f
Create rule S5843[kotlin]: Regular expressions should not be too complicated (#416)
Co-authored-by: Margarita Nedzelska <margarita.nedzelska@sonarsource.com>
2021-11-03 16:33:50 +00:00

23 lines
1.5 KiB
Plaintext

Overly complicated regular expressions are hard to read and to maintain and can easily cause hard-to-find bugs. If a regex is too complicated, you should consider replacing it or parts of it with regular code or splitting it apart into multiple patterns at least.
The complexity of a regular expression is determined as follows:
Each of the following operators increases the complexity by an amount equal to the current nesting level and also increases the current nesting level by one for its arguments:
* ``++|++`` - when multiple ``++|++`` operators are used together, the subsequent ones only increase the complexity by 1
* ``++&&++`` (inside character classes) - when multiple ``++&&++`` operators are used together, the subsequent ones only increase the complexity by 1
* Quantifiers (``++*++``, ``+``, ``++?++``, ``++{n,m}++``, ``++{n,}++`` or ``++{n}++``)
* Non-capturing groups that set flags (such as ``++(?i:some_pattern)++`` or ``++(?i)some_pattern++``)
* Lookahead and lookbehind assertions
Additionally, each use of the following features increase the complexity by 1 regardless of nesting:
* character classes
* back references
If a regular expression is split among multiple variables, the complexity is calculated for each variable individually, not for the whole regular expression. If a regular expression is split over multiple lines, each line is treated individually if it is accompanied by a comment (either a Java comment or a comment within the regular expression), otherwise the regular expression is analyzed as a whole.