23 lines
1.5 KiB
Plaintext
23 lines
1.5 KiB
Plaintext
Overly complicated regular expressions are hard to read and to maintain and can easily cause hard-to-find bugs. If a regex is too complicated, you should consider replacing it or parts of it with regular code or splitting it apart into multiple patterns at least.
|
|
|
|
|
|
The complexity of a regular expression is determined as follows:
|
|
|
|
|
|
Each of the following operators increases the complexity by an amount equal to the current nesting level and also increases the current nesting level by one for its arguments:
|
|
|
|
|
|
* ``++|++`` - when multiple ``++|++`` operators are used together, the subsequent ones only increase the complexity by 1
|
|
* ``++&&++`` (inside character classes) - when multiple ``++&&++`` operators are used together, the subsequent ones only increase the complexity by 1
|
|
* Quantifiers (``++*++``, ``+``, ``++?++``, ``++{n,m}++``, ``++{n,}++`` or ``++{n}++``)
|
|
* Non-capturing groups that set flags (such as ``++(?i:some_pattern)++`` or ``++(?i)some_pattern++``)
|
|
* Lookahead and lookbehind assertions
|
|
|
|
Additionally, each use of the following features increase the complexity by 1 regardless of nesting:
|
|
|
|
|
|
* character classes
|
|
* back references
|
|
|
|
If a regular expression is split among multiple variables, the complexity is calculated for each variable individually, not for the whole regular expression. If a regular expression is split over multiple lines, each line is treated individually if it is accompanied by a comment (either a Java comment or a comment within the regular expression), otherwise the regular expression is analyzed as a whole.
|