2021-09-27 15:21:34 +02:00
|
|
|
|
When using metacharacters like `\w` without the Unicode flag or when using hard-coded character classes like `[a-zA-Z]`,
|
|
|
|
|
letters outside of the ASCII range, such as umlauts, accented letters, or letters from non-Latin languages, won’t be matched.
|
|
|
|
|
This may cause code to incorrectly handle input containing such letters.
|
|
|
|
|
|
|
|
|
|
To correctly handle non-ASCII input, it is recommended to use the Unicode flag `\u` or Unicode character properties like `\p{L}`.
|
|
|
|
|
|
|
|
|
|
== Noncompliant Code Example
|
|
|
|
|
|
2022-02-04 17:28:24 +01:00
|
|
|
|
[source,php]
|
2021-09-27 15:21:34 +02:00
|
|
|
|
----
|
|
|
|
|
preg_match("/[a-zA-Z]/", "ö"); // returns 0
|
|
|
|
|
preg_match("/\w/", "ö"); // returns 0
|
|
|
|
|
----
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
== Compliant Solution
|
|
|
|
|
|
2022-02-04 17:28:24 +01:00
|
|
|
|
[source,php]
|
2021-09-27 15:21:34 +02:00
|
|
|
|
----
|
|
|
|
|
preg_match("/\w/u", "ö"); // returns 1
|
|
|
|
|
preg_match("/\p{L}/", "ö"); // return 1
|
|
|
|
|
----
|