rspec/rules/S5867/java/rule.adoc

== Why is this an issue?

When using POSIX classes like ``++\p{Alpha}++`` without the ``++UNICODE_CHARACTER_CLASS++`` flag or when using hard-coded character classes like ``++"[a-zA-Z]"++``, letters outside of the ASCII range, such as umlauts, accented letters or letter from non-Latin languages, won't be matched. This may cause code to incorrectly handle input containing such letters.


To correctly handle non-ASCII input, it is recommended to use Unicode classes like ``++\p{IsAlphabetic}++``. When using POSIX classes, Unicode support should be enabled by either passing ``++Pattern.UNICODE_CHARACTER_CLASS++`` as a flag to ``++Pattern.compile++`` or by using ``++(?U)++`` inside the regex.


=== Noncompliant code example

[source,java]
----
Pattern.compile("[a-zA-Z]");
Pattern.compile("\\p{Alpha}");
----


=== Compliant solution

[source,java]
----
Pattern.compile("\\p{IsAlphabetic}"); // matches all letters from all languages
Pattern.compile("\\p{IsLatin}"); // matches latin letters, including umlauts and other non-ASCII variations
Pattern.compile("\\p{Alpha}", Pattern.UNICODE_CHARACTER_CLASS);
Pattern.compile("(?U)\\p{Alpha}");
----


ifdef::env-github,rspecator-view[]

'''
== Implementation Specification
(visible only on this page)

=== Message

* when using plain character classes: Replace this character range with a Unicode-aware character class.
* when using POSIX classes: Enable the "UNICODE_CHARACTER_CLASS" flag or use a Unicode-aware alternative.


include::../highlighting.adoc[]

endif::env-github,rspecator-view[]
migrate rule descriptions to new education format 2023-05-03 11:06:20 +02:00			`== Why is this an issue?`

Restructure one-lang rspecs 2021-04-28 16:49:39 +02:00			When using POSIX classes like ``++\p{Alpha}++`` without the ``++UNICODE_CHARACTER_CLASS++`` flag or when using hard-coded character classes like ``++"[a-zA-Z]"++``, letters outside of the ASCII range, such as umlauts, accented letters or letter from non-Latin languages, won't be matched. This may cause code to incorrectly handle input containing such letters.


			To correctly handle non-ASCII input, it is recommended to use Unicode classes like ``++\p{IsAlphabetic}++``. When using POSIX classes, Unicode support should be enabled by either passing ``++Pattern.UNICODE_CHARACTER_CLASS++`` as a flag to ``++Pattern.compile++`` or by using ``++(?U)++`` inside the regex.

Fix the missing default metadata fields 2021-04-28 18:08:03 +02:00
migrate rule descriptions to new education format 2023-05-03 11:06:20 +02:00			`=== Noncompliant code example`
Restructure one-lang rspecs 2021-04-28 16:49:39 +02:00
RULEAPI-661: Add syntax coloring 2022-02-04 17:28:24 +01:00			`[source,java]`
Restructure one-lang rspecs 2021-04-28 16:49:39 +02:00			`----`
			`Pattern.compile("[a-zA-Z]");`
			`Pattern.compile("\\p{Alpha}");`
			`----`

Fix the missing default metadata fields 2021-04-28 18:08:03 +02:00
migrate rule descriptions to new education format 2023-05-03 11:06:20 +02:00			`=== Compliant solution`
Restructure one-lang rspecs 2021-04-28 16:49:39 +02:00
RULEAPI-661: Add syntax coloring 2022-02-04 17:28:24 +01:00			`[source,java]`
Restructure one-lang rspecs 2021-04-28 16:49:39 +02:00			`----`
			`Pattern.compile("\\p{IsAlphabetic}"); // matches all letters from all languages`
			`Pattern.compile("\\p{IsLatin}"); // matches latin letters, including umlauts and other non-ASCII variations`
			`Pattern.compile("\\p{Alpha}", Pattern.UNICODE_CHARACTER_CLASS);`
			`Pattern.compile("(?U)\\p{Alpha}");`
			`----`
Fix the missing default metadata fields 2021-04-28 18:08:03 +02:00

RULEAPI-666: Migrate the "List of parameters", "Highlighting" and "Message" fields from jira RSPEC (#346) 2021-09-20 15:38:42 +02:00			`ifdef::env-github,rspecator-view[]`

			`'''`
			`== Implementation Specification`
			`(visible only on this page)`

Inline adoc when include has no additional value (#1940) Inline adoc files when they are included exactly once. Also fix language tags because this inlining gives us better information on what language the code is written in. 2023-05-25 14:18:12 +02:00			`=== Message`

			`* when using plain character classes: Replace this character range with a Unicode-aware character class.`
			`* when using POSIX classes: Enable the "UNICODE_CHARACTER_CLASS" flag or use a Unicode-aware alternative.`

RULEAPI-666: Migrate the "List of parameters", "Highlighting" and "Message" fields from jira RSPEC (#346) 2021-09-20 15:38:42 +02:00
Create rule S5867[kotlin]: Unicode-aware versions of character classe… (#439) * Create rule S5867[kotlin]: Unicode-aware versions of character classes should be preferred * Fix typo Co-authored-by: margarita-nedzelska-sonarsource <70522623+margarita-nedzelska-sonarsource@users.noreply.github.com> Co-authored-by: margarita-nedzelska-sonarsource <70522623+margarita-nedzelska-sonarsource@users.noreply.github.com> 2021-10-22 15:41:03 +02:00			`include::../highlighting.adoc[]`
RULEAPI-666: Migrate the "List of parameters", "Highlighting" and "Message" fields from jira RSPEC (#346) 2021-09-20 15:38:42 +02:00
			`endif::env-github,rspecator-view[]`