\[~nicolas.harraudeau] - the https://github.com/SonarSource/sonar-clang/blob/master/src/core/analyzers/ControlCharacterCheck.cpp#L73[CPP implementation] looks at \n, \t, \r, 0, nonBreakableSpaceCharacterCode in combination with other characters and at characters with code < 32.
On the other side, https://discuss.sonarsource.com/t/new-rule-multiple-language-remove-invisible-characters-from-your-string/512/5[this internal discussion] mentions all invisible spaces characters.
I guess we should aim for invisible spaces, right? And mention that in the RSPEC? Update the rule name (instead of control character, also invisible space?)
The Unicode characters set https://en.wikipedia.org/wiki/C0_and_C1_control_codes[C1] (from \u0080 to \u009F) is considered by some languages as control characters (e.g.: C# https://docs.microsoft.com/en-us/dotnet/api/system.char.iscontrol?view=netframework-4.8[Char.IsControl(Char)] ). But it's not true for all encodings. For example, Windows-1250 code page encodes euro sign (€ unicode \u20AC) by using \u0080, and it's not a control character. So if a source code is encoded using UTF-8, we can not guess what means characters between \u0080-\u00FF because it depends of the original code page used to write characters in this range before converting it to UTF-8.