The Unicode characters set https://en.wikipedia.org/wiki/C0_and_C1_control_codes[C1] (from \u0080 to \u009F) is considered by some languages as control characters (e.g.: C# https://docs.microsoft.com/en-us/dotnet/api/system.char.iscontrol?view=netframework-4.8[Char.IsControl(Char)] ). But it's not true for all encodings. For example, Windows-1250 code page encodes euro sign (€ unicode \u20AC) by using \u0080, and it's not a control character. So if a source code is encoded using UTF-8, we can not guess what means characters between \u0080-\u00FF because it depends of the original code page used to write characters in this range before converting it to UTF-8.