Create rule S7034: "contains" should be used to test whether a substring is part of a string (CPP-4996) (#4096)

This commit is contained in:
github-actions[bot] 2024-08-05 11:50:59 +02:00 committed by GitHub
parent 0f8af61051
commit 0f722e6d0b
No known key found for this signature in database
GPG Key ID: B5690EEEBB952194
3 changed files with 179 additions and 0 deletions

View File

@ -0,0 +1,27 @@
{
"title": "\"contains\" should be used to test whether a substring is part of a string",
"type": "CODE_SMELL",
"status": "ready",
"remediation": {
"func": "Constant\/Issue",
"constantCost": "10min"
},
"tags": [
"since-c++23",
"clumsy"
],
"defaultSeverity": "Major",
"ruleSpecification": "RSPEC-7034",
"sqKey": "S7034",
"scope": "All",
"defaultQualityProfiles": [
"Sonar way"
],
"quickfix": "partial",
"code": {
"impacts": {
"MAINTAINABILITY": "MEDIUM"
},
"attribute": "CLEAR"
}
}

View File

@ -0,0 +1,150 @@
== Why is this an issue?
There are multiple ways and useful functions to determine whether a substring or a character is part of another string.
Historically, the standard {cpp} functions provide this feature through proxies that do not directly reflect the intent of the code:
* The `find` functions on `std::string` and `std::string_view` return an index into the searched string.
The substring or character is not present when the index is equal to `npos` (often spelled `-1`).
* The `std::strstr` and `std::strchr` functions (inherited from C) return a pointer into the searched string.
When they return `nullptr`, the substring or character is not present.
While these functions work, they complexify the code:
1. They do not work at the right abstraction level.
Instead of just returning a `bool` indicating whether the substring is present, they return a proxy from which the information can be decoded.
This requires some boilerplate code that clouds the indent of the code.
2. Their names do not reflect the performed task.
The names of the C functions are arguably hard to read and obscure.
They only make sense after reading their documentation.
Furthermore, `find` also has a name unrelated to the current task.
Thankfully, {cpp}23 introduces `contains` on `std::string` and `std::string_view` to simplify this task.
This rule raises an issue when it detects code that could be improved with the use of `contains`.
== How to fix it
=== Replacing `find(needle)` with `contains(needle)`
For `std::string` and `std::string_view`, `contains` is a drop-in replacement when the returned index from `find` is compared against `npos` or the magical value `-1`.
==== Noncompliant code example
[source,cpp,diff-id=1,diff-type=noncompliant]
----
void example(std::string const& haystack, std::string_view needle) {
if (haystack.find(needle) == std::string::npos) { // Noncompliant
// The needle was not found.
}
}
----
==== Compliant solution
[source,cpp,diff-id=1,diff-type=compliant]
----
void example(std::string const& haystack, std::string_view needle) {
if (!haystack.contains(needle)) {
// The needle was not found.
}
}
----
In this example, `haystack` could also be a `std::string_view`, and `needle` could be a single character, a C-string, or anything convertible to `std::string_view` such as a `std::string`.
=== Replacing `std::string_view::find(needle, index)` with `substr(index)` and `contains(needle)`
When calling `find` on a `std::string_view`, you can provide a second parameter that allows skipping some characters.
This allows you, for example, to look for some content in a string while ignoring some prefixes.
However, this is essentially putting two distinct actions into one and clouds the intent.
==== Noncompliant code example
[source,cpp,diff-id=2,diff-type=noncompliant]
----
void example(std::string_view haystack,
std::size_t skipPrefix,
std::string_view needle) {
if (haystack.find(needle, skipPrefix) == std::string::npos) { // Noncompliant
// The needle was not found after the prefix.
}
}
----
==== Compliant code example
[source,cpp,diff-id=2,diff-type=compliant]
----
void example(std::string_view haystack,
std::size_t skipPrefix,
std::string_view needle) {
if (haystack.substr(skipPrefix).contains(needle)) {
// The needle was not found after the prefix.
}
}
----
Because `std::string_view` is a lightweight class, the compliant solution remains efficient.
Furthermore, `substr` can also take a second parameter, allowing the operation to check for a substring while also ignoring a suffix.
=== Replacing `std::string::find(needle, index)`
// For std::string, we are waiting for https://wg21.link/p3044 sub-string_view from string
// before raising an issue and providing a proper fix.
Since `std::string::substr` creates a new string, which is an expensive operation, the previous solution should not be applied directly to ``std::string``s.
Here, the best option is probably to create a `string_view` from the `string` and _then_ use the previous solution.
This rule will not raise an issue since this is less direct, and the benefits of using `contains` become less obvious.
=== Replacing `strstr` and `strchr` with `contains`
Because the C-inherited functions `strstr` and `strchr` take null-terminated C-string as parameters, you can write equivalent yet more readable code by first converting your ``++char const*++`` into a `std::string_view`.
Thankfully, `std::string_view` is a lightweight object that can be created inline without impacting performance.
The following example illustrates how to replace `strstr` but the same process can be applied to `strchr`.
// We do not show examples with strchr to avoid entering the confusing realm of characters being represented as `int` but interpreted as `char`.
// Nor do we explain that strchr(haystack, '\0') cannot be expressed as string_view{haystack}.contains('\0') because, by construction, the former is always true and the latter is always false.
// We do not expect people to write such code anyway.
==== Noncompliant code example
[source,cpp,diff-id=3,diff-type=noncompliant]
----
void example(const char* haystack, const char* needle) {
if (std::strstr(haystack, needle) != nullptr) { // Noncompliant
// The needle was found.
}
}
----
==== Compliant solution
[source,cpp,diff-id=3,diff-type=compliant]
----
void example(const char* haystack, const char* needle) {
if (std::string_view{haystack}.contains(needle)) {
// The needle was found.
}
}
----
While the previous solution works and only requires modifying a single line of code, replacing the C-strings with `std::string_view` objects throughout your program would be better.
== Resources
=== Documentation
* {cpp} reference - https://en.cppreference.com/w/cpp/string/basic_string_view/contains[`std::string_view::contains`]
* {cpp} reference - https://en.cppreference.com/w/cpp/string/basic_string/contains[`std::string::contains`]
* {cpp} reference - https://en.cppreference.com/w/cpp/string/basic_string/find[`std::string::find`]
* {cpp} reference - https://en.cppreference.com/w/cpp/string/byte/strstr[`std::strstr`]
* {cpp} reference - https://en.cppreference.com/w/cpp/string/byte/strchr[`std::strchr`]
=== Related rules
* S6171 - "contains" should be used to check if a key exists in a container

View File

@ -0,0 +1,2 @@
{
}