CPP-4991 S5274 Rework rule to cover only optimization inhibiting moves and not all redundant move

2024-02-27 11:52:16 +01:00 · 2024-02-27 11:52:16 +01:00 · d747b76e63
commit d747b76e63
parent 0329489057
2 changed files with 189 additions and 22 deletions
--- a/rules/S5274/cfamily/metadata.json
+++ b/rules/S5274/cfamily/metadata.json
@ -1,11 +1,11 @@
 {
-  "title": "\"std::move\" should only be added when necessary",
+  "title": "\"std::move\" should not inhibit optimizations",
  "type": "CODE_SMELL",
  "code": {
    "impacts": {
      "MAINTAINABILITY": "MEDIUM"
    },
-    "attribute": "CLEAR"
+    "attribute": "EFFICIENT"
  },
  "status": "ready",
  "remediation": {
@ -15,9 +15,7 @@
  "tags": [
    "cppcoreguidelines",
    "performance",
-    "bad-practice",
-    "since-c++11",
-    "clumsy"
+    "since-c++11"
  ],
  "extra": {
    "replacementRules": [
--- a/rules/S5274/cfamily/rule.adoc
+++ b/rules/S5274/cfamily/rule.adoc
@ -1,29 +1,174 @@
+This rule reports an issue when the use of ``++std::move++`` prevents the copy elision optimizations from happening.
+
 == Why is this an issue?

-Usually, when copying an object, the source object is unchanged, meaning all resources owned by the source objects must be duplicated during the copy operation. If the source object is no longer used, this duplication is inefficient. Since {cpp}11, a move semantic mechanism has been added to detect such cases and replace the expensive copy with a much cheaper move operation that will transfer resources.
+Usually, when copying an object, the source object is unchanged, 
+meaning all resources owned by the source objects must be duplicated during the copy operation. 
+If the source object is no longer used, this duplication is inefficient. 
+Since {cpp}11, a move semantic mechanism has been added to detect such cases and replace the expensive copy with a much cheaper move operation that will transfer resources.
+
+The cornerstone of move semantics is detecting during a "copy" whether the source object will be reused or not. 
+This can be done explicitly by the user, by invoking `std::move` (or different casts to rvalue) on the object.
+In such case the user promises to the compiler that they won't care for the object's current value any longer.
+In addition, the compiler will implicitly use a move operation or skip copying the object in some situations.
+
+One case of optimization is that the copy will be elided or automatically turned into a move operation
+when a temporary object of type `T`:
+
+* is used to initialize a parameter or variable of type `T` or `const T`
+* is returned from the function that declares `T` or `const T` as return type
+
+[source,cpp]
+----
+class A {/* ... */};
+A create();
+
+void asParam(A a);
+
+A returnedFromFunc() {
+  // For all examples below, the object will not be copied.
+  // Either no copy or move will be performed (as guaranteed optimization since C++17)
+  // or a move operation will be used.
+  A a = create();
+  asParam(createA()); 
+  return A();
+}
+----
+
+Calling `std::move` on such an object is not only unnecessary but will also prevent the compiler from performing copy elision,
+and the rule raises issues in that case.
+[source,cpp]
+----
+class A {/* ... */};
+A create();
+
+void asParam(A a);
+
+A returnedFromFunc() {
+  // Move operations need to be performed, and cannot be elided.
+  A a = std::move(create());     // Noncompliant
+  asParam(std::move(createA())); // Noncompliant
+  return std::move(A());         // Noncompliant
+}
+----
+
+Another case of optimization is that under certain conditions, the local variable or function parameter is implicitly moved
+if it is directly returned (`return x`) from the function.
+
+In particular, when a variable of type `T` is returned directly from the function that declares `T` or `const T`
+as a return type:
+[source,cpp]
+----
+class A {/* ... */};
+
+A returnedLocalVar() {
+  A a = create();
+  // Variable a is automatically moved here
+  return a;
+}
+----
+
+These conditions overlap with the conditions under which copy elision optimization,
+referred to as Named Return Value Optimization (NRVO) can be performed by the compiler. 
+When this optimization is applied the local variable is returned without any copy or move operation being performed.
+
+In this case, adding `std::move` to the return statement will inhibit this optimization,
+and the rule raises an issue.
+[source,cpp]
+----
+class A {/* ... */};
+
+A returnedLocalVar() {
+  A a = create();
+  // Variable a is moved, but NRVO cannot be performed
+  return std::move(a); // Noncompliant
+}
+----


-The cornerstone of move semantics is detecting during a "copy" whether the source object will be reused or not. There are three situations:
+=== Why is the issue raised if my class does not have a move constructor?

-* The object is a temporary object with no name, and if it can't be named, it can't be used
-* The object is used in some specific places, such as a return statement
-* The user explicitly promises to the compiler that they won't care for the object's current value any longer. They do so by using the specific cast operation named ``++std::move++``.
+A move itself is not performing any object operation, and casting a source to `rvalue`.
+This leads to the constructor and assignment operator that accepts rvalue reference as a parameter -
+also referred to as move constructor and move assignment - to be selected by the overload resolution.
+However, when the class does not provide such a constructor,
+a copy constructor/assignment will be invoked respectively.

-If the user writes ``++std::move++`` in one situation that is already handled by the first two cases, it has two drawbacks:
+Such invocation of copy constructor may still be eliminated by copy elision optimizations,
+and thus redundant `std::move` calls, that inhibit such optimization, have a performance impact in such situations.

-* It is clumsy, useless code, which makes understanding the code more complex
-* In some cases, it can decrease performances because this can deactivate another optimization of the compiler, named copy elision.
+[source,cpp]
+----
+class OnlyCopyable {
+  OnlyCopyable(OnlyCopyable const&);
+  /* No move constructor */
+};
+OnlyCopyable create();

-When copy elision occurs, the object is neither copied nor moved (even if the copy/move constructors have side effects). The two objects are collapsed into only one memory location. The moment when copy elision occurs is compiler-dependent but is mandatory in the following cases:
+void test() {
+  // Forces a move operation, that invokes copy constructor
+  OnlyCopyable c1 = std::move(create()); // Noncompliant
+  // Copy elision eliminates invocation of the copy constructor
+  OnlyCopyable c2 = create(); // Compliant
+}
+----

+=== Why is an issue raised when passing an argument to a reference parameter?

-* in a return statement, if the returned object is a prvalue of the same class type as the function return type
-* in the initialization of a variable, if the initializer expression is a prvalue of the same class type as the variable type
+The copy elision optimization happens only if a new value is produced from the source,
+not if the parameter is a reference to the same type:

-This rule reports an issue when the use of ``++std::move++`` prevents the copy elision from happening.
+[source,cpp]
+----
+void process(A&& sink);

+void passArgument() {
+  // No move operation is triggered, as the parameter is a reference to A
+  process(std::move(create())); // Noncompliant
+  process(create());            // Compliant
+}
+----

-=== Noncompliant code example
+Such redundant calls to `std::move` are not inhibiting optimization at this point.
+However, when the `process` function is modified to accept `A` by value,
+it will prevent the compiler from eliminating the move operation altogether.
+To fully benefit from the performance impact of this change, 
+the maintainers would need to review and update all call sites and process functions,
+reducing the maintainability of the code.
+
+Moreover, if the parameter is a reference to a type to which the argument is converted,
+then copy elision may still happen when calling the converting constructor.
+[source,cpp]
+----
+class B {
+  // Converting constructor takes object B by value
+  B(A a); 
+};
+void processB(B&& sink);
+
+void passArgument() {
+  processB(create());            // Compliant
+  processB(std::move(create())); // Noncompliant, inhibits copy elision when initializing constructor parameter
+  // This call is equivalent to:
+  processB(B(std::move(create()))); // Noncompliant, inhibits copy elision when initializing constructor parameter 
+}
+----
+
+=== Why issues are not raised for all redundant moves?
+
+The requirements from performing an implicit move were relaxed in {cpp}20 and {cpp}23 standards,
+with some of them being applied retroactively.
+As a consequence depending on the standard and compiler versions, 
+a call to `std::move` may or may not be redundant in the return statement,
+and thus required for the code to be portable accross compilers.
+
+== How to fix it
+
+Remove the call to `std::move` flagged by the rule.
+
+=== Code examples
+
+==== Noncompliant code example

 [source,cpp,diff-id=1,diff-type=noncompliant]
 ----
@ -32,14 +177,12 @@ A getA();

 A f() {
  A a = std::move(getA()); // Noncompliant, prevents copy elision
-  std::vector<A> v;
-  v.push_back(std::move(getA())); // Noncompliant
  return std::move(a); // Noncompliant
 }
 ----


-=== Compliant solution
+==== Compliant solution

 [source,cpp,diff-id=1,diff-type=compliant]
 ----
@ -48,9 +191,35 @@ A getA();

 A f() {
  A a = getA(); // Compliant
+  return a; // Compliant
+}
+----
+
+
+==== Noncompliant code example
+
+[source,cpp,diff-id=2,diff-type=noncompliant]
+----
+class A {};
+A getA();
+
+void f() {
+  std::vector<A> v;
+  v.push_back(std::move(getA())); // Noncompliant
+}
+----
+
+
+==== Compliant solution
+
+[source,cpp,diff-id=2,diff-type=compliant]
+----
+class A {};
+A getA();
+
+void f() {
  std::vector<A> v;
  v.push_back(getA()); // Compliant
-  return a; // Compliant
 }
 ----