Create rule S6985 : Usage of "torch.load" can lead to untrusted code execution (#3976)

* Create rule S6985

* add implementation details

* Address review

* Update rule to include details about the wheights_only parameter

* Remove unnecessary example

---------

Co-authored-by: ghislainpiot <ghislainpiot@users.noreply.github.com>
Co-authored-by: Ghislain Piot <ghislain.piot@sonarsource.com>
Co-authored-by: Sebastian  Zumbrunn <sebastian.zumbrunn@sonarsource.com>
This commit is contained in:
github-actions[bot] 2024-09-17 14:59:12 +02:00 committed by GitHub
parent 58c6c084e6
commit 7f75840e19
No known key found for this signature in database
GPG Key ID: B5690EEEBB952194
3 changed files with 91 additions and 0 deletions

View File

@ -0,0 +1,2 @@
{
}

View File

@ -0,0 +1,25 @@
{
"title": "Usage of \"torch.load\" can lead to untrusted code execution",
"type": "SECURITY_HOTSPOT",
"status": "ready",
"remediation": {
"func": "Constant\/Issue",
"constantCost": "15min"
},
"tags": [
"pytorch",
"machine-learning"
],
"defaultSeverity": "Major",
"ruleSpecification": "RSPEC-6985",
"sqKey": "S6985",
"scope": "All",
"defaultQualityProfiles": ["Sonar way"],
"quickfix": "infeasible",
"code": {
"impacts": {
"SECURITY": "HIGH"
},
"attribute": "CONVENTIONAL"
}
}

View File

@ -0,0 +1,64 @@
This rule raises an issue when `pytorch.load` is used to load a model.
== Why is this an issue?
In PyTorch, it is common to load serialized models using the `torch.load` function.
Under the hood, `torch.load` uses the `pickle` library to load the model and the weights.
If the model comes from an untrusted source, an attacker could inject a malicious payload which would be executed during the deserialization.
== How to fix it
Use a safer alternative to load the model, such as `safetensors.torch.load_model`. Alternatively, PyTorch can be instructed to only load
the weights by setting the parameter `weights_only=True`. This avoids the use of the `pickle` library and is therefore safe. Note that the
use of `weights_only` requires saving only the `state_dict` of a model instead of the whole model.
=== Code examples
==== Noncompliant code example
[source,python,diff-id=1,diff-type=noncompliant]
----
import torch
model = torch.load('model.pth') # Noncompliant: torch.load is used to load the model
----
==== Compliant solution
[source,python,diff-id=1,diff-type=compliant]
----
import torch
import safetensors
model = MyModel()
safetensors.torch.load_model(model, 'model.pth')
----
== Resources
=== Documentation
* Pytorch documentation: https://pytorch.org/tutorials/beginner/saving_loading_models.html#save-load-entire-model[Save/Load Entire Model]
ifdef::env-github,rspecator-view[]
(visible only on this page)
== Implementation specification
All usages of torch.load
=== Message
Primary : Replace this call with a safe alternative
=== Issue location
Primary : name of the function call
=== Quickfix
No
endif::env-github,rspecator-view[]