Marco Borgeaud 8209548e54
Diff blocks: fix incorrect use for python (#2795)
Improvement identified in #2790.

Add a prefix to the diff-id when it is used multiple times in different
"how to fix it in XYZ" sections to avoid ambiguity and pedantically
follow the spec:

> A single and unique diff-id should be used only once for each type of
code example as shown in the description of a rule.

Obvious typos around `diff-type` were fixed.

An obvious extra use of diff blocks was removed.
2023-08-21 15:22:49 +02:00

76 lines
2.6 KiB
Plaintext

== How to fix it in the Python Standard Library
=== Code examples
The following noncompliant code is vulnerable to XPath injection because untrusted data is
concatenated to an XPath query without prior validation. Although `xml.etree.ElementTree` only
supports a subset of XPath syntax, exploitation can still be possible.
==== Noncompliant code example
[source,python,diff-id=11,diff-type=noncompliant]
----
from xml.etree import ElementTree
from flask import request
@app.route('/authenticate')
def authenticate():
username = request.args['username']
password = request.args['password']
expression = "./users/user[@name='" + username + "'][@pass='" + password + "']"
tree = ElementTree.parse('resources/users.xml')
if tree.find(expression) is None:
return "Invalid credentials", 401
else:
return "Success", 200
----
////
The above example would be exploitable by using e.g. `username=”admin][.!=”` and `password=”][.!='XXXXX”`
This creates the following query: `./users/user[@name='admin][.!='][@pass='][.!='XXXXX']`,
where `[.!='][@pass=']` and `[.!='XXXXX']` should match on everything, resulting in authentication as admin.
////
==== Compliant solution
[source,python,diff-id=11,diff-type=compliant]
----
import re
from xml.etree import ElementTree
from flask import request
tree = ElementTree.parse('resources/users.xml')
@app.route('/authenticate')
def authenticate():
username = request.args['username']
password = request.args['password']
if re.match("^[a-zA-Z0-9]*$", username) is None or re.match("^[a-zA-Z0-9]*$", password) is None:
return "Username or password contains invalid characters", 400
expression = "./users/user[@name='" + username + "'][@pass='" + password + "']"
tree = ElementTree.parse('resources/users.xml')
if tree.find(expression) is None:
return "Invalid credentials", 401
else:
return "Success", 200
----
=== How does this work?
As a rule of thumb, the best approach to protect against injections is to
systematically ensure that untrusted data cannot break out of the initially
intended logic.
include::../../common/fix/parameterized-queries.adoc[]
It is not possible to construct parameterized queries using only the Python Standard Library. +
Please use https://pypi.org/project/lxml/[lxml] instead, which allows for parameterized queries using the `ElementTree.xpath()` method.
include::../../common/fix/validation.adoc[]
In the example, we ensure that the username and password only contain alphanumeric characters by doing a regex match before executing the XPath query.