PyTorch’s Distributed RPC Framework Vulnerable to Remote Code Execution

A critical vulnerability has been found in the PyTorch machine learning library, allowing remote code execution. Designated CVE-2024-5480, the issue affects PyTorch’s Distributed RPC framework due to lack of function validation during RPC operations.

Utilized in distributed training scenarios, this vulnerability enables arbitrary command execution during multi-CPU RPC communication by abusing built-in Python functions.

NIST stated: “The vulnerability stems from insufficient restrictions on function calls when worker nodes serialize and send PythonUDFs (User Defined Functions) to the master node, which subsequently deserializes and executes the function without validation.”

Huntr, an AI and ML bug bounty platform, explained that worker nodes can serialize functions and tensors into PythonUDFs using specific functions during multi-CPU RPC communication, then send them to the master node.

Huntr elaborated: “The Master deserializes the received PythonUDF data and invokes _run_function. This allows the worker to execute the specified function, but due to the lack of restrictions on function calls, it’s possible to trigger remote code execution by calling built-in Python functions like eval.”

Remote attackers can exploit this vulnerability to compromise the master node orchestrating distributed training, potentially leading to theft of sensitive data related to artificial intelligence.

CVE-2024-5480 has a CVSS score of 10, reported on April 12th, impacting PyTorch 2.2.2 and earlier versions. The latest version of the machine learning library is currently 2.3.1.

The researchers who discovered this vulnerability received a $1,500 bug bounty reward.