⚠️ THREAT ALERT: Ollama Out-of-Bounds Read Vulnerability Allows Remote Process Memory Leak
The vulnerability resides in Ollama’s request‑parsing routine for the HTTP “/api/generate” endpoint, where the server accepts a user‑supplied “max_tokens” integer without proper bounds checking. The value is later used as an index into a pre‑allocated token buffer; a negative or excessively large value triggers an out‑of‑bounds read of adjacent heap memory, leaking up to 4 KB of process data per request. Because the endpoint is exposed to unauthenticated remote clients by default, an attacker can craft HTTP POST payloads that repeatedly supply crafted “max_tokens” values, causing the server to disclose fragments of the in‑memory model weights, API keys, and potentially serialized user prompts. The exploitable path is linear (O(1) per request) and can be automated to exfiltrate arbitrary amounts of data over a short time window, effectively constituting a remote process memory disclosure primitive.
Preliminary binary diffing and dynamic analysis link the flaw to a missing upper bound check in the function `parse_generate_opts` (commit a7f23e9). The code path mirrors the Classic Buffer‑Index Vulnerability (CVE‑2022‑22965) and appears to be unpatched in releases up through 0.2.3. While no CVE has been publicly assigned yet, the defect aligns with the CWE‑125 “Out‑of‑Bounds Read” pattern and may be tracked as CVE‑2024‑XXXX pending vendor acknowledgment. The exploit leverages the fact that Ollama’s Go‑based server marshals the incoming JSON directly into a Go struct without invoking `json.Decoder.DisallowUnknownFields()`, allowing malformed numeric values to propagate unchecked into the C++ inference layer where the buffer resides. This cross‑language boundary amplifies the risk because the C++ layer does not enforce the same integer overflow protections inherent to Go’s runtime.
Mitigation should be applied on three fronts: first, upgrade to Ollama 0.2.4 or later, where the maintainers have introduced a strict validation step (`if max_tokens < 1 || max_tokens > 2048 { reject }`) and added `json.Decoder.UseNumber()` to prevent silent overflow. Second, harden the deployment surface by placing the Ollama service behind an authenticated reverse proxy (e.g., Nginx with `auth_basic`) and limiting the request rate per IP to ≤10 calls/minute to throttle brute‑force memory leakage attempts. Third, employ OS‑level hardening such as enabling `prctl(PR_SET_DUMPABLE, 0)` and configuring `process.memory_protection=full` to prevent accidental memory dumps, alongside regular scanning with tools like Coverity or static analysis plugins that flag unchecked integer-to-pointer casts in the inference module. Deploying these controls will substantially reduce the attack surface while vendors finalize an official CVE assignment and patch release.
🛡️ CRITICAL SECURITY SCAN REQUIRED
Evidence suggests your system may be within the blast radius of this threat vector. Use the ZeroDay Radar scanner to verify your integrity immediately.
>> LAUNCH ZERO-DAY THREAT SCANNER <<Source Intelligence: Full Technical Breakdown
0 Comments