HAProxy 3.3 Chunked Encoding Bug: How We Found It Running S3 Storage

Mid-December. ZERO-Z3, our S3-compatible object storage service, was under application-level DDoS attack. L3/L4 mitigation upstream didn't help—the traffic was valid HTTP requests, just too many of them. We needed to rate-limit at the application layer.

HAProxy stick tables gave us what we needed: per-bucket and per-IP rate limiting that could throttle attackers while letting legitimate customers through. The rate limiting worked, but under sustained load we started seeing connection queue buildup and timeout issues.

HAProxy 3.3.1 had just released on December 19th. The changelog looked like it addressed exactly our situation. We upgraded. It didn't go as planned.

The Architecture

ZERO-Z3 runs on Ceph with RadosGW providing the S3 API frontend. HAProxy sits in front for load balancing, SSL termination, health checking, and rate limiting via stick tables.

RadosGW uses HTTP chunked transfer encoding for many S3 API responses. According to Ceph PR #23940, operations like ListBucket, ListBuckets, and ListBucketMultiparts use chunked encoding to stream responses while generating them—rather than buffering the entire response to calculate Content-Length upfront.

Per RFC 9112 §7.1, chunked transfer encoding terminates with a zero-sized chunk (0\r\n) followed by an empty line (\r\n). This termination sequence is what HAProxy 3.3.1 mishandles.

The DDoS Situation

The attack started in early December. Volumetric at first—easy to filter upstream. Then it shifted to application-layer. Valid-looking S3 requests: bucket listings, HEAD requests, authentication probes. Coming from thousands of IPs.

L3/L4 DDoS mitigation can't help with application-layer attacks. The packets are valid. The TCP connections complete normally. The HTTP requests are well-formed.

HAProxy stick tables track request rates per source IP, per target bucket, or combinations. We set limits high enough that legitimate customers wouldn't hit them, but low enough to throttle distributed attack traffic.

The rate limiting worked. But stick tables under heavy load meant connection queue buildup during peak attack periods.

Why HAProxy 3.3

HAProxy 3.3.0 released on November 26th, 2025, with 3.3.1 following on December 19th. The release announcement highlighted performance improvements directly relevant to our situation:

  • Relaxed stick-table locking: Updates batched and work delayed, reducing contention between tables and peers
  • Single-threaded stick-table expiration: Tasks that caused contention on many-CPU servers now run single-threaded
  • 64-byte aligned memory allocation: Objects aligned to cache line boundaries, reducing thrashing between threads
  • Default option abortonclose: Backends now stop processing requests from clients that already disconnected

With stick tables under heavy use for DDoS mitigation, these improvements looked directly applicable. We upgraded to 3.3.1 to get both the new features and initial bug fixes.

We tested in our lab with multiple S3 clients—Cyberduck, MinIO mc, rclone, aws-cli. Everything passed. We proceeded to production.

The Rollout

Rolling upgrade. One HAProxy node at a time. Watch the metrics: request success rate, latency percentiles, backend health, connection queue depth.

Grafana dashboard showing HTTP response codes - all 2xx successes, zero 5xx errors, yet chunked encoding corruption was occurring
HTTP response monitoring showed nothing wrong. All 2xx, zero 5xx. The corruption was invisible to standard metrics.

Every metric stayed stable. Queue depth improved—the stick-table optimizations were helping. By December 22nd, the entire HAProxy fleet was on 3.3.1.

The Symptoms

December 25th. Christmas morning. First support ticket: "Cyberduck is showing XML parse errors when I try to list my bucket."

We checked. Bucket existed. Permissions correct. We listed it ourselves—worked. Asked the customer to try again. It worked. Intermittent.

Within hours, more tickets came in:

  • MinIO client throwing errors on bucket operations
  • Rclone sync jobs failing
  • Veeam backup reporting S3 target issues

Different customers, different S3 client implementations. Same vague symptoms: intermittent failures, XML parsing issues, connection problems.

Our monitoring showed nothing wrong. Success rate still high. No HTTP 5xx errors. The bug was invisible to standard HTTP monitoring.

Root Cause Analysis

Packet captures revealed the issue. Comparing responses direct to RadosGW versus through HAProxy 3.3.1:

Direct to RadosGW (correct):

HTTP/1.1 200 OK Transfer-Encoding: chunked Content-Type: application/xml 1a4 <?xml version="1.0"...><ListBucketResult>...</ListBucketResult> 0

Through HAProxy 3.3.1 (intermittently corrupted):

HTTP/1.1 200 OK Transfer-Encoding: chunked Content-Type: application/xml 1a4 <?xml version="1.0"...><ListBucketResult>...</ListBucketResult> 0 0

The chunked encoding terminator was being forwarded as part of the response body. Responses ended up with a trailing 0 character. XML parsers returned "Content is not allowed in trailing section."

Per our bug report GitHub issue #3230: affected 25–50% of requests, non-deterministically. Direct to RadosGW: always correct. Through HAProxy 3.2.10: always correct. Through HAProxy 3.3.1: intermittently corrupted.

Affected Versions

Issue observed: HAProxy 3.3.1
Working: HAProxy 3.2.10, 2.8.14
Root cause: Zero-copy forwarding regression—kop value not decreased when chunk sizes are emitted, causing HAProxy to announce chunk sizes larger than actual data sent

Fix Confirmed

The fix is in commit 529a8dbf. A new helper function h1s_consume_kop() now correctly updates the kop value during zero-copy forwarding. We've tested the fix and confirmed it resolves the issue. The patch is being backported to 3.3.

Workaround: Add tune.disable-zero-copy-forwarding to your global section.

Resolution

Same day. Within hours of connecting the dots:

  1. Identified the bug through packet capture comparison
  2. Confirmed it wasn't our configuration by testing with minimal HAProxy config
  3. Rolled back to HAProxy 3.2.10 across the fleet
  4. Scaled horizontally with additional servers to handle the DDoS load

Total time from first ticket to full rollback: under 24 hours.

Bug Status: Fixed

Christopher Faulet (capflam) from the HAProxy team fixed this in commit 529a8dbf. The fix is being backported to 3.3. See GitHub #3230 for the full discussion.

What We Learned

HTTP Monitoring Has Blind Spots

The bug didn't manifest as HTTP errors. Status codes were 200. Connections completed. Corruption was inside the response body. Standard HTTP monitoring didn't catch it.

Lab Testing Has Limits

We tested with multiple S3 clients. The bug didn't appear. It manifests under conditions our lab didn't replicate.

LTS Versions Exist for a Reason

HAProxy 3.2 is LTS. We jumped to 3.3 for performance improvements and got a regression. LTS versions have had time to shake out these bugs.

Keep Rollback Ready

We had 3.2.10 packages ready and tested the downgrade path. Quick rollback kept resolution under 24 hours—on Christmas Day.

About ZERO-Z3

ZERO-Z3 is our S3-compatible object storage service. European data residency (DE, FI, NL), no per-request API fees, predictable egress pricing. Runs on our own infrastructure on AS215197.

Incidents like this are why we run our own infrastructure. When something breaks, we can see the full stack—from load balancer to storage backend.

Learn more about ZERO-Z3 →

Acknowledgments

Thanks to the HAProxy team for the quick turnaround on this fix. From bug report to confirmed fix in under two weeks—that's how open source should work.

If you're running HAProxy 3.3.x with backends that use chunked transfer encoding, either apply the workaround (tune.disable-zero-copy-forwarding) or wait for the backported fix.

Looking for S3-Compatible Storage?

ZERO-Z3 offers European data residency and predictable pricing. No per-request fees, no egress surprises.