Imagine a tiny tweak in how a system organizes data causing a massive global outage. That’s exactly what happened to Cloudflare’s popular 1.1.1.1 DNS service, and it all boils down to a decades-old ambiguity in how DNS records are ordered. But here’s where it gets controversial: Was this a simple oversight, or a deeper issue with how we interpret internet standards? Let’s dive in.
In a recent blog post titled What came first—the CNAME or the A record? (https://blog.cloudflare.com/cname-a-record-order-dns-standards/), Cloudflare sheds light on how an unclear specification in the RFC (Request for Comments) standards led to a significant disruption. The issue? A routine update on January 8 altered the order of CNAME records in DNS responses, causing some clients to fail when resolving names. While most modern systems ignore the order of records, Cloudflare discovered that certain implementations still rely on CNAME records appearing first.
And this is the part most people miss: The change wasn’t just a bug—it was a result of optimizing memory usage in Cloudflare’s cache implementation. Sebastiaan Neuteboom, a systems engineer at Cloudflare, explained that the update was introduced on December 2, 2025, tested on December 10, and deployed globally starting January 7, 2026. The goal was to reduce memory allocations by appending CNAME records to the existing answer list instead of creating a new one. However, this caused CNAME records to sometimes appear at the bottom of responses, breaking compatibility with older systems.
Here’s how it works: When a DNS resolver encounters a CNAME record, it follows a chain of aliases to reach the final address. Each step in this chain is cached with its own expiration time. If part of the chain expires, the resolver only fetches the expired portion and combines it with the valid parts. But when the order of records changed, resolvers expecting CNAMEs first began to fail, triggering the outage.
Boldly put, this isn’t just Cloudflare’s problem—it’s a wake-up call for the entire industry. On platforms like Reddit and Hacker News, debates flared up. Some argued that the RFC specification is inherently unclear, while others suggested Cloudflare misinterpreted it. One user pointed out the lack of robust testing, while another invoked Hyrum’s Law: “With enough users, any observable behavior becomes a dependency.” Combined with the failure to follow Postel’s Law (“Be conservative in what you send, liberal in what you accept.”), this incident highlights the delicate balance between optimization and compatibility.
Cloudflare has since proposed an Internet-Draft (https://datatracker.ietf.org/doc/draft-jabley-dnsop-ordered-answer-section/) to clarify how CNAME records should be handled in DNS responses, aiming to prevent similar issues in the future. The outage was resolved within hours, but the conversation it sparked is far from over.
Here’s the thought-provoking question for you: Is the ambiguity in RFC specifications a relic of the past, or a sign that we need to rethink how we standardize and test critical internet infrastructure? Share your thoughts in the comments—let’s keep the discussion going.