A funny thing happened on the way to launching a website the other day. 

The project was for a large, global fintech corporation and this was a big update for them. They are a highly technical group, as is their clientele– so a smooth and error-free launch was crucial to positive reception. 

This site relies on NextJS, a highly-sophisticated meta-framework which depends on the popular, open-source UI library React. For most clients, we recommend hosting this type of project on a 3rd party service like Vercel or Netlify. These platforms are specifically optimized for this type of architecture and come packed with features. 

For this client, it was important to take advantage of their own infrastructure, so they opted to self-host. 

Self Hosting NextJS

Now, self-hosting NextJS is not free of challenge despite plenty of documentation on how to set things up. In fact, the complexity of self-hosting your own NextJS project is well established. There are some 12 different ways to go about hosting NextJS in AWS alone (Amplify, Lightsail, EC2, HTML export to S3, Lambda, …you get the point). There are even projects like OpenNext that exist specifically because of how complex self-hosting can be. And which features or rendering mode(s) your app depends on can heavily influence the constraints you have to work within too.  

So not surprisingly– despite rigorous QA in advance of launch, we discovered a really strange defect when moving to production. 

Upon the initial production deployment, the site appeared to perform flawlessly. It loaded fast, content looked great, 3rd party integrations were all firing, etc. Though under certain circumstances, when a user would hit their back button, instead of returning the previous page they would be greeted with this kind of nasty not-quite JSON looking response…

Yikes. What on earth? 

Definitely not a good look. And especially not as the corporate face of a tech-first financial services company. 

Troubleshooting 

What could explain why the user might get two different responses for the same page? 

We all panicked a little bit at first. I initially assumed this behavior to be an edge-case. But once we recreated the issue a few times, we knew this was a serious threat to the success of the project. Was this a particularly gnarly application error? Maybe something was jacking with the history object? Could this be a goofy result from some 3rd party script?

We instinctively jumped into the code but quickly realized we couldn’t recreate the problem locally. And when we deployed the project to Netlify, it worked as expected? WTF.

This led to some swift scrutiny on the differences in hosting environments and pretty quickly, the Ample team zeroed in on the client’s CDN as a potential problem. 

What’s a CDN again?

Content Delivery Networks (aka CDNs) are powerful things. They allow us to defray the cost of serving lots of files to lots of users by efficiently caching and delivering them around the globe. This ensures an optimal experience while providing enormous and consistent pricing. 

Analogy time!

Imagine a global pizza chain. Instead of making pizzas in one central kitchen and delivering them across the world, they set up local kitchens in different cities. When you order a pizza, it comes from the kitchen closest to you, not the main one far away.

Similarly, a CDN stores copies of your website's content on multiple servers worldwide. When a user visits your site, the content is delivered from the server closest to them.

Thus the benefits of a CDN, in the context of this analogy are…

  • Speed: Just like a pizza arrives faster from a nearby kitchen, website data loads faster when it's served from a nearby server.

  • Reliability: If one kitchen is overloaded or out of ingredients, others can step in, ensuring you still get your pizza (or web content).

  • Scalability: The pizza chain can serve many customers simultaneously, just as a CDN can handle spikes in traffic.

So how would a CDN be implicated in this bizarre production defect anyway? Well, it has to do with a core feature of NextJS called “RSC payloads”. 

So many acronyms!

I know. I’ll make this brief. What’s an RSC payload, you ask?

RSC stands for React Server Component. Since the introduction of the App Router in version 13, NextJS defaults to using RSCs to render pages. This drastically improves the performance of your site. 

When you load a NextJS page, your browser is making all these little requests for the server rendered components. For each request, the server returns these JSON-like responses which NextJS calls an “RSC payload”. NextJS uses these payloads to hydrate components as you click around the site. The result is a crazy fast, almost instantaneous page-load. 

These RSC payloads are essentially a serialized representation of the server rendered components. Normally, you don’t even know this stuff is going on in the background but for some reason, the back button caused NextJS to barf this crap all over the screen. 

But why?

So if you look at the network traffic for a NextJS project, you can plainly see requests returning RSC payloads all over the place. They’re denoted by query string params or custom headers but more crucially, we see this Vary header attached to lots of the requests. 

What is the Vary header?

The Vary header tells the CDN to store different versions of a response based on certain request details, like the user's current page or their state within the application. Without this header, the CDN might send the wrong version of the page (like someone else's app state or perhaps an RSC payload?!) leading to broken or confusing experiences because the server's dynamic responses aren’t correctly distinguished.

This was a big clue. What if the client’s CDN was failing to properly observe the Vary header? That would totally explain why the user might get two different responses for the same path!

Beware the Defaults

We immediately started reviewing the documentation for Akamai CDN and it didn’t take long to find the following… 

“Many web servers add Vary HTTP headers to content by default. This header lets applications alter content based on the browser in use or other client properties. Unfortunately, this makes it difficult to cache content for quicker delivery. As a best practices setting, AMD assumes that all content is cacheable and automatically removes the Vary HTTP header to allow for caching.
Source

Eureka!

Thanks to the defaults, Akamai was stripping the Vary header from requests which resulted in incorrect responses for certain routes, like those that were distinguished as RSC responses. 

So when our users would unwittingly hit the back button to return to the previous page, the CDN was returning the most recent response for the same route. This resulted in the user seeing the raw RSC response instead of the fully intact page. 

After we toggled off the “Remove Vary Header” feature, our application was happy again and our users can safely back-track as many times as they want. 

This experience highlights the importance of understanding how modern frameworks like Next.js rely on nuanced mechanisms like the Vary header. While CDNs are essential for speed and scalability, their default behaviors can conflict with dynamic applications, leading to issues like serving incorrect responses. 

The takeaway? Always consider how your CDN handles critical details like these headers, especially when working with server-rendered content. Sometimes, a small adjustment—like preserving Vary—can make all the difference in delivering that seamless user experience that makes your clients say WOW. 

And, whenever you’re next choosing a digital agency to work with, be sure that they’re a knowledgeable partner who can navigate unexpected challenges like this one.

Interested in moving to the JAMstack? Let's talk.

Want to stay in the know on what we know?

Sign up for our email newsletter. Nothing spammy about it. Just a monthly rundown of what we’re sharing.

Thank you! Your submission has been received!
Oops! Something went wrong while submitting the form.