Debugging stories: What’s that 404 error?

12 minute read

Here is a little story about resolving an issue with a web site that turned out not to be an issue with a web site :-)

A client approached me and asked, if I could look into an issue they were having with their web app. Multiple users, mainly from mobile devices, were reporting 404 Not Found errors when accessing the site’s domain.

Server error for some devices? 🤨

It sounded like a strange thing that the server would give a 404 for some mobile devices. I tried to reproduce the issue, but was not able to do so, neither on one of my devices (mobile or not) nor on devices from an external device farm.

So, after double-checking that users were actually accessing the right site, telling them to clear their browser’s cache and sending in screenshots (default 404 page), I started familiarizing myself with what was running on the server and went through the web server log files. The result in short: I couldn’t find any of the errors there.

Reproducible, but no logs?

After a while, though, the client had a device at hand that could reliably reproduce the issue. I asked them to open the address that would lead to the 404 and simultaneously monitored the log file – nothing showed up. Either the device was not making an actual network request or it was just hitting another server. In general, access/error logs were written by the server, as I could verify before by accessing the site myself.

404 – wtf?

Further asking about the problem, I learned that the issue would sometimes (!) go away when a user would enter a wifi – still using the same device. While this also sounded weird, it was now much easier to narrow down.

Inspecting the request, finding the issue!

I tried to connect again with one of my devices using the mobile network and could finally see the same issue. I hooked the device up to my computer and looked at the network request my phone was sending (over the mobile network) and the answer the (a?) server was giving. This revealed what the real issue was.

Via the mobile network, the IPv6 address from the AAAA DNS record of the site’s domain was being used. In the wifi, however, the device used the IPv4 address from the A record. While this is not an issue by itself and just depends on what a device or router supports or how it is configured, it showed that someone indeed set the AAAA record to an address pointing to a different server, which, naturally, didn’t have the same resources 😱. This explained all previously reported issues:

  • Devices on mobile networks used the IPv6 address (Dual Stack, IPv6 preferred) and thus reached a totally different server
  • Most of the time, the problem would go away when switching from a mobile network to a (W)LAN; these were IPv4 connections
  • Some networks were apparently IPv6 only, meaning, it would still not work on those after switching to them
  • In some networks, desktop access would work, but mobile access didn’t, even though it was the same network: it turns out, the mobile devices automatically switched to the cell network, when it was faster (WiFi assist) 😈
  • Of course nothing would ever show up in the logs of the web server in question, as the faulty request wasn’t sent there

Correcting the IPv6 address in the AAAA record solved the issue – obviously. It’s always the little and easy things that go wrong :-)

I hope this story will help someone with a similar problem find the solution quicker.

Tags:

Categories:

Updated:

Like to comment? Feel free to send me an email or reach out on Twitter.