How I lost more time than I want to admit
How one stupid thing cost me hours and hours of bad decissions, and why I at some point said that Lambdas are False Hope
This is a short story about how I lost hours and hours and had a lot of bad ideas about how to solve one issue.
There are two stories and two lessons from this. The first story is that for a very very long time, I wanted to build something. It’s some deep desire inside me. I lost count of how many brainstorming sessions with my close friend I had about ideas.
But recently I figured out it was procrastination all along. And that’s when I decided to build something, don’t worry, it’s not about AI. It’s about monitoring and fault tolerance. The first lesson is that if you want to build something, you should stop procrastination and build a thing you are passionate about. You have it for sure.
I’ve started building it for a couple of weeks now. The application has a simple stack so far.
It’s using Next on the frontend because I want to have server-side rendering (SSR), and I know the technology. Or I thought I know it.
The backend is done in NodeJS using Firebase Functions. Which are in general Google Cloud Functions.
Cloudflare is in front of the application
After some time I started deploying it. Everything worked as a charm, but I at first used JWT and stored token in local storage. Because at the beginning didn’t want to lose too much time on it, wanted to ship things.
But as you know, SSR can’t work in this situation since JWT is inside local storage. Now the time came for me to fix my auth flow, and I started moving to cookie. In order for it to work properly, you have to use getServerSideProps in Next, this way things will be fetched and rendered on the server.
Things worked in my local (it works my machine joke :)), but things started breaking on Firebase.
I started noticing timeouts on my app. Vercel has a timeout of 10s to return a response, otherwise, you get 504. I got 504, I hate it at this point. My first assumption is that my environment variables are wrong.
They looked ok, checked logs and there was literally nothing. So my first assumption was that something is happening with getServerSideProps but I couldn’t find anything, it was really simple. Just fetch data, and send it into the component.
After hours of investigation, I noticed that Firebase will strip all my cookies and only leave __session. Of course, my cookie wasn’t named like that, and I didn’t expect this random thing to happen. I should’ve read the documentation, although I wouldn’t search how my cookie should be named in the documentation 🙄.
After switching to using __session cookie things were working on API. But I was still getting 504. At this point, I was like: “What the fuck is happening?”.
My next assumption was that something is wrong with the cold start of the function. If you don’t know, the cold start is the time your function takes to bootstrap. When you don’t have much traffic, it can take a lot of time depending on what you do inside it. The first request will be slow.
So I tried to optimize for it. Not much improvement. I even tried to raise the memory/cpu of cloud functions, since I started getting desperate. Didn’t help.
Since a cold start could be a potential issue but it was far-fetched, I placed all my lambdas to be warm. Warm instances are a way to keep your lambdas spinning all the time, which destroys the entire purpose of lambdas. Essentially getting to the simulating server, at this point I could’ve just bought VPC on Hetzner and host the app there. But warm instances didn’t help.
I was joking at this point that lambdas are False Hope (referencing Star Wars New Hope). I even planned to rename the newsletter into False Hope.
At this point, started to move from lambdas to express application. Wrapped everything inside one express app, wrote Dockerfile and deployed it to Cloud Run.
If you don’t know what is Cloud Run, it’s a simple way to run Serverless Containers on Google Cloud. And Firebase Function’s newer generation (gen2) is using Cloud Run to execute them.
But of course, this too didn’t help, my app was still down. At this point, I ruled out that it was an issue with the backend, rollbacked everything, and deployed on Firebase Functions once again.
Coming down to the real issue
The only thing that was left was NextJS. I knew that it had to be a problem with it. But how the hell do I debug it? How can this be a problem, I’m not doing anything crazy…
Without errors, and some logs on Vercel I was left on my own. My one thought was that if I had to, I would try to migrate it to GCP too, and host on my own. Maybe that would help… Thank God I didn’t go with that route, but I was really close to it. It wouldn’t help.
I started digging a little bit on the internet, people were saying a bunch of stuff but all of them were not my issues. So my suspicions moved to my design library, React MUI.
MUI is great and helped me move fast, it felt really great working with it. But after a few minutes of digging, I saw on Github issues that people have problems in the way how they import it.
MUI supports tree shaking by default. If you don’t know JS, it would remove things you don’t use from the final bundle, and you end up only with JS you use.
This is how I was importing it, tree shaking should do its job.
But people were saying that they saw improvement if they did imports like this.
Since I was desperate, I just decided to change all imports and only import what I need.
I deployed app, it took a lot less time to deploy. So I got optimistic, tried to open and login, and things loaded 5 times faster. 5 fucking times faster…
I had my “Holly shit moment”. In hindsight it all makes sense, my app ended up with too much junk it didn’t need and it caused it to take time to render, bundle, and even send data to the client.
But I went in so many different directions, to even change the entire infrastructure. It wasn’t hard, it took me a couple of hours, and a big shoutout to Google for making it easy. Being able to pivot quickly into different architecture helped me rule out issues.
I guess, after all, lambdas are not a False Hope.
My second lesson from this is, that things won’t be obvious all the time, there will be obstacles along the way and all of us make mistakes. I guess it’s more than one lesson.
But, sometimes we need a sanity check before making a move. That’s a reason people speak to a rubber duck, it’s their sanity check, I didn’t have mine. Maybe it’s time for me to buy my rubber duck.