Serverless Comparison

This is the final part in a series on serverless computing where I’ll compare my experience with each. For the previous parts click below:

Let me just start by saying that I love using both AWS and GCP, but each has its own strengths and weaknesses. Choosing a cloud provider can be a difficult decision. At a high level they look the same as they offer 95% of the same services. It isn’t until you get into using one that you really get to know it and if you don’t try the alternatives then you really never know whether the grass is greener.

Isolated Environments

In AWS it took us hours to get the environments setup the way we wanted them. I watched a number of videos on the topic and sometimes had to backtrack on how I tried to configure the IAM permissions. I also had to create environment specific users for certain cases like storing private keys for CodeCommit. Sometimes I also ran into limitations with services that I wanted to act in a cross-account capacity but couldn’t. Like when I wanted to use the Amplify console from a services account and deploy the front and back-end to each environment account. It turns out that you can’t do that.

If I did it again I could probably do it in under an hour. However the journey to gain that knowledge wasn’t the most intuitive and the actual usage once implemented felt a bit unnatural.

With GCP we had it all setup in 15-30 minutes. It took us a few extra minutes to figure out how to properly setup billing because of different permissions between user accounts. We also had to recreate the production environment when we realized that we didn’t set it up as mult-zone for HA. However, it was overall a rather intuitive process and switching between environments was seamless.

Infrastructure Automation

AWS has GCP outgunned when it comes to Infrastructure Automation. They both have the basic tooling to do it but AWS has a lot more options to make it easier. For example, they offer a drag-and-drop designer (CF Designer), resource discovery (Cloud Former), drift detection, simplified templates (SAM), and code generation (Amplify).

That being said, our experience turned out differently than expected. The code generation was nice when it gave you the options that you wanted. I found a number of times where I wanted to specify something in the CF template that the Amplify wizard didn’t give me the option to specify. So I was left with the standard conundrums around modifying generated code. I had to take what the CLI gave me wholesale or just use it to help inform template creation.

Another disappointment related to hard limits placed on CF templates. Thankfully we decided at the project outset to create our entire data model. For AWS, this meant creating our entire GraphQL api. Generally we don’t have it that well defined at the beginning of a project, but in this instance we knew what it would be. Half way through I was pretty impressed. I saw how little code decoration I had to do and how much CF templating was generated for me. Then it broke.

Once I finished creating the whole GraphQL API it started blowing chunks. After some research I found this open issue. Basically there is a size limitation (~460kb) that I was hitting on my generated CF templates for my GraphQL api. Amplify only supports 1 GraphQL api so I couldn’t split it. I tried minifying the generated CF template and was able to get under the size requirement and then I hit the 200 resource limit. I started removing elements from my data model and was just able to get under it.

It was working, but I was feeling kind of claustrophobic. I was also rather angry that Amplify didn’t come with a clear warning on the GraphQL documentation. What if I hadn’t been able to create my entire api up front and hit the issue two months in?

My data model was not of an unreasonable size. I had defined about 25 entities with their associated mutations, queries, and subscriptions. That put me at about double the size limit and ~250 resources. Other people reported hitting the limit with as few as 12 entities. Honestly this ended up being the hammer that broke the camel’s back and made us entertain jumping horses in the middle of the stream.

During this process I also had to keep tearing down and recreating stacks that somehow became tainted and would no longer deploy. I knew that I committed a few sins by changing something in the console that was defined by the stack, but that didn’t account for most of the instances. I’m sure they could be fixed but often it was easier to destroy and recreate. However, I knew that this was not a great option once I actually had things running.

With Firebase (GCP) we did infrastructure automation by using the CLI with rather minimal configuration. Firestore is schemaless so there wasn’t any upfront data model definition. There were a few configurations that we couldn’t automate but they were trivial, one-time changes so we created a Google doc to track them.

Serverless Auth

For AWS we had to start off with some reading about the difference between user and identity pools and general multi-tenant, multi-role strategies. The Cognito limits made us feel uncomfortable but we found a happy place by using groups for tenants and custom attributes for roles. Not being able to set those custom attributes through the console was pushing us toward needing to develop our user admin earlier than we wanted. We also ran into these issues which I won’t explain for brevity:

For Firebase the user admin is fairly simplistic but the ability to extend it with Firestore let us easily create a custom solution for multi-tenant, multi-role. The rules were also very easy to setup to enforce it.

Serverless Hosting

For AWS, we used the Amplify CLI and Console to setup the hosting. It setup a domain for us by default (branchname.somerandomstringofcharacters.amplifyapp.com) which came with SSL. Setting up a custom domain was a pain. If the dns was managed by the same account that we were using the Amplify console on, then it would have been easy. However, the dns was managed on the parent account and we could never get each environment account setup with a custom sub-domain. We even tried the instructions for pointing external DNS but that didn’t work. So we planed to just buy a domain per environment.

This was an easier and successful task with GCP. Firebase setup a default domain for us based on our project name which was rather convenient (proj-name.firebaseapp.com). Honestly we could have just used that as it was relevant and memorable. Our custom domain was manged by Google domains. After some standard ownership verification we were able to setup a custom sub-domain for each lower environment. SSL was setup by Firebase for both the default and custom domain.

Serverless Database

In concept, I like AWS’s serverless database offering better than Firestore. GraphQL is hands-down the best developer experience for querying. It is like going to a buffet rather than having to order one thing at a restaurant. The fact that I can back the GraphQL API by DynamoDB, Lambda, or ElasticSearch opens up a world of possibilities. It is also really easy to setup via decorators on the GraphQL schema and its ability to code gen the angular service to interface with the API is pretty awesome!

Yet the flexibility seems to come at a price. First off, it is noticeably slower than Firestore. Secondly, subscriptions will only be triggered by data going through the mutations. So if other sources in your cloud change your data without going through your GraphQL API, you won’t be notified.

Firestore is an altogether different animal as it is a schemaless database. If you need to query across entities then you have to do multiple trips to the server. Drilling down through nested documents and collections creates serial round-trips. Thankfully the progression usually matches the workflow so it isn’t that detrimental. There was a short period where we considered creating a GraphQL api for Firebase but decided to just hope that someone else would instead. Still, I like the security of knowing that I am notified if my data is changed by something else in my cloud. It also has a really nice interface in the console and it is stupid fast.

Automated deployments (CI/CD)

For AWS we tried to use the Amplify console to deploy from one services account to each environment account. We figured out how to do that for the front-end but we couldn’t grant the right permissions to do that on the back-end. So we moved to using the Amplify console in each environment account and having them listen to different branches.

At first we used CodeCommit but we had to setup different users in each environment account to house the SSH key for each developer. This seems to be a limitation of assuming a role in different accounts. So we just switched to GitLab for source control. The builds are honestly kind of slow. The initial build which creates the stack takes 20-25 minutes. Subsequent builds range from 3-6 minutes.

For GCP we just used Gitlab CI and the Firebase CLI. Using their shared runners is rather slow (5-8 minutes even with caching) but deploying from local is pretty snappy. Also, Firebase keeps track of each deploy for hosting and you can jump back to previous versions through the console.

We could have basically done the same solution for AWS as we did with GCP by using the Amplify CLI. So it isn’t much of a comparison. Honestly the main difference is that AWS Cloud Formation started blowing chunks on the GraphQL api, which became a deal breaker for us.

Conclusion

We spent a solid week setting up the project in AWS. By the end I was following 6-8 issues on github and hoping they would be fixed before I really needed them a month or two down the road. Considering the risk, we decided to do a time-boxed (5hr) spike into switching to Firebase. In 3 hours we were staring at each other wondering if we forgot some detail. We had accomplished everything we set out to do and it actually worked. The project felt simple again.

This experience highlights for me some strengths and weaknesses of each cloud provider. AWS seems to offer more services with more knobs and dials that you can turn to tune your solution. It can feel complex but they offer many abstractions that get you 95% of the way there. Also you always need to keep an eye out for those soft and hard limits and you have to dig to find the gotchas.

GCP feels simpler, intuitive, and more cohesive. I don’t always have the same level of control but I honestly haven’t felt the need for it either. Even their beta offerings feel polished and thoughtful.

So in my situation GCP or rather Firebase was the clear winner. That wont necessarily be true of every scenario, but I think that the general characteristics of each cloud likely still hold true.

Tags:

Serverless
AWS
GCP

Erik Murphy

Erik is an agile software developer in Charlotte, NC. He enjoys working full-stack (CSS, JS, C#, SQL) as each layer presents new challenges. His experience involves a variety of applications ranging from developing brochure sites to high-performance streaming applications. He has worked in many domains including military, healthcare, finance, and energy.