×

Cluster Autoscaling: The Story Behind the Feature

Last Updated: June 20, 2024

Last week, Starburst engineer Jack Klamer shared the release of an exciting new feature for Starburst Galaxy: Cluster Autoscaling.  In this post, we take a step back to learn more about Jack, how he came to work on the cutting edge of Trino, and how Starburst Engineering brings these ideas to life.

If you were to make a simplistic generalization about the job of an engineer, it’s safe to assume that between all the different engineering disciplines there is one common theme: the innate drive to build things. This desire does not pop up overnight, as many who pursue the craft of engineering have memories as kids taking apart their pencils just to build them back together again, or some equivalent story that demonstrates the eerily similar personality traits that many engineers have embodied for most of their lives.  Enter Jack Klamer, one of the Starburst Galaxy software engineers who fits perfectly into this description.

The Anatomy of an Engineer

Philosophically, Jack’s wanted to build things for a “really long time”. Ever since he was a kid he’s identified with the job description of engineer before really even understanding what that definition meant. Luckily for Jack’s inner child, I’d say he succeeded since the art of programming is one of the purest forms of building something, most of the time out of nothing. Most recently, that “something” that Jack and his team worked on was the implementation of cluster autoscaling within Starburst Galaxy.

It takes confidence, teamwork, and a special personality to spend months working on a feature that quite frankly, should go unnoticed. Sometimes, the engineers creating these new capabilities do not get the credit they deserve for their amazing work. So, let’s flip the script and peek behind the curtain to see just how amazing this Starburst Engineering team is from the perspective of one of their own. I had quite a fascinating conversation chatting with Jack about his journey and the rollercoaster of emotions that are associated with undertaking such a big project like cluster autoscaling and quickly realized that throughout the entire conversation, the recurring themes were confidence, teamwork, and test-driven development.

The Origin Story

After starting his career at Apple and getting his introduction into the “Big Data Space”, Jack’s passion for Data Mesh is what led him to Starburst. Not only is Jack a Data Mesh believer, Jack is a true disciple of this data management approach and he wanted to join a company that also champions this vision. Since he knew about Trino from working within the data ecosystem, and since Starburst has become an established leader in the Data Mesh category, the career move made perfect sense. The other contributing factor toward joining Starburst were the engineers themselves. Jack wanted to work with smart people who were also focused on solving the same problems he feels passionately about.

Jack found a place on the Starburst Galaxy team and after a couple of bug fixes, he got moved into the big leagues. By luck of the draw, or since he wasn’t already in the throws of a previous project, he started on his autoscaling journey. I personally have my own conspiracy theories about the entire situation because Jack got a question about kubernetes autoscaling during his interview process, which seems eerily coincidental to me. Anyone in the technology world knows that if you answer the right question about a topic, you are doomed to own it for all of eternity – even if you have no idea what you are getting into when you start.

The path forward may not have been clear at the beginning, but there was a confidence from everyone involved that the team had the right skillset to create a viable solution. Jack had enough confidence in his technical abilities to know he was capable of taking on the challenge. Josh Howard, Jack’s manager, believed in Jack’s skills enough to push him towards this problem. And of course, Jack also had lots of confidence in his teammates to accurately review his code, and to tell him when he was wrong.

Creating a Feature

I’m sure you’ve heard the old mantra “failing to plan is planning to fail”, and the bigger the feature, the more applicable this mantra becomes.  Jack spent weeks diligently reading code to  learn the environment, because it is critical to understand the rippling effect each change has.  In addition to code changes, some additional (and substantial) considerations to fully grasp the scope of cluster autoscaling include deployment, testing and migration details. Jack admitted to me that in many scenarios, his team’s suggestions for this implementation portion were recommendations he hadn’t even considered on his own. After gathering as much information as possible, Jack had created a code plan that enabled testing along every step of the journey.

Crediting his backend blood, Jack started making code changes from the bottom up, which meant lots of these code changes were implemented without affecting the running code at all. The sheer volume of changes that had to be made in each place ended up creating quite a code safari, and only escalated the importance of a proper test plan. As the project continued, a majority of the time spent creating cluster autoscaling actually revolved purely around testing, and Jack realized that the tests he was trying to perform were novel compared to the original testing expectations. Here’s the thing. Failing to plan is planning to fail. But failing to pivot from the original plan is also a recipe for disaster.

Ready, Set, Test!

After hours and hours of haggling with the test stack, Jack pivoted from his original test plan and ended up collaborating with another Starburst team to accurately create a test environment that could perform the task at hand. You may wonder why Jack dedicated so much of his time during this feature to diligently testing his code. The answer also relates back to one commonality I’ve observed: the best programmers are genuinely test driven developers.

I asked Jack how he would describe himself when it comes to testing and he has self-diagnosed himself as painstakingly meticulous. He spent so much time trying to test each individual change because he doesn’t like deploying code for scenarios he hasn’t proved out. The more we chatted, the more passionate about the subject he seemed as we even got into some of the details of the extensive matrix he created to completely mock up his entire state machine. It’s a good thing too that Jack put this effort into testing his code, because obviously as everyone does when reviewing their code, he found some errors he then remedied before the feature launch.

The extensive matrix was not the only extreme method of testing developed for cluster autoscaling as Jack also made his own testing infrastructure to specifically test the code migration, which required a delicate level of precision. Building and testing the feature was only part of the challenge, migrating the feature into a production environment is a whole other beast. Ideally, this migration would entail hot swapping the live code “Indiana Jones Style” as Jack described it. Because of this dedication to proper development practices, Jack also caught multiple migration issues that would have turned catastrophic had he not thoroughly vetted all parts of the migration before execution. Even better, this migration infrastructure is in the process of being improved so that it can potentially be utilized as a safeguard for other engineers in the future.

Teamwork Makes the Dream Work

Obviously, Jack led a fantastic effort to enable cluster autoscaling within Starburst Galaxy that was enhanced through the dedication of his teammates, specifically Eric Hwang, Pachu Shrinivas, and Neeraj Soparawala. Each teammate played a pivotal role in shaping the code, working through tough problems, and catching potential mistakes that had previously gone unnoticed.  Credit to the village that spent hours reviewing and shaping the code to create this feature all that it can be.

Join Starburst Engineering

If you are looking to work for an organization that encourages you to produce meaningful and impactful work, join our engineering team.  Jack, Eric, Pachu, and Neeraj are just a small subset of the talented coworkers I have gotten to interact with and watch from afar.  Our team is proud of the environment we have created to empower each other to go the extra mile, and with that kind of support the possibilities of what you can achieve are endless.

What are some next steps you can take?

Below are three ways you can continue your journey to accelerate data access at your company

  1. 1

    Schedule a demo with us to see Starburst Galaxy in action.

  2. 2

    Automate the Icehouse: Our fully-managed open lakehouse platform

  3. 3

    Follow us on YouTube, LinkedIn, and X(Twitter).

Start Free with
Starburst Galaxy

Up to $500 in usage credits included

  • Query your data lake fast with Starburst's best-in-class MPP SQL query engine
  • Get up and running in less than 5 minutes
  • Easily deploy clusters in AWS, Azure and Google Cloud
For more deployment options:
Download Starburst Enterprise

Please fill in all required fields and ensure you are using a valid email address.

s