This blog was written in collaboration with Bartosz Owczarek, Software Engineer at Starburst, and Antony Prasad Thevaraj, Sr. Partner Solutions Architect at AWS.
Starburst Enterprise is a versatile and easy-to-use analytics engine, but when deployed through the AWS platform, it can sometimes be challenging to install and configure. Now deploying Starburst in the AWS Cloud just became easier with the introduction of Starburst Enterprise for EKS.
Starburst has been available in the AWS Marketplace for some time as both a CFT and AMI deployment. With Starburst Enterprise for EKS, a user can deploy a fully optimized installation of the Starburst application running on Kubernetes, combined with native EKS features like autoscaling and Spot instance integration, in a completely self-service manner. It’s a fully-functioning version of the platform offered as a pay-as-you-go service with all billing handled through your existing Amazon Web Services account.
To demonstrate how simple this can be, let’s walk through an actual deployment.
Prerequisites
To get started, you will need:
Install all of the listed components to your local machine and configure your client connection to your AWS environment (see AWS documentation for how to do this.)
Setting up your EKS cluster
There are a number of ways you can set up an EKS cluster, but I think the easiest and most repeatable way is to use the eksctl utility. This GO application leverages Cloud Formation to build out all the necessary infrastructure required to support an EKS deployment.
A convenient eksctl template that you can use to deploy your own EKS cluster capable of running Starburst is available here. It’s preconfigured to set up a simple cluster with two node pools, one of which leverages both EC2 autoscaling groups and Spot instances. This is perfect for running Starburst in a cost + performance capacity while retaining a high degree of stability . . . but more on that later.
From the template, edit the placeholder variables in bold to suit your environment. This install uses modestly sized m5/m5a/m5ad EC2 machine types, so ensure you have capacity for these in your cloud region!
Once you have copied down the template and edited it to suit your environment, go ahead and run this command to create the cluster:
An eksctl job run on the command line
Once the deployment is complete, your local Kubernetes configuration file will be automatically updated. At this point you can fire up Lens, see the running cluster, and start to move on to the Starburst installation.
The Lens application view of the nodes in your cluster
Installing Starburst
Now that you have a fully functioning EKS cluster, it’s time to start installing Starburst.
1. To begin, login to the AWS portal.
2. Type ‘marketplace’ in the search bar and select ‘AWS Marketplace Subscriptions’
3. Click on ‘Discover Products’ in the left-navigation:
4. Search the Marketplace listings for ‘Starburst EKS’:
5. Select the ‘Starburst Enterprise for EKS PayGo’ option from the list and click through the subscribe and then configure pages.
6. On the configuration page, ensure that it is set to the ‘Helm Chart CLI installation’ and the latest version of the software, and then click on ‘Continue to Launch’:
7. The ‘Usage Instructions’ button reveals the detailed steps, which I have also included below for convenience:
- Create a Kubernetes namespace on the cluster for Starburst.
- Associate an IAM OIDC provider for your cluster
- Create an IAM Service Account for your cluster
- Create the AWS Marketplace Pull Secret (so we can download the Helm chart package)
- For older versions of Helm (i.e. before v3.8.0), you need to set this environment parameter:
- Next, authenticate to the Container Registry (ECR):
- Then pull the Helm chart down from the ECR:
- At this point you could just run the basic Helm install with all the defaults, but this doesn’t give us a lot of optionality. Let’s say I’d also like to point Starburst to my AWS Glue catalog, set my pod sizes to fit the nodes in my cluster, and ensure that I have autoscaling enabled for my Starburst worker pods.
To start, I will need to set up a yaml file. I will then be able to include this values file in my Helm deployment. This is the values.yaml file that I will be using:
- Install Starburst through the Helm command. Note that I do not need to extract the tar package beforehand. I can just point Helm to it. Also, note where I’ve included my custom values.yaml file.
- Once the Helm deployment has completed, you should check Lens and verify that all the pods have started up correctly.
Running pods as viewed in Lens
- Lastly, since we are running the Starburst service in Kubernetes using the default ‘clusterIP,’ which doesn’t allow direct access from the internet, we’ll just use Lens to port-forward the connection to our local machine.
The Starburst service running in Kubernetes
Forwarding the connection to my local machine
Accessing the application
Once the application is installed and I have forwarded my connection using Lens, a browser window will pop up and I’ll be presented with a login screen. SInce I am not running the application over https, I do not need to authenticate. I can just enter whatever value I like as the username and will get access to the application. Once inside, I will be able to see the Glue catalog (as long as I gave the appropriate permissions), alongside the tpch data sample.
The Starburst login screen
The Starburst application running in my browser
Final thoughts
The Starburst Enterprise for EKS offering in the AWS Marketplace provides a simplified self-service approach to getting set up with a Starburst cluster that offers an overall better experience than the existing CFT deployment. Everything you need to run an optimized, full version of the platform, using your own data sources, and with billing handled through your existing AWS account is already in place. All you need to do is just ensure that you have an EKS cluster setup and execute the deployment steps as outlined above. Then you’re off and running, ready to take advantage of the analytics engine for all your data.
What are some next steps you can take?
Below are three ways you can continue your journey to accelerate data access at your company
- 1
- 2
Automate the Icehouse: Our fully-managed open lakehouse platform
- 3
Follow us on YouTube, LinkedIn, and X(Twitter).