Last Updated: 2024-04-16

Background

AWS PrivateLink allows private connectivity between virtual private clouds (VPC), supported AWS services, and on-premises networks. This connection does not expose traffic to the public internet, making it a great choice for secure client or tool connectivity and other use cases.

Starburst Galaxy extends support for AWS PrivateLink across certain connections. This tutorial will guide you through configuring PrivateLink for a client or tool connection to a Starburst Galaxy cluster.

Scope of tutorial

In this tutorial, you will learn how to configure AWS PrivateLink for client or tool connectivity to Starburst Galaxy.

Learning objectives

Once you've completed this tutorial, you will be able to:

Prerequisites

About Starburst tutorials

Starburst tutorials are designed to get you up and running quickly by providing bite-sized, hands-on educational resources. Each tutorial explores a single feature or topic through a series of guided, step-by-step instructions.

As you navigate through the tutorial you should follow along using your own Starburst Galaxy account. This will help consolidate the learning process by mixing theory and practice.

Background

If you are configuring PrivateLink for the first time you are encouraged to work with a Starburst technical resource. This individual will work with you to set up the environment needed to complete the tutorial.

Contacting your technical resource

To be assigned this resource, you should reach out to your regular Starburst account team for assistance.

Working together

Once assigned, your Starburst technical resource will work with you to set up an environment where you can complete the tutorial.

Please review the following overview of this process before beginning the tutorial.

Your responsibilities:

Steps to add an additional Starburst Galaxy PrivateLink cluster in the same AWS VPC region:

Background

A wildcard DNS record is a type of DNS record that allows you to specify a wildcard character (*) as part of the domain name. This wildcard character matches any subdomain that has not been explicitly defined in the DNS zone.

For example, if you set up a wildcard DNS record for "*.example.com" to point to a specific IP address, then any subdomain of "example.com" that has not been explicitly defined (e.g., "sub1.example.com", "sub2.example.com", etc.) will automatically resolve to the specified IP address.

Wildcard DNS records are commonly used in scenarios where you want to route all subdomains of a domain to a single location or when you want to simplify DNS configuration for a large number of subdomains.

In this tutorial, you are going to use the wildcard DNS record to route all galaxy.starburst.io subdomains to a VPC Endpoint that you create in your AWS environment.

Wildcard DNS records and VPC Endpoints

Currently, you can only connect to a Starburst Galaxy PrivateLink cluster from one VPC per region. To set this up, you need to create a wildcard DNS record for each region, which directs traffic to a single VPC Endpoint. This means that any request resolved to the wildcard subdomain will be directed to that specific VPC Endpoint.

Step 1: Select appropriate Endpoint Service name

In the next section, you will create an Endpoint, and you will need to supply the correct Starburst Galaxy Endpoint Service name as part of the configuration. Locate your region from the list below, and copy the corresponding Endpoint Service name.

Background

Now it's time to create an Endpoint.

In the context of AWS PrivateLink, a VPC Endpoint allows users to securely connect their VPC to an Endpoint Service. In this case, you will be connecting to the Starburst Galaxy Endpoint Service.

Step 1: Start the Endpoint wizard

Step 2: Verify Starburst Galaxy Endpoint Service

It's time to start configuring your new Endpoint, starting with a name tag and Endpoint Service with which to connect. This is the time to verify connection with the Starburst Galaxy Endpoint Service you copied earlier in this tutorial.

Step 3: Select VPC and Subnets

Now it's time to select the VPC in which to create the Endpoint.

Step 4: Select security group

The final step to configure your Endpoint is to choose a security group. Because this is a private Endpoint, the security group will only need to allow the IP CIDR from the hosts where your clients or tools are running.

Step 5: Obtain AWS Account ID

You will need to provide this information when you open a support ticket in the next step.

Step 6: Open support ticket

You are going to use the automated assistant in Starburst Galaxy to open a support ticket and provide support with the Account ID that you just copied and ask support to accept your endpoint connection.

Step 7: Wait for Starburst to accept Endpoint connection

You need to wait for the Endpoint connection request to be accepted before moving on.

Step 8: Record DNS name and Subnet IPs

Later in this tutorial, you will be creating a DNS alias record for your Endpoint DNS name. You're going to record that name now and save it in a safe place for later use. You're also going to record your Endpoint Subnet IPs for validation in a later step.

Background

It's time to switch gears and work in the Starburst Galaxy UI. Your next task is to create a cluster that is only accessible via PrivateLink. When your client or tool uses PrivateLink to connect to Starburst Galaxy, it will be connecting to the cluster you create.

Step 1: Sign into Starburst Galaxy

Step 2: Set your role

Starburst Galaxy separates users by role. Your current role is listed in the top right-hand corner of the screen.

Creating a cluster in Starburst Galaxy will require access to a role with appropriate privileges. Today, you'll be using the accountadmin role.

Step 3: Create a new cluster

It's time to create your cluster. In Starburst Galaxy, the Clusters pane is used for all cluster operations.

Step 4: Configure cluster

It's time to add some details to your cluster, such as the name, catalogs to include, and region.

Step 5: Configure Cluster type

Next you'll choose the execution mode, cluster size, and auto-suspend time period for your cluster.

Step 6: Configure cluster for PrivateLink

The option to make the cluster accessible only via PrivateLink is located in the Advanced settings menu.

Step 8: Copy Cluster Host name

While you're in the Clusters pane, you need to record the Host name for your PrivateLink cluster for later use.

Background

AWS Route 53 is a scalable and highly available Domain Name System (DNS) web service offered by Amazon Web Services. It is designed to route end users to internet applications by translating human-readable domain names (like www.example.com) into the numeric IP addresses (like 192.0.2.1) used by computers to connect to each other.


In AWS Route 53, a private hosted zone is a DNS zone that is used to manage domain names and their corresponding DNS records within an Amazon VPC. Private hosted zones are only accessible from within the specified VPC, making them ideal for internal resources that should not be accessible from the public internet.

When you create a private hosted zone in Route 53, you can define custom domain names (e.g., mycompany.local) and create DNS records (such as A, AAAA, CNAME, etc.) for those domain names. These DNS records can then be used to route traffic within your VPC to the appropriate resources, such as EC2 instances, load balancers, or other services.

After you create the private hosted zone in Route 53, you'll be creating two DNS records. The first will route traffic to your PrivateLink cluster based on its host name. The second record will ensure that any redirections come back to the correct cluster.

Step 1: Create hosted zone

It's time to switch back to working in the AWS console. Hosted zones can be created from the Route 53 dashboard in AWS.

Step 2: Configure hosted zone

It's time to configure the hosted zone by providing the name of the domain you want to route traffic for and selecting the type of hosted zone you would like.

Step 3: Select VPC

You're almost finished with the private hosted zone configuration. The final step is to select a VPC to associate with the hosted zone.

Background

Now that you've created a hosted zone, you can create the two required DNS alias records. After you've completed this section, you'll be ready to test your PrivateLink connection.

Step 1: Create first DNS alias record

This record will route traffic to your Starburst Galaxy PrivateLink cluster based on its host name.

You should be on the information page for the Private hosted zone you created in the previous section.

Step 2: Configure first record

It's time to configure your record by giving it a name and identifying where it should route to.

Example host name: erosas-privatelink-cluster-us-east-1.private.trino.galaxy.starburst.io

Example host name with domain name removed: erosas-privatelink-cluster-us-east-1

Step 3: Create wildcard DNS record

This second record will ensure that any traffic redirections are routed back to your PrivateLink cluster.

us-east-1: *.aws-us-east1

us-east-2: *.aws-us-east2

us-west-1: *.aws-us-west1

us-west-2: *.aws-us-west2

ap-southeast-2: *.aws-ap-southeast2

ap-southeast-1: *.aws-ap-southeast1

ca-central-1: *.aws-ca-central1

eu-central-1: *.aws-eu-central1

eu-west-1: *.aws-eu-west1

eu-west-2: *.aws-eu-west2

af-south-1: *.aws-af-south1

Background

You've completed all the necessary steps to configure AWS PrivateLink for client or tool connectivity to Starburst Galaxy. Your final task is to test the connection. We will use the Trino CLI as the test client.

Step 1: Run DNS lookup test

You'll need to confirm that this test returns the Subnet IPs from your VPC Endpoint. Recall that you copied those earlier in this tutorial. The command you use to run the DNS lookup test will depend on your operating system.

Windows users

nslookup your_dns_name

Mac/Linux users

dig your_dns_name

Step 2: Test connection to Trino CLI

If you haven't done so already, you can visit the Starburst Software Downloads site to get the latest version of the Trino CLI. These instructions use the trino-cli-429-e.0-executable.jar.

The command you use to connect to the Trino CLI will depend on your operating system.

Windows users

java -jar trino --server https://<your_dns_name> --user <your_galaxy_url>/accountadmin --password --debug

Mac/Linux users

./trino --server https://<your_dns_name> --user <your_galaxy_url>/accountadmin --password --debug

Step 3: Confirm connection to Trino CLI

Let's confirm that the connection was successful by running a simple SQL command.

SHOW CATALOGS;

Tutorial complete

Congratulations! You have reached the end of this tutorial, and the end of this stage of your journey.

You're all set! Now you can use PrivateLink for client or tool connectivity to Starburst Galaxy.

Continuous learning

At Starburst, we believe in continuous learning. This tutorial provides the foundation for further training available on this platform, and you can return to it as many times as you like. Future tutorials will make use of the concepts used here.

Next steps

Starburst has lots of other tutorials to help you get up and running quickly. Each one breaks down an individual problem and guides you to a solution using a step-by-step approach to learning.

Tutorials available

Visit the Tutorials section to view the full list of tutorials and keep moving forward on your journey!

Start Free with
Starburst Galaxy

Up to $500 in usage credits included

  • Query your data lake fast with Starburst's best-in-class MPP SQL query engine
  • Get up and running in less than 5 minutes
  • Easily deploy clusters in AWS, Azure and Google Cloud
For more deployment options:
Download Starburst Enterprise

Please fill in all required fields and ensure you are using a valid email address.