These cookies are essential in order to enable you to move around the website and use its features, such as accessing secure areas of the website.
Analytical/ Performance Cookies
These are analytics cookies that allow us to collect information about how visitors use a website, for instance which pages visitors go to most often, and if they get error messages from web pages. This helps us to improve the way the website works and allows us to test different ideas on the site.
Functional/ Preference Cookies
These cookies allow our website to properly function and in particular will allow you to use its more personal features.
Targeting/ Advertising Cookies
These cookies are used by third parties to build a profile of your interests and show you relevant adverts on other sites. You should check the relevant third party website for more information and how to opt out, as described below.
Last Updated: 2024-04-16
Background
AWS PrivateLink provides private connectivity between virtual private clouds (VPCs), supported AWS services, and on-premises networks without exposing traffic to the public internet.
Starburst Galaxy supports AWS PrivateLink for some of its catalogs. In this tutorial, you will learn how to set up PrivateLink for Amazon Redshift.
Scope of tutorial
In this tutorial, you will learn how to configure AWS PrivateLink for Amazon Redshift.
Learning objectives
Once you've completed this tutorial, you will be able to:
Configure AWS PrivateLink for connectivity from Starburst Galaxy to your Redshift data warehouse.
Use PrivateLink to securely connect Starburst Galaxy to your Redshift data warehouse.
Prerequisites
You need a Starburst Galaxy account to complete this tutorial. Please see Starburst Galaxy: Getting started for instructions on setting up a free account.
This tutorial comes with a bring your own storage requirement. Before proceeding with this lesson, you must already have an Amazon Redshift data warehouse set up.
About Starburst tutorials
Starburst tutorials are designed to get you up and running quickly by providing bite-sized, hands-on educational resources. Each tutorial explores a single feature or topic through a series of guided, step-by-step instructions.
As you navigate through the tutorial you should follow along using your own Starburst Galaxy account. This will help consolidate the learning process by mixing theory and practice.
Background
If you are configuring PrivateLink for the first time you are encouraged to work with a Starburst technical resource. This individual will work with you to set up the environment needed to complete the tutorial.
Contacting your technical resource
To be assigned this resource, you should reach out to your regular Starburst account team for assistance.
Working together
Once assigned, your Starburst technical resource will work with you to set up an environment where you can complete the tutorial.
Please review the following overview of this process before beginning the tutorial.
Your responsibilities:
Obtain your Redshift endpoint.
Record the IP address(es) of your Redshift endpoint.
Allow the Starburst Galaxy AWS account principal to use the endpoint service.
Submit a Starburst Galaxy support ticket to request that they create an endpoint in the Starburst Galaxy AWS account.
Accept the endpoint connection in your AWS account.
Background
Understanding the Redshift PrivateLink architecture is important when completing the steps in this tutorial. In this section you will learn about this architecture and the way that Starburst Galaxy uses it to securely connect private clouds.
This tutorial also follows corresponding AWS documentation on the topic. It is recommended that you consult this documentation if you want to learn more about AWS PrivateLink in general.
Reference architecture
The following diagram illustrates a PrivateLink connection between the Starburst Galaxy VPC and the Amazon Redshift VPC.
Review the diagram and corresponding notes below for more information.
Once the PrivateLink configuration is complete, an endpoint is created in the Starburst Galaxy VPC (Source).
This endpoint connects to a Network Load Balancer located inside an endpoint service situated in the Redshift cluster VPC (Destination).
This establishes a private connection between Starburst Galaxy and the Redshift, enabling PrivateLink functionality.
In this reference architecture, the Starburst Galaxy VPC is the source.
In this reference architecture, the Redshift VPC is the destination.
Background
Amazon Redshift can be deployed in two ways:
Redshift Serverless
Redshift Provisioned Clusters
In each case, the process of obtaining the necessary information for PrivateLink differs slightly. Despite this, the way the information is used remains very similar.
Option 1: Redshift Serverless
If you are using Redshift Serverless, you will need to obtain:
Endpoint
A minimum of 3 subnets
AWS Availability zone (AZ)
Option 2: Redshift Provisioned Clusters
If you are using Redshift Provisioned Clusters, you will need to obtain:
Endpoint
AWS Availability zone (AZ)
Select appropriate deployment option
To continue with this tutorial, select the appropriate deployment type matching your Redshift deployment.
Background
Redshift Serverless deployment is one of the two types of Redshift deployment.
In this implementation, users do not need to provision or manage the underlying infrastructure. Instead, AWS itself handles the infrastructure management. This includes autoscaling of the cluster based on workload demands.
Step 1: Sign in to AWS console
You're going to start by signing in to your AWS console.
Remember that this should be the AWS account containing the Redshift warehouse that you would like to connect using PrivateLink, so if you use multiple AWS accounts, ensure that you pick the correct one.
Sign in to your AWS account.
In the AWS console, enter Amazon Redshift in the search field.
Select Amazon Redshift from the drop-down menu.
Step 2: Select Redshift Serverless
Next, you're going to enter the Redshift Serverless section of AWS. You can get there from the left-hand navigation bar.
Expand the left-hand navigation bar.
Select Redshift Serverless.
Step 3: Select Namespace
In the Namespaces / Workgroups section, select your Namespace.
Step 4: Select Workgroup
Now that you're on the Namespace page, it's time to select your workgroup.
In the Workgroup name section, select your Workgroup.
Step 5: Copy your Redshift endpoint
Now it's time to gather some information from the AWS console. Later, you'll use this to connect your Redshift warehouse to Starburst Galaxy using PrivateLink.
The first pieces of information you need to gather are the endpoint and subnet IDs for your Redshift warehouse. Let's start by copying the Redshift endpoint, located in the General information section.
Copy the Endpoint. For example: docs-dev-redshift-serverless.594246526251.us-east-1.redshift-serverless.amazonaws.com:5439/dev)
The endpoint is broken into 3 parts:
The DNS name starts from the beginning and goes up to but does not include the colon (ex. docs-dev-redshift-serverless.594246526251.us-east-1.redshift-serverless.amazonaws.com)
The Port is the second part. This is the number after the colon and before the forward slash (ex. 5439)
The database name is the characters after the forward slash (ex. dev).
Later in these instructions you'll be asked to either use or provide each of those three parts to Starburst.
Step 6: Copy the subnets from your Workgroup
Now it's time to copy the subnet IDs from your Workgroup.
These are located in the Data access tab in the Network and security section.
Scroll down to the Data access tab.
Locate your Subnet.
Record the Subnet ID(s) listed there in your text editor.
Step 7: Obtain the IP addresses of your Redshift endpoint
Now that you have your Redshift endpoint recorded, you can use it to find its IP addresses. For Redshift Serverless, you will have at least two IP addresses.
You'll be using a terminal window to find the IP addresses. Again, you will be copying information into your text editor.
Copy the DNS name from your Redshift endpoint.
Open a Terminal window on your desktop.
Run one of the following commands to retrieve the IP addresses.
Note: The command you choose will depend on your operating system. Be sure to replace [redshift-dns-name] with your actual Redshift DNS name.
In Windows run the command nslookup [redshift-dns-name]
In Linux\MacOS run the command dig [redshift-dns-name]
Record all IP addresses of your Redshift endpoint. They will be listed in the ANSWER SECTION.
Step 8: Navigate to Subnets
Now that you've obtained the IP addresses of your Redshift endpoint, you can use them to determine the availability zones. You will need this information when you create a load balancer later in this tutorial.
To do this, you're going to start by navigating to the Subnets menu, which is accessed through the VPC dashboard.
In the AWS console, select VPC.
Using the left-hand navigation menu, select Subnets.
Step 9: Determine the availability zones
Now it's time to use your Redshift subnets to determine the corresponding Availability Zones.
Using the subnets you copied out earlier, search for each subnet one at a time.
Click on the option for Subnet ID = to filter the results to your subnet.
Scroll to the right until you see the Availability Zone column.
Record the Availability Zone corresponding to the subnet.
Repeat this for each subnet listed by your Redshift Serverless cluster.
Background
Redshift provisioned clusters enable users to create fully-managed data warehousing environments with customizable configurations using the AWS cloud platform.
With provisioned clusters, users have full control over cluster configuration and are responsible for managing the infrastructure, including scaling the cluster up or down based on workload demands.
Step 1: Sign in to AWS console
You're going to start by signing in to your AWS console.
Remember that this should be the AWS account containing the Redshift warehouse that you would like to connect using PrivateLink, so if you use multiple AWS accounts, ensure that you pick the correct one.
Sign in to your AWS account.
In the AWS console, enter Amazon Redshift in the search field.
Select Amazon Redshift from the drop-down menu.
Step 2: Select Provisioned clusters dashboard
Next, you're going to enter the Redshift provisioned clusters dashboard section of AWS. This can be accessed via the left-hand navigation bar.
Expand the left-hand navigation bar.
Select Provisioned clusters dashboard.
Confirm that you have entered the Provisioned clusters dashboard.
Step 3: Select Clusters menu
Now it's time to view specific provisioned clusters. AWS includes a specific section for this in the left-hand navigation menu.
Using the left-hand navigation menu, select Clusters.
Step 4: Select your cluster
Next, it's time to select your cluster from the Clusters menu. If you have multiple clusters, make sure to select the correct one.
Select your Cluster from the list.
Step 5: Record cluster endpoint
Now it's time to record your cluster's endpoint. Later, this will be used to connect the Redshift cluster to Starburst Galaxy.
Copy your Redshift Endpoint. For example: docs-dev-redshift.cniblknnqbe8.us-east-1.redshift.amazonaws.com:5439/dev
The endpoint is broken into 3 parts:
The DNS name starts from the beginning and goes up to but does not include the colon (ex. docs-dev-redshift.cniblknnqbe8.us-east-1.redshift.amazonaws.com)
The Port is the second part. This is the number after the colon and before the forward slash (ex. 5439)
The database name is the characters after the forward slash (ex. dev).
Later in these instructions you'll be asked to either use or provide each of those three parts to Starburst.
Step 6: Record cluster availability zone
Now it's time to copy your cluster's availability zone and paste it into a text editor. To do this, you're going to access the Network and security settings section located under the properties tab.
Scroll down below the General information section.
Select the Properties tab.
In the Network and security settings section, locate your Availability Zone.
Record your Availability Zone for later use.
Step 7: Obtain the IP address of your Redshift endpoint
Now that you have your Redshift endpoint recorded, you can use it to find its IP address.
You'll be using a terminal window to do so. Again, you will be copying information into your text editor.
Copy the DNS name from your Redshift endpoint.
Open a Terminal window on your desktop.
Run one of the following commands to retrieve the IP address.
Note: The command you choose will depend on your operating system. Be sure to replace [redshift-dns-name] with your actual Redshift DNS name.
In Windows run the command nslookup [redshift-dns-name]
In Linux\MacOS run the command dig [redshift-dns-name]
Record the IP address of your Redshift endpoint.
Background
Now it's time to set up a target group. A target group is responsible for routing incoming traffic from a load balancer to registered targets, which are typically instances, containers, or IP addresses.
The target group you create in this tutorial will route traffic to the IP address of your Redshift endpoint.
Step 1: Start the target group wizard
The AWS console includes a target group creation wizard. This allows you to quickly and easily create target groups. It is accessed using the EC2 dashboard.
Navigate to the EC2 dashboard in the AWS console.
Note: This can be done by searching for EC2 and clicking EC2 in the results list.
Using the left-hand navigation menu, expand Load Balancing and click Target Groups.
Click the Create target group button on the right.
Step 2: Provide a target group name
Now it's time to configure your new target group. AWS will ask you to select a target type and provide a meaningful name.
Select IP addresses as the target type.
Provide a meaningful Target group name.
Step 3: Configure target group
Next, you're going to configure your target group for use with your Redshift cluster. To do this, you're going to use some of the details that you copied into your text editor earlier in this tutorial.
From the Protocol dropdown, select TCP.
Set the Port to 5439.
Select IPv4.
Select the VPC for your Redshift cluster.
Using the Health check protocol dropdown menu, select TCP.
Click Next.
Step 4: Complete configuration process
For the final step, you're going to finish the configuration process and create the target group.
In the IPv4 address input field, enter the first IP address of your Redshift endpoint.
If you have additional IPv4 addresses, click the Add IPv4 address button and enter the next address of your Redshift endpoint. Repeat this until all of them are added.
In the Ports section, click the Include as pending below button.
Confirm that your Redshift cluster IP is now listed under Targets and that its Health status is listed as Pending.
Click Create target group.
Background
Now it's time to create a network load balancer. In AWS, a Network Load Balancer (NLB) is a service that automatically distributes incoming network traffic across multiple targets based on IP protocol data. This includes Amazon EC2 instances, containers, and IP addresses. Load balancers are also configurable across either a single or AWS Availability Zone or multiple Availability Zones.
After configuring PrivateLink, an endpoint in the Starburst Galaxy VPC will connect to your Network Load Balancer using a service located in the Redshift cluster VPC.
Step 1: Start the load balancer wizard
Just like target groups, AWS includes a load balancer wizard to help make the creation of load balancers easy. Again, this is located in the EC2 dashboard.
Ensure that you are in the EC2 dashboard.
Using the left-hand navigation menu, select Load Balancers.
Click the Create load balancer button on the right side of the dashboard.
Step 2: Select load balancer type
AWS load balancers come in several different types. These include Application Load Balancers, Network Load Balancers, and Gateway Load Balancers.
For this tutorial, you're going to select the Network Load Balancer.
Select the Network Load Balancer by clicking the corresponding Create button.
Step 3: Name your load balancer
It's time to start configuring your new load balancer, starting with a name.
Enter your Load balancer name in the field provided.
Step 4: Configure the load balancer
Next, you're going to configure your load balancer for use with your Redshift cluster.
In the Scheme section, select Internal.
In the IP address type field, select IPv4.
In the VPC section, select your Redshift VPC.
Step 5: Select the availability zone and subnet(s)
Now it's time to select AWS availability zones (AZ) for your load balancer. These will be the same AZs that you recorded for your Redshift deployment earlier in this tutorial.
Select each Availability Zone corresponding to your Redshift deployment.
For each Availability Zone, select the corresponding Subnet. Recall that you recorded them earlier in the tutorial. An AZ can have more than one subnet and it is important you are selecting the right ones.
Leave Private IPv4 address field unchanged.
Step 6: Configure security group
Select a Security Group that has an inbound rule allowing the IP CIDR 172.16.0.0/16 for your Redshift port.
Step 7: Configure port and target group
Now it's time to finish configuring your target group.
In the Protocol menu, select TCP.
In the Port field, enter 5439.
Using the Forward to drop-down menu, select the target group you just created.
Click Create load balancer.
Step 8: Wait for load balancer to activate
That's it! Your load balancer is now being created. This process takes between three to five minutes.
Wait for the State to change from Provisioning to Active before moving to the next step.
Click the Refresh button to view status updates.
Background
Now it's time to create an endpoint service.
In the context of AWS PrivateLink, an endpoint service allows you to expose services running in your VPC to other accounts within the same AWS region using a private connection.
Step 1: Start the endpoint service wizard
Like target groups and load balancers, AWS allows you to create an endpoint service using a wizard.
Navigate to the VPC dashboard in the AWS console.
Note: This can be done by searching for VPC in the AWS search bar and selecting VPC in the results list.
From the left-hand navigation menu, expand the Virtual private cloud menu, and select Endpoint services.
Click the Create endpoint service button on the right.
Step 2: Provide an endpoint service name
Begin by naming your service endpoint and choosing the load balancer type.
Enter your endpoint service Name in the field provided.
In the Load balancer type section, select Network.
Step 3: Configure endpoint service
Now it's time to configure your endpoint service. You're going to make sure that it connects with your network load balancer and uses the correct IP address.
Select the network load balancer that you created in this tutorial.
In the Supported IP address type section, select IPv4.
Click the Create button.
Background
Time to switch gears. You've completed all of the steps required on your own. Now it's time to contact the Starburst support team to finish the last steps.
Step 1: Enter the Starburst Galaxy ARN
In the last section of this tutorial, you created your endpoint service. At the end of that process, you were directed to a page that displays the details of that service.
You're going to use this section to input the Starburst Galaxy Amazon Resource Name (ARN).
Select the Allow principals tab under the Details box.
Select the Allow principals button.
Enter the following ARN in the ARN field: arn:aws:iam::179619298502:root
Select the Allow principals button.
Step 2: Record Service name
Next, you will locate and copy the service name for your endpoint service. The Starburst support team will use this information to create the endpoint in Starburst Galaxy.
Scroll up, and copy the Service name.
Step 3: Open support ticket
You are going to use the automated assistant in Starburst Galaxy to open a support ticket and provide support with the Service name that you just copied. You will also need to provide the port your database is listening on and your preferred Starburst Galaxy PrivateLink configuration name.
Log in to Starburst Galaxy.
Click the support icon located at the bottom right of the screen.
Select Chat with technical support.
Select Submit a Support Ticket.
The automated assistant will ask you to provide your email address, first name, and last name.
When you receive the prompt to describe your issue, note that you would like support to create a private endpoint connection for you. Be sure to include the Service name you just copied, the port your database is listening on (ex. 5439), the Redshift database name (ex. dev), and your preferred Starburst Galaxy PrivateLink connection name.
Wait for Starburst support to confirm that they have created the Endpoint in Starburst Galaxy. This should take no longer than 24 - 48 hours.
Step 4: Select the Starburst Galaxy endpoint
Do not begin this step until you receive confirmation from Starburst support that the Starburst Galaxy endpoint has been created successfully.
Scroll down, and select the Endpoint connections tab.
Wait to see the connection listed. Note: You may need to click the Refresh button.
Select the endpoint from the list.
Step 5: Accept the endpoint connection request
Now that you've selected the Starburst Galaxy endpoint, it's time to accept the connection request.
Select the Actions drop-down menu.
Select Accept endpoint connection request.
Manually enter accept in the field.
Click the Accept button.
Step 6: Confirm endpoint connection
That's it. The connection is now being created. This process takes between 1 to 3 minutes to complete.
When this process is complete, you are finished and ready to start using PrivateLink.
Wait for the State to change from Pending to Available.
Click the Refresh button to view status updates.
Tutorial complete
Congratulations! You have reached the end of this tutorial, and the end of this stage of your journey.
You're all set! Now you can use PrivateLink to configure access to data in your Redshift deployment.
Continuous learning
At Starburst, we believe in continuous learning. This tutorial provides the foundation for further training available on this platform, and you can return to it as many times as you like. Future tutorials will make use of the concepts used here.
Next steps
Starburst has lots of other tutorials to help you get up and running quickly. Each one breaks down an individual problem and guides you to a solution using a step-by-step approach to learning.
Tutorials available
Visit the Tutorials section to view the full list of tutorials and keep moving forward on your journey!