Last Updated: 2024-09-18
Starburst Galaxy uses clients and drivers to connect to a variety of applications, tools, and technologies. Organizations typically use these tools to meet their third-party integration needs.
Clients and drivers are also useful because they help abstract and simplify sometimes complicated underlying data architecture and logic. Overall, the use of clients and drivers expands the use case for Starburst Galaxy.
This tutorial will show you how to connect a Trino object instance to a Starburst Galaxy cluster using the trino-js-client driver. This will allow you to access all of the data configured inside the cluster's catalogs. The trino-js-client is a Trino client written in TypeScript operating in a Node.js environment using a JavaScript run-time environment.
Once you've completed this tutorial, you will be able to:
This tutorial is highly technical in nature.Because of this, it is recommended that you have a basic understanding of JavaScript, including its data structures like objects and arrays, before proceeding.
The project is built in Node.js.For this reason, you should understand both the relationship between Node.js and JavaScript and the process required to create and run projects in this environment before beginning.
Finally, you will need some knowledge and familiarity regarding package management systems, such as npm
. Prior familiarity with this topic will make it easier to install and manage the necessary dependencies needed to successfully complete this project.
Node.js is a JavaScript runtime environment that allows you to run JavaScript or TypeScript code outside of a browser environment. It is designed to make web and mobile development accessible. In this tutorial, you will be using Node.js to run and access the trino-js-client. This section will help you get your environment set up.
Let's begin by downloading an IDE of your choice. This will allow you to run your project locally. There are many popular IDEs, including Visual Studio Code or IntelliJ IDEA.
This step will ensure that Node.js is installed correctly on your local machine. To do this, you're going to use code from the Node.js website to download the required packages.
# installs nvm (Node Version Manager)
curl -o- https://raw.githubusercontent.com/nvm-sh/nvm/v0.40.0/install.sh | bash
# download and install Node.js (you may need to restart the terminal)
nvm install 20
# verifies the right Node.js version is in the environment
node -v # should print `v20.17.0`
# verifies the right npm version is in the environment
npm -v # should print `10.8.2`
Your terminal should return node
and npm
version numbers to ensure that your download is successful.
Next, you need to initialize a new Node project and create a JavaScript file that will allow you to interact with Trino.
If you haven't created a folder for your project to live, please do so now.
cd path/to/your/project-directory
npm init -y
This command creates a package.json
file that stores your project's basic information and tracks any dependencies you add.
.js
file to run logicNext, you need to create a .js
file that will allow you to run all of your connection and data retrieval logic.
index.js
in the root of your project directory.Now it's time to install the trino-js-client in your environment. You can use npm
or yarn
via the command line to download the client driver into your project.
It's important to note that yarn
requires separate download scripts. For more detailed information on how to download yarn
, refer to yarn's official docs.
Command for npm
:
npm install trino-client
Command for yarn
:
yarn add trino-client
Note: You can ignore any vulnerability warnings.
dotenv
Now it's time to download the dotenv
package. This package will help you store and reference confidential information.
npm
script in your terminal command line to install the package:npm install dotenv
Note: You can ignore any vulnerability warnings.
package.json
By default, Node.js treats files with .js
extension as an older module system. This system uses require()
statements to import modules. However, more modern versions of JavaScript will use import
statements instead. To make sure your file is up to date with the ES6 version of JavaScript, you need to update your package.json
file.
package.json
file, located in the root of your project directory.package.json
file:"type": "module"
Your package.json
file should now look like this:
{
"name": your project name,
"version": your project version,
"type": "module",
...
}
Now it's time to import packages. These allow you to utilize the downloaded dependencies within your project. In this case, you'll want to import the dotenv
and Trino client package.
package.json
file to ensure that the client and dotenv
have been downloaded properly. You should see the following code in your package.json file:"dependencies": {
"dotenv": "^16.4.5",
"trino-client": "^0.2.3"
}
index.js
file to properly import your Trino client and .env
packages.import {BasicAuth, Trino} from ‘trino-client';
import dotenv from ‘dotenv';
These import statements offer a global reference point to your dependencies that can be used throughout your project.
You will use the trino-js-client to connect to your Starburst Galaxy catalogs. In this section, you'll instantiate the Trino client object, enabling you to query and retrieve your data efficiently.
Let's take a look at the example given in the trino-js-client documentation:
const trino: Trino = Trino.create({
server: 'http://localhost:8080',
catalog: 'tpcds',
schema: 'sf100000',
auth: new BasicAuth('test'),
});
Notice how the server
parameter specifies the location of your catalogs. In this case, you will be replacing this with your Galaxy server URL. Additionally, the catalog
and schema
define the name and structure of the data source you're querying. Furthermore, the auth
parameters, which will authenticate your access to the server itself, will be replaced with your own Galaxy credentials.
Let's dive into setting up the client in your environment! These next few steps will outline how to find the necessary information to replace these placeholders with your own parameters.
.env
file to store username and passwordBefore we begin, you need to modify your environment variables to store your username and password. When dealing with sensitive credentials, it's important to add a layer of security in your project's root directory.
.env
file to store your username and password, authentication parameters needed for Galaxy connection. GALAXY_USERNAME='yourusername'
GALAXY_PASSWORD='yourpassword'
.env
to .gitignore
Now it's time to set up your .gitignore. Adding your .env
file to a .gitignore
lets you protect your sensitive data when using Github.
.env
file is configured with your environment variables, create a .gitignore
file and paste the following line of code into the file: # local env files
.env
Please refer to this article for more information, if needed.
Your Starburst server URL can be found in the Partner Connect section of your Starburst Galaxy account.
Note: It is important to make sure that you are selecting the correct cluster containing the catalogs you want to use for this tutorial.
index.js
You will need to reference hidden environment variables in your index.js
file. To do this, you'll instantiate your imported dotenv
module using the config
method.
index.js
file below the two import
statements:dotenv.config()
To access your environment variables, you're going to use the process.env
object in Node.js. To authenticate using your credentials, you'll use the BasicAuth
method from the trino-client.
index.js
file:const auth = new BasicAuth(process.env.GALAXY_USERNAME, process.env.GALAXY_PASSWORD)
Now it's time to populate the Trino object. This will allow you to establish a working connection to your Starburst Galaxy cluster and access the data that you will query. Recall the example Trino object from the trino-js-client documentation:
const trino: Trino = Trino.create({
server: 'http://localhost:8080',
catalog: 'tpcds',
schema: 'sf100000',
auth: new BasicAuth('test'),
});
You will now reconfigure this object to add your own credentials obtained from previous steps.
index.js
file.server
key with the Galaxy server URL you copied in an earlier step. Be sure to include the https://
at the beginning of the URL.auth
key with the previously configured auth
variable. const trino = Trino.create({
server: "https://yourusername-yourclustername.trino.galaxy.starburst.io",
catalog: "catalog name",
schema: ‘schema name',
auth: auth
});
Now it's time to manually configure the SSL options. This is the last step needed to instantiate the object. This important and crucial step ensures that your application will only accept connection requests from servers that have valid SSL authorization.
sslOptions
and set the parameter rejectUnauthorized
to true
. Copy and paste this line of code into your index.js
file, directly below the declaration of the auth
constant: const sslOptions = {
rejectUnauthorized: true
}
After this, you will set the ssl
key to the sslOptions
object you just created by including the key-value pair ssl: sslOptions
. This will properly ensure that your code remains secure and our data is protected.
auth: auth
. Don't forget to add a comma after the auth: auth
key-value pair.ssl: sslOptions
The completed index.js file should look similar to the following:
Great job! You have successfully connected to Starburst Galaxy and Trino. The next steps will involve testing your connection and querying your data.
So far, you have populated your Trino object, allowing it to accept the correct parameters. You have also set up your coding environment to host this object and created a pathway that allows it to store the logic needed to query your data.
Next, it's time to test your connection and query your data using an index file.
executeQuery
functionAt the start of this tutorial, we encouraged you to review the trino-js-client documentation. This documentation outlines the executeQuery()
function. You'll use this function to retrieve and manage your data.
executeQuery
function into your index.js
file.const executeQuery = async (query) => {
const iter = await trino.query(query);
const data = await iter
.map(results => results.data ?? [])
.fold([], (row, acc) => [...acc, ...row]);
console.log("testing return of query on server", data)
return data;
}
executeQuery
functionPassing a query string into executeQuery()
will execute the query via the trino.query()
method. The results are processed within an asynchronous loop and mapped to extract the data and fold (or flatten) them into a single array.
You will be using a basic query to first check to see if your object has access to the data stored in your Galaxy catalogs.
index.js
file to pass SHOW CATALOGS
as a string into the execute function. const testQuery = 'SHOW CATALOGS'
console.log(executeQuery(testQuery));
node index.js
It's time to execute more complex SQL queries. To do this, you're going to use a modified version of the previous step, replacing the testQuery
string with a modified one that retrieves data from the tpcds
catalog.
node index.js
command.const testQuery = `
SELECT
c.custkey AS customer_id,
c.name AS customer_name,
SUM(o.totalprice) AS total_order_amount
FROM
tpch.sf1.orders o
JOIN
tpch.sf1.customer c
ON
o.custkey = c.custkey
GROUP BY
c.custkey,
c.name
ORDER BY
total_order_amount DESC
`
console.log(executeQuery(testQuery));
Your terminal should log the return values, similar to the following:
You have now successfully run a SQL query in Node.js and confirmed that your Trino client is connected and properly referencing your data. This basic example demonstrates the power of Trino and its ability to integrate with a variety of different tools. You've officially laid the groundwork for any future projects or applications you wish to build using Node.js, Trino, and Starburst Galaxy.
Congratulations on finishing this tutorial! You should now have a comprehensive understanding of how to configure a Trino client within a Node.js environment to access data in Starburst Galaxy.
At Starburst, we believe in continuous learning. This tutorial provides the foundation for further training available on this platform, and you can return to it as many times as you like. Future tutorials will make use of the concepts used here.
Starburst has lots of other tutorials to help you get up and running quickly. Each one breaks down an individual problem and guides you to a solution using a step-by-step approach to learning.
Visit the Tutorials section to view the full list of tutorials and keep moving forward on your journey!