Last Updated: 2023-12-15
Starburst Galaxy includes a universal search feature. This assists data discoverability across catalogs, schemas, tables, views, columns, data products, tags, owners, or contacts. This added transparency is backed-up by role-based security access, ensuring maximum visibility to those with the necessary privileges, while restricting access to other users.
This helps break down organizational knowledge silos, freeing data consumers to efficiently find, query, and analyze datasets more easily. It also assists data engineers and platform administrators, by providing a global view across the data pipeline.
You need a Starburst Galaxy account to complete this tutorial. Please be sure to complete the tutorial titled Starburst Galaxy: Getting started before attempting this tutorial.
Upon successful completion of this tutorial, you will be able to:
Starburst tutorials are designed to get you up and running quickly by providing bite-sized, hands-on educational resources. Each tutorial explores a single feature or topic through a series of guided, step-by-step instructions.
As you navigate through the tutorial you should follow along using your own Starburst Galaxy account. This will help consolidate the learning process by mixing theory and practice.
The data engineers at Chryse Corp. aim to enhance the discoverability of datasets used by data analysts. To achieve this, they plan to add both tags and metadata. This will make it easier for data analysts to find and access the relevant data, improving overall data exploration and analysis processes.
You'll help them start this process by adding tags and metadata to their Starburst Galaxy datasets, focusing on the astronauts
and missions
tables that are included in the demo
catalog. Then, you'll use Starburst Galaxy's search features to find these datasets.
Universal search uses metadata to create a searchable index of different data assets across all catalogs and clusters connected with your account. This synopsis is meant as a jumping-off point for further data discovery and querying activities.
Notably, this universality extends to assets held across different clouds, whether it be AWS, Azure, or GCP. Although you can see the location of this data, you cannot transfer data across clouds and the results are based on metadata rather than the data itself.
Universal search is continuously updated on a streaming basis, so any changes you make within Starburst Galaxy will take immediate effect. Changes made outside of Starburst Galaxy are updated less frequently. Those changes will be reflected in a batch process updated approximately once every 24 hours.
The following video walks through the first two sections in this tutorial. It shows you how to create tags and add metadata to tables and columns.
You can choose to watch the video and follow along using your own account. Alternatively, if you prefer, you can skip the video and proceed directly to the step-by-step instructions provided later in the tutorial.
In Starburst Galaxy, you can add tags to data entities, including catalogs, schemas, tables, views, columns, data products, tags, owners, or contacts.
Universal search works in tandem with this, allowing users to filter their results based on the types of tags involved.
In this section of the tutorial, you'll create a set of tags to help data consumers find data. Specifically, you'll use the astronaut dataset to create a missions tag and two additional tags nested under missions, called personnel and info.
Sign into Starburst Galaxy in the usual way. If you have not already set up an account, you can do that here.
Only the data entity owner can add metadata to data entities. In this tutorial, you'll add metadata from the accountadmin role.
In Starburst Galaxy, tags are part of the Access control menu.
In Starburst Galaxy, you can create tags nested within other tags. This is exactly what you're going to do in this tutorial.
You'll begin by creating a top-level tag called missions. It will help data consumers identify data entities that contain mission information. Afterwards, you'll create two other tags nested inside the missions tag.
Now it's time to explore how to create nested tags.
You're going to start by creating a personnel tag nested under the missions tag that you created in the previous step.
Now it's time to create your second nested tag, info. This tag identifies columns that hold general mission information.
You've created three tags, but you can only see the missions tag listed in the tags section.
That's because this is the only top-level tag you created, the other two were nested inside missions.
Universal search works by using metadata, but not all of your tables and columns have metadata from the outset. Luckily, Starburst Galaxy allows you to add metadata at any point to columns and tables.
In this section of the tutorial, you'll add metadata to the astronauts
table and several of its columns. Later in this tutorial, you'll use this metadata with universal search.
To add metadata to a table, you need to select the table in the catalog explorer.
Remember that Starburst Galaxy uses the catalog.schema.table
hierarchy. You're going to navigate down that hierarchy until you find the astronauts
table.
sample
catalog.demo
schema.astronauts
table.Starburst Galaxy displays important basic information about the astronauts
table.
You can access additional information about the table's metadata by expanding these details.
Starburst Galaxy shows you several metadata fields that you can edit. Each of these has a pencil icon next to it, allowing you to update or add additional metadata.
You're going to begin by adding to the table description, which is currently empty.
Now you're going to add additional metadata to the astronauts
table by adding a tag.
Just like before, you can do this by selecting the corresponding pencil icon, this time in the Tags row.
You've added two types of metadata.
Now you're going to do the same thing by editing the Contacts field.
Now it's time to pivot towards looking at columns.
Starburst Galaxy lists each of the columns in the table, making it easy to add metadata at the column level. For this tutorial, you're going to add metadata to specific columns in the astronauts
table.
Notice that each column in this table already has one tag listed. This is because you added the missions.personnel tag to the whole table, and each column inside the table has inherited it.
Let's add a tag to the mission_number
column.
mission_number
column by selecting the + icon.You can add descriptions to columns just like you did with tables.
Let's add a description to the mission_number
column to test it out.
mission_number
by selecting the pencil icon. Now that you have added tags to the astronauts
table and the columns inside it, it's time to explore how Starburst Galaxy reports tag usage.
You're going to start by looking at tags at the table level first.
Notice that the nested tags missions.info and missions.personnel are listed as being in use, denoted by the 1.
Even though each column in the astronauts
table inherited the missions.personnel tag, it is only considered to be in-use once because it was added to a single data entity.
Starburst Galaxy also allows you to manage tag settings for column-level tags. This allows you to see where the tag is being used, remove the tag, and edit any of its settings.
mission_number
row. The following video guides you through the remaining steps in this tutorial. Specifically, it shows you more information about using universal search.
You can choose to watch the video and follow along using your own account. Alternatively, if you prefer, you can skip the video and proceed directly to the step-by-step instructions provided later in the tutorial.
Universal search allows you to use keywords to find a number of different types of data entities. These include:
For the search to function properly, the keyword in the search term must match the first part of a data entity's name. It can also match the characters after an underscore.
For example, the search term ‘cust
' would return a data entity named customer
, but also an entity named profile_customer
. However, a search for the term ‘omer
' or ‘file
' would not return either of these results because the matching occurs only on the first part of strings.
When a keyword matches a data entity, you can further filter the results by:
Universal search is improving rapidly, and keyword searches will match with more metadata in the future. For even more information, review the documentation.
Universal search can be accessed in two different ways.
/
at any time.It's time to test out universal search using the tags you added earlier in this tutorial.
Universal search works like many other search systems and involves keywords. You can choose how you want to filter the results. The default filter is datasets, but you can choose to filter by data products, tags, owners, or contacts.
missions
.missions
".Now it's time to search for one of the tags you added. This works in a similar way to searching for Datasets, but with one twist.
Let's explore that twist in more detail.
Universal search provides the best matches first, but sometimes it's necessary to dig deeper. This is when the Search results menu comes in, providing a number of search results and filter options.
Search filters can be updated. You can filter a search by asset type, catalog, tag, contact, and owner.
Right now you are filtering on the missions.personnel tag. Let's change the filter type to see how the results change.
You can also use the Catalog explorer to search for catalogs, schemas, tables, views, and columns.
This type of search works in a similar way to universal search, and uses the same matching process.
Time to try your first search in this field.
Congratulations! You have reached the end of this tutorial, and the end of this stage of your journey.
Now that you've completed this tutorial, you should have a better understanding of just how easy and convenient it is to use universal search in Starburst Galaxy.
At Starburst, we believe in continuous learning. This tutorial provides the foundation for further training available on this platform, and you can return to it as many times as you like. Future tutorials will make use of the concepts used here.
Starburst has lots of other tutorials to help you get up and running quickly. Each one breaks down an individual problem and guides you to a solution using a step-by-step approach to learning.
Visit the Tutorials section to view the full list of tutorials and keep moving forward on your journey!