I had a meeting a few days ago with a new customer that is a large pharmaceutical company. The conversation went something like this:
Me: How is your adoption of Starburst Enterprise coming along with your users?
Them: Great, but we had to pause the deployment because once word spread to our user community that they could access all of these different systems using their favorite BI tool, the usage went through the roof and we realized that we needed to start limiting access to sensitive data.
Me: Ok, we need to implement Starburst Global Access Control ASAP!
Starburst Global Access Control provides a secure layer that ensures data is accessed only by authorized users. It works by checking predefined policies against every query that is executed ensuring data is either restricted or masked meeting the strict security requirements that are a reality in today’s big data world. Paired with Starburst’s Event Logger, which preserves the detail of every executed query, a company can meet the most stringent of security requirements.
As the diagram illustrates below, any client accessing Starburst Enterprise to issue queries against the more than 30 different data sources available conforms to global access control policies.
In the diagram below, Starburst clusters are configured for global access control and are pointed to an Apache Ranger instance that contains security policies. Multiple Starburst clusters can use one Ranger instance greatly simplifying the security architecture.
Policies
Access Control
Access control policies allow users and group access to catalogs, schemas, tables, procedures, queries and session properties. When global access control is enabled, all objects within a Starburst cluster are denied by default. This is an important feature because it alleviates the accidental data exposure to unauthorized users. Access control policies are then added for different groups and users and these can be read-only or full write and managed privileges.
The two images below illustrate a policy that provides full access to the marketing database only to the Marketing group. If this was the only policy against this database then any user logging into Starburst that isn’t in the Marketing group wouldn’t be able to see any objects in this database as well as querying would be restricted.
Access policies are a very powerful feature and need to be planned ahead of time to ensure proper access to all Starburst users.
Column Masking
Masking sensitive data is vital to any data access solution. Starburst Enterprise provides the ability to mask any column of any source for any user or group. Masking is often used when situations occur where a user or group needs access to a select table but there are columns that need to be partially or fully masked.
In the example below, the customer name in the customer table is masked only for the Marketing Intern group. If access policies exist for other users or groups, they will see the full customer name.
Masking policies act as insurance to companies to make sure sensitive data isn’t being accessed by unauthorized users.
Row Level Filtering
Lastly, limiting rows from a data source is a powerful feature. Users and groups can be limited by the data they see using this type of policy.
In this policy, we limit queries against the nation table to filter out where the nationkey = 18. This can be for regulatory purposes or to hide data to end users.
Row filters allow the addition of a “where” clause against tables. This can be a very powerful tool to restrict data without implementing more complex solutions such as views or functions.
Example Company
To provide an example of how policies can be created and applied, we’ll look at an example company named Example Corp. This company has 3 departments with one sub-department under Marketing:
There are a few different approaches to create policies, source based or department/group based.
In this example, I took the department approach where I create policies that allow different source systems access to these different departments. The below diagram
I create an initial layout of policies by department. Since you cannot create duplicate policies on the same resources, I had to create an “All” one that allows the different departments read-only access to the data lake:
Using these rules, I can allow not only SQL access but access to the actual catalogs and schemas. For example, for Marketing, I only allow them to see the data lake and marketing databases. Notice the POC system doesn’t show up:
Lastly, sandboxes are a popular feature to offer Starburst users. It allows a place where they can create sample, rollup and aggregation tables. With Starburst Global Access Control, this is a very simple task using an allow rule but denying others from accessing this location in the data lake.
In the image below, the policy allows full access to the “marketingsb” schema in the data lake, read-only to marketing interns and denies all others from accessing it:
In conclusion, managing data access for a small or large group of users can be a challenge. Protecting your company data has become one of the most important aspects of data analytics and providing a single place that Starburst Global Access Control provides, ensures all security policies are met.
You can watch this webinar to learn more about Global Access Control at Starburst.
For more information about Starburst Global Access Control or any other questions, please contact us.
What are some next steps you can take?
Below are three ways you can continue your journey to accelerate data access at your company
- 1
- 2
Automate the Icehouse: Our fully-managed open lakehouse platform
- 3
Follow us on YouTube, LinkedIn, and X(Twitter).