Managing access to data is a fundamental and critical need in any organization that has only increased in importance and complexity with regulatory requirements like GDPR. And the size of data, type of data, and number of people accessing data are all growing at near exponential rates.
Traditional resource-based access control (RBAC) relies on granting specific groups of people access to particular data assets — like individual tables or columns. Managing access at this level in such a large-scale system becomes a very difficult and time-consuming exercise.
That’s why we’re thrilled to announce the addition of attribute-based access control (ABAC) to Starburst Galaxy, which brings fine-grained governance and security to all of your cloud data sources. ABAC offers a dynamic and flexible access control framework that aligns security policies with an organization’s business-driven data privacy policies.
Why attribute-based access control (ABAC) is important
At its core, ABAC grants or denies access to resources based on various attributes associated with users, data assets, and other resources. These attributes can encompass a wide range of factors, including user roles, job titles, clearances, time of day, location, and even contextual data like device type or network security level. By leveraging this granular approach, large enterprises can implement fine-grained access control, letting them define precise rules for accessing sensitive information.
The importance of ABAC for large enterprises cannot be overstated. First, it facilitates the implementation of the principle of least privilege (PoLP), ensuring that users are granted only the permissions necessary for their tasks. This helps minimize the potential attack surface and restricts unauthorized users from accessing critical data. Second, ABAC enables dynamic access control, allowing organizations to adapt security policies in real-time as users’ roles and attributes change.
The agility unlocked with ABAC is vital for large enterprises that deal with complex organizational structures and frequently evolving access requirements.
The fundamentals of access control in Starburst Galaxy
Before diving in, let’s cover the basics. Galaxy’s access control is role-based — meaning every user acts in a specific role at any given time. Roles can be assigned to users, groups, or other roles. Users and groups can be synced with your preferred identity provider.
Roles are then given access to perform certain actions including:
- Administrative functions
- Creating, starting, and stopping clusters
- Managing metadata (including tags)
- Querying data
Here’s an example of what a marketing role may look like in Galaxy. You can see that users from the marketing department have all been added to the role via email.
How to implement ABAC in Starburst Galaxy
Step One – Tag your data
In order to create attribute-based access controls, we must first tag our data with some attributes. Tags are key to providing business context to access controls. It is important to note that Galaxy requires elevated privileges to tag assets and manage available tags. These can be found in the “Account” section of a role.
Tags have a two-level hierarchy that we’ll explore later. For now, create a tag that can be applied to data assets (catalogs, schemas, tables, and columns) by navigating to the tags section under access control.
The next step is to tag your data. Tags can be applied at every level in the data hierarchy – catalogs, schemas, tables, and columns.
Important: Tags “inherit” down this hierarchy. This means that if you tag a catalog, all of its schemas, tables, and columns will get that tag. However, tagging a column will apply that tag to just that column since it’s the lowest level of the hierarchy. |
Navigate to the catalogs section of Galaxy and choose the asset you want to tag. You can add tags directly on the object or in-line in its parent object. The example below is a catalog tagged with “marketing” and a schema tagged with “sales”. Note the inheritance from the catalog to the schema.
Step Two – Create policies
Now that we’ve tagged our data, we can create policies. Access is managed via roles, so navigate to the roles and privileges tab, chose the role you want to create the policy for, and click the policies tab to add a new policy.
Let’s break down each section you see in the create policy screen.
- Policy name and description: Describe the policy
- Scope: Limit where this policy will be applied. In this example, it will apply the policy to the entire demo_customer catalog and the entire demo schema in the google_cloud_storage catalog
- Matching expression: Apply the policy when the expression evaluates to true. This is where tags come into play. The expression field currently supports a single function has_tag(tag) and logical operators AND, OR, NOT, and(). For this example, the policy will apply to anything tagged with marketing but not anything tagged with pii or pii.*
- Privileges: these are the privileges that will be applied – note you can ALLOW or DENY privileges
*Note the wild card applied in the expression to match the hierarchy we built earlier.
Policies can also optionally have an expiration date to automatically remove access at a certain day and time.
Try Starburst Galaxy today
The analytics platform for your data lake
What are some next steps you can take?
Below are three ways you can continue your journey to accelerate data access at your company
- 1
- 2
Automate the Icehouse: Our fully-managed open lakehouse platform
- 3
Follow us on YouTube, LinkedIn, and X(Twitter).