Cookie Notice
This site uses cookies for performance, analytics, personalization and advertising purposes.
For more information about how we use cookies please see our Cookie Policy.
Manage Consent Preferences
These cookies are essential in order to enable you to move around the website and use its features, such as accessing secure areas of the website.
These are analytics cookies that allow us to collect information about how visitors use a website, for instance which pages visitors go to most often, and if they get error messages from web pages. This helps us to improve the way the website works and allows us to test different ideas on the site.
These cookies allow our website to properly function and in particular will allow you to use its more personal features.
These cookies are used by third parties to build a profile of your interests and show you relevant adverts on other sites. You should check the relevant third party website for more information and how to opt out, as described below.
Fully managed in the cloud
Self-managed anywhere
Use the input above to search.
Here are some suggestions:
Trino Summit is a two-day virtual conference on the 11th and 12th of December 2024. It's an event that brings together engineers, analysts, data scientists, and anyone interested in using or contributing to Trino.
Learn moreStarburst Galaxy is a price-performant, fully managed, multi-cloud data and analytics platform powered by Trino, a leading open-source distributed MPP SQL query engine. Starburst Galaxy is used for both interactive ad-hoc analytics and long-running workloads like batch and ETL/ELT, and offers high scalability and query completion rates even as the amount of data, query volume, and query complexity increases. The service runs federated queries across data lakes, cloud data warehouses, on-premises databases, and relational data management systems like PostgreSQL and MySQL. Galaxy also supports fault-tolerant execution, smart indexing and caching, Data Products, and universal search and schema discovery.
Amazon Athena, available in serverless and dedicated versions, is a query service that analyzes data in Amazon Web Services (primarily Amazon S3) using standard SQL for ad-hoc analytics. Amazon Athena serverless has no infrastructure for customers to manage, and they only pay for queries that run. Amazon Athena was originally built on a fork of Presto (PrestoDB version .217), originally released in January 2019.
“Athena laid our foundation, but growth demands prompted a shift to Starburst. With its Warp Speed indexing and caching, costs were cut by 70%, seamlessly aligning with our expansion. Starburst not only caters to our growth but elevates performance and optimization in our data engineering landscape.”
– Pankaj Arora, Associate Director of Data Engineering
Learn more
“The bottom line is that Starburst Galaxy is a huge force multiplier for us. Based on my experience in previous roles, I’ve been able to accomplish what would’ve taken two to three engineers in half the time and one tenth of the cost [compared to Athena].”
– Anonymous, Director of Software and Engineering
Learn more
“With Starburst, we can maximize the value of our data. We are now able to run queries on tables with terabytes of data in just a few seconds.”
– Staff Engineer, A Fortune 100 Cloud Computing Company
Learn more
“We were using data in the way we could. It was getting more expensive, slower, and feeble. We had to change our approach and look for other ways of enabling our users without infrastructure penalties. We were over-run by the limitations of our latest solution [Athena]… Starburst gives us a single platform to explore more data through connectivity, maintain data quality and governance, and provide the data to all of our employees using their visualization tools of choice.”
– André Gortari, Data Engineering Manager at Banco Inter
Learn moreDon’t take our word for it. Starburst is named #1 for Quality of Support and Ease of Use in G2 Crowd’s Grid Report based on real customer reviews. Additionally, customers said this about Starburst:
Going beyond key platform governance and management capabilities, a modern data and analytics platform empowers data teams with easy-to-use functionality that increases productivity without adding complexity. It allows businesses to use a range of existing investments in just a few clicks. It enables data teams to easily discover, create, govern, analyze and share federated data products from distributed data sets across the organization.
Starburst Galaxy
Amazon Athena (Serverless)
Automated AWS compute plane set-up
Automated AWS compute plane set-up
Automated data maintenance
Automated data maintenance
Multi-cloud platform
Multi-cloud platform
Built-in data security
Built-in data security
Data Products
Data Products
Automated cluster management
Automated cluster management
Built-in real-time usage monitoring
Built-in real-time usage monitoring
Built-in query scheduler
Built-in query scheduler
Built-in Natural Language Processing
Built-in Natural Language Processing
*
Automated data lake optimization
Automated data lake optimization
Predictable pricing
Predictable pricing
Comparison based on publicly available information as of July 8, 2024.
* In preview. Contact us to learn more.
True data access empowers organizations with the ability to use all their data, no matter where it lives, across data lakes, data warehouses, and databases while having confidence in security and governance controls. True access is about meeting business needs on time while adhering to regulatory data sovereignty requirements. Your open lakehouse should free your data sources for analytics and AI, not confine them in another way.
Starburst Galaxy
Amazon Athena (Serverles)
Cloud data federation
Cloud data federation
On-premise data federation
On-premise data federation
AWS service account
AWS service account
Time-based policies
Time-based policies
RBAC
RBAC
ABAC
ABAC
Column/Row masking
Column/Row masking
SSO via AWS IAM, Okta, Azure AD, and Google
SSO via AWS IAM, Okta, Azure AD, and Google
Universal Search and schema discovery
Universal Search and schema discovery
Uses Trino connectors for federation
Uses Trino connectors for federation
In platform universal search and schema discovery
In platform universal search and schema discovery
Optimized first party connectors - parallelism, cached views, dynamic filtering, security, and authentication
Optimized first party connectors - parallelism, cached views, dynamic filtering, security, and authentication
Query sharing
Query sharing
Data Products sharing
Data Products sharing
Data profiling
Data profiling
Data lineage
Data lineage
Streaming ingest
Streaming ingest
*
Comparison based on publicly available information as of July 8, 2024.
* In preview. Contact us to learn more.
Internet scale matters in an internet-powered world but not every workload needs that power and performance. Your open data lakehouse, p0wers modern data and analytics and puts control of performance and costs in your hands. It ensures high-performance scalability is available at a click of a button or automatically when you need it most while optimizing price-to-performance for all analytics workloads. It also instills confidences that queries will execute as scheduled, even at high concurrencies.
Starburst Galaxy
Amazon Athena (Serverless)
Works with S3 Express One Zone
Works with S3 Express One Zone
Ad-hoc and interactive queries
Ad-hoc and interactive queries
Results and repeated subquery caching
Results and repeated subquery caching
*
High concurrency
High concurrency
Control over concurrency and prioritization
Control over concurrency and prioritization
Fault Tolerant Execution
Fault Tolerant Execution
Built-in data catalog
Built-in data catalog
Autoscales by adding more nodes per cluster
Autoscales by adding more nodes per cluster
Customizable scaling for cost and performance optimization
Customizable scaling for cost and performance optimization
Consistently executes long-running batch queries
Consistently executes long-running batch queries
Smart indexing and caching
Smart indexing and caching
Fine-grained resource management
Fine-grained resource management
Comparison based on publicly available information as of July 8, 2024.
* In preview. Contact us to learn more.
Open file and table formats are table stakes in providing optionality. Your open lakehouse goes beyond the fundamentals to ensure your business has full control over your data by accessing data where it lives across hybrid and multi-cloud data architectures, by allowing choice in cloud providers, security, and BI tools, and ensuring expert Trino support is available if and when your teams need it most.
Starburst Galaxy
Amazon Athena (Serverles)
OS Trino query engine
OS Trino query engine
Supports popular open file formats
Supports popular open file formats
Supports Python
Supports Python
Supports hybrid and cloud data architectures
Supports hybrid and cloud data architectures
Supports data catalogs beyond AWS Glue
Supports data catalogs beyond AWS Glue
Runs on multiple clouds
Runs on multiple clouds
Expert in-house Trino support
Expert in-house Trino support
Natively run SQL on Iceberg, Delta Lake, Hudi, and Hive table formats
Natively run SQL on Iceberg, Delta Lake, Hudi, and Hive table formats
In platform capability to migrate Hive to Delta or Iceberg tables
In platform capability to migrate Hive to Delta or Iceberg tables
Comparison based on publicly available information as of July 8, 2024.
* In preview. Contact us to learn more.
Access and analyze your data with elastic scale and high performance your business demands. Take Starburst Galaxy for a free test drive, watch the on-demand demo (no form fill needed), or contact us.
Amazon Athena is AWS’s analytics engine that allows you to execute Athena queries terabytes and petabytes of data in and around S3. You can use Athena to execute data warehouse-like SQL queries on data in your lake, access data from federated sources, prepare data for machine learning models, build distributed data reconciliation engines, and perform multi-cloud data analysis while only being able to run on Amazon Web Services.
Amazon Athena is not an ETL tool in the traditional sense, but it can be used to simplify ETL data pipelines using its federated SQL queries and user-defined functions. However, it is not uncommon for long-running queries like ETL jobs to fail without warning.
No, they are not the same. Amazon Simple Storage Service (S3) is a cloud storage service that allows you to store and retrieve data stored within it (cloud data lake). Amazon Athena, on the other hand, is what you would use to run queries against S3 data using standard SQL that supports ANSI SQL.
No, Amazon Athena is not a SQL server. It is an interactive query service that makes it easy to analyze data in Amazon S3 using standard SQL. Athena is serverless, so there is no infrastructure to manage, and you pay only for the queries that you run. Athena is built on top of Presto, a distributed SQL query engine, and can process large amounts of data in parallel. It supports a wide range of data formats, including CSV, JSON, ORC, Avro, and Parquet.
Athena allows you to create a ‘data warehouse’ like experience on Amazon S3. By defining schemas and running queries, you can efficiently organize and get your data. Furthermore, using APIs from AWS, visualization of the query results in business intelligence tools becomes possible.
To effectively use and manage Amazon Athena, instead of having the native built-in functionality you would expect of an analytics platform, Amazon Athena requires you to use several other AWS services. Here are some of the services that are required to make Athena work effectively.
Amazon S3 – Amazon S3 (Simple Storage Service) is an object storage service that offers scalability, data availability, security, and performance. It allows you to store and retrieve any amount of data from anywhere on the web. You can accomplish these tasks using the simple and intuitive web interface of the AWS Management Console. Data is stored in S3 buckets, which are containers for objects that you store in Amazon S3. S3 is the primary data source for Amazon Athena.
AWS Glue – this is the primary data catalog for Amazon Athena. The Glue data catalog is a fully managed, serverless data integration service that makes it easy to prepare and load data for analytics. Starburst Galaxy provides options in which data catalogs you use, including Starburst Gravity and AWS Glue.
Amazon Lake Formation – is a fully managed service that makes it easy to build, secure, and manage data lakes. Unlike Starburst Galaxy, which has these security and governance capabilities built in, Amazon Athena requires you to use Amazon Lake Formation to define and enforce database, table, and column-level access policies when using Athena queries to read data stored in Amazon S3
Amazon Redshift – this is AWS’s data warehouse. Similar to Starburst Galaxy, Amazon Athena also allows you to access data within Amazon Redshift.
AWS Lambda – You can use Lambda to execute SQL queries on Amazon Athena. You can create a Lambda function that uses the AWS SDK for Python (Boto3) to execute SQL queries on Amazon Athena. The Lambda function can be triggered by an event, such as an API Gateway invocation, to execute the query and return the query results.
Amazon DynamoDB – this is a fully managed NoSQL database service. The Amazon Athena DynamoDB connector (also available in the Starburst self-managed software offering) enables Athena to communicate with DynamoDB so that you can execute SQL queries on your tables.
AWS IAM – this is the identity and assessment service from AWS. Unlike the built-in capabilities with Starburst Galaxy, Amazon Athena uses AWS IAM policies as the primary means to restrict access to Athena operations. Users can create policies that grant or deny access to specific resources and configure permissions based on user roles or groups.
AWS Command Line Interface (CLI) – You can use the AWS CLI to interact with Amazon Athena. For example, you can use the aws athena start-query-execution command to run a query. You will then need to poll with aws athena get-query-execution until the query is finished. When that is the case, the result of that call will also contain the location of the query result on S3, which you can then download with aws s3 cp.
You cannot save results from the AWS CLI directly, but you can specify a Query Result Location, and Amazon Athena will automatically save a copy of the query results in an Amazon S3 location that you specify. You could then use the AWS CLI to download that results file.
Amazon Quicksight – this is AWS’s business intelligence (BI) service. Similar to the Starburst Galaxy and Quicksight experience, Amazon QuickSight retrieves data from Athena to enable visualization of the query results from Amazon Athena SQL queries.
Amazon EMR, formerly known as Amazon Elastic MapReduce – is a managed cluster platform that simplifies running big data frameworks, such as Apache Hadoop and Apache Spark, on AWS to process and analyze vast amounts of data. You would use the two together when building an Apache Iceberg data lake. You can use Amazon EMR Spark to create an Iceberg table and load sample data, then use Athena to query the table, perform schema evolution, and more with the AWS Glue Data Catalog.
Unlike Starburst, Amazon Athena does not offer capabilities to build, manage, secure, and share curated data sets in the form of federated data products.
Amazon Athena is serverless, so there is no infrastructure to manage, and its pricing structure is you pay only for the queries you run.
Amazon Athena also recently introduced the ability to provision dedicated capacity for your Athena queries. With provisioned capacity, you can reserve a dedicated set of compute resources to run your queries. This puts the management responsibilities on customers creating high risks of wasted resources and rising costs with poor management.
Up to $500 in usage credits included
You will need a valid email in order to activate your free trial.
© Starburst Data, Inc. Starburst and Starburst Data are registered trademarks of Starburst Data, Inc. All rights reserved. Presto®, the Presto logo, Delta Lake, and the Delta Lake logo are trademarks of LF Projects, LLC
Up to $500 in usage credits included