Going, an up and coming travel technology company, is ingesting 50 PB of flight data a year to provide their customers with the most timely travel recommendations while optimizing their margins with a price-performant and easy to use lakehouse platform. With Starburst, Going ingests their Kafka streaming data into their data lakehouse using Trino and Apache Iceberg, two powerful technologies that together enable seamless real-time analytics on large-scale datasets.


Fully managed streaming analytics
A simplified and powerful experience that goes from set-up to value in just a few clicks
Massive scale
Ingest 100 GB of Kafka topics per second per Iceberg table without worrying about head-of-line misconfigured schema or temporary load spikes and backlog challenges.
Exactly once processing
Guaranteed once processing to ensure no messages are missed or duplicated for optimal data integrity.
Data transformation
Automated transformation into a relational format to eliminate the need for complex ETL pipelines and reduce the time and resources required for data preparation.
Data Maintenance
Automated compaction, delete, orphaned file removal, and statistics collections to reduce storage costs and improve performance.
Open streaming foundation
From data to query engine, the only lakehouse that offers optionality through the stack.
Multiple data sources
Ingest any Kafka compliant streams including Apache Kafka, Amazon Managed Kafka Service, and Confluent Cloud.
Own the data
Land data into Iceberg tables within the customers Amazon S3 buckets.
Optionality in catalogs
Use Starburst Gravity, Amazon Glue, or Rest API based Iceberg catalogs
Interoperable and performant SQL
Powered by enhanced Trino to provide industry leading, scalable, and price-performant SQL analytics.
Make decisions faster
Improve the agility of business by making analytics more near real-time without the complexity.
Next best offer
Collect and analyze multiple data streams to delight customers with the most tailored and timely offers.
Anomaly detection
Monitor data streams to identify unusual patterns or outliers that could indicate issues such as network intrusion, system failures, or unusual behavior across network security, financial fraud, and more.
Predictive maintenance
Ingest and analyze up to the minute IoT sensor data to prevent equipment failures with timely maintenance.
Inventory Management
Continuously monitor global inventory levels and sales data to make near real-time restocking, promotions, and inventory allocation decisions.
Dynamic pricing
Adjust pricing in near real-time based on demand, inventory levels, and competitive pricing to maximize cart size and revenue.
Frequently Asked Questions
Accelerate analytics to be more responsive to evolving business needs with Streaming Ingestion in Starburst Galaxy.
What makes the Starburst Galaxy streaming ingestion service a powerful alternative?
As a fully managed and highly scalable service, verified at 100 GB/second, with minimal configuration and no infrastructure to run, Starburst automates the majority of the task typically experienced with alternative solutions. Furthermore, both ingestion, parsing/transformation and data maintenance are managed by the same system, reducing the need for multiple tools. Starburst also guarantees once semantics so that records are neither missed nor written more than once to the Iceberg table ensuring a high quality bar for streaming data.
How does sizing work?
Starburst can scale with customer throughput requirements and can support up to 8 MB/s (uncompressed) per Kafka partition.
Where is the data landed after it has been ingested by Starburst?
After ingesting data from Apache Kafka, Amazon MKS, or Confluent Cloud, data is landed into an Iceberg table that resides in the customer Amazon S3 bucket. Once it landed in the S3 bucket, Starburst automates all the transformation and governance to make the data ready to be queried in about a minute.
How is this feature priced?
Pricing for streaming ingestion is straightforward and starts at $.065 per 8MB/s pipe-hour. Check out details here.
What query engine is used to analyze the streaming data?
Starburst uses an enhanced version of Trino, a leading open source MPP SQL query engine. Within the query engine, Starburst offers enhanced fault tolerance, smart indexing and caching via Warp Speed, and additional optimizations for caching and intelligent routing.
Start for Free with Starburst Galaxy
Up to $500 in usage credits included
Discover
Easily search across data sources and clouds to find the data you need.
Govern
Streamline data governance with built-in RBAC and ABAC.
Analyze
Run internet-scale workloads with the power of Trino.
Fast
Accelerate queries with smart indexing and caching technologies like Warp Speed.
More Deployment Options