Near real-time analytics

Ingest up to 100GB/second of Kafka topics, land in Apache Iceberg, transform, and govern to start querying within a minute.

Through your choice of streaming ingestion or file ingestion (public preview), connect to your desired data source and get automatic updates to your latest Iceberg tables in near real-time.

Free Trial

Talk to the experts on how to get started with Streaming Ingest

Going, an up and coming travel technology company, is ingesting 50 PB of flight data a year to provide their customers with the most timely travel recommendations while optimizing their margins with a price-performant and easy to use lakehouse platform. With Starburst, Going ingests their Kafka streaming data into their data lakehouse using Trino and Apache Iceberg, two powerful technologies that together enable seamless real-time analytics on large-scale datasets.

“We needed a solution that is scalable, wouldn’t lock us into a specific vendor, and is cost efficient and easy to manage. With the Starburst partnership and its lakehouse platform, we have unlocked data for all employees to focus on providing near real-time value, from 50 PB of streaming data and growing, to our customers and spend less time on data management.”

Brian Kidwell

CEO and cofounder, Going

Fully managed streaming analytics

A simplified and powerful experience that goes from set-up to value in just a few clicks

Massive scale

Ingest 100 GB of Kafka topics per second per Iceberg table without worrying about head-of-line misconfigured schema or temporary load spikes and backlog challenges.

Exactly once processing

Guaranteed once processing to ensure no messages are missed or duplicated for optimal data integrity.

Data transformation

Automated transformation into a relational format to eliminate the need for complex ETL pipelines and reduce the time and resources required for data preparation.

Data Maintenance

Automated compaction, delete, orphaned file removal, and statistics collections to reduce storage costs and improve performance.

Open streaming foundation

From data to query engine, the only lakehouse that offers optionality through the stack.

Multiple data sources

Ingest any Kafka compliant streams including Apache Kafka, Amazon Managed Kafka Service, and Confluent Cloud.

Own the data

Land data into Iceberg tables within the customers Amazon S3 buckets.

Optionality in catalogs

Use Starburst Gravity, Amazon Glue, or Rest API based Iceberg catalogs

Interoperable and performant SQL

Make decisions faster

Improve the agility of business by making analytics more near real-time without the complexity.

Next best offer

Collect and analyze multiple data streams to delight customers with the most tailored and timely offers.

Anomaly detection

Monitor data streams to identify unusual patterns or outliers that could indicate issues such as network intrusion, system failures, or unusual behavior across network security, financial fraud, and more.

Predictive maintenance

Ingest and analyze up to the minute IoT sensor data to prevent equipment failures with timely maintenance.

Inventory Management

Continuously monitor global inventory levels and sales data to make near real-time restocking, promotions, and inventory allocation decisions.

Dynamic pricing

Adjust pricing in near real-time based on demand, inventory levels, and competitive pricing to maximize cart size and revenue.

Frequently Asked Questions

Accelerate analytics to be more responsive to evolving business needs with Streaming Ingestion in Starburst Galaxy.

What makes the Starburst Galaxy streaming ingestion service a powerful alternative?

As a fully managed and highly scalable service, verified at 100 GB/second, with minimal configuration and no infrastructure to run, Starburst automates the majority of the task typically experienced with alternative solutions. Furthermore, both ingestion, parsing/transformation and data maintenance are managed by the same system, reducing the need for multiple tools. Starburst also guarantees once semantics so that records are neither missed nor written more than once to the Iceberg table ensuring a high quality bar for streaming data.

How does sizing work?

Starburst can scale with customer throughput requirements and can support up to 8 MB/s (uncompressed) per Kafka partition.

Where is the data landed after it has been ingested by Starburst?

After ingesting data from Apache Kafka, Amazon MKS, or Confluent Cloud, data is landed into an Iceberg table that resides in the customer Amazon S3 bucket. Once it landed in the S3 bucket, Starburst automates all the transformation and governance to make the data ready to be queried in about a minute.

How is this feature priced?

Pricing for streaming ingestion is straightforward and starts at $.065 per 8MB/s pipe-hour. Check out details here.

What query engine is used to analyze the streaming data?

Starburst uses an enhanced version of Trino, a leading open source MPP SQL query engine. Within the query engine, Starburst offers enhanced fault tolerance, smart indexing and caching via Warp Speed, and additional optimizations for caching and intelligent routing.

Start for Free with Starburst Galaxy

Up to $500 in usage credits included

Discover
Easily search across data sources and clouds to find the data you need.
Govern
Streamline data governance with built-in RBAC and ABAC.
Analyze
Run internet-scale workloads with the power of Trino.
Fast
Accelerate queries with smart indexing and caching technologies like Warp Speed.

More Deployment Options

Request Enterprise trial license keyarrow_forward

Starburst’s mission is to free our customers to see the invisible and achieve the impossible