Cloud Data Warehouse

Each data warehouse implementation style has advantages and disadvantages.

The implementation that a company chooses requires them to weigh a number of factors against each other.

Although the factors an organization cares most about will differ among organizations, common considerations include speed, control, scalability, reliability, security and governance, and cost.

Factors that influence on-premise and cloud data warehouses

Factors

Why it matters

Speed Requires a balance between setup time and time-to-insights once setup is complete
Control Decisions about setup, implementation, and access
Scalability The needs of a business will change over time, seasonally, and as a company grows
Reliability Backing up and accessing data is critical — how often will maintenance be required?
Security and governance Handling data, especially personally identifiable data, requires care and complying with regulations
Cost Whether you pay to implement your data warehouse upfront (CaPex) or as-you-go (OpEx) depends on your implementation and affects your company’s balance sheet and tax strategy.

Cloud data warehouse vs On-premise data warehouse

We outline some of the advantages and disadvantages of on-prem data warehouse and cloud data warehouse installations.

The comparison looks at the seven specific factors listed above. Can you see any points where one installation type or another might be more advantageous in your organization?

Factors

On-premise data warehouse

Cloud data warehouse

Speed Quicker to obtain insights if a company is in one location Quicker setup time because the hardware doesn’t need to be set up and fewer team members to train

Quicker to obtain insights if a company is spread out but needs to transfer data among locations

Control Company has complete control Some decisions are left to the cloud vendor and may not be adjustable
Scalability No advantages over cloud data warehouses. Easier to scale up and down because no hardware is required
Reliability Depends on your team. Depends on the cloud provider

Some level of inherent backups or disaster recovery

Security and governance With a strong data access policy, on-prem is most secure.

Some legal or contractual requirements may not allow for cloud providers.

Cloud vendors have security guarantees and can restrict access to employees
Cost Avoid annual costs from cloud vendors

May be cheaper over time if resource procurements are carefully managed

No cost to query your own data, and your data belongs to you

Lower upfront costs because you don’t need to pay for infrastructure

Potential lower ongoing employee costs because you don’t need to hire employees with on-prem with skills to maintain and administer on-prem data warehouses

Allows you to scale storage and server usage up and down when needed, which can lower costs if done correctly

No need to buy hardware that is used only during peak times

Using a data warehouse with a data lake

For better data integration, data lakes are another type of data storage that can store non-standard and unstructured data without transforming them first. One challenge of storing data in multiple types of storage is aggregating them when both will be used in an analysis.

How Starburst helps

Solutions like Starburst help immensely with disparate data sets and improve the lives of data scientists and data engineers. Starburst works in conjunction with both data warehouses and data lakes to help you query data, using sql, from any location. This helps data consumers and business users overcome barriers and optimize more accurate data-driven insights on a dashboard.