Data Sovereignty
The cloud was supposed to drive innovations by enabling the free flow of data around the world.
This global vision has faded as countries assert principles of data sovereignty. Rather than building an efficient, unified cloud infrastructure, companies must manage multiple data silos optimized to comply with each country’s data regulations.
This guide will introduce data sovereignty, what’s driving countries to exert control over data, and a modern approach to analytics that unifies fragmented storage architectures.
What is the difference between data sovereignty and data residency?
Data residency is a data property that describes where it is stored. Data residency can also apply more broadly to compliance practices in each geographical location.
Consider a Los Angeles-based direct-to-consumer company. It stores customer data in on-premises servers and Amazon Cloud locations in Oregon and France. The company’s data resides in at least five jurisdictions:
- State of Oregon
- State of California
- United States of America
- France
- European Union
The company’s data residency practices will map which data is stored where to help it comply with data regulations in each jurisdiction.
Complicating matters further, residency is not the sole determinant of sovereignty. The data’s origins, or its provenance, also matters. Many data privacy laws apply to data collected within a country’s jurisdiction, no matter where it is stored.
What are the pros and cons of data sovereignty?
Regulations based on data sovereignty benefit people, economies, and societies. Companies must bear the burden of compliance by implementing those protections.
Pros of data sovereignty
As American companies like Microsoft and Amazon came to dominate the cloud, people worldwide grew concerned about their data privacy. Data sovereignty gives countries control of data created and stored within their borders to protect their citizens.
Protecting personal data
Most data regulations protect individual privacy by giving people the right to decide how organizations collect and use their personal data.
Protecting data from outside jurisdictions
Driven by intelligence-gathering practices like the USA PATRIOT Act, many nations expanded their definition of data sovereignty to include data localization — requiring data to remain within their borders.
Confidentiality and national security
Data sovereignty is not limited to rules protecting personal privacy. Industries or government agencies have data residency requirements that prevent the transfer of sensitive data beyond national borders.
Cons of Data Sovereignty
Data fragmentation is a side-effect of data sovereignty. A company can’t leverage economies of scale by consolidating its data storage. Instead, it must manage data in multiple locations and jurisdictions. This fragmented infrastructure has significant impacts on data management efficiency and business decision-making.
Increased costs
Data infrastructure costs rise as companies decentralize storage across multiple cloud providers and on-premises data centers. This fragmentation creates significant hidden costs due to the egress fees that cloud service providers charge when customers move data to another service.
Reduced performance
Centralized cloud storage offers significant improvements in network performance. Data moves within the cloud platform’s high-bandwidth, low-latency internal network. Stitching together national data silos creates bandwidth and latency penalties that can impact business performance.
Barriers to innovation
Data sovereignty adds friction to innovation by inhibiting data insights. Information in one region is no longer accessible to analysts in another without significant coordination. Data teams must develop ETL pipelines to process regional data sets in compliance with data regulations. Only the most critical projects will justify this time and expense, undermining data-driven business cultures.
How data sovereignty relates to data protection and data security
Data protection and data security are distinct concepts. Secure data is not necessarily protected. And protected data is not necessarily secure.
The purpose of data protection is to ensure the company always has access to the best quality data possible. These practices safeguard data integrity while allowing the recovery of data should those safeguards fail.
Data security’s purpose is to defend data and information systems from unauthorized access. Layers of security technologies and practices defend networks from external threats and data breaches. At the same time, authentication and authorization systems limit access to legitimate users.
Regulations based on data sovereignty principles set expectations for how companies protect and secure their data. Unfortunately, these expectations vary from country to country, making it impossible to set efficient company-wide policies. Instead, the data governance organization must coordinate protection and security policies that meet local requirements everywhere the company collects and stores data.
How data sovereignty impacts data privacy
Sovereignty allows political jurisdictions to grant data privacy rights to their citizens and to set the rules organizations must follow to protect those rights. As mentioned earlier, both residency and provenance determine which privacy regulations apply to the data a company collects.
Returning to our hypothetical direct-to-consumer company, it collects data from Californians visiting its website and stores the data on company servers in Los Angeles. Residency and provenance subject the company to the California Consumer Privacy Act (CCPA). Moving servers out of state would change the data’s residency, but not its provenance, so the company would still have to comply with CCPA.
How GDPR impacts data sovereignty and data localization
Three out of four countries have data localization laws requiring the local storage of data collected about their citizens.
China’s localization rules, for example, require in-country storage of all personal data and only allows data exports after a formal security review.
At the other extreme, trade agreements between Canada, Mexico, and the United States prohibit data localization.
The situation in Europe is murkier. Although the European Union’s General Data Protection Regulation (GDPR) does not explicitly require localization, this may be the safest path to compliance.
The Court of Justice of the European Union (CJEU) struck down data-sharing agreements between the EU and the US government. GDPR allows organizations to export EU citizens’ data provided “adequate protections” exist at the destination. The CJEU ruled that the United States does not have those protections.
Data sovereignty undermines data-driven innovation
Ubiquitous internet access and the power of cloud computing were supposed to make international business operations more efficient. All data would flow into the cloud, which end users could mine for insights that drove better data-based decisions. At the same time, companies could replace local IT infrastructure with more cost-effective cloud computing architectures.
Data sovereignty upends these plans. All data is no longer the same. Companies must handle European, Canadian, Indian, Chinese, and American data differently, implementing data protection and security practices unique to each jurisdiction.
The impact of data sovereignty is not limited to international business. The United States has federal data protection laws for specific industries, such as the healthcare industry’s Health Insurance Portability and Accountability Act (HIPAA), but no overarching protections for American citizens’ data privacy.
America’s lack of a national data privacy regime is one reason for the CJEU’s decision. It also creates a fragmented regulatory landscape within the United States as California, Colorado, Connecticut, Utah, and Virginia have enacted data privacy regulations.
This national and international data regulatory patchwork undermines data-driven decision-making. Companies must develop multiple data strategies to ensure compliance everywhere they do business. If data is accessible from other regions, it takes longer to analyze. Data scientists must settle for incomplete data sets for their advanced analytics projects. Ultimately, decision-makers cannot fully leverage their company’s vast resources for innovative insights.
A gateway for global cross-cloud analytics
Starburst resolves data sovereignty challenges by creating a virtual abstraction layer that unifies disparate data sources into a single point of access. Starburst customers can create clusters with catalogs and data sources specific to each geography. Data always resides at the source where governance systems can enforce locally-appropriate access control policies.
Starburst Stargate links these clusters together to make data globally accessible. When an analyst in the United States generates a query that requires data from the European Union, Stargate pushes the query to the European cluster. There, Starburst applies fine-grained access control policies to ensure the query uses data in accordance with European privacy rules. Data can be aggregated and anonymized so the analyst in America never sees the personal data of European citizens.
Starburst customers use Stargate to generate data-driven insights across geopolitical boundaries while honoring data sovereignty principles anywhere they do business.