×

4 Best practices for developing and scaling data products

Published: May 15, 2023

Starburst Academy: Exploring data products

In this module, you will explore data products and investigate the various ways in which they change the landscape of traditional ETL pipelines.

Register | FREE

It seems like everyone is talking about data products these days. They’ve emerged as the go-to solution for organizations that want to maximize the competitive value of their data. But there’s so much hype surrounding the term that you have to wonder if it’s just another buzzword full of sound and fury, signifying nothing.

Data products help us focus on what stakeholders need from data to recognize opportunities, spot problems, plan strategies, and otherwise contribute to the mission of their organization.

But there’s more to data products than buzz. The concept is a meaningful step forward in the art and science of data management. Data products help us focus on what stakeholders need from data to recognize opportunities, spot problems, plan strategies, and otherwise contribute to the mission of their organization. In this article, we’ll look at why data products matter in modern analytics and the practical strategies that companies should consider for successful data product implementations.

Defining a data product

“What is a data product, that is the question:

Whether ’tis nobler in the mind to suffer

The slings and arrows of outrageous hype,

Or to take arms against a sea of vague definitions

And by opposing, clarify them”

DJ Patil, U.S Chief Data Scientist from 2015 to 2017, described a data product in 2012 as “a product that facilitates an end goal through the use of data”. According to this definition just about anything can be a data product, from a bank statement to a smartwatch. That’s not necessarily wrong, but it’s too broad to be helpful. 

A data product is a reusable data asset designed for a particular use and delivered according to agreed-upon standards and schedules. 

A data product is more than a product that uses data. A data product is a reusable data asset designed for a particular use and delivered according to agreed-upon standards and schedules. Let’s unpack this definition. 

  • Data asset. Data assets take many forms such as a data set, an interactive dashboard, a SQL view, an embedded recommendation engine, or a fraud detection model. Regardless of the form, a data asset is at the core of a data product’s value. 
  • Reusable. A data product is reusable in that data consumers, including other systems, can use it in different ways within the scope of its functionality. For example, users can provide request parameters, such as a date range or product categories, to generate results that meet their needs.
  • Designed for a particular use. A data product can’t solve every problem for everyone. Developers and stakeholders work together to address specific business issues and develop a solution iteratively incorporating feedback along the way. 
  • Delivered according to agreed-upon standards and schedules. Stakeholders must be able to trust that a data product will be accurate, timely, and reliable. Therefore, key features of a data product are the guarantees that the delivery team and stakeholders agree upon. 

The modern concept of a data product centers on product management techniques that put value delivery, as defined by stakeholders, front and center. It encourages cross-functional collaboration that breaks down silos and bolsters data fluency across the enterprise. 

Next, we’ll explore some best practices for creating value-driven data products.

#1 Getting started with data products: Start small and iterate

When getting started with data products, it’s essential to focus on one or two specific use cases with well-defined scope and business value. By targeting high-impact use cases, organizations generate quick wins and demonstrate the value of data products to stakeholders. This approach also affords organizations the opportunity to experiment with different tools and methodologies and determine what works best for them.

Concentrating on use cases with well-defined scope and business value is a key habit to establish for ongoing data product development. So is an iterative development process that encourages continuous collaboration between developers and business stakeholders. Teams need to work together to refine data products based on usage and feedback. The iterative process often reveals new understanding of requirements for both developers and users and ensures alignment with business needs.

#2 Developing data products: Assemble a multidisciplinary team

A diverse data product team with a mix of business and technical expertise is essential for developing data products. The skills required, depending on the nature of the data product, can include data science, data engineering, data analysis, and data visualization. It’s critical that the team have members with business domain knowledge, best represented by stakeholders, and product management skills.   

A diverse data product team with a mix of business and technical expertise is essential for developing data products.

A larger organization can often assign one or more individuals to a data product team in each of the skill areas illustrated in the diagram below. Smaller companies often need people who bring multiple skills to the team and function in more than one role.

Data product team knowledge and skills

#3 Realizing the value of data products: The data product delivery platform

To realize the potential value of their data products, organizations must enable users to find, understand, access, and trust them. A data product delivery platform is therefore essential. It is a self-service portal that enables users to search or browse for suitable data products, learn about their potential uses, and access them or easily request access. Data consumers must also be able to evaluate product quality and reliability and determine if they can trust the data for their needs. Therefore the platform should provide metadata, documentation, and data quality measures to ensure users understand the context and limitations of the product. 

#4 Making more data available to users: Establishing data governance

Well-crafted data products and a good delivery platform will make more data available to more users. This can be transformative for an organization that has struggled with leveraging their data. However, it can also devolve into chaos. So it’s imperative to establish data governance standards for metadata, documentation, and data quality that data products must meet before they’re made widely available. Automating enforcement of these standards is also critical to keep pace with the growing demand for data products. Here are some ways to automate enforcement:

  • Ensuring data quality. Automated data validation, profiling, and cleansing can help identify and correct data quality issues, ensuring that data products are built on accurate, reliable data.
  • Protecting data privacy and security. Automated data masking, encryption, and access controls  can safeguard sensitive information, ensuring that data products meet privacy and security regulations.
  • Facilitating compliance. Automated data lineage, audit trails, and policy enforcement can help organizations demonstrate compliance with data regulations and industry standards, minimizing the risk of costly fines and reputational damage.

Along with automation, it’s important to address the people and process aspects of data governance. Clearly define roles and responsibilities for data governance to ensure that all team members understand their part in maintaining data quality, security, and compliance. Encourage a culture of data governance through training, promoting awareness, and rewarding excellence. And finally, monitor and measure key data governance metrics, such as data quality issues identified and mitigated, reliability of data pipelines, and unused data access privileges that can be removed.

Conclusion: Data products are an essential tool for data-driven organizations

Treating data as a product ensures that teams keep stakeholder needs and business value top of mind. The most important things to consider when adopting the data product approach are:

  • Focus on use cases with well-defined scope and business value.
  • Assemble a multidisciplinary data product team with business stakeholder representation. 
  • Develop data products iteratively with frequent opportunities for stakeholder feedback.
  • Invest in a robust product delivery platform that makes data products easy to find, understand, access, and trust.
  • Establish automated data governance to ensure quality and compliance without getting in the way of innovation. 

With these approaches, organizations can harness the power of data products to drive analytics success and maintain a competitive edge in today’s data-driven world.

Data products best practices guide

Learn more

Get started with Starburst

 

Install anywhere

Starburst includes everything you need to install and run Trino on a single machine, a cluster of machines, or even your laptop.

Download Free

Marketplace offerings

Try Starburst in your preferred marketplace

 

Start Free with
Starburst Galaxy

Up to $500 in usage credits included

  • Query your data lake fast with Starburst's best-in-class MPP SQL query engine
  • Get up and running in less than 5 minutes
  • Easily deploy clusters in AWS, Azure and Google Cloud
For more deployment options:
Download Starburst Enterprise

Please fill in all required fields and ensure you are using a valid email address.

s