×

Today’s digital world is an expanding frontier of emerging technologies. There are endless innovations, inspired by data, informed by data, enabled by data, and that create value from data. One thing we’ve seen more and more enterprises do to keep up with this digital revolution is the adoption of cloud services for a variety of IT functions, to an extent that modern approaches to building and running programs are often described as “cloud-native.” But cloud computing isn’t the final answer, it’s actually just the beginning.

According to Gartner, while only about 10 percent of enterprise-generated data is created and processed outside a traditional data center or cloud, this figure is expected to soar to 75 percent by 2025. The cloud alone simply isn’t efficient enough to keep up with the volume and velocity of data that enterprises will be faced with as time goes on. So what is the missing piece to keeping up? Computing at the edge. I had the pleasure of presenting this paradigm at Datanova for Data Scientists, a summit organized by Starburst.

Computing at the edge, or intelligent edge, is the analysis of data and the development of solutions at the site where the data is generated. The ability to do intelligence or knowledge discovery at the point of data collection is critical in many applications now, especially when driven by machine learning. By analyzing data at the source of collection, teams experience faster response to events seen in the data stream, greater scalability due to processing being distributed to the edge of the network, and cost savings by minimizing the bandwidth used. All of this adds up to being able to push new boundaries in analytics and to do more, faster.

One of the most relevant edge intelligence applications for data sciences is model monitoring. When we deploy models in our environment, we’re deploying them at some place where operations are happening, whether it’s actually an ecommerce store, or a self-­driving vehicle, or a camera, or a drone, or an assembly line, etc. Embedded in these technologies and processes are models that fuel the AI, that derive the action that the AI will take. We need to monitor our models, to verify that they’re behaving properly, as things change in the world. We call this model monitoring, and what we’re really monitoring is not just the model itself, but the two components of a model, which are the input (the data that fuels the model) and the output (the thing that we are modeling, which we call the concept). When the input or the output changes, we call that data drift or concept drift, respectively.

Model monitoring for predictive analytics at the edge begins with input, i.e. the data and how it is collected. I like to say, the Internet used to be a thing, but now, things are the Internet. In the Internet of Things, “things” are embedded with sensors, software, and other technologies for the purpose of connecting and exchanging data with other devices and systems. The data that is collected at the edge often needs to be processed in real-time in order to fuel predictive modeling or to reveal novel patterns in the data that may inspire questions we didn’t think to ask about the things that we are monitoring. The output, or the concept that drives decisions and actions, manifests in edge applications. Some examples of edge applications are technologies like drones or self-driving cars, which operate autonomously through software ­controlled plans and on­board edge sensors, including GPS. Another example in the medical field is the development of carbon nanotubes used in heart stents for people’s arteries. The carbon nanotubes are attached to sensors which measure the condition of the person who has the heart stent. The sensors pick up on their blood pressure, pulse, body temperature, and a number of other different reactions in the body to detect whether the person is under stress or not under stress. Based on that information, the shape of the carbon nanotubes is changed using electric impulses to mold dynamically in real-time to that specific patient’s needs. This is an example of using information to create a product, in this case, a heart stent, which is dynamically changing in shape, in response to input from these edge sensors. Technologists have a word for this: 4-D printing.

The application of predictive analytics at the edge leads to something called Event Science. Event Science is basically the application of data science and machine learning methods and algorithms to triaging events, determining what are the most critical things that we need to respond to and what are the things we don’t need to respond to. Event Science is part of the measurement, inference, prediction, and steering (taking action) process flow, which is determined from all the things that we know as data scientists. We collect different types of data from edge sensors, we process it at the point of data collection to acquire edge intelligence, and then we take action based on it or let it fuel predictive models that may be used by other systems to take action. Triaging the events basically means that from all this data that we collect, we apply algorithms in models to determine which patterns in the data are most meaningful – for example, which alerts are anomalies or deviations that require our attention (true alarms), and which ones do not need our attention (false alarms). Getting those alerts right is the goal of Event Science, with the objective of minimizing false alarm fatigue. Data deviations, data drifts, and concept drifts may be a forewarning of some sort of emergent behavior. We apply our machine learning to infer what is going on in the data. We make a prediction. Then, from that prediction, we make a decision on what action to take.

But what kind of messages are we getting, and how do we decipher them? To illustrate this, here are three analytics application examples at the Intelligent Edge:

  1. Sentinel Analytics: The sentinel is the thing that gives you the warning, the alert. So it’s using sensors and data, to watch the things that should be watched. The sentinel is the guard on the guard porch, that tells you things are fine, or we need to sound the alarm.
  2. Precursor Analytics: This is the detection of emergent behavior that produces the early warning signs of risky events such as data drifts, concept drifts, change points, and emerging trends. When you attach the edge intelligence to the sensor and provide access to those early warnings to people, processes, or products, you could call that FaaS (forecasting-as-a-service) Analytics from those precursor signals.
  3. Cognitive Analytics: This is sort of an “interestingness discovery” application that actually discovers the right questions you should be asking in your data. For example, when I see an anomaly or a trend or a correlation or something that’s changing, what should I be doing? So, asking the right questions to your data at the right time and in the right context is Cognitive Analytics.

In summary, data is increasingly being generated at the edge and needs to be processed at the edge rather than in the data center, since a real-time response is needed in many applications. There are many ways to look at edge computing and ultimately it is the future of analytics. We aren’t getting rid of the cloud anytime soon, but adding edge computing to your cloud strategy can extend its reach, providing a local point of presence, thus reducing application latency, response time, and bandwidth costs. I hope you’ll consider the potential benefits of edge computing and incorporate it into your data and analytics strategies going forward.

 

What are some next steps you can take?

Below are three ways you can continue your journey to accelerate data access at your company

  1. 1

    Schedule a demo with us to see Starburst Galaxy in action.

  2. 2

    Automate the Icehouse: Our fully-managed open lakehouse platform

  3. 3

    Follow us on YouTube, LinkedIn, and X(Twitter).

Start Free with
Starburst Galaxy

Up to $500 in usage credits included

  • Query your data lake fast with Starburst's best-in-class MPP SQL query engine
  • Get up and running in less than 5 minutes
  • Easily deploy clusters in AWS, Azure and Google Cloud
For more deployment options:
Download Starburst Enterprise

Please fill in all required fields and ensure you are using a valid email address.

s