“Real-Time” Data: What Does it Mean?
In data, the term “real time” is a difficult topic to discuss, and even more difficult to define. Everyone has their own definition, which makes the conversation interesting, to say the least–especially when that conversation is between technical people and those who are more business-oriented.
To the business person, real time can be as simple as meaning “right now,” which leaves a lot of room for interpretation for the technical folks. I think there are basically two forms of real-time functions that we commonly solve for, using technical solutions. First: Traffic Lights, and Second: The Weather.
Traffic lights: Real time or else
If you are approaching a busy intersection, traffic lights changing color is as important as real tIme data gets. That data–the change from yellow to red–needs to happen and be accurate and timely, or the consequences are potentially severe. But, once our driver/decision maker is through the intersection, the data becomes outdated and irrelevant, unless a record of that data can be useful in another scenario.
Traffic lights are temporal data, which usually requires technology that is extremely specific to the decisions being made. Traffic lights, sensors, and schedules serve one purpose: aiding the decision making process, and doing so with great efficiency, speed and accuracy. In the technology world, the words “message”, “event”, and “stream” indicate an opportunity to report real-time data. Doing so requires specialized software and hardware, and all the rules need to be programmed.
The weather: It’s all relative
The weather however, has an opportunity for comparison associated with it: the temperature “right now” usually won’t be too far off from when I checked it in the morning (I hope). It gets interesting when compared to other facts however, such as the record high, or seasonal trends. The decisions made from this type of data usually have tolerance for a lack of accuracy and/or timeliness. For example: if you arrive at work and it is nice and sunny, there is a good chance that it could be downright hot when you head home. But, maybe you know from historical weather information that the rain this time of year usually starts at 4:30PM, so you ought to pack an umbrella.
The technology solutions for historical comparisons are more generic and somewhat commoditized. Database, data warehouse, data lake, and business intelligence (my favorite) are all terms The Public has at least some familiarity with.
Traffic and weather, as you need it
In most cases, as humans, we want to combine the weather and the traffic. We know it’s been snowing for two days, so we will either work from home or take the SUV to work. But, that could change the second you look outside and see that the snow melted overnight. There is a constant tension between the two information sources, and we naturally “figure it out” based on the context.
Because the demands and usages are different, they require equally different technical solutions. “Traffic” data can be stored as facts and dimensions in a Data(base|warehouse|lake|report) storage system, but it takes work to get it there. The Traffic data source has a specific purpose, and needs to be transformed so that it can be used in a statistical/historical context, which takes some period of time and is usually slower than the speed of light and our perception of a light changing.
The term “real time” is indeed challenging, but with a little context, we can make much better decisions about what that means to us. Seems simple on the surface, and today’s technology landscape helps immensely to solve these challenges, but in order to effectively solve them, we need to understand the nature of what we want to find out, and why we need to know.
Matillion ETL: Helping organizations move closer to “real time” decision making
In order to make fast, accurate decisions, you need shared, secure, analytics-ready datasets. Matillion ETL helps data teams move faster and be more productive with a low-code interface and repeatable workflows that significantly decrease data preparation time. Want to see for yourself? Get a demo.
About the author
Aaron Segesman is a former Solution Architect for Matillion. He has been working with data for 20 years and is passionate about quality data solutions and customer success.
10 Best Practices for Maintaining Data Pipelines
Mastering Data Pipeline Maintenance: A Comprehensive GuideBeyond ...News
Matillion Adds AI Power to Pipelines with Amazon Bedrock
Data Productivity Cloud adds Amazon Bedrock to no-code generative ...Blog
Data Mesh vs. Data Fabric: Which Approach Is Right for Your Organization? Part 3
In our recent exploration, we've thoroughly analyzed two key ...