Data Health Continued – Narratives and the Dirty Event
- by Tallgrass
- May 10, 2021
Finding the truth in data is messy but the key to the survival of business.
What is the cost of a question?
The answer is not as simple as you would think. Knowing the correct question is more important than finding the correct answer right? You would hope so but in a narrative-driven exchange, the answer is assumed, predefined, and sadly even agreed upon. With that, the best way to keep a narrative in place is to not allow the wrong questions asked. Simply put, a quantifiable narrative is simply the truth.
So why are narratives “defendable” and so popular today? Or are they?
There are many ways our processes devolve and lose the cohesion to function efficiently. When this happens, it opens the doors for human interpretation and yes, narratives. Narratives can be based on honest popular assumptions from poorly maintained systems to a complete farce where data is manufactured or even engineered to hide bias. Let’s call this data dirty via the event that created it, in other words …” the dirty event”.
The Dirty Event
When data is created and designed with a bias that betrays the intent it was captured for it is dirty. Here are some common ways to make dirty data via events:
- Shape the question
- Shape the answer
- Drop and add data techniques, numerator madness
- Population controls for known behavior at capture or source
- Population weighting within metrics, typically to correct data bias – true sausage making
- Semantics – capturing data where known narratives do and do not exist
- And many more…
It’s a given that data shouldn’t be trusted. Any decision made from information that cannot cite its source and intent should be dropped. So from this point on, we will review data bias in the light void of “dirty data”. The good news is that the same techniques we use to monitor data event health can be configured to spot data betraying its purpose and biasing results.
There is Hope
Is like-event data created equally? No, and accepting this fact is a first solid step of acknowledging how best to create systems that align the core moving parts to drive decisions. The premise is simple – build transparency around each and every event that is captured. As data is related to other sources – ensure taxonomies plus the intent are rationalized and outputs traceable. When inequality of output exists based solely on a data’s source in a short amount of time it should be considered suspect. This, with other proprietary methods, allow a modern AI to build a simple ruleset.
Is your organization narrative-based?
Truth requires no narrative. Narratives are distractions and business killers. Look no further than Sears, Kmart, JCPenney, just to site a few. What is the opportunity loss to the wrong answer?
Today, all the processes we help optimize and automate have tremendous gravity and gravity has real opportunity loss and cost.
This is still a technical discussion by the way. I’m talking about questions that aren’t asked, the answers not found, and the micro corrections that never occur. The irony is that the leadership that allows an assumed narrative to remain unchallenged are trapped by their own organizations. It’s frustrating for them and for us. No more frustrating though than the savvy analysts and mid-managers who live with the unintended consequences.
To optimize is to seek the truth.
A process is optimal when it’s nothing more than a conduit of like function. Here we’ve minimized friction, of its inputs and outputs. Everything that moves has the least path of resistance to its final state. Nothing is left behind. It’s positive, true, and optimal at the same time.
This is one reason why we have gotten so good at costing what is untrue. You don’t have to perform a bottom-up optimization to identify what we have coined “Operational Failure”. Every optimization effort will present operational opportunity, being able to uniquely define them allows us to monetize their impact as they diminish
Operational failure is nothing more than a small observable event that is not optimal to operations. Failure might be too strong of a word but we need some gravity here. A single instance is observable, many instances or changes in velocity over time is actionable.
When used within a closed-loop like our Alert – Prescribe – Comply methodology compliance is automatically achieved.
Again, the cost of the answer is quantifiable, but we need to look at the technical cost as well in the next addition.
… more soon.