When my kids were teenagers, we used to play a game at my house, I called “Where are my keys”. My kids never put their car keys in the same place when they came home which caused the panic search to find out where they were minutes before they needed to leave for school. My wife is a very organized individual and she always said “the keys go in the key bowl. If you put your keys in the key bowl you will always know where to find them.” Simple but elegant logic. Imagine if we scaled that idea to our neighborhood. All the houses have a key bowl and all the keys are labeled for what purpose they serve. I could walk into any house, assuming we have a defined security model, and always find the key to what I need. Scale it to the town, the county, the state, and the country. If we define the model of how we put our data away on a large scale and govern that model then we improve our ability to find that data when we need it.
Putability greatly improves findability. If that seems like an odd phrase let me explain. The world is being overrun with data. In 2017 IBM stated that 90% of all data has been created in the last two years. CIO Magazine estimated that 80-90% of that data is unstructured. This implies that we are drowning in data and lack the ability to get real knowledge from it. If data is truly the oil of the Enterprise, how do we transform data into knowledge? The formula is simple, the execution is hard. Putability is the work we do to store data. At the point of data generation and all along its journey through the data pipeline, we need to classify, trace, transform, tag, augment, and enrich our data. If we impact the putability of the data, then we increase the findability of that data. When a user needs to discover data, we can contextualize their search. We can use relevancy tuning to see what people close to them have looked at. We help them boil the ocean of data down to the critical elements they need when they need it. We supply them with relevant quality data that they can use to complete their assigned tasks. If we control how we put our data away, then we can greatly improve our ability to find the right, accurate data when we need it. Better yet we can have our data find us when the time is right.
IDC reported in 2016 that workers spend ~30% of their workday looking for or recreating data that already exists. Add into that statistic the likelihood of the data not being the same as the original data and we start to turn our data lakes into data swamps.
So, what is the answer? I contend it is governance. The keys always go in the key bowl. People cringe when we use this term because it implies rigor and control, and they believe it will stifle innovation and freedom. With a proper governance model in place, your organizations will see an explosion in the ability to get value from data and innovate because data will be turned into knowledge that allows users to focus on the work and not on the searching and wrangling data.
Start small and grow. Identify a few projects, not the easiest and not the hardest. Have a solution architect work with the teams to see how they generate the data. Is it machine-generated? Do humans enter data into an application or spreadsheet? How raw is the data? How much contextualization needs to be applied to increase the data quality? Can we trace the data from point of origin through the data pipeline process to the storage location? Tagging, enriching, and classifying data at the point of origin greatly increases the data putability.
The solution architect documents and defines the data generation process across the putability model. They then pivot to how the data needs to be consumed, its findability. How will users expect to discover this data? What context does it need? Who is the audience? What are the security requirements? Will the data be used in reports? Machine Learning models? Application development efforts? Can we trace who is accessing the data? Can we meter and monitor the data consumption, so we do not impact the network?
The solution architect will work across the identified projects to generate a series of patterns and practices on the data generation and consumption models, putability, and findability. They will engage with the Enterprise Architecture team to ensure usability across the organization. This begins the creation of a Data Office. Data Stewards take ownership of the rules of putability and findability. As more projects are added to the Data Office a governing model starts to emerge. Over time most if not all data generated in the organization will be under management. The Data Office will monitor the effectiveness of the data management process through analytics. They will track key metrics like bounce rates, failed searches, and data freshness ratings. Data engineers in the Data Office will work to improve search results and improve the findability.
The Data Office will begin to document tagging models and curation processes. Taxonomies and ontologies will emerge to improve the putability and findability. A data management life cycle will formulate. As the organization works to mature the Data Office, they will work with other domains in the Enterprise to federate their data. This will create a mesh of knowledge that lubricates the company with the oil of data to improve decision-making and analysis for improved efficiency and effectiveness. Data will morph into wisdom.
The tactical approach of how this gets done can paralyze an organization. They will start to thrash on tools and approaches. Siloed groups will want to define their own models and will be reluctant to share and hand over their data. They will be willing to tag and classify their data if they can use their own languages to do so. An overarching governance model empowered and supported by Leadership is critical to success. An Enterprise Architecture group defined at the organization level can give oversight and guidance to the data management model and ensure its growth and maturity.
Organizations must work to harness the power of their data. Start small and grow into a common model of how you put your data away. This will increase the organizations’ ability to discover, find and get knowledge out of the data and create a flywheel that accelerates and promotes agility and innovation. Putability equals Findability.