Another day, another data management metaphor... Gartner, for example, cited the “data fabric” as one of their “Top 10 Data and Analytics Technology Trends That Will Change Your Business.” The term refers to a way of organizing data sources to create uniform access, wherever the data may be. We’ll get to the “fabric” imagery in a second. The rise of the data fabric reflects its importance in how businesses work with data to achieve their objectives. Thus, it’s worthwhile to explore data fabrics and what they might mean for your business data. Plus, there might be a better metaphor out there.
What is a Data Fabric?
The data fabric is a metaphor in a long series of such linguistic constructs that attempt to make the abstract nature of computing into something relatable from the physical world. Just as a computer network has nothing to do with nets and a digital file is not a paper file folder, so, too, the data fabric is not a cloth creation. It provides useful symbolism, nonetheless.
A data fabric is a combination of architectures, interfaces (e.g. APIs) and schemas that enable streamlined, high-integrity access to data. The concept arose as data managers began to see that corporate data was being stored in a bewildering variety of repositories, which were often difficult to access. Imagine, if you will, that you were looking at a spray of dots, something like a Jackson Pollack canvas. Each dot represents a place where you keep or stream data for your organization. The dots might signify IoT devices, cloud data stores, on-premises databases, data lakes and so forth.
It’s a big mess, basically. You can’t find what you want, and when you do, it’s hard to connect. Now, visualize a solution that spins a thread amongst each of these dots, connecting them with you, and with one another. It’s a very dense set of threads. Soon, the threads start to resemble a cohesive fabric. Wherever you touch the fabric, you can instantly access any dot it covers.
In practical terms, a data fabric can be a custom-developed solution or a product. There are different kinds of data fabrics for various use cases, as well. Some are designed to enable smooth data management inside a data center—optimizing the so-called “machine-to-machine” data traffic, which can be larger in volume than human-to-machine data exchanges. Others are meant to facilitate easy access between multiple data analytics platforms and disparate data stores.
2019 Market Guide for Data and Analytics by Gartner
The Evolution of the Data Fabric
For context, the data fabric represents another step in the sophistication of data management. It follows the advent of the data warehouse, data lake and operational data store. It also updates the notion of the data hub. Data hubs combine microservices, APIs or Point-to-Point (P2P) data services with distributed data management to covering geographic or domain “regions.” Data fabrics take the whole capability further.
Why Would You Want a Data Fabric?
A data fabric is becoming increasingly necessary because, as Gartner puts it, “Organizations are successfully deploying a greater number of large and more-complex data and analytics implementations than in any previous period.” From their perspective, growth in data management and analytics markets demonstrates the value of the data fabric’s ability to deliver consistent, governed and semantically harmonized information assets.
Citing the connection between data management, analytics and business outcomes, Gartner further notes, “The convergence of data management technologies will enable the rise of common platforms for new and differentiated managed data services exchanges.” (i.e. data fabrics) They add, “This is being driven by the need for consumption, modeling and effective visualization of a growing and varied source of information assets.”
Data Fabric Use Cases
Companies are putting data fabrics to work in a growing array of use cases. These include:
- Dynamic data engineering—Data engineers can build a data fabric architecture once and then reuse it as needed. In this process, they act as facilitators of a flexible system, rather than designers of a data integration solution. The fabric can be optimized as a data engineering pipeline that spans multiple clouds and other data sources.
- Governed/trusted data science—Given that data scientists tend to demand an end-to-end lineage and understanding of their algorithms and data models for the sake of efficiency and compliance, a data fabric gives them much-needed data fusion outputs that alert them to expanding data assets. The data scientist gains visibility into active metadata affecting data in their models, e.g. performance optimization, data quality, design and lineage.
- Logical data warehouse architecture—Data fabrics liberate data management from the need to confine data to a physical consolidation and static semantic interpretation.
In our business, we helped a large consumer packaged goods (CPG) company implement a data fabric that enabled them to move from an IT-driven analytics culture to a business-driven culture with faster time to insight and increased business interaction with data engineers and data scientists. This helped them reduce cycle time, take on more projects, as well as give them the flexibility to rapidly onboard acquisitions into their analytics ecosystem.
Additionally, we worked with a major Financial Services company on the implementation of an Azure data fabric that increased their agility. Where it once took multiple quarters to deliver new business functionality, the first project using their new data fabric delivered value from requirements to production in under 60 days. This, in turn, enabled a new revenue stream that immediately began delivering ROI to their business.
Done Right, It’s More of a Data Trampoline, Anyway
If we’re going to keep slinging computer science metaphors here, let’s go for it. It’s time to take the data fabric to the next level: the “Data Trampoline.” Unlike threads strung loosely between data source points, the data trampoline is a tightly woven mesh. It lets you spring into action, leaping into death-defying acts of data analytics and business-facing data management. (It’s a metaphor, okay? It’s not completely perfect...)
To be serious, in our experience working with companies on data fabric design and implementation, we’re seeing an ever more important role for metadata. After all, metadata provides for overall comprehension and performance optimization of any data asset used by the organization.
For instance, statistical metadata may relate to rates of data access as well as usage by platform, user and use case. Metadata lets system managers stay on top of the overall systems’ physical capacities and the utilization of the infrastructure components. With metadata, it’s possible to track and improve the reliability of data access connections, which seldom rises about 80%, according to Gartner. Actively tracking metadata and including it in the data fabric makes the fabric more relevant and responsive. It enables data scientists and architects to be more dynamic in their set up and use of the fabric. Inclusion of metadata turns the data fabric into the data trampoline—enabler of previously unheard-of feats of data science.
We have extensive experience working with companies on the design and implementation of data fabrics. Now, we can even offer the mythical data trampoline, too, by giving you new metadata management capabilities and new forms of data integration. If you want to understand the potential for a data fabric in your business, let’s talk.