In our previous post we explained the need to drive greater business clarity in business data and information capabilities. Here we describe our own practices so you can see if your data architecture practice is truly business focused—or if your organization is merely creating “boxes” to put your data in.
Our data architect “forefathers”, provided us with principles, techniques and methodologies for creating data assets that describe the business in a way that they themselves would recognize. For instance, the application of data normalization rules to the entity relationship model result in purely defined and described business entities. We should appreciate the ingenuity of Entity Relationship modeling, which focused on eliminating the possibility of insert, update and delete anomalies that plague deformalized databases by simply modeling the business accurately.
We need only apply these principles, techniques, and methods in a slightly more creative way to turn data architecture into a discipline that clarifies the data information capabilities contained within the company.
The two practices in this article are fundamental to the data architecture discipline. Any other practice is merely creating a “box” for data. We will continue this series with more practices to present a full vision of how world-class data architecture practices can make your data assets work for you. If you stick with us through the series, you will see in our final article how all these practices come together for use in defining, planning, and understanding your business information architecture.
For more details on any of the practices we discuss throughout the series, an illustrated white paper will be forthcoming to explain in greater detail.
First Practice: Clarify the data foundation, model the business!
Just modeling physical tables or creating a physical data model isn’t data architecture any more than creating a box to live in constitutes building architecture. The principle that there is a union between design form and function is the foundation on which all design disciplines operate, including data architecture. This principle when fully understood and acted on, results in ultimate usability and value to clients.
If we do not pay full attention to the business function required of our data design, we will not know if the resulting data solution fulfills any function other than that of a container. It is almost assured that it won’t allow for direct business consumption without significant manipulation of content.
True business logical modeling documents all of the contextual differences in content and relationship rules based on the business narrative. The business narrative is the description of how the business functions and provides the only valid context for business data design. Many data modelers just don't have enough connection to the business and/or understanding of business modeling to develop the business narrative necessary and model in this fashion.
The hallmark of a proper business logical model is extensive depiction of subtyping to explain the attributes and detailed relationship contexts that apply to business entities. Without this model, use of the data by employees, as well as accurate measurement of performance, become difficult if not impossible.
The business narrative we capture in the business logical model, which includes relationship phrasing, is precise and detailed. It needs to be, so that we can correctly organize business data into information without many trials and numerous errors. In order to organize information, such as in a dimensional model, we need to understand the business state of data required for information organization. Data, as sourced from many operational systems, does not constitute a business state. The business logical model provides the specification of the required state and the business rules by which we need to organize information.
Second Practice: Clarify the data content, use best practice naming methodology
The principles we use in naming entities and attributes (as expressed through a data architecture's models and metadata) make a huge difference in the volume of information imparted to the data architecture end user.
Every practice Clarity employs as part of logical modeling is intended to maximize the information conveyed by review of the model. Details that cannot be conveyed by the graphical content of the model are found in the model metadata and direct model annotation. With strong graphics, far greater insight can be conveyed through the model than many modelers realize. But it is the naming and the methodology used that sets the context for the graphical depiction. Clarity’s naming practices shed the greatest illumination on the graphical presentation possible, ensuring accuracy of the business data definition.
Naming any entity or attribute has to be based on the identity of the attribute. Notice that we don’t say the name establishes the identity, which is the way many modelers think about data identity.
The real basis of data identity is found in the definition that describes the content of the entity and attribute. The name needs to be created to summarize that definition.
Following the same principles, the business description—the major component that sets the specification for the content when we are designing a new system—drives identity and it too needs to be named in a self-referential, descriptive way."
When naming, we always use three word or phrase components that ensure (promote?) understanding of the three facets of the content needed by the model's end user.
- Context: the subject—the prime word or phrase
- Aspect: the characteristic of interest—the qualifier word or phrase
- Function: the purpose from a user or (or user group) perspective—the class word or phrase
There are nuances to the way we define each of these components, but the functional component (class words and phrases) is the most misunderstood by data and governance professionals. Class words or class phrases describe the functional use of the attribute, not the data type. We allow the functional use to imply the data type. To be clear, we use class words like “Identifier”, ‘Ratio”, “Score”, “Rank”, “Percent”, “Factor”, or “Coefficient”, and never use the class word “Number” or “NUM”. All of these define a unambiguous functional use, finishing a definition founded in unambiguous context and aspect wording.
We also use “Description”, “Definition”, “Remarks”, and “Note” instead of “String” or “Text” as class words. The former set tells us the functional use of the text contained in the attribute. “String” and “Text” explain only the data type.
In our upcoming white paper, we will explore in greater detail the discipline of proper entity and attribute naming. This discipline drives greater understanding through the names that appear in the model and fosters greater business clarity for the content of the data or information asset.
So far we have established the two fundamental practices of true data architecture. Our next article will examine how we apply these practices to make our physical data model understandable to the end user.
Written by Don Gooldy
Senior principal and data architect with 24 years of database design and system architecture experience, with 15 of those years leading Business Intelligence/Data Warehousing efforts. His solutions architect qualifications are grounded in a foundation of business aligned data architecture fundamentals.