Three-Part Series: Article 2
In the prior article I questioned whether the continued comparison of Relational models to Star Schema models is still relevant with the advent of Snowflake Computing’s cloud-built data warehouse.
We began to explore the origins of the relational model, based on Codd’s recognition that database change anomalies would be eliminated if database tables mirrored individual business functions that data describe.
Here we examine the distinct architectural foundations of both the Business Architecture Model and the Star Schema, to understand the reasons they are used today. This will allow us to determine, in article three, if those reasons are still relevant with Snowflake.
Test drive the data warehouse built for the cloud:
All Data is Relational
Data’s natural and logical organization is relational because the business architecture that data describes is one of relationships between functions of the business, as established by business processes. The business architecture is the pattern, in which the business functions (the actors, agreements, business activities, organizations, etc.) interact with one another in accordance with the company’s business processes.
We model data to emulate this business architecture, relationally, to ensure that our data is not only in harmony with the business architecture, but also to ensure data does not have a consumption bias, limiting its use. A database that is defined to reflect the business architecture has ultimate flexibility of use, allowing the business to effectively understand, measure and control any aspect of the business that data describes.
Data solutions that are conformed to the definition of the business architecture will naturally answer any question about the business functions and those function’s interactions within the business architecture. SQL’s declarative language is designed for just this purpose.
In the relational model, data describes the business functions themselves and the functions’ relationship to one another. The relational model, when designed according to the business architecture, provides the consistent business pattern and foundation needed for any relevant measurement and analytics creation.
The Business Architecture Model vs Information Model
The star schema is NOT a data model at all. It’s an information model. Its purpose is to measure the business function by organizing measurements (facts) of the business function, in association with other relevant business functions (dimensions), giving the measurement context. In the dimensional model, data describes the measurements of the business function.
Because of this, the star schema is incapable of describing the business architecture. It denormalizes data physically, merging business functions with one another, fixing specific consumption pathways. Relationships between business functions, established by business process, are eliminated in favor of direct aggregation and filtering pathways, which creates more bias.
This consumption bias is the source of the star schema’s greatly enhanced performance capability over the relational model on Legacy RDMBS technology. All the processing that must occur to answer business questions of a common consumption perspective against the relational business model is baked into the star schema organization and data processing.
However, the bias that enables this performance also makes the star schema inflexible to answer questions outside of the consumption perspective the star was designed for.
The denormalizations and the accompanying data processing that creates the star schema’s information make the schema resistant to change or extension as well. Transforming operational source content to a true business state of information (not just dimensional shaping of operational data) typically requires significant, complex processing code. Changing these implementations can be so costly that organizations resist updating them to fulfill new or changed requirements.
As a result, Star Schemas often become stove pipe solutions (narrow use with little flexibility or adaptability to new requirements).
In my final article, the technology limits that gave rise to the Star Schema are explained and how the arrival of MPP’s technology began to change the equation of model selection. We conclude with Snowflake's disruption of the MPP market place and why Snowflake Computing has made the debate between business architecture model and star schema irrelevant. Learn more in the case study: Improved performance and scalability for BI and analytics.
Can't wait to read the entire blog series? Skip the line and download the white paper today.
Written by Don Gooldy
Senior principal and data architect with 24 years of database design and system architecture experience, with 15 of those years leading Business Intelligence/Data Warehousing efforts. His solutions architect qualifications are grounded in a foundation of business aligned data architecture fundamentals.
Topics: Data Warehouse Modernization