Three-Part Series: Article 1
A colleague recently asked me to author an article on the differences between a “3NF” model and a star schema model on Snowflake Computing’s cloud-built data warehouse . We should instead examine whether this comparison is still relevant, with Snowflake’s entry into the data warehouse and analytics market.
Snowflake’s query optimization and scaling capability doesn’t favor one of these over the other, unless one considers that Snowflake removes the vertical and horizontal scaling limits of the SMP class server that gave rise to the star schema in the first place.
In this series of three articles we explore the architectural characteristics of these two model types, as well as the role SMP and MPP technology has played in shaping their use. Finally, we see how Snowflake Computing is disrupting both the SMP and MPP markets, completely changing our architectural perspective of these models.
Test drive the data warehouse built for the cloud:
3NF—An Engineering Characterization
“3NF” is a technical characterization of a transactional database’s normalization state. The term leads one to believe that data modeling is a technical engineering exercise.
Relational model or business architecture model better reflects the model’s functional business narrative foundation.
The purpose of normalization is to accurately define and describe the business functions (business entities) that data describes. Yes, normalization is intended to eliminate update and delete anomalies. Codd recognized that database change anomalies would be eliminated if database tables mirrored individual business functions that data describe. Each business entity possesses only attributes that uniquely identify and describe just the singular business function the entity represents. It also holds attributes, describing the entities relationship to other business functions. By creating a database modeled on business in this way, no data redundancy can exist in the database.
Defining data according to the business function it describes, and establishing data’s context within the business architecture, has been the central tenet of relational modeling from its origin. It fulfills Codd’s goals to make data understandable to business, which makes data directly usable by business consumers.
Thanks to updated techniques by James Martin, relational modeling is among the most effective means available to describe and communicate data, according to the business architecture it describes.
Next in this series, we highlight the foundational characteristics of the business architecture model and star schema model, distinguishing them architecturally.
In the final article, we see how MPP server technology has begun to impact the architectural evaluation in choosing between these model solutions. That article completes by showcases Snowflake Computing’s disruption of the MPP marketplace by fully eliminating the need to compromise data solution flexibility, while significantly reducing implementation and administration costs. All while enabling combined functionality of the co-located data warehouse and data lake.
Can't wait to read the entire blog series? Skip the line and download the white paper today.