The data warehouse (DW) has proven to be the main ingredient for business intelligence (BI) and advanced analytics. Its very success, though, has led to a situation where companies are constantly struggling to meet Service Level Agreements (SLA), and scale and adapt to additional, large internal and external data sources. Just running a DW on-premise has become costly and requires constant maintenance and monitoring, despite the business advantages of having an integrated view of data across the enterprise. The cloud data warehouse emerges as an attractive alternative. The cloud data warehouse is easier to manage, scale and modify. It allows the organization to focus on data and business value instead of maintenance and performance. However, not all cloud data warehouses are equal.
What is a Cloud Data Warehouse?
What is a cloud data warehouse? To understand how a cloud data warehouse works, it is first necessary to grasp the basics of data warehouses in general. A data warehouse is a repository of integrated, historical business data. The core value is derived when data analysis can be performed without performance degradation or forgoing SLA requirements.
By having an integrated, historical view of data across enterprise transactions, the data warehouse supports ad-hoc analysis, BI, extracts, analytics and other needs.
Until about five years ago, data warehouses were hosted in on-premise-systems. Like any large database, they had their own compute and storage infrastructures. Infrastructure administrators looked after the hardware and network supporting the DW. Database administrators took care of the inner workings, tuning and adjusting as new demands were made, while trying to make sure existing workloads were not impacted. This arrangement worked and provided immense value to the business. But today, there are better options.
A cloud data warehouse puts all of the infrastructure into a public cloud hosting environment. The database software and analytics tools run in the cloud. The database servers, network and storage volumes are all virtualized and available when needed.
Benefits of the Cloud Data Warehouse
The cloud data warehouse, offered as Software-as-a-Service (SaaS), offers several advantages over the on-premise version. Costly infrastructure, support, and maintenance that come with an on-premise model are removed. Teams are not required to manage and maintain the hardware and software. Tuning and general administration are also not required. With a cloud data warehouse teams can focus on the data and the value it adds to the business.
An on-premise warehouse requires an initial capital expense (CapEx) of hardware and software. In addition, operating expenses (OpEx) are required to pay employees to maintain the warehouse. In comparison, a cloud-based warehouse is about one third to one half the cost, is incurred “upon use”, and is considered an OpEx.
Providers of cloud data warehouses shoulder the CapEx burden themselves and rent the assets to their customers. This is similar to a real estate developer who builds a building and rents an apartment. The OpEx for the data warehouse allows companies to conserve cash and reduces the upfront investment.
Elasticity is also a benefit of the cloud data warehouse. Available capacity is “limitless”, as far as the customer is concerned. It can be scaled up, or scaled down, as needed and when needed. If the customer needs 10 terabytes of data warehouse storage and additional compute capacity in January for a big analytical project, but only needs 3 terabytes on an ongoing basis, this can be done in the cloud. If that same customer decides they need 100 terabytes and compute capacity to match in July, for some surprise reason (like litigation or compliance), they can also put that capacity online when needed.
In contrast, getting even a fraction of this capacity on premise will require months of effort. Any capacity added, generally cannot be “removed” as, in most cases, it is “purchased and owned”.
Working with the Right Cloud Data Warehouse
The cloud data warehouse, in general, provides advantages to businesses that embrace the cloud model. However, not every cloud data warehouse is equally advantageous. For example, simply lifting and shifting an existing data warehouse into the cloud may not fit the “data architecture” supported by some of the cloud data warehouse software.
This may require a larger spend for migration and more expenditure to meet SLA requirements. For example, moving a data warehouse model to the cloud, depending on the software, may require the same effort to tune, monitor, and maintain as on-premise systems. Most cloud data warehouse software tools have their roots in on-premise systems. Others were on-premise systems “migrated” to the cloud. In either case, they don’t take advantage of cloud architecture and innovations.
Snowflake addresses these challenges. Snowflake was built specifically to be hosted in the cloud and to take advantage of technology and solutions only available in the cloud. The Snowflake cloud data warehouse architecture features centralized, scaled-out storage. To this, Snowflake adds multiple, independent compute clusters. It is a multi-cluster, shared storage model. With this cloud-native architecture, multiple compute capabilities can execute independent workloads simultaneously, making efficient use of cloud resources, managing costs, and not affecting performance.
While others may be able to provide separation of compute and storage, Snowflake allows for separation of compute over “shared” storage, a key advantage. This allows for separate, dedicated, “right” sized computing for each of the functions (ex. ETL, ad-hoc analysis, BI) that the data warehouse supports. It is also true that using cloud storage provides “limitless”, burstable, or permanent capacity that is required.
Clarity Insights is one of Snowflake’s top partners. The partnership with Snowflake spans multiple industries, including manufacturing, financial services, media and retail. Working together, Clarity leverages the Snowflake cloud data warehouse to support clients’ cloud-based modernization efforts.
Written by Ali Sajanlal
Snowflake CoE Lead