A Future State Big Data Solution For a Big Data Giant

Clarity Insights Big Data Consultants Help Improve Performance

Challenge

A System Slowing Everyone Down 

This client had more than 1 Billion customers a day interacting with 2M advertisers on this site. This generated a lot of data; 22 terabytes to be precise. 

As a result, the client has hundreds of petabytes of raw data stored in Hadoop with multiple terabytes of new data ingested every hour. Fast-paced growth and increasing data ingestion were causing data management challenges. In order to scale and meet growing business needs, the client needed a better solution.

 

Solution

Rearchitecting the system

In order to improve the system's performance, Clarity:

  • Architected scalable, end-to-end processes to consume large volumes of complex data from Hive, Scribe and3P APIs
  • Integrated datasets into Hive, Vertica and 3P APIs
  • Built data integration pipelines for advertising campaigns
  • Debugged Big Data ETL pipelines
  • Developed custom UDFs for data transformation

Outcome

Faster, cheaper, better 

  • Significant cost savings compared with traditional, legacy environment.
  • Development of a next-generation platform for user insight and ad-hoc analytics.
  • Support for EDW-class, structured analytics for multiple petabytes on a multi-node EDW cluster.
  • Sample analytics, including a monetization roadmap to increase demand in the social media platform and insight to increase revenue from existing customers through media-mix optimization.
Contact us

Technologies

vertica_logo_1.png
hadoop.jpeg

Latest News

Recognizing Women at Clarity

Happy International Women’s Day to all our women in Tech and personal Shero’s! We want to take advantage of this opportunity to recognize all the strong women we get to work alongside everyday at

The Data Model in The Era of Snowflake Computing’s Data Warehouse: SMP and MPP

Three-Part Series: Article 3 In the first article of this series I questioned whether the continued comparison of Relational Models to Star Schema models is still relevant with the entrance of

The Data Model in The Era of Snowflake Computing’s Data Warehouse: Business Architecture and Star Schema

Three-Part Series: Article 2 In the prior article I questioned whether the continued comparison of Relational models to Star Schema models is still relevant with the advent of Snowflake Computing’s

why-clarity.jpg

Find out how Clarity Insights can help you 

Contact us