A Future State Big Data Solution For a Big Data Giant

Clarity Insights Big Data Consultants Help Improve Performance

Challenge

A System Slowing Everyone Down 

This client had more than 1 Billion customers a day interacting with 2M advertisers on this site. This generated a lot of data; 22 terabytes to be precise. 

As a result, the client has hundreds of petabytes of raw data stored in Hadoop with multiple terabytes of new data ingested every hour. Fast-paced growth and increasing data ingestion were causing data management challenges. In order to scale and meet growing business needs, the client needed a better solution.

 

Solution

Rearchitecting the system

In order to improve the system's performance, Clarity:

  • Architected scalable, end-to-end processes to consume large volumes of complex data from Hive, Scribe and3P APIs
  • Integrated datasets into Hive, Vertica and 3P APIs
  • Built data integration pipelines for advertising campaigns
  • Debugged Big Data ETL pipelines
  • Developed custom UDFs for data transformation

Outcome

Faster, cheaper, better 

  • Significant cost savings compared with traditional, legacy environment.
  • Development of a next-generation platform for user insight and ad-hoc analytics.
  • Support for EDW-class, structured analytics for multiple petabytes on a multi-node EDW cluster.
  • Sample analytics, including a monetization roadmap to increase demand in the social media platform and insight to increase revenue from existing customers through media-mix optimization.
Contact us 

Technologies

vertica_logo_1.png
hadoop.jpeg

Latest News

The #1 Secret Behind How AI is Driving Value-based Patient Care (What Providers and Payers NEED to Know)

Artificial intelligence offers many benefits to the healthcare sector – from improved patient experience to more effective decision support for providers – so it’s no surprise that the Artificial

Are you Laying the Critical Foundation for a Data-Centric Transformation?

Walmart. Amazon. Spotify. AirBnB. Lyft.  What do they have in common? A data obsession. As waves of data sources have drastically changed the industries to which they belong (or even new categories

Is Interoperability in Healthcare Still That Controversial? Why FHIR 4 is a Big Step Forward

When it comes to the exchange of data, application interoperability is crucial. That’s why this year’s debut of FHIR 4 – announced by the Health Level 7 board and advisory council – is an important

why-clarity.jpg

Find out how Clarity Insights can help you 

Contact us