BUSINESS SITUATION
We automated the data ingestion application that integrates API-based input data sources across 3rd party and owned data sets
SGA APPROACH
- We created API-based python codes for extracting data across multiple sources like 3rd party data such as Nielsen, Ad Sales data, Campaign promotions and Program Schedule data
- We built a staging area in AWS S3 and then loaded those into tables using a unifying process log id
- We used python and Spark to process tags and update missing show data using fuzzy match to build the final consumable tables
- We also built process monitoring dashboards using Tableau to track the entire data-processing lifecycle
ENGAGEMENT
We developed APIs that provide stakeholders with data on channel switching behavior of live audiences across different TV networks and demographics
BENEFITS AND OUTCOMES
- Our Web Scheduler helped in triggering data ingestion, cleansing, processing codes
- Results are utilized by UI team to draw intuitive visualizations and generate insights accordingly
- Our solution helped in improving efficiency of endpoint results
KEY TAKEAWAYS
- We built a data-cleansing layer that automatically cleaned the data using inbuilt process logic applicable to the media domain data