Operations Management for Disney’s Data Foundation Platform
The Challenge: Legacy Data Fragmentation
Before partnering with Info Services, the Disney team faced significant hurdles in unifying diverse data feeds for their digital platforms. The legacy metadata management system was struggling to scale during major content releases, causing delays in data availability across several digital channels.
- Scaling Constraints: The EC2-based metadata store required manual intervention to handle traffic spikes, leading to performance bottlenecks.
- Operational Complexity: Managing fragmented data pipelines across different business units was resource-intensive and prone to configuration drift.
The Solution: Data Foundation & Metadata Mart API
We collaborated with the Disney architecture team to design and build a modern, serverless Data Foundation Platform. This transformation focused on high-throughput ingestion and global metadata accessibility.
- Metadata Mart API: Built using Node.js 18.x on the Serverless Framework, leveraging AWS AppSync for a managed GraphQL interface and AWS Lambda for compute.
- Unified Ingestion: Developed robust Spark-based ingestion pipelines to unify historic and ongoing data feeds into a central repository.
- Global Resilience: Deployed Amazon DynamoDB Global Tables to ensure active-active multi-region support with sub-200ms ingestion latency.
- Snowflake Integration: Designed and implemented downstream data marts in Snowflake to support advanced analytics and reporting.
Managed Services: 24/7 Platform Operations
Once the platform was operational, Info Services took over full responsibility for its maintenance and continuous optimization as a Managed Service Provider.
- Proactive Pipeline Monitoring: We monitor the health of all Spark ingestion jobs and AppSync API metrics around the clock. By using Amazon CloudWatch and specialized Canary Synthetics, we verify end-to-end metadata flow every 5 minutes.
- Change Management & IaC: All environment updates, including WAF rules and IAM policies, are managed through Terraform and AWS CodePipeline. This ensures every change is audited and repeatable.
- Security Governance: We manage secrets centrally via AWS Secrets Manager and perform regular compliance checks using AWS Config to prevent resource misconfiguration.
- Quality Assurance Automation: Our team manages automated QA processes, including security scans and data integrity checks, to ensure the Metadata Mart remains reliable during 10k+ events/min spikes.
Results & Business Value
The move to a managed operations model has allowed Disney to shift its internal focus from infrastructure maintenance to content innovation.
- 100% Availability: The platform successfully maintained 99.9% uptime during high-profile holiday releases.
- Operational Efficiency: Automated deployments and proactive monitoring have reduced manual operational tasks by over 100% compared to the legacy environment.
- Massive Scalability: The serverless architecture now handles heavy metadata ingestion peaks with zero automated throttles or system degradation.