MLaaS Platform | Jian Cui

Overview

Designed and built a Machine Learning as a Service (MLaaS) platform on AWS that standardizes how ML models are deployed, served, and monitored across Dish Network. The platform provides a unified interface for ML teams to ship models to production without managing infrastructure.

Architecture

The platform is built on an event-driven, serverless architecture:

Client → Apigee API Gateway → AWS API Gateway → Lambda (sync inference)
                                                       ↓
                                               SQS → Lambda (async inference)
                                                       ↓
                                               S3 (results) → Athena (analytics)
                                                       ↓
                                               SNS (notifications)

Synchronous inference: Low-latency requests go through API Gateway → Lambda, with results returned in real time.

Asynchronous inference: High-volume or long-running jobs are queued via SQS, processed by worker Lambdas, and results stored in S3 with SNS notifications on completion.

Data layer: RDS (PostgreSQL) stores model metadata, deployment configs, and job history. Athena provides SQL analytics over inference logs in S3.

API Management: Integrated Apigee for external API product management, rate limiting, and developer portal. Added Logz for centralized log monitoring.

Results

Standardized ML deployment across multiple engineering teams
Reduced model deployment time from days to hours
Apigee + Logz integration improved API observability and reliability
Platform handles variable inference workloads with auto-scaling Lambda
Comprehensive unit test coverage for all ETL and serving components

Tech Stack

Compute:    AWS Lambda, EC2
API:        AWS API Gateway, Apigee
Messaging:  AWS SNS, SQS
Storage:    AWS S3, RDS (PostgreSQL)
Analytics:  AWS Athena
Monitoring: AWS CloudWatch, Logz
Language:   Python, SQL