Enterprise DAB CI/CD Controller
A centralized, governed Databricks Asset Bundle (DAB) deployment system supporting multi-team, multi-environment bundle promotion — the first automated CI/CD framework of its kind at Dish Network.
Multi-team support
Multi-environment promotion
First DAB CI/CD at Dish
TDD adopted team-wide
Overview
Pioneered and architected Dish Network's enterprise Databricks Asset Bundle (DAB) CI/CD controller — a centralized deployment system that brought automated, reproducible, and governed bundle promotion workflows to the entire data engineering organization.
Problem
Before this project, Databricks workloads (jobs, pipelines, notebooks) were deployed manually or through ad hoc scripts. There was no standardized way to promote changes across development, staging, and production environments. Each team had its own approach, making cross-team collaboration difficult and production deployments risky.
Solution
Phase 1: Initial DAB Pipeline
Built the team's first Databricks Asset Bundle CI/CD pipeline using GitLab CI/CD. Established the foundation for automated bundle validation, testing, and deployment.
Phase 2: Enterprise DAB Controller
Designed and led implementation of a centralized controller that governs DAB deployments across the organization:
- Multi-environment promotion: Automated dev → staging → production promotion with approval gates
- Centralized governance: Single controller manages permissions, environment configs, and deployment policies
- Bundle validation: Pre-deployment validation of bundle structure, permissions, and resource configurations
- Rollback support: Automated rollback to previous bundle versions on deployment failure
Key Design Principles
Infrastructure as Code: All Databricks resources (jobs, clusters, pipelines) defined as YAML bundle configs, versioned in Git.
Test-Driven Development: Spearheaded TDD methodology across multiple projects. Each bundle includes unit tests for transformation logic (Python/Scala) run in isolated Databricks environments.
Results
- Delivered the first fully automated Databricks deployment pipeline at Dish Network
- Reduced deployment time from hours (manual) to minutes (automated)
- Enabled multi-team, multi-environment workflows with a single centralized system
- Established TDD practices adopted across multiple data science projects
Tech Stack
Platform: Databricks, Databricks Asset Bundles (DAB)
CI/CD: GitLab CI/CD
Languages: Python, Scala, YAML
Cloud: AWS (S3, IAM, VPC)
Testing: pytest, ScalaTest