Create Music Group Logo

Create Music Group

Lead Data Engineer

Posted 23 Days Ago
Remote
Hiring Remotely in Canada
Mid level
Remote
Hiring Remotely in Canada
Mid level
The Tech Lead, Data Engineer will design and implement CMG's Medallion 2.0 data platform, collaborating with teams to ensure optimal data architecture and quality.
The summary above was generated by AI

Established in 2015, Create Music Group is a leading music and entertainment company. The company operates as a record label, distribution company, and entertainment network which generates over 15 billion music streams each month on DSP’s. Named #2 on the Inc 5000 Fastest Growth Companies in America in 2020, the company has grown exponentially by leveraging its owned IP with its media and technology platform. The company works with superstar artists, major and independent record labels, and global media brands. It operates a number of companies including Label Engine, one of the largest independent music distribution platforms in the world, with over 75,000 artists and 5,000 label clients; and Flighthouse, a digital entertainment brand focused on Gen Z,  which has more than 300 million followers across social media. Create Music Group is based in Hollywood, CA and has 400 employees worldwide.


Job Summary

The Lead Data Engineer will play a central role in the buildout of CMG's next-generation data platform. This is a high-ownership role on a small, senior team, working directly with the SVP of Data & AI to design and implement a scalable lakehouse architecture on Google Cloud Storage (GCS) and Databricks, spanning bronze, silver, and gold layers. The role emphasizes domain-driven design, data contracts, and proactive communication with both internal stakeholders and external vendors.


Responsibilities

  • Lead the technical design and implementation of CMG's Medallion 2.0 lakehouse architecture — bronze ingestion, silver transformation, and gold domain layers — built on GCS and Databricks (Delta Lake), with clear data contracts at each boundary
  • Design and manage data pipelines using Astro (Airflow), PySpark, and Delta Live Tables, ensuring reliability and scalability across ingestion and transformation layers
  • Govern the lakehouse using Databricks Unity Catalog — managing access controls, data lineage, and schema enforcement across domains
  • Apply domain-driven design principles to partition and model data domains (e.g., royalty, asset, artist, distribution)
  • Collaborate with the analytics team to ensure the gold layer reflects real business needs — reducing workarounds
  • Coordinate with external vendors (e.g., DataArt) and internal stakeholders across DevOps, product, and analytics
  • Proactively identify architectural risks, data quality issues, and dependency blockers with proposed resolutions
  • Maintain clear, impact-first documentation and status updates for both technical and non-technical stakeholders
  • Other duties as assigned

Qualifications 

  • 4+ years of data engineering experience, with at least 1–2 years focused on data platform or lakehouse architecture
  • Hands-on experience with Databricks — including Delta Lake, PySpark, and ideally Unity Catalog
  • Experience with GCS or equivalent cloud object storage as a lakehouse foundation layer
  • Hands-on experience with domain-driven design applied to data modeling
  • Strong command of SQL and at least one transformation framework (dbt preferred)
  • Experience with medallion or lakehouse architectures (bronze/silver/gold or equivalent)
  • Familiarity with GCP-native tooling — Pub/Sub, Dataflow, or Dataplex a plus
  • Excellent written communication — able to write design docs non-engineers can understand and status updates executives can act on
  • Demonstrated ability to work independently in ambiguous environments
  • Track record of flagging risks early with proposed solutions

Nice to have: Experience in music/media/entertainment data; familiarity with data contracts or schema validation (Unity Catalog, Great Expectations, dbt tests); experience with external dev vendors


Pay Scale

  • $120,000 - $150,000 CAD per year
  • The final compensation within this range will be determined based on the candidate’s experience, skills, and overall fit for the role.

Top Skills

BigQuery
Dataflow
Dataplex
Dbt
GCP
Pub/Sub
SQL

Similar Jobs

6 Days Ago
In-Office or Remote
CA
Mid level
Mid level
Insurance
The Lead Data Engineer will design and build scalable data pipelines on Databricks, develop ETL/ELT pipelines with PySpark, and mentor engineering teams for data transformation and analytics.
Top Skills: DatabricksPysparkPythonSQL
7 Days Ago
Easy Apply
In-Office or Remote
Easy Apply
Senior level
Senior level
AdTech • Big Data • Consumer Web • Digital Media • Marketing Tech
Lead data engineering efforts, manage ETL pipelines, ensure data integrity, collaborate with BI, and drive strategic growth with data-driven insights.
Top Skills: Athena/Presto)Aws (S3DockerGlueKinesisPythonSparkSQL
7 Days Ago
Easy Apply
In-Office or Remote
Ottawa, ON, CAN
Easy Apply
Senior level
Senior level
AdTech • Big Data • Consumer Web • Digital Media • Marketing Tech
Lead and enhance scalable data engineering efforts facilitating effective data usage across teams, ensuring quality and accessibility of critical data.
Top Skills: AthenaAWSDockerGlueKinesisPrestoPythonS3SparkSQL

What you need to know about the Ottawa Tech Scene

The capital city of Canada and the nation's fourth-largest urban area, Ottawa has proven a rapidly growing global tech hub. With over 1,800 tech companies, many of which are leaders in their sectors, the city's tech talent now makes up more than 13 percent of its total workforce. This growth is driven not only by the big players like UL Solutions and Dropbox, but also by a thriving startup ecosystem, as new businesses emerge to follow in the footsteps of those that came before them.

Sign up now Access later

Create Free Account

Please log in or sign up to report this job.

Create Free Account