ML Data Engineer (Feature Pipeline & ETL)

Posted 2 Days Ago
Be an Early Applicant
Canada
Mid level
Security • Software • Cybersecurity
The Role
The ML Data Engineer will develop and maintain feature engineering pipelines using Databricks, manage ETL processes, and ensure optimized data quality and performance. Responsibilities include supporting the ML lifecycle, designing low-latency data pipelines, and collaborating with teams to align data transformation efforts with business needs.
Summary Generated by Built In

Role Overview:

McAfee is seeking a skilled ML Data Engineer to join our Consumer ML team, specializing in creating robust feature engineering ETL pipelines tailored for machine learning applications. This role requires hands-on experience with Databricks, a solid understanding of the medallion architecture, and expertise in developing, deploying, and managing scalable data pipelines for low-latency model serving.
The ideal candidate will also have experience supporting the end-to-end ML lifecycle, including model training and experiment tracking, with MLflow experience as a strong asset. As part of our AI and Machine Learning team, you will be instrumental in enabling advanced analytics and delivering personalized user experiences.
This is a remote position based in Canada. We will only consider candidates in Canada and are not offering relocation assistance at this time.

About the role:

  • Feature Engineering & Data Integration: Develop and maintain end-to-end ML feature engineering pipelines using Databricks, ensuring data is consistently structured to support ML models effectively.
  • Pipeline Development & Management: Integrate diverse data sources (clickstreams, user behaviour, demographic data, etc.) and tailor data integration processes to optimize data quality and performance.
  • Medallion Architecture Expertise: Build ETL/ELT pipelines that follow the bronze, silver, and gold layers of the medallion architecture, ensuring efficient data structuring for ML workflows.
  • Model Training & Experiment Tracking: Support ML model training and calibration through optimized data pipelines, using MLflow for experiment tracking, model versioning, and performance monitoring.
  • Query Optimization & Low Latency Pipelines: Design and implement optimized queries and low-latency data pipelines to support real-time and batch model inference in production.
  • CI/CD & Deployment: Apply CI/CD best practices to ensure smooth and efficient pipeline deployments, with automated testing for consistent performance.
  • Data Governance & Compliance: Ensure pipelines meet security and compliance standards, particularly for PII, and manage metadata and master data across the data catalogue.
  • Collaboration: Work closely with data scientists, data stewards, and other teams to align data ingestion and transformation efforts with business requirements.

About you:

  • Experience: Minimum 4 years in data engineering, focusing on ML feature engineering, ETL pipeline development, and data preparation for machine learning.
  • Databricks & Medallion Architecture: Proven expertise in managing ETL/ELT pipelines on Databricks, with a solid understanding of the medallion architecture.
  • ML Lifecycle & MLflow: Familiarity with the ML lifecycle and experience using MLflow for model training, calibration, and experiment tracking is highly desirable.
  • Spark & Big Data Technologies: Advanced skills in Apache Spark for big data processing and analytics.
  • Programming & Querying: Strong skills in Python for data manipulation, SQL for query optimization, and performance tuning.
  • Low Latency Data Pipelines: Experience in building and optimizing pipelines for low-latency model inference and serving in production environments.
  • CI/CD & System Integration: Familiarity with continuous integration and deployment practices for ETL/ELT pipeline development.
  • Data Pipeline Management: Expertise in managing data pipelines, ensuring adherence to security, compliance, and best practices.
  • Metadata & Master Data Management: Competency in managing metadata and master data within a technical data catalogue
  • You are a detail-oriented ML Data Engineer passionate about building scalable, efficient data pipelines tailored for machine learning.
  • You thrive in a collaborative environment, working effectively with cross-functional teams to drive data-driven insights and personalized solutions.
  • You are proactive in troubleshooting, monitoring, and optimizing data pipelines to support high-performance ML models in production.

#LI-Remote


Company Overview

McAfee is a leader in personal security for consumers. Focused on protecting people, not just devices, McAfee consumer solutions adapt to users’ needs in an always online world, empowering them to live securely through integrated, intuitive solutions that protects their families and communities with the right security at the right moment.

Company Benefits and Perks:

We work hard to embrace diversity and inclusion and encourage everyone at McAfee to bring their authentic selves to work every day. We offer a variety of social programs, flexible work hours and family-friendly benefits to all of our employees.

  • Bonus Program
  • Pension and Retirement Plans
  • Medical, Dental and Vision Coverage
  • Paid Time Off
  • Paid Parental Leave
  • Support for Community Involvement

We're serious about our commitment to diversity which is why McAfee prohibits discrimination based on race, color, religion, gender, national origin, age, disability, veteran status, marital status, pregnancy, gender expression or identity, sexual orientation or any other legally protected status.

Top Skills

Python
The Company
Ottawa, ON
7,996 Employees
On-site Workplace

What We Do

McAfee is a global organization with a 30-year history and a brand known the world over for innovation, collaboration and trust. McAfee’s historical accomplishments are founded upon decades of threat and vulnerability research, product innovation, practical application and a brand which individuals, organizations and governments have come to trust.

Similar Jobs

TransUnion Logo TransUnion

Analyst, Batch Processing (8-month Maternity Leave Contract)

Big Data • Fintech • Information Technology • Business Intelligence • Financial Services • Cybersecurity • Big Data Analytics
Hybrid
Burlington, ON, CAN
13000 Employees

TransUnion Logo TransUnion

Analyst, Batch Implementation (6 month contract)

Big Data • Fintech • Information Technology • Business Intelligence • Financial Services • Cybersecurity • Big Data Analytics
Hybrid
Burlington, ON, CAN
13000 Employees

General Motors Logo General Motors

Field Action Execution Analyst

Automotive • Big Data • Information Technology • Robotics • Software • Transportation • Manufacturing
Hybrid
Oshawa, ON, CAN
165000 Employees

General Motors Logo General Motors

Senior Data Engineering

Automotive • Big Data • Information Technology • Robotics • Software • Transportation • Manufacturing
Hybrid
Oshawa, ON, CAN
165000 Employees

Similar Companies Hiring

Zone & Co Thumbnail
Software • Professional Services • Fintech • Consulting
Amsterdam, NL
UL Solutions Thumbnail
Software • Renewable Energy • Professional Services • Energy • Consulting • Chemical • Automotive
Chicago, IL
15000 Employees
Consensus Cloud Solutions Thumbnail
Software • Information Technology • Healthtech • Cloud • Business Intelligence • Artificial Intelligence
Los Angeles, CA
398 Employees

Sign up now Access later

Create Free Account

Please log in or sign up to report this job.

Create Free Account