Maven Robotics Logo

Maven Robotics

ML Infrastructure Engineer

Posted Yesterday
In-Office or Remote
Hiring Remotely in CA
Senior level
In-Office or Remote
Hiring Remotely in CA
Senior level
Design, build, and operate ML infrastructure powering data, compute, artifacts, and orchestration across cloud and on-prem. Own backend services, storage, observability, security, and developer tools; collaborate with cloud/compute providers and lead reliability and scaling efforts.
The summary above was generated by AI
Company Overview

Maven Robotics is building the world’s leading general-purpose robots and providing physical AI solutions for the most challenging industrial autonomy tasks.

Operating in stealth, we are assembling a team of world-class innovators who think from first principles. Our mission is to achieve human-level task success rates in complex environments, even when faced with limited fine-tuning data or evolving robotic hardware. We value unwavering truth-seeking, humility, and relentless determination.

Role Description

We are looking to recruit an exceptional Infrastructure Engineer to own and build the backend systems that power machine learning at Maven Robotics. In this role, you will design and scale the core infrastructure used by our AI and robotics teams to manage data, run compute workloads, store artifacts, monitor systems, and support rapidly growing engineering workflows.

You should be excited about distributed systems, backend services, data infrastructure, GPU compute, and high-reliability internal platforms. The ideal candidate has successfully built and operated similar systems before and can independently drive complex infrastructure projects from architecture through production operation. The underlying systems may be sophisticated, but the interfaces and workflows they expose should be reliable, intuitive, and easy for engineers to use.

In this role you will:

  • Own the architecture, implementation, reliability, and evolution of Maven's machine learning infrastructure.
  • Build backend services and platforms for managing data, artifacts, jobs, logs, metadata, and compute resources across cloud and on-premise environments.
  • Design scalable systems for workload orchestration, storage, observability, security, and infrastructure automation.
  • Build intuitive internal tools and abstractions that make complex infrastructure easy for engineers to use.
  • Lead technical and commercial discussions with cloud and ML compute providers, including capacity planning, performance, reliability, and cost.
Qualifications

Must-have:

  • Significant experience designing, building, and operating production backend, distributed, or compute infrastructure.
  • A track record of independently owning complex infrastructure projects from architecture through deployment and ongoing operation.
  • Strong programming ability in Python, Go, Rust, C++, or a similar backend or systems language.
  • Experience operating GPU compute infrastructure and orchestrating distributed workloads using Kubernetes, Ray, ZenML, or similar systems.
  • Experience designing and operating storage systems, observability platforms, infrastructure-as-code, and secure access controls.
  • Experience managing large-scale GPU fleets or hybrid cloud and on-premise compute environments.
  • Experience building internal developer platforms, CLIs, SDKs, or other self-service infrastructure tools.
  • Strong technical judgment, leadership, and communication skills, with the ability to drive decisions across teams and external partners.
  • Self-starter attitude with the ability to identify priorities and deliver durable solutions in a fast-paced startup environment.

Nice-to-have:

  • Familiarity with GPU architecture, accelerator-aware software design, and profiling compute-intensive workloads.
  • Exposure to infrastructure supporting large-scale robot learning workloads, including policy training, simulation, and multimodal data pipelines.
  • Familiarity with SOC 2 controls, security practices, and audit readiness.

Similar Jobs

3 Days Ago
Easy Apply
Remote or Hybrid
Canada
Easy Apply
Expert/Leader
Expert/Leader
Artificial Intelligence • Cloud • Computer Vision • Hardware • Internet of Things • Software
As a Staff ML Engineer, you'll design and operate Samsara's ML platform, partnering with teams to deliver scalable ML solutions that enhance safety and efficiency in physical operations.
Top Skills: AWSCloud InfrastructureDistributed ComputingKubernetesMachine LearningRaySpark
25 Days Ago
Remote or Hybrid
CA
Senior level
Senior level
Artificial Intelligence • Information Technology • Software
Design and develop scalable, high-performance data and API infrastructure for real-time processing. Mentor engineers and collaborate with teams to enhance AI model evaluations.
Top Skills: APIsDistributed SystemsLow-Latency PipelinesPyTorchScalable Backend ArchitectureStream Processing
11 Days Ago
Remote or Hybrid
CA
Mid level
Mid level
Artificial Intelligence • Information Technology • Software
The role involves designing scalable data pipelines for 3D, video, and sensor data, optimizing infrastructure, and productionizing ML models with researchers.
Top Skills: SparkAWSAzureDaskDvcFlyteGCPKubernetesMlflowPythonPyTorchRay

What you need to know about the Ottawa Tech Scene

The capital city of Canada and the nation's fourth-largest urban area, Ottawa has proven a rapidly growing global tech hub. With over 1,800 tech companies, many of which are leaders in their sectors, the city's tech talent now makes up more than 13 percent of its total workforce. This growth is driven not only by the big players like UL Solutions and Dropbox, but also by a thriving startup ecosystem, as new businesses emerge to follow in the footsteps of those that came before them.

Sign up now Access later

Create Free Account

Please log in or sign up to report this job.

Create Free Account