Interface AI Logo

Interface AI

Lead DevOps / Platform Engineer

Posted 12 Days Ago
Be an Early Applicant
In-Office
7 Locations
Senior level
In-Office
7 Locations
Senior level
Design and maintain AI platform infrastructure, focusing on reliability, observability, and developer experience, while managing complex workloads and automation.
The summary above was generated by AI

Banking is being reimagined—and customers expect every interaction to be easy, personal, and instant

We are building a universal banking assistant that millions of U.S. consumers can use to transact across all financial institutions and, over time, autonomously drive their financial goals. Powered by our proprietary BankGPT platform, this assistant is positioned to displace age-old legacy systems within financial institutions and own the end-to-end CX stack, unlocking a $200B opportunity and potentially replacing multiple publicly traded companies

Ultimately, our mission is to drive financial well-being for millions of consumers.

With over two-thirds of Americans living paycheck to paycheck, 50% holding less than $500 in savings, and only 17% financially literate, we aim to put financial well-being on autopilot to help solve this problem.


About the Role

We are looking for a Lead Platform Engineer to design, build, and evolve our core AI platform infrastructure. This role is at the intersection of software engineering, infrastructure automation, and platform reliability, enabling product and AI teams to ship faster with confidence.

You will design developer-facing platforms, define standards for reliability and observability, and help scale complex workloads like LLM orchestration, vector databases, and event-driven systems.

This is a hands-on role where you’ll shape the foundational components that power our multi-product ecosystem — Sphere (Voice AI), Orbit (Chat AI), and Nexus (Employee Copilot).

What You’ll Do
  • Platform Architecture: Design, implement, and maintain core platform services and internal APIs for scalable, multi-tenant workloads.
  • Developer Experience: Build internal developer platforms (IDP) that streamline CI/CD, environment provisioning, and observability across teams.
  • System Reliability: Architect for fault tolerance, auto-scaling, and zero-downtime deployments for distributed microservices and AI pipelines.
  • Infrastructure as Code: Own and extend Terraform/Crossplane configurations to standardize provisioning across environments.
  • Performance & Observability: Implement deep observability (OpenTelemetry, Prometheus, Grafana) for tracing, metrics, and proactive alerting.
  • Service Orchestration: Manage Kubernetes, Helm, and service mesh (Istio/Linkerd) to ensure secure and efficient service communication.
  • Platform APIs: Build and evolve backend services in Go/Node.js/Python for internal orchestration, configuration, and workload routing.
  • AI Platform Integration: Collaborate with AI teams to optimize LLM workflows, caching strategies, and retrieval pipelines for low-latency inference.
  • Automation: Write high-quality scripts/tools in Python/Go to automate operational tasks, resilience testing, and rollout management.
  • Cross-Functional Partnership: Work with Product, DevOps, and Security to ensure every platform capability meets performance, compliance, and reliability goals.
What You’ll Bring
  • 6–9 years of engineering experience, with at least 3+ years in platform, infrastructure, or DevOps-heavy roles.
  • Strong proficiency in at least two backend languages (Go, Node.js, or Python).
  • Hands-on experience with Kubernetes, Helm, Terraform, and declarative infrastructure management.
  • Deep understanding of distributed systems, container orchestration, and microservice communication.
  • Proficiency in AWS cloud architecture (EKS, S3, RDS, Lambda, IAM, VPC).
  • Proven experience with observability and tracing systems (OpenTelemetry, Prometheus, Grafana).
  • Experience with CI/CD pipeline design (Jenkins, GitHub Actions, ArgoCD, GitOps workflows).
  • Exposure to AI/ML or data-intensive systems, including model serving, vector databases, or RAG pipelines.
  • Knowledge of networking, service mesh, and security controls in production-grade environments.
  • Strong debugging and performance tuning skills; ability to reason about failure modes and resilience.
  • Excellent collaboration skills — able to partner with developers, product managers, and AI researchers effectively.
Why Join Us
  • Build core platform systems that power one of the fastest-growing AI companies in fintech.
  • Shape developer experience, infrastructure standards, and reliability practices for an AI-first ecosystem.
  • Collaborate with top-tier engineers, AI researchers, and architects on large-scale distributed systems.
  • Work in a high-trust, fast-growth environment where innovation meets real-world impact.

Compensation

  •  Compensation is expected to be between $170,000 - $200,000. Exact compensation may vary based on skills and location.

What We Offer

  • 💡 100% paid health, dental & vision care
  • 💰 401(k) match & financial wellness perks
  • 🌴 Discretionary PTO + paid parental leave
  • 🏡 Remote-first flexibility
  • 🧠 Mental health, wellness & family benefits
  • 🚀 A mission-driven team shaping the future of banking

At interface.ai, we are committed to providing an inclusive and welcoming environment for all employees and applicants. We celebrate diversity and believe it is critical to our success as a company. We do not  discriminate on the basis of race, color, religion, national origin, age, sex, gender identity, gender expression, sexual orientation, marital status, veteran status, disability status, or any other legally protected status. All employment decisions at Interface.ai are based on business needs, job requirements, and individual qualifications. We strive to create a culture that values and respects each person's unique perspective and contributions. We encourage all qualified individuals to apply for employment opportunities with Interface.ai and are committed to ensuring that our hiring process is inclusive and accessible.

Top Skills

Argocd
AWS
Github Actions
Go
Grafana
Helm
Jenkins
Kubernetes
Node.js
Opentelemetry
Prometheus
Python
Terraform

Similar Jobs

24 Minutes Ago
Hybrid
Guelph, ON, CAN
Entry level
Entry level
Automotive • Hardware • Robotics • Software • Transportation • Manufacturing
The Industrial Engineer develops production processes and equipment layouts, establishes labor standards, conducts validation, and manages projects to enhance automotive manufacturing operations.
Top Skills: Lean ManufacturingMostPfmeaTime Study MethodsValue Stream Mapping
6 Hours Ago
In-Office
8 Locations
Mid level
Mid level
Blockchain • eCommerce • Fintech • Payments • Software • Financial Services • Cryptocurrency
The role involves ensuring data reliability by developing systems for data quality, collaborating with various teams, and establishing best practices for observability and operational excellence.
Top Skills: AirflowAWSDagsterDatabricksJavaKafkaKotlinKubernetesPrefectPythonSnowflakeSQLTerraform
8 Hours Ago
Easy Apply
Remote or Hybrid
6 Locations
Easy Apply
Senior level
Senior level
Fintech • HR Tech
Define and drive the vision for compute and networking platforms, leading design and optimization of distributed systems, mentoring engineers, and ensuring system reliability.
Top Skills: CiliumCrossplaneEnvoyIstioKubernetesService MeshTerraform

What you need to know about the Ottawa Tech Scene

The capital city of Canada and the nation's fourth-largest urban area, Ottawa has proven a rapidly growing global tech hub. With over 1,800 tech companies, many of which are leaders in their sectors, the city's tech talent now makes up more than 13 percent of its total workforce. This growth is driven not only by the big players like UL Solutions and Dropbox, but also by a thriving startup ecosystem, as new businesses emerge to follow in the footsteps of those that came before them.

Sign up now Access later

Create Free Account

Please log in or sign up to report this job.

Create Free Account