nesto Logo

nesto

Senior SRE Developer

Posted 2 Days Ago
Be an Early Applicant
Remote
Hiring Remotely in Canada
Senior level
Remote
Hiring Remotely in Canada
Senior level
Drive SRE initiatives to improve platform reliability, performance, and automation. Build observability (Datadog), enhance CI/CD and infra-as-code (Pulumi, ArgoCD), guide teams on SLOs and incident response, participate in on-call rotation, and collaborate on design and capacity planning for a cloud-native mortgage platform.
The summary above was generated by AI

Our mission is to provide a positive, empowering, and transparent property financing experience that is simple from start to finish. Our team consists of skilled technology experts, caring mortgage specialists, and a diverse marketing team, all working together to lead change in the mortgage industry.

At nesto, we're proud of:

  • Our clients love our positive, empowering, and transparent mortgage financing experience. Our 4.5-star Google reviews speak for themselves!
  • We won the 2023 & 2024 CLA Lender of the Year award, recognizing our excellence in lending services.
  • We are a B Corp certified organization, highlighting our dedication to making a positive impact on our society and our planet.
  • Our highly skilled, diverse, and collaborative team makes everything possible!
  • Our Mortgage Cloud platform gives financial institutions full access to nesto's proprietary technology, powering a better client experience from start to finish.

About

Since the beginning, we've been committed to creating a modern and cloud-native tech platform (based on Google Cloud) that adheres to the industry's highest development and operation standards. Our front-end is written in Typescript with React and our back-end in Go, which consists of loosely coupled docker-containerized microservices communicating via RESTful APIs and pub-sub queues. These containers are orchestrated using Kubernetes and monitored with APM via Datadog. Our CI/CD pipelines are automated using GitHub actions, ArgoCD, and our own open-source config management tool Joy. All infrastructure configuration is managed using Infrastructure as Code via Pulumi (similar to Terraform, but using a rich and extensible programming language and API, in our case, TypeScript), ArgoCD and Crossplane.


We strive to implement the best emerging DevOps practices, such as GitOps, Internal Developer Platforms, and Infrastructure as Code. We believe that DevOps goes beyond the mere automation of pipelines; it should encompass all areas of the business, empowering its actors and stakeholders to actively contribute to the organization's overall success.


We are looking for a passionate System Reliability Engineering (SRE) Developer who shares our core principles and wishes to join our dedicated Platform team. You’ll collaborate closely with our development teams to help ensure system stability and performance through various initiatives as we continue to scale our modern mortgage platform of the future. 


What you'll be doing

You will apply software engineering principles to operations, focusing on the core SRE objectives of stability, performance, and automation. Crucially, you will act as a force multiplier by empowering our development teams.

  • Service Reliability & Observability: Drive SRE efforts to enhance the platform's overall resilience, uptime, performance, and stability.
    • Design and implement end-to-end active monitoring, alerting, and detection of issues using tools like Datadog.
    • Gather, process, and analyze metrics, logs, and traces from systems and applications to assist in performance tuning and fault finding.
    • Continuously improve our system's performance and scalability, including investigating and implementing solutions to current system bottlenecks (e.g., sharding strategy, database architecture, microservice communication).
  • Automation & Developer Velocity: Create sustainable systems and services through extensive automation to eliminate operational bottlenecks and reduce toil.
    • Enhance our CI/CD pipelines (GitHub Actions) and deployment flow (Kubernetes, ArgoCD) by enhancing application and infrastructure observability.
    • Enhance our infrastructure automation (Pulumi, TypeScript, Crossplane) and mature our cloud infrastructure to ensure it is secure, compliant, and cost-effective.
  • Knowledge Sharing & Developer Empowerment: Foster a culture of quality and ownership by enabling developers to manage their services with confidence.
    • Guide and train development teams on best practices for observability, alerting configurations, and defining Service Level Objectives (SLOs) for critical services.
    • Improve the clarity and adoption of our incident management and analysis process.
    • Support developers during on-call and critical situations, transferring knowledge so they can provide timely assistance and remediation solutions during production incidents.
  • Cross-Team Collaboration & Platform Ownership: Work closely with development teams to integrate operational considerations and consult on system design, capacity planning, and architectural conversations.
    • Participate in system design consulting, platform management, and capacity planning.
    • Provide primary operational support and engineering for large-scale distributed software applications.
    • Take part in the on-call rotation for Platform engineering

Who we’re looking for

  • You have 5+ years of relevant technical experience with a significant portion in a Site Reliability Engineering, DevOps, or Production Engineering role.
  • You have deep experience in cloud infrastructure with any of the top cloud providers, ideally Google Cloud and/or AWS.
  • You are proficient in programming/scripting languages, with experience in Go (Golang) and/or TypeScript/JS being highly desirable.
  • You are an expert with key cloud-native technologies: Kubernetes, Docker, Helm, ArgoCD, and Infrastructure as Code (Pulumi/Terraform).
  • You have hands-on experience with monitoring tools like Datadog or similar systems.
  • You have prior experience working within a high-growth B2B or SaaS business model.
  • You possess exceptional interpersonal, written, and oral communication skills that support effective collaboration across engineering teams.
  • You possess a highly analytical mindset, with the ability to see both the big picture and the small details, and thrive on collaboration.
  • Ideally, you have knowledge or interest in security best practices such as vulnerability scanning and intrusion detection.
The Reward
  • The A-Team: Work alongside high-performing talent in the industry.
  • Accelerated Growth: The slope of your learning curve here will be vertical. You will touch more production systems in one year than you would in five years at a bank.
  • Top-Tier Coverage: Premium benefits plan fully paid by nesto, including comprehensive insurance and unlimited access to telemedicine and mental health services for you and your family.
  • Rest & Recharge: 4 weeks of vacation to ensure you stay at peak performance.
  • Best-in-Class Tools: Access to the resources and tech you need to execute without friction.
  • Working framework: The environment that makes you productive and enables teamwork (Hybrid model).
Diversity and Inclusion

At nesto, we believe that creativity and collaboration are the result of a diverse team. We are committed to fostering a culture of diversity, equity, inclusion, and belonging, and we strongly encourage women, people of color, LGBTQIA+ individuals, and individuals with disabilities to apply. We are committed to creating a workplace that is inclusive and welcoming to all.


Top Skills

Typescript,React,Go (Golang),Docker,Kubernetes,Helm,Argocd,Github Actions,Datadog,Pulumi,Terraform,Crossplane,Joy,Google Cloud (Gcp),Aws,Rest,Pub/Sub

Similar Jobs

Yesterday
In-Office or Remote
27 Locations
Senior level
Senior level
Blockchain • Fintech • Payments • Financial Services • Cryptocurrency • Web3
Design, maintain, and secure cloud infrastructure and CI/CD pipelines; automate operations with Go/Python; manage Kubernetes and blockchain nodes; implement disaster recovery; use AI tools for monitoring, anomaly detection, and capacity planning; participate in on-call rotations; mentor team members to improve reliability and performance.
Top Skills: Go,Python,Shell,Terraform,Crossplane,Aws Lambda,Kubernetes,Helm,Ethereum,Solana,Arbitrum,Base,Avalanche,Postgresql,Redis,Opensearch,Apache Airflow,Aws Dms,Snowflake,Github Copilot,Gemini,Chatgpt,Llms,Apm,Rum,Telemetry
20 Days Ago
Easy Apply
Remote or Hybrid
7 Locations
Easy Apply
Senior level
Senior level
Big Data • Cloud • Software • Database
Manage continuous delivery infrastructure for reliable code deployment. Collaborate with teams to streamline onboarding, support deployment systems, and participate in on-call rotations.
Top Skills: Argo WorkflowsArgocdAWSAzureGoGoogle Cloud PlatformKubernetesPython
13 Days Ago
Remote or Hybrid
3 Locations
Senior level
Senior level
Transportation
Design and develop Waabi's observability stack, optimize performance, build automation tooling, and support application requirements while leading projects and mentoring teams.
Top Skills: AWSC/C++DockerGoGrafanaJavaKubernetesOpentelemetryPythonRust

What you need to know about the Ottawa Tech Scene

The capital city of Canada and the nation's fourth-largest urban area, Ottawa has proven a rapidly growing global tech hub. With over 1,800 tech companies, many of which are leaders in their sectors, the city's tech talent now makes up more than 13 percent of its total workforce. This growth is driven not only by the big players like UL Solutions and Dropbox, but also by a thriving startup ecosystem, as new businesses emerge to follow in the footsteps of those that came before them.

Sign up now Access later

Create Free Account

Please log in or sign up to report this job.

Create Free Account