About Us
Outspeed is creating the most lifelike conversational voice systems to augment human-computer interaction. We are building the infrastructure and tools to unlock applications in therapy, coaching, companionship and gaming.
Outspeed is led by an experienced team of researchers and engineers with collective experience from MIT, Google, and Microsoft. Our team is based in San Francisco and values empathy, deep technical knowledge, and autonomy.
Janak earned his bachelor's and master's degrees from MIT and built grid-scale AI algorithms and infra at Autogrid (acq. Schneider). Sahil is a published researcher who led AI infra efforts at Google and Microsoft.
Role Description
As a Founding Engineer, you will play a pivotal role in shaping the company's technical direction.
You will train state-of-the-art TTS models, experiment with new architectures for prosody, expressiveness, and multilingual support, and optimize inference for ultra-low latency and scalability. You will measure and push boundaries on speech quality (e.g., MOS, intelligibility, naturalness) while engaging directly with customers to integrate feedback into product improvements. You will also contribute heavily to the architectural roadmap, ensuring our systems are robust, flexible, and future-proof.
This role requires a proactive individual comfortable working in a fast-paced environment with frequently shifting priorities.
Benefits
Competitive salary + Equity
Health, dental, and vision insurance.
Bonuses based on performance
We are a company founded by immigrants, and we are committed to providing support to immigrant workers throughout their journey.
Requirements
Experience in end-to-end ML application development, including data engineering, model tuning, and model serving
Strong expertise in TTS models, with prior experience training or fine-tuning large-scale models for speech synthesis
Familiarity with speech quality evaluation (MOS, intelligibility, expressiveness, latency)
Prior experience with audio processing pipelines, vocoders (e.g., HiFi-GAN, WaveRNN), or speech-to-speech systems
Strong understanding of ML development technologies such as Pytorch, Transformers, CUDA etc.
Nice-to-haves
Enjoy moving fast and making a large business impact