Tiger Analytics is looking for a skilled and innovative Machine Learning Engineer with hands-on experience in Google Cloud Platform (GCP) and Vertex AI to design, build, and deploy scalable ML solutions. You will play a key role in operationalizing machine learning models and driving the end-to-end ML lifecycle, from data ingestion to model serving and monitoring.
Key Responsibilities:
- Develop, train, and optimize ML models using Vertex AI, including Vertex Pipelines, AutoML, and custom model training.
- Design and build scalable ML pipelines for feature engineering, training, evaluation, and deployment.
- Deploy models to production using Vertex AI endpoints and integrate with downstream applications or APIs.
- Collaborate with data scientists, data engineers, and MLOps teams to enable reproducible and reliable ML workflows.
- Monitor model performance and set up alerting, retraining triggers, and drift detection mechanisms.
- Utilize GCP services such as BigQuery, Dataflow, Cloud Functions, Pub/Sub, and GCS in ML workflows.
- Apply CI/CD principles to ML models using Vertex AI Pipelines, Cloud Build, and GitOps practices.
- Implement model governance, versioning, explainability, and security best practices within Vertex AI.
- Document architecture decisions, workflows, and model lifecycle clearly for internal stakeholders.
1. Advanced Generative AI
- Advanced RAG including Graph based hybrid retrieval
- Multimodal agent
- Deep knowledge on ADK , Langchain Agentic Frameworks
- Fine tuning and Distillation
2. Python Expertise
- Expert in Python with strong OOP and functional programming skills
- Proficient in ML/DL libraries: TensorFlow, PyTorch, scikit-learn, pandas, NumPy, PySpark
- Experience with production-grade code, testing, and performance optimization
3. GCP Cloud Architecture & Services
- Proficiency in GCP services such as:
- Vertex AI
- BigQuery
- Cloud Storage
- Cloud Run
- Cloud Functions
- Pub/Sub
- Dataproc
- Dataflow
- Understanding of IAM, VPC
6. API Development & Integration
- Designs and builds RESTful APIs using FastAPI or Flask
- Integrates ML models into APIs for real-time inference
- Implements authentication, logging, and performance optimization
7. System Design & Scalability
- Designs end-to-end AI systems with scalability and fault tolerance in mind
- Hands-on experience in developing distributed systems, microservices, and asynchronous processing
This position offers an excellent opportunity for significant career development in a fast-growing and challenging entrepreneurial environment with a high degree of individual responsibility.