Senior MLOps Engineer
We are looking for a skilled and motivated Senior MLOps Engineer to join our dynamic team at Quantori. In this role, you will have the opportunity to standardize and implement tools that empower ML Scientists to develop and deploy models more effectively. Our initiative aims to create strategic and holistic solutions by collaborating with our scientific stakeholders to transition their modeling work to the cloud and provide MLOps solutions that streamline the entire process from model creation to inference and monitoring.
Location:
Quantori is an international team: we have colleagues who work not only from office but also remotely from all over the world.
Responsibilities:
- Design, enhance, scale, and maintain machine learning solutions from conception to deployment in a production environment.
- Collaborate with the MLOps Initiative engineering and product management teams to translate scientific and technical requirements into scalable ML systems.
- Architect MLOps pipelines using orchestration frameworks to streamline data preparation, training, deployment, and the entire machine learning model lifecycle.
- Contribute to machine learning architecture to support scalable and repeatable model training and deployment.
- Design and implement robust continuous monitoring systems for deployed models to track performance, data drift, and anomalies, enabling model retraining.
- Facilitate the creation of automated processes for model validation and testing.
- Ensure best practices in code quality, version control, and CI/CD for data and machine learning pipelines.
- Work with computational scientists to understand and leverage domain-specific software libraries and frameworks.
- Perform code reviews and refactoring to ensure high-quality software.
- Write technical documentation.
What we expect:
- Expertise in cloud services and containerization technologies/platforms, particularly AWS and Kubernetes.
- Hands-on experience in orchestrating and optimizing scaled ML pipelines on Kubernetes.
- Proficiency in ML frameworks (e.g., TensorFlow, PyTorch/Lightning), programming languages (Python), MLOps technologies (e.g., Weights & Biases, AWS Sagemaker, Ray), and job scheduling frameworks (e.g., Slurm, AWS Step Functions).
- Familiarity with software engineering best practices, including agile development, code reviews, build processes, testing, and operations.
- Experience with distributed computing and big data technologies (e.g., Batch, Ray, Spark).
- Familiarity with distributed training systems.
- Familiarity with large language model development is a plus.
- Strong communication skills, effective teamwork, and critical thinking abilities.
- Degree in Computer Science, Statistics, or a relevant field.
- Ability to work in the PST time zone.
We offer:
- Strong management and technical expertise
- Four-month contract with possible extension based on project needs and performance
- Flexible working hours
If you don't see an open position that suits your skills stack and/or professional background but you are interested in working with us — please send your CV to career@quantori.com. We will try to find something special and interesting for you!