Lead MLOPS Engineer
Job Title: Lead MLOps Engineer
Location:
Hyderabad, India
Summary:
We are seeking a highly motivated and
experienced Lead MLOps Engineer to join our team and lead the development and
operation of our Machine Learning Operations (MLOps) infrastructure. You will
be responsible for building robust, scalable, and automated pipelines for
deploying, monitoring, and managing our AI/ML models, with a particular focus
on Conversational AI, RAG, NLP & LLM systems. This role requires a strong
blend of technical expertise in MLOps principles, DevOps practices, and cloud
technologies.
Responsibilities:
- Pipeline Design & Development: Architect,
build, and maintain end-to-end CI/CD pipelines for the deployment and
operation of AI/ML models, specifically focusing on Conversational AI, RAG
(Retrieval Augmented Generation), NLP (Natural Language Processing), and
LLM (Large Language Model) systems.
- Infrastructure Management: Design
and manage the underlying infrastructure required to support our MLOps
workflows, including Kubernetes clusters, containerized environments, and
cloud resources.
- Model Deployment & Monitoring: Implement
robust model deployment strategies and monitoring solutions to ensure high
availability, performance, and accuracy of deployed models.
- DevOps Practices: Champion
DevOps best practices throughout the ML lifecycle, fostering a culture of
automation, collaboration, and continuous improvement.
- Cloud Platform Expertise: Leverage
cloud platforms (AWS, Azure, GCP) to build scalable and cost-effective
MLOps solutions.
- Collaboration & Mentorship: Collaborate
closely with data scientists, machine learning engineers, and software
developers to ensure seamless integration of ML models into production
environments. Mentor junior engineers in MLOps best practices.
- Performance Optimization: Identify
and implement optimizations to improve the efficiency and performance of
our MLOps pipelines and deployed models.
- Security & Compliance: Ensure
that all MLOps processes and infrastructure adhere to security and
compliance requirements.
Qualifications:
- Education: Bachelor’s degree
in Computer Science or equivalent degree with a strong foundation in AI/ML
Operations and Data Science operations.
- Experience: 7+ years of
overall experience in software development, with a focus on MLOps
principles and practices.
- Containerization & Orchestration: Extensive experience with containerization technologies
(Docker) and orchestration platforms (Kubernetes).
- DevOps Expertise: Strong
skills with DevOps practices and CI/CD pipelines (e.g., Jenkins, GitLab
CI, Azure DevOps).
- Cloud Proficiency: Proven
experience working with cloud platforms and tools including Linux, Git,
Docker, Terraform, Kubernetes, AWS, Azure, or GCP.
- ML Model Deployment & Monitoring: Experience with model deployment and monitoring tools and
techniques.
- Programming Skills: Proficient
in Linux shell scripting, Jenkins, Terraform, Ansible, Python.