You will bridge the gap between Data Science and Operations. Your goal is to operationalize complex ML models for Defense applications, ensuring that AI-driven insights are delivered to the field with "Zero-Downtime" reliability and full traceability.
Key Responsibilities:
- End-to-End Lifecycle: Build and manage the full ML lifecycle—from experiment tracking to model deployment and retraining.
- Containerization: Master the deployment of ML workloads using Docker and Kubernetes (OpenShift).
- Automation: Implement ML-specific CI/CD (e.g., CML, Kubeflow Pipelines) to automate the promotion of models to production.
- Observability: Set up specialized monitoring for model drift, data quality, and prediction accuracy.
- Scalability: Architect distributed systems for large-scale model inference.