Role Summary
We are seeking a Generative AI Systems Engineer to design, evaluate, and optimize Vision-Language Model (VLM) systems for real-world applications.
This role requires a combination of:
- Model understanding
- Experimental rigor
- Systems and production thinking
You will work on benchmarking, fine-tuning, and deploying multimodal models, with a strong emphasis on tradeoff analysis across accuracy, latency, and cost.
Key Responsibilities
Model Evaluation & Benchmarking
- Evaluate pretrained VLMs on domain-specific datasets
- Define and justify appropriate evaluation metrics
- Analyze model behavior, including systematic failure modes
Model Adaptation & Fine-Tuning
- Implement parameter-efficient fine-tuning techniques (e.g., LoRA, QLoRA)
- Optimize training under limited data and compute constraints
- Make data-centric and model-centric improvements with clear justification
Experimental Rigor
- Design controlled experiments to compare baseline vs improved models
- Quantify improvements across:
- Provide clear, defensible explanations for observed outcomes
System Design & Deployment
- Architect scalable inference pipelines for multimodal models
- Optimize for:
- low latency
- high throughput
- cost efficiency
- Implement serving layers (API/service) with reproducible environments
Data Engineering
- Build pipelines to process and align:
- images
- textual queries
- structured metadata
- Analyze dataset characteristics, including biases and distribution gaps