We are seeking an experienced and visionary Sr. Technologist to join our organization in Milpitas, United States. This role is director‑equivalent in scope and impact, with a strong emphasis on technical leadership and hands‑on innovation rather than large team management.
In this pivotal role, you will establish and lead an AI Systems & Performance Lab focused on identifying and eliminating AI bottlenecks across real‑world workloads, software stacks, and diverse hardware configurations. You will define the lab’s technical vision, develop rigorous methodologies, and deliver insights that directly influence system architecture, platform strategy, and future hardware designs.
The ideal candidate brings deep expertise in AI workloads and system performance, paired with the ability to translate complex technical findings into clear architectural and business impact.
Essential Duties and Responsibilities:
AI Lab Vision & Technical Leadership
- Establish and articulate a clear technical vision and charter for the AI Lab, focused on AI workload characterization and performance bottleneck analysis.
- Define the research agenda, success metrics, and execution model aligned with organizational objectives and future technology directions.
- Serve as the technical authority for AI systems performance and workload analysis.
AI Workload & System Bottleneck Analysis
- Drive end‑to‑end analysis of AI training and inference workloads, spanning models, frameworks, runtimes, and full system stacks.
- Identify performance bottlenecks across compute, memory hierarchy, interconnects, storage, power, and software layers.
- Develop repeatable methodologies for root‑cause analysis across heterogeneous platforms (CPU, GPU, NPU, and custom accelerators).
Benchmarking & Infrastructure Development
- Architect and oversee the development of benchmarking, profiling, and workload characterization frameworks.
- Evaluate AI performance across multiple hardware configurations and system topologies, ensuring rigor, reproducibility, and scalability.
- Establish best practices for performance measurement, regression tracking, and comparative analysis.
Cross‑Functional Influence
- Partner closely with hardware, platform, and system architecture teams to translate workload insights into actionable design recommendations.
- Collaborate with AI software, product, and business teams to ensure performance findings inform roadmap and investment decisions.
- Influence next‑generation system and silicon direction using data‑driven insights from real workloads.
Technical Mentorship & Thought Leadership
- Provide technical mentorship and guidance to a small, highly skilled team of engineers and researchers.
- Drive thought leadership through internal publications, external papers, or patents in AI systems and performance engineering.
- Represent the AI Lab in technical reviews, architecture forums, and executive briefings.
Responsible AI & Governance
- Ensure AI performance research and system recommendations align with responsible and ethical AI principles.
- Incorporate considerations around efficiency, sustainability, and scalability into workload and system analysis.