We are building a next-generation storage platform for AI infrastructure that combines high-performance flash, accelerator technologies, and advanced storage software, with the goal of delivering a breakthrough step-function improvement in cost, power efficiency, density, and scalability for AI-era data-center storage.
We are seeking a Networking Software Expert, RDMA/RoCE to define and drive the low-latency network data path of this platform, enabling efficient scaling from node-level communication up to large-scale infrastructure deployment.
This role will work at the intersection of storage, networking, and system architecture, shaping how low-latency networking, advanced NIC technologies, software services, and hardware capabilities come together in a tightly optimized system. The role will also help strengthen Sandisk's broader data-center networking and distributed infrastructure architecture knowledge across the Architecture organization.
Responsibilities:
- Drive the low-latency networking software architecture and implementation for a groundbreaking storage platform targeting step-change improvements in cost, power, density, and scalability
- Define and implement high-throughput, low-latency data paths across storage nodes and larger-scale deployments
- Own the software architecture around advanced NIC integration, queueing models, completion handling, memory registration strategy, and zero-copy data movement
- Analyze and optimize end-to-end network behavior, including latency, throughput, CPU efficiency, congestion sensitivity, and tail behavior under scale
- Help define how data, metadata, and control-plane traffic are partitioned across the platform
- Work closely with architecture, hardware, firmware, software, and silicon teams on HW/SW partitioning decisions and opportunities for networking acceleration
- Drive performance bring-up and debugging on real high-speed Ethernet environments
- Contribute to transport-level design decisions for multi-node communication and large-scale system scaling
- Build and optimize robust software for error handling, recovery, observability, and production readiness in RDMA environments
- Contribute to technical direction, coding standards, and architectural quality across the networking and storage stack