New
Own the roadmap for cluster-level inference efficiency: routing and cache. Partner with engines, kernels, and perf teams to ship measurable wins. Build quantitative perf models; measure wins before shipping. Run the team's on-call, incidents, and deploy safety procedures. Hire, coach, and grow the team; align with model launches and scaling. Clarify commitments across seams; ensure dependencies are understood. 5+ years engineering management, leading production infra at scale Deep systems background: load balancing, scheduling, cache-coherent state, high-perf networking Shipped performance improvements in large-scale systems with metrics Experience running production infra: on-call, incidents, capacity events, deploy discipline Results-oriented; balance throughput, latency, stability, velocity Build cross-team relationships; seam role; curious about ML systems Competitive benefits package Optional equity donation matching Generous vacation and parental leave Flexible working hours Office space in San Francisco Visa sponsorship available π Visa sponsorship