Engineering Manager, Inference Routing and Performance

New

Skills

Own the roadmap for cluster-level inference efficiency: routing and cache. Partner with engines, kernels, and perf teams to ship measurable wins. Build quantitative perf models; measure wins before shipping. Run the team's on-call, incidents, and deploy safety procedures. Hire, coach, and grow the team; align with model launches and scaling. Clarify commitments across seams; ensure dependencies are understood. 5+ years engineering management, leading production infra at scale Deep systems background: load balancing, scheduling, cache-coherent state, high-perf networking Shipped performance improvements in large-scale systems with metrics Experience running production infra: on-call, incidents, capacity events, deploy discipline Results-oriented; balance throughput, latency, stability, velocity Build cross-team relationships; seam role; curious about ML systems Competitive benefits package Optional equity donation matching Generous vacation and parental leave Flexible working hours Office space in San Francisco Visa sponsorship available πŸ›ƒ Visa sponsorship

No forms. Your profile is generated instantly.

Job Type: Remote

Salary: Not Disclosed

Experience: Entry

Duration: Months

Share this job: