Business Services & Consulting • all cities, UT 45
Staff Pricipal Performance Engineer (45)
all cities, UT 45On-sitePosted 3 hours ago
Business Services & Consulting
About the Role
Staff Principal Performance Engineer, Remote
Define and implement performance engineering strategies for our Generative AI full stack, including services, application, LLMs, RAG pipelines, and related infrastructure. Lead performance testing, profiling, and analysis efforts to identify and resolve performance bottlenecks. Establish and maintain performance benchmarks and SLAs for critical AI services. Provide technical leadership and mentorship to performance engineering team members.
Analyze and improve LLM inference performance, including latency, throughput, and resource utilization. Develop and implement strategies for LLM capacity planning and scaling. Collaborate with AI researchers to evaluate and improve LLM model architectures and training techniques for performance. Optimize LLM inference through techniques such as quantization, distillation, and optimized kernel implementation.
Staff Principal Performance Engineer, Remote
Define and implement performance engineering strategies for our Generative AI full stack, including services, application, LLMs, RAG pipelines, and related infrastructure. Lead performance testing, profiling, and analysis efforts to identify and resolve performance bottlenecks. Establish and maintain performance benchmarks and SLAs for critical AI services. Provide technical leadership and mentorship to performance engineering team members.
Analyze and improve LLM inference performance, including latency, throughput, and resource utilization. Develop and implement strategies for LLM capacity planning and scaling. Collaborate with AI researchers to evaluate and improve LLM model architectures and training techniques for performance. Optimize LLM inference through techniques such as quantization, distillation, and optimized kernel implementation.
What You'll Do
Staff Principal Performance Engineer, Remote Define and implement performance engineering strategies for our Generative AI full stack, including services, application, LLMs, RAG pipelines, and related infrastructure.
Lead performance testing, profiling, and analysis efforts to identify and resolve performance bottlenecks.
Establish and maintain performance benchmarks and SLAs for critical AI services.
Provide technical leadership and mentorship to performance engineering team members.