drjobs Performance engineer

Performance engineer

Employer Active

1 Vacancy
drjobs

Job Alert

You will be updated with latest job alerts via email
Valid email field required
Send jobs
Send me jobs like this
drjobs

Job Alert

You will be updated with latest job alerts via email

Valid email field required
Send jobs
Job Location drjobs

San Francisco, CA - USA

Monthly Salary drjobs

Not Disclosed

drjobs

Salary Not Disclosed

Vacancy

1 Vacancy

Job Description

About this role
Writer is seeking a highly skilled and motivated Principal Performance Engineer to lead the performance optimization of our cuttingedge Generative AI technology stack. This role is critical in ensuring the scalability efficiency and reliability of our Large Language Models (LLMs) and Retrieval Augmented Generation (RAG) systems. You will be a key driver in identifying and resolving performance bottlenecks optimizing resource utilization and ensuring a seamless user experience. You will work closely with our AI research software engineering and infrastructure teams to deliver worldclass AI solutions.


Your responsibilities

  • Performance Leadership:

    • Define and implement performance engineering strategies for our Generative AI full stack including services application LLMs RAG pipelines and related infrastructure.

    • Lead performance testing profiling and analysis efforts to identify and resolve performance bottlenecks.

    • Establish and maintain performance benchmarks and SLAs for critical AI services.

    • Provide technical leadership and mentorship to performance engineering team members.

  • LLM Capacity and Tuning:

    • Analyze and improve LLM inference performance including latency throughput and resource utilization.

    • Develop and implement strategies for LLM capacity planning and scaling.

    • Collaborate with AI researchers to evaluate and improve LLM model architectures and training techniques for performance.

    • Optimize LLM inference through techniques such as quantization distillation and optimized kernel implementation.

  • RAG Performance Optimization:

    • Design and implement performance tests for RAG pipelines including retrieval ranking and generation components.

    • Identify and optimize performance bottlenecks in RAG systems such as database queries vector search and document processing.

    • Evaluate and optimize RAG system architectures for scalability and efficiency.

    • Tune vector databases for optimal recall and latency.

  • Infrastructure Optimization:

    • Collaborate with infrastructure teams to optimize hardware and software configurations for AI workloads.

    • Evaluate and recommend new technologies and tools for performance monitoring and analysis.

    • Develop and maintain performance dashboards and reports to track key metrics.

    • Optimize GPU utilization and memory management for LLM inference.

  • Collaboration and Communication:

    • Work closely with AI researchers software engineers and product managers to ensure performance requirements are met.

    • Communicate performance findings and recommendations to stakeholders at all levels.

    • Stay uptodate with the latest developments in Generative AI and performance engineering.

Is this you

  • Education:

    • Bachelors degree in Computer Science Engineering or a related field (Masters preferred).

  • Experience:

    • 10 years of experience in performance engineering with a focus on largescale distributed systems.

    • 2 years of experience working with AI/ML technologies

    • Proven experience in performance testing profiling and analysis of complex software systems.

    • Deep understanding of NLP architectures training and inference.

    • Experience with vector databases and search technologies.

    • Experience with cloud computing platforms (e.g. AWS Azure GCP) and containerization technologies (e.g. Docker Kubernetes).

    • Strong programming skills in python.

    • Experience with performance analysis tools (e.g. profilers debuggers monitoring tools).

  • Skills:

    • Strong analytical and problemsolving skills.

    • Excellent communication and collaboration skills.

    • Ability to work in a fastpaced and dynamic environment.

    • Passion for AI and a desire to push the boundaries of performance engineering

      #LIRemote


Benefits & perks (US Fulltime employees)

Writer is an equalopportunity employer and is committed to diversity. We dont make hiring or employment decisions based on race color religion creed gender national origin age disability veteran status marital status pregnancy sex gender expression or identity sexual orientation citizenship or any other basis protected by applicable local state or federal law. Under the San Francisco Fair Chance Ordinance we will consider for employment qualified applicants with arrest and conviction records.

By submitting your application on the application page you acknowledge and agree to Writers Global Candidate Privacy Notice.

Employment Type

Full-Time

About Company

Report This Job
Disclaimer: Drjobpro.com is only a platform that connects job seekers and employers. Applicants are advised to conduct their own independent research into the credentials of the prospective employer.We always make certain that our clients do not endorse any request for money payments, thus we advise against sharing any personal or bank-related information with any third party. If you suspect fraud or malpractice, please contact us via contact us page.