Senior Systems Engineer, Workers AI

Cloudflare

Not Interested
Bookmark
Report This Job

profile Job Location:

San Francisco, CA - USA

profile Monthly Salary: Not Disclosed
Posted on: Yesterday
Vacancies: 1 Vacancy

Job Summary

About Us

At Cloudflare we are on a mission to help build a better Internet. Today the company runs one of the worlds largest networks that powers millions of websites and other Internet properties for customers ranging from individual bloggers to SMBs to Fortune 500 companies. Cloudflare protects and accelerates any Internet application online without adding hardware installing software or changing a line of code. Internet properties powered by Cloudflare all have web traffic routed through its intelligent global network which gets smarter with every request. As a result they see significant improvement in performance and a decrease in spam and other attacks. Cloudflare was named to Entrepreneur Magazines Top Company Cultures list and ranked among the Worlds Most Innovative Companies by Fast Company.

At Cloudflare were not looking for people who wait for a polished roadmap; were looking for the builders who see the cracks in the Internet that everyone else has simply learned to live with. We value candidates who have the instinct to spot a normalized problem and the AI-native curiosity to create a solution using the latest tools. Our culture is built on iteration leveraging AI to ship faster today to make it better tomorrow while ensuring that every improvement no matter how small is shared across the team to lift everyone up. If youre the type of person who values curiosity over bureaucracy and that AI is a partner in solving tough problems to keep the Internet moving forward youll fit right in.

Available Locations: Austin TX or London UK (Hybrid)

About the role

Youll design and build the core infrastructure that powers AI inference across Cloudflares global network real-time voice frontier open LLMs and customer-deployed models running on a heterogeneous fleet of GPUs and next-generation accelerators in hundreds of cities worldwide. Working alongside AI/ML engineers hardware partners and Cloudflare product teams youll solve hard problems in distributed systems and high-performance computing: sub-second model cold starts multi-accelerator workload scheduling efficient KV cache management and a model deployment platform serving both Cloudflare and customers bringing their own models. Were building an AI inference platform embedded in the fabric of the internet something that doesnt exist yet and this role puts you at the center of it. Were looking for high-agency systems engineers who are energized by foundational infrastructure problems and want to define how AI runs at the edge of the network.

Role Responsibilities

  • Develop and maintain core components of the serverless inference platform to ensure high availability and scalability for Cloudflare users.
  • Optimize the model scheduling system to significantly increase efficiency and resource utilization across our inference infrastructure.
  • Implement improvements to the inference request routing logic to enhance overall performance and reduce latency for end-users.
  • Drive significant measurable improvements in the platforms reliability and resilience by identifying and mitigating systemic risks.
  • Expand and refine the observability stack including metrics logging and tracing and fine-tune alerts to proactively identify and resolve production issues.
  • Lead complex cross-functional technical projects from initial concept and design through final deployment and operationalization.
  • Act as a mentor to junior engineers and actively contribute to cultivating a strong collaborative engineering culture within the team.

Role Requirements

Must-Have Skills

  • Experience in systems engineering with a focus on distributed high-performance systems.
  • Expert proficiency in Rust programming particularly in an asynchronous environment.
  • Deep understanding and hands-on experience with relevant networking and application protocols (e.g. TCP HTTP WebSocket).
  • Experience with scaling and performance optimization techniques including load balancing and caching in a distributed environment.

Nice-to-Have Skills

  • Demonstrable experience with container orchestration platforms specifically Kubernetes and/or Nomad.
  • Familiarity with the challenges and architectures involved in large-scale inference serving (e.g. LLM and diffusion models).

What Makes Cloudflare Special

Were not just a highly ambitious large-scale technology company. Were a highly ambitious large-scale technology company with a soul. Fundamental to our mission to help build a better Internet is protecting the free and open Internet.

Project Galileo: Since 2014 weve equipped more than 2400 journalism and civil society organizations in 111 countries with powerful tools to defend themselves against attacks that would otherwise censor their work technology already used by Cloudflares enterprise customers--at no cost.

Athenian Project: In 2017 we created the Athenian Project to ensure that state and local governments have the highest level of protection and reliability for free so that their constituents have access to election information and voter registration. Since the project weve provided services to more than 425 local government election websites in 33 states.

1.1.1.1: We released 1.1.1.1 to help fix the foundation of the Internet by building a faster more secure and privacy-centric public DNS resolver. This is available publicly for everyone to use - it is the first consumer-focused service Cloudflare has ever released. Heres the deal - we dont store client IP addresses never ever. We will continue to abide by our privacy commitment and ensure that no user data is sold to advertisers or used to target consumers.

Sound like something youd like to be a part of Wed love to hear from you!

This position may require access to information protected under U.S. export control laws including the U.S. Export Administration Regulations. Please note that any offer of employment may be conditioned on your authorization to receive software or technology controlled under these U.S. export laws without sponsorship for an export license.

Cloudflare is proud to be an equal opportunity employer. We are committed to providing equal employment opportunity for all people and place great value in both diversity and inclusiveness. All qualified applicants will be considered for employment without regard to their or any other persons perceived or actual race color religion sex gender gender identity gender expression sexual orientation national origin ancestry citizenship age physical or mental disability medical condition family care status or any other basis protected by law. We are an AA/Veterans/Disabled Employer.

Cloudflare provides reasonable accommodations to qualified individuals with disabilities. Please tell us if you require a reasonable accommodation to apply for a job. Examples of reasonable accommodations include but are not limited to changing the application process providing documents in an alternate format using a sign language interpreter or using specialized equipment. If you require a reasonable accommodation to apply for a job please contact us via e-mail at or via mail at 101 Townsend St. San Francisco CA 94107.


Required Experience:

Senior IC

About UsAt Cloudflare we are on a mission to help build a better Internet. Today the company runs one of the worlds largest networks that powers millions of websites and other Internet properties for customers ranging from individual bloggers to SMBs to Fortune 500 companies. Cloudflare protects and...
View more view more

About Company

Company Logo

Make employees, applications and networks faster and more secure everywhere, while reducing complexity and cost.

View Profile View Profile