Senior Data & MLOps Engineer

CoreWeave Europe

Not Interested
Bookmark
Report This Job

profile Job Location:

London - UK

profile Monthly Salary: Not Disclosed
Posted on: 12 hours ago
Vacancies: 1 Vacancy

Job Summary

CoreWeave is The Essential Cloud for AI. Built for pioneers by pioneers CoreWeave delivers a platform of technology tools and teams that enables innovators to build and scale AI with confidence. Trusted by leading AI labs startups and global enterprises CoreWeave combines superior infrastructure performance with deep technical expertise to accelerate breakthroughs and turn compute into capability. Founded in 2017 CoreWeave became a publicly traded company (Nasdaq: CRWV) in March 2025. Learn more at.
Were proud to be a Living Wage accredited Employer.

What Youll Do:

The Data Science team is focused on developing an advanced reliability platform. This system covers various aspects of data processing and analysis including data intake deriving meaningful metrics identifying unusual patterns predicting potential issues finding slow processes in distributed systems and using automated analysis to determine causes. We collaborate closely with internal teams like Fleet Infrastructure and AI Platform to enhance system stability optimize resource use shorten resolution times and maintain service availability and financial performance.

About the role:

As a Senior Data & MLOps Engineer you will design and scale the infrastructure supporting the GPU Intelligence Platform. This involves building pipelines for handling data features model training and delivering insights and predictions for system health and optimization. You will transition the system from initial prototypes to a production environment operating across the fleet focusing on scalability separating real-time service from periodic processing and dynamic resource management based on system load and data frequency. You will architect and deploy these scalable distributed services using orchestration technologies.

Key responsibilities:

  • Design and implement scalable data ingestion pipelines.
  • Build feature processing and baseline computation systems.
  • Productionize models for prediction and detection.
  • Develop and operate low-latency service and robust offline workflows.
  • Architect horizontally scalable services with clear separation between components leveraging orchestration for distribution.
  • Implement monitoring and feedback loops for continuous model and signal improvement.
  • Collaborate with Platform teams to integrate operational signals into monitoring and diagnostics.
  • Implement a scalable solution for mitigation and structured analysis.

Who You Are:

  • 7 years of experience in data engineering distributed systems MLOps or infrastructure ML roles in production environments.
  • Proven experience building high-throughput streaming or telemetry pipelines (e.g. Kafka Pulsar Kinesis or equivalent).
  • Strong experience designing time-series feature pipelines and operating large-scale observability systems.
  • Experience building and maintaining feature stores and ensuring offline/online feature parity.
  • Hands-on experience deploying ML models to production including versioning monitoring rollback and drift detection.
  • Experience designing scalable microservices deployed in Kubernetes-based environments.
  • Strong proficiency in Python and at least one systems language (Go Rust or C).
  • Experience working with distributed compute or training systems (e.g. NCCL PyTorch Distributed Spark Ray Slurm).
  • Familiarity with GPU telemetry systems such as NVML or DCGM and hardware-level monitoring concepts.
  • Demonstrated experience scaling systems from Proof-of-Concept to production-grade fleet-level deployments.

Preferred:

  • Experience working on GPU fleet management hyperscale infrastructure or AI training clusters.
  • Experience building anomaly detection or failure prediction systems for hardware or distributed systems.
  • Experience implementing distributed straggler detection or collective-level performance analysis systems.
  • Experience developing agentic or LLM-powered reasoning systems for diagnostics or operational intelligence.
  • Background in reliability engineering or SRE practices.

Wondering if youre a good fit We believe in investing in our people and value candidates who can bring their own diversified experiences to our teams even if you arent a 100% skill or experience match. Here are a few qualities weve found compatible with our team. If some of this describes you wed love to talk.

  • You love building systems that turn raw infrastructure telemetry into actionable intelligence.
  • Youre curious about distributed systems failure modes GPU performance pathologies and reliability engineering at scale.
  • Youre excited by the idea of moving from anomaly detection to prediction to autonomous root cause reasoning.
  • You enjoy designing platforms that protect uptime revenue and customer trust through proactive systems thinking.

Why CoreWeave

At CoreWeave we work hard have fun and move fast! Were in an exciting stage of hyper-growth that you will not want to miss out on. Were not afraid of a little chaos and were constantly learning. Our team cares deeply about how we build our product and how we work together which is represented through our core values:

  • Be Curious at Your Core
  • Act Like an Owner
  • Empower Employees
  • Deliver Best-in-Class Client Experiences
  • Achieve More Together

We support and encourage an entrepreneurial outlook and independent thinking. We foster an environment that encourages collaboration and enables the development of innovative solutions to complex problems. As we get set for takeoff the organizations growth opportunities are constantly expanding. You will be surrounded by some of the best talent in the industry who will want to learn from you too. Come join us!

To fulfil our obligation to protect client data successful applicants offered employment with CoreWeave will be required to complete a basic criminal record check conducted in compliance with GDPR. Employment offers are conditional upon receiving satisfactory check results

What We Offer

In addition to a competitive salary we offer a variety of benefits to support your needs including:

  • Family-level Medical Insurance
  • Family-level Dental Insurance
  • Generous Pension Contribution
  • Life Assurance at 4x Salary
  • Critical Illness Cover
  • Employee Assistance Programme
  • Tuition Reimbursement
  • Work culture focused on innovative disruption

Benefits may vary by location.

Our Workplace

While we prioritize a hybrid work environment remote work may be considered for candidates located more than 30 miles from an office based on role requirements for specialized skill sets. New hires will be invited to attend onboarding at one of our hubs within their first month. Teams also gather quarterly to support collaboration

CoreWeave is an equal opportunity employer committed to fostering an inclusive and supportive workplace. All qualified applicants and candidates will receive consideration for employment without regard to race color religion sex disability age sexual orientation gender identity national origin veteran status or genetic information.

CoreWeave does not accept speculative CVs. Any unsolicited CVs received will be treated as the property of CoreWeave and your Terms & Conditions associated with the use of CVs will be considered null and void.

Any unsolicited CVs sent by your company to us that is to say in any situation where we have not directly engaged your company in writing to supply candidates for a specific vacancy will be considered by us to be a free gift leaving us liable for no fees whatsoever should we choose to contact the candidate directly and engage the candidates services and will in no way establish any prior claim by your company to representation of that candidate should the candidates details also be submitted by any other party.

Export Control Compliance

This position requires access to export controlled information. To conform to U.S. Government export regulations applicable to that information applicant must either be (A) a U.S. person defined as a (i) U.S. citizen or national (ii) U.S. lawful permanent resident (green card holder) (iii) refugee under 8 U.S.C. 1157 or (iv) asylee under 8 U.S.C. 1158 (B) eligible to access the export controlled information without a required export authorization or (C) eligible and reasonably likely to obtain the required export authorization from the applicable U.S. government agency. CoreWeave may for legitimate business reasons decline to pursue any export licensing process.

Updated privacy notice - UK and EU Job Applications

When you apply to a job on this site the personal data contained in your application will be collected by CoreWeave UK Ltd. (Controller) which is located at

Phosphor (6th Floor) 133 Park Street London SE1 9EA

and can be contacted by emailing . Controllers data protection officer can be contacted at . Your personal data will be processed for the purposes of managing Controllers recruitment related activities which include setting up and conducting interviews and tests for applicants evaluating and assessing the results thereto and as is otherwise needed in the recruitment and hiring processes. Such processing is legally permissible under Art. 6(1)(f) of (i) Regulation (EU) 2016/679 (General Data Protection Regulation (GDPR) and (ii) the GDPR as it forms part of the laws of the UK (UK GDPR) as necessary for the purposes of the legitimate interests pursued by the Controller which are the solicitation evaluation and selection of applicants for employment. Your personal data will be shared with Greenhouse Software Inc. a cloud services provider located in the United States of America and engaged by Controller to help manage its recruitment and hiring process on Controllers behalf. With respect to transfers originating from the UK or the European Economic Area (EEA) to a country outside the UK or the EEA we implement the appropriate transfer mechanism(s) and other appropriate solutions to address cross-border transfers as required by applicable law. You may request a copy of the suitable mechanisms we have in place by contacting us at

Your personal data will be retained by Controller as long as Controller determines it is necessary to evaluate your application for employment. Where permitted by applicable law we may also retain your personal data for a limited period after the recruitment process ends in order to consider you for future job opportunities respond to legal claims or comply with record-keeping obligations. Under the GDPR and the UK GDPR you have the right to request access to your personal data to request that your personal data be rectified or erased and to request that processing of your personal data be restricted. You also have the right to data addition you may lodge a complaint with the relevant supervisory authority: (i) A list of Europes data protection authorities can be found here; and (ii) for the UK this is the Information Commissioners Office.

For additional information please see our .


Required Experience:

Senior IC

CoreWeave is The Essential Cloud for AI. Built for pioneers by pioneers CoreWeave delivers a platform of technology tools and teams that enables innovators to build and scale AI with confidence. Trusted by leading AI labs startups and global enterprises CoreWeave combines superior infrastructure per...
View more view more

Key Skills

  • Apache Hive
  • S3
  • Hadoop
  • Redshift
  • Spark
  • AWS
  • Apache Pig
  • NoSQL
  • Big Data
  • Data Warehouse
  • Kafka
  • Scala