DCIM Analyst Infrastructure Operations

Cloudflare

Not Interested
Bookmark
Report This Job

profile Job Location:

San Francisco, CA - USA

profile Monthly Salary: Not Disclosed
Posted on: 11 hours ago
Vacancies: 1 Vacancy

Job Summary

About Us

At Cloudflare we are on a mission to help build a better Internet. Today the company runs one of the worlds largest networks that powers millions of websites and other Internet properties for customers ranging from individual bloggers to SMBs to Fortune 500 companies. Cloudflare protects and accelerates any Internet application online without adding hardware installing software or changing a line of code. Internet properties powered by Cloudflare all have web traffic routed through its intelligent global network which gets smarter with every request. As a result they see significant improvement in performance and a decrease in spam and other attacks. Cloudflare was named to Entrepreneur Magazines Top Company Cultures list and ranked among the Worlds Most Innovative Companies by Fast Company.

We realize people do not fit into neat boxes. We are looking for curious and empathetic individuals who are committed to developing themselves and learning new skills and we are ready to help you do that. We cannot complete our mission without building a diverse and inclusive team. We hire the best people based on an evaluation of their potential and support them throughout their time at Cloudflare. Come join us!

Available Locations:

Atlanta (US) Austin (US) Denver (US) Seattle (US) Toronto (Canada) London (UK) Lisbon (Portugal).

About the Role:

We are seeking a DCIM Analyst to be the data scientist for the Physical Layer responsible for the integrity forecasting and visualization of all infrastructure datasetsSpace Power Cooling Cabling/Ports and Asset Inventory. This technical role is part of the Infrastructure Operations organization which is responsible for building scaling and running one of the worlds largest and most important cloud networks. Cloudflares global network spans more than 330 cities and is a key strategic asset that supports all of our customers and products.

The DCIM Analyst is the architect of our physical intelligence. You own the complete analytical scope of the Nlyte platformspanning Space Power Cooling Connectivity and Asset Lifecycle. You will move beyond simple monitoring to build an infrastructure health engine serving as the mandatory Validator for all global changes. You will transform fragmented data into a unified capacity strategy ensuring our edge network scales efficiently while safeguarding against resource exhaustion and physical risk.

We operate in a fast-paced environment where you will be expected to drive both project delivery and operational excellence through continuous improvement standardization and optimization. This isnt just about day to day operations; its about building a scalable performant secure and resilient infrastructure that plays a critical role in us building a better Internet.

Key Responsibilities:

  • Serve as the required approval step in the Change Management workflow. You must validate every proposed Move Add and Change (MAC) against real-time capacity constraints before the Administrator can issue a work order.
  • Enforce a Zero-Overprovisioning policy by blocking requests that breach redundancy thresholds for Space Power Cooling or Network Port availability.
  • Develop forward-looking capacity models to forecast resource exhaustion. Run What-If scenarios to determine the optimal placement of new high-density hardware (e.g. AI/GPU clusters) to avoid creating hot spots or stranded capacity.
  • Advise the DCIM Manager and Capacity Team on when and where to purchase additional colocation space or power based on consumption trends.
  • Design and own the data ingestion strategy for the Nlyte Real-Time Monitoring module. Ensure continuous polling of thousands of sensors across IT devices and facility equipment (CRACs UPS PDUs).
  • Manage the normalization of raw telemetry data from diverse protocols into a clean actionable Time-Series Database.
  • Analyze the integrity of the Asset Management database. Identify ghost servers (powered on but not in inventory) and track asset aging to predict decommissioning waves.
  • Reconcile data discrepancies between Discovered network data and Managed inventory data flagging errors for the Administrator to fix.
  • Transform raw data into executive-level dashboards. Calculate and report on critical efficiency metrics including Power Usage Effectiveness (PUE) and carbon impact.
  • Define and tune global alerting thresholds to ensure operations teams are alerted to genuine risks without suffering from alert fatigue.

Qualifications:

  • Expert DCIM Analytics: 4 years of experience administering the analytics module of a major DCIM platform (Nlyte Sunbird or similar). Must demonstrate the ability to build custom reports not just use default dashboards.
  • Multi-Constraint Modeling: Proven experience modeling capacity across four distinct constraints: Space (Rack Units/Footprint) Power (kW draw vs. Circuit limits) Cooling (BTU/h and Airflow) Connectivity (Port density and Cabling availability).
  • Data Normalization: Experience managing data ingestion from varied hardware sources using standard protocols and normalizing that data for historical analysis.
  • BI Visualization: Proficiency in SQL and data visualization tools (e.g. Tableau Grafana PowerBI) to create the Single Source of Truth reporting for Finance and Strategy stakeholders
  • Domain Knowledge
    • Deep understanding of the physical environment. You must understand why a rack is overheating not just report that it is hot.
    • Power Distribution Architectures: Knowledge of data center power chains
    • Structured Cabling Standards: Familiarity with fiber/copper standards to accurately model port capacity and connectivity meshes.
    • Change Management Logic: Experience defining the business logic for Automated Capacity Validationwriting the rules that determine if a ticket is automatically approved or rejected based on data.
    • Root Cause Analysis: Experience using historical time-series data to perform forensic analysis after an incident (e.g. correlating a power drop with a specific server failure.
  • Principled: You have the confidence to act as a neutral arbiter. If the data shows a deployment is unsafe you will withhold validation approval regardless of pressure from deployment teams.
  • Curious: You proactively hunt for inefficiencies that others miss treating the infrastructure as a puzzle to be optimized.

What Makes Cloudflare Special

Were not just a highly ambitious large-scale technology company. Were a highly ambitious large-scale technology company with a soul. Fundamental to our mission to help build a better Internet is protecting the free and open Internet.

Project Galileo: Since 2014 weve equipped more than 2400 journalism and civil society organizations in 111 countries with powerful tools to defend themselves against attacks that would otherwise censor their work technology already used by Cloudflares enterprise customers--at no cost.

Athenian Project: In 2017 we created the Athenian Project to ensure that state and local governments have the highest level of protection and reliability for free so that their constituents have access to election information and voter registration. Since the project weve provided services to more than 425 local government election websites in 33 states.

1.1.1.1: We released 1.1.1.1 to help fix the foundation of the Internet by building a faster more secure and privacy-centric public DNS resolver. This is available publicly for everyone to use - it is the first consumer-focused service Cloudflare has ever released. Heres the deal - we dont store client IP addresses never ever. We will continue to abide by our privacy commitment and ensure that no user data is sold to advertisers or used to target consumers.

Sound like something youd like to be a part of Wed love to hear from you!

This position may require access to information protected under U.S. export control laws including the U.S. Export Administration Regulations. Please note that any offer of employment may be conditioned on your authorization to receive software or technology controlled under these U.S. export laws without sponsorship for an export license.

Cloudflare is proud to be an equal opportunity employer. We are committed to providing equal employment opportunity for all people and place great value in both diversity and inclusiveness. All qualified applicants will be considered for employment without regard to their or any other persons perceived or actual race color religion sex gender gender identity gender expression sexual orientation national origin ancestry citizenship age physical or mental disability medical condition family care status or any other basis protected by law. We are an AA/Veterans/Disabled Employer.

Cloudflare provides reasonable accommodations to qualified individuals with disabilities. Please tell us if you require a reasonable accommodation to apply for a job. Examples of reasonable accommodations include but are not limited to changing the application process providing documents in an alternate format using a sign language interpreter or using specialized equipment. If you require a reasonable accommodation to apply for a job please contact us via e-mail at or via mail at 101 Townsend St. San Francisco CA 94107.


Required Experience:

IC

About UsAt Cloudflare we are on a mission to help build a better Internet. Today the company runs one of the worlds largest networks that powers millions of websites and other Internet properties for customers ranging from individual bloggers to SMBs to Fortune 500 companies. Cloudflare protects and...
View more view more

Key Skills

  • Installation Maintenance Repair
  • Quality Assurance
  • Active Directory
  • End user
  • Access Points
  • Deskside Support
  • Infrastructure Development
  • Project Management
  • Quality Control
  • Troubleshoot
  • User Accounts
  • Desktop
  • Setup
  • hardware
  • Technical Support

About Company

Company Logo

Make employees, applications and networks faster and more secure everywhere, while reducing complexity and cost.

View Profile View Profile