AI Ops Engineer
Job Summary
Job Description AI Ops Engineer (DevOps/MLOps/AIOps)
Who are we
Fulcrum Digital is an agile and next-generation digital accelerating company providing digital transformation and technology services right from ideation to implementation. These services have applicability across a variety of industries including banking & financial services insurance retail higher education food healthcare and manufacturing.
Fulcrum Digital is an agile and next-generation digital accelerating company providing digital transformation and technology services right from ideation to implementation. These services have applicability across a variety of industries including banking & financial services insurance retail higher education food healthcare and manufacturing.
Role summary
We are seeking an AI Ops Engineer to run and improve the operational reliability of our AI/ML and GenAI platforms. You will own CI/CD enablement create/maintain Infrastructure-as-Code monitoring/alerting incident triage release readiness operational automation and cloud cost governance. This role is hands-on and technical and operates within enterprise security controls (segregation of duties)you will drive outcomes end-to-end partnering with C&F Service Desk/Cloud/Network teams when elevated access is required.
Key responsibilities
- CI/CD & deployments: build/maintain pipelines and deployment processes aligned to Cyber/compliance expectations.
- IaC & implementation packages: create/maintain Infrastructure-as-Code (e.g. Terraform/Bicep) and provide complete technical specs for approved execution.
- Operate & support production: monitor dashboards/logs triage incidents perform RCAs maintain runbooks and drive issue closure.
- Alert hygiene: reduce alert noise tune rules/thresholds and ensure alerts are actionable (severity ownership playbooks).
- Enterprise integrations & access issues: troubleshoot and drive fixes for items such as:
- Power Platform Jira connector issues (incl. allowlisting/service tags)
- HTTPS connector compliance needs
- Citrix VDI / Netskope / proxy access blocks (e.g. VS Code external tools)
- Cloud cost management (automation-first): monitor spend implement tagging/controls automate recurring cost reporting (avoid manual spreadsheet-heavy processes).
Working model (important)
- Due to segregation of duties and centralized governance you may not have persistent elevated/admin access.
- Some changes (RBAC network/security rules proxy allowlists certain provisioning) must be executed by C&F Service Desk/Cloud/Network/Security teams.
- Ownership in this role means: you diagnose propose the fix provide IaC/specs validation steps coordinate execution and verify results.
Requirements
Required qualifications
- 3 years in DevOps/SRE/Cloud Ops/MLOps/AIOps (or equivalent).
- Experience with at least one major cloud (Azure and/or AWS) in enterprise environments with restricted permissions.
- Hands-on experience with CI/CD IaC and observability/monitoring tools.
- Strong troubleshooting skills using logs/metrics/traces; scripting/automation with Python/PowerShell/Bash.
- Strong communication and responsiveness (timely acknowledgements proactive follow-through).
Preferred
- Experience supporting AI/ML platforms or API-based services.
- Containers/Kubernetes and/or serverless exposure.
- FinOps experience (cost allocation anomaly detection optimization).
Required Skills:
Devops Cloud OpsAzure Cloud Hands on exp in CI /CD pipelines AI Deployment