Title: Hybrid AI-Ops Engineer
Pay Rate: $60.00/hr DOE
Employment Type: Contract
Est. End Date: January 2027
Schedule: Full-time Monday Friday 9:00 AM 6:00 PM
Location: Stanford CA 94305 (Hybrid 2 days onsite)
Job Code: JPC-1464463
Tekberry is looking for a highly qualified and motivated AI-Ops Engineer to work hybrid with our client a top-tier university engineering program. As a W2 employee you will have access to health benefits.
Responsibilities:AI-Driven Operations & Automation
Implement AIOps solutions using ML to automate performance monitoring workload scheduling and infrastructure operations.
Build anomaly detection systems to identify system issues before they impact users.
Develop automated root cause analysis using ML-driven event correlation.
Create predictive maintenance workflows based on historic patterns and telemetry data.
Design and execute automated remediation scripts for incident response.
Observability & Intelligent Monitoring
Build observability platforms that aggregate logs metrics and events into unified dashboards.
Implement intelligent alerting using NLP/ML to reduce noise and prioritize actionable insights.
Deploy APM tools integrated with AI-powered analytics.
Ensure full visibility across cloud infrastructure applications and ML workloads.
Cloud Infrastructure & DevOps
Design and maintain scalable AWS infrastructure using CloudFormation Terraform or CDK.
Build and manage containerized workloads (Docker ECS Fargate EKS).
Create CI/CD pipelines incorporating AI-driven deployment and quality checks.
Automate cloud operations to optimize cost scalability and reliability.
Ensure all cloud architecture meets Stanford s compliance requirements (FERPA GDPR).
Collaboration & Continuous Improvement
Partner with engineers and cross-functional teams to deliver AIOps capabilities.
Use Git-based workflows and participate in code reviews.
Document runbooks automation workflows and operational procedures.
Continuously evaluate emerging AIOps tools and methodologies.
Contribute to building a culture focused on predictive and automated operations.
Required
Bachelor s degree in Computer Science DevOps Cloud Engineering or related field (Master s preferred).
3 years in DevOps SRE or Cloud Engineering roles.
2 years hands-on experience with AWS (EC2 Lambda ECS/Fargate S3 IAM VPC).
Strong Python programming skills.
Experience implementing monitoring and observability solutions at scale.
Familiarity with ML/AI concepts applied to automation.
Technical Skills
Languages: Python required; Bash Go or TypeScript preferred.
Monitoring Tools: CloudWatch X-Ray Prometheus Grafana Datadog Splunk.
Infrastructure as Code: CloudFormation Terraform CDK.
Containers & Orchestration: Docker ECS/Fargate Kubernetes (EKS).
AWS Services: Lambda EC2 S3 API Gateway EventBridge CloudWatch IAM CodePipeline SageMaker.
CI/CD: GitHub Actions CodePipeline Jenkins GitLab CI.
Data & Analytics: Log aggregation metrics analysis event correlation.
Desired Attributes
Strong understanding of AIOps principles and automation-first operations.
Passion for eliminating manual work through AI-driven solutions.
Excellent debugging and root cause analysis skills.
Adaptable collaborative and eager to learn with strong communication skills.
Thrives in fast-paced environments with evolving technology stacks.
We need hard-working reliable employees. Tekberry offers a $100 payment for referrals!
Tekberry Inc. is an equal opportunity employer. Qualified applicants will receive consideration for employment without regard to race color religion sex sexual orientation gender identity national origin protected veteran status disability or any other protected categories under all applicable laws.
Tekberry Inc is a Certified Minority Business Enterprise (MBE) and Certified Disadvantaged Business Enterprise (DBE).
By submitting your resume you are explicitly consenting to receive communications from our organization via text message. Rest assured all our texts are sent by real people and we look forward to a conversation with you about this job!
Check out all our jobs at #INDHP