We are looking for an experienced Senior DevOps Engineer to join our team to play a key role in building and developing modern scalable solutions based on GenAI technologies. If you are passionate about automation cloud (mainly Azure) infrastructure as code and want to co-develop advanced AI/ML platforms in a production environment - this role could be for you.
Main tasks:
- Design implement and manage scalable secure and highly available cloud infrastructure for GenAI-based platforms (Azure-focused).
- Develop maintain and optimize CI/CD pipelines for rapid reliable deployment of AI/ML and conversational AI solutions.
- Automate provisioning monitoring scaling and failover for cloud-native microservices (Kubernetes Docker Helm Terraform).
- Ensure 24/7 operations practices including monitoring logging alerting and disaster recovery for production AI workloads.
- Collaborate closely with AI/ML engineers backend developers and security/compliance teams to ensure end-to-end delivery of GenAI products.
- Implement best practices for infrastructure as code (IaC) cost management and cloud resource optimization.
- Integrate and manage data storage solutions (Azure Postgres Flex data lakes warehousing) and secure data pipelines (ETL/ELT).
- Support the integration of APIs API gateways and service mesh components for scalable multi-channel (chat/voice) deployments.
- Contribute to privacy GDPR and compliance standardsenabling secure auditable handling of sensitive data.
- Mentor junior DevOps/Cloud Engineers and promote DevOps culture in the GenAI team.
Qualifications :
- Degree in Computer Science Information Technology or a related field.
- 5 years hands-on experience in DevOps or Cloud Engineering including Azure (preferred) AWS or similar environments.
- Strong knowledge of Kubernetes Docker Helm and Terraform (infrastructure as code).
- Proven experience with CI/CD tools (Azure DevOps Jenkins GitLab) and automation scripting (Python Bash etc.).
- Experience managing secure high-availability and large-scale distributed systems in production (24/7 ops).
- Deep understanding of cloud networking API gateways load balancing and monitoring/logging stacks (Prometheus Grafana ELK).
- Familiarity with data storage solutions (Azure PostgreSQL data lake/warehousing) and secure ETL/ELT pipelines.
- Solid experience with cloud security compliance (GDPR) RBAC and auditability.
- Strong troubleshooting incident management and performance tuning skills.
Additional Information :
- Hybrid work 3 times a week from one of our offices in Warsaw Lublin or Pozna.
- Participation in on-call duties 24/7 (probably one week per month).
Remote Work :
No
Employment Type :
Full-time