DevOps Production Support Engineer

Inetum

Not Interested
Bookmark
Report This Job

profile Job Location:

Lisbon - Portugal

profile Monthly Salary: Not Disclosed
Posted on: 2 hours ago
Vacancies: 1 Vacancy

Job Summary

For this project you will be responsible for ensuring the stability availability and performance of applications within the teams scope. You will play a key role in incident management ensuring timely resolution by collaborating with internal teams (Development and Infrastructure) and external stakeholders (Service Providers) while driving sustainable long-term solutions.

Key Responsibilities

Application Stability & Availability

  • Monitor maintain and support in-scope applications to ensure high availability reliability and performance.
  • Actively participate in incident management activities including:
    • Situation Rooms for P1/P2 incidents
    • Root Cause Analysis (RCA)
    • Identification of incident trends and contribution to permanent solutions
  • Ensure compliance with ITIL governance within IT Production including SLA management.
  • Execute change requests and deployments in accordance with ITIL and DevOps processes and tools.
  • Proactively identify and resolve technical issues to ensure smooth business operations.
  • Participate in on-call rotations and provide 24/7 support for critical applications when required.

Technical Support & Cross-Team Collaboration

  • Serve as a primary point of contact for Development teams supporting troubleshooting activities and coordinating fixes.
  • Work closely with Scrum and Agile teams to design deploy and continuously improve systems.
  • Implement upgrades patches and new functionalities while ensuring minimal impact on end users.

Platform Monitoring & Observability

  • Implement configure and optimize monitoring solutions within the production environment (e.g. Dynatrace).
  • Collaborate with Development teams and Centers of Expertise to define effective monitoring and observability practices.
  • Promote observability awareness to enable early detection and proactive resolution of potential issues.
  • Utilize distributed tracing logging and metrics tools (e.g. Jaeger Grafana Prometheus ELK).

Documentation & Knowledge Sharing

  • Create maintain and update technical documentation including processes configurations and troubleshooting guides.
  • Share best practices and technical knowledge with global support teams to improve service quality and operational efficiency.

 


Qualifications :

API Application Servers & Kubernetes

  • Strong experience with Java application servers particularly Red Hat JBoss EAP.
  • Solid Java knowledge including:
    • Heap and thread dump analysis
    • Performance tuning and optimization
  • Experience with OpenShift and Kubernetes-based platforms including cloud-native environments.
  • API Gateway integration and support.
  • Strong knowledge of RHEL Linux operating systems.

Monitoring Automation & DevOps

  • Hands-on experience with observability and monitoring tools such as:
    • Dynatrace
    • Jaeger
    • Grafana
    • Prometheus
    • ELK Stack
  • Experience setting up and optimizing CI/CD pipelines using tools such as:
    • GitLab
    • ArgoCD
    • Jenkins
    • Nexus Sonatype
  • Experience with infrastructure and automation tools such as Ansible and/or Terraform.

Soft Skills

  • Strong problem-solving and critical-thinking abilities.
  • Excellent collaboration and teamwork skills.
  • Clear and effective communication skills.
  • Resilience and adaptability in fast-paced environments.
  • Ability to manage stress and perform effectively during critical incidents.
  • Strong sense of accountability ownership and autonomy.
  • Effective time management and prioritization skills.
  • Goal-oriented mindset with strong attention to detail.

Language Skills

  • Fluent in Portuguese and English.

Remote Work :

No


Employment Type :

Full-time

For this project you will be responsible for ensuring the stability availability and performance of applications within the teams scope. You will play a key role in incident management ensuring timely resolution by collaborating with internal teams (Development and Infrastructure) and external stake...
View more view more

About Company

Company Logo

Inetum is a European leader in digital services. Inetum’s team of 28,000 consultants and specialists strive every day to make a digital impact for businesses, public sector entities and society. Inetum’s solutions aim at contributing to its clients’ performance and innovation as well ... View more

View Profile View Profile