Software Engineer E5 (Kubernetes)

Whatfix

Not Interested
Bookmark
Report This Job

profile Job Location:

San Jose, CA - USA

profile Monthly Salary: Not Disclosed
Posted on: 30+ days ago
Vacancies: 1 Vacancy

Job Summary

Who are we

Whatfix is an AI platform advancing the userization of enterprise applications empowering companies to maximize the ROI of their digital investments. Technology needs adoption. Its no different for AI. As AI reshapes roles workflows and human-machine interactions it also introduces new layers of complexity and user friction. This is where Whatfix plays a pivotal role. A decade old DNA of empowering people to succeed with technology and not replacing them. We call this philosophy Userization: the belief that technology must adapt to the user not the other way around.

At the heart of userization philosophy is ScreenSense our proprietary AI engine which continuously interprets both the context of what users are doing in an application or an AI tool and the intent behind their actions. By combining these signals Whatfix delivers real-time guidance nudges knowledge and automation directly in the flow of work.

This intelligence powers our entire product suite.

  • Digital Adoption helps users get productive faster.
  • Product Analytics uncovers friction and closes adoption gaps.
  • Mirror allows employees to train in safe simulated environments.

These are ur embedded with Whatfix AI Agents which supercharge creation insights and user guidance.

Our upcoming AI-first products are already creating a buzz in the market.

  • Seek is an AI-native assistant that not only knows your business context but can also act across applications to get work done on your behalf.
  • Whatfix Mirror 2.0 is the worlds only System plus Role simulation with a complete assessment to lead the Gen AI simulation category.

Together these products reflect Whatfixs commitment to building enterprise-ready AI teammates that maximize productivity and ROI. It gives users a unified intelligent way to find answers across systems apps and knowledge silos and helps anyone looking to deliver fast and contextual answers.

Whatfix is bridging the gap between rapid technological change and human enablementensuring AI is not only embedded but also usable trusted and outcome-driven for every employee.

At Whatfix were not just making software easierwere making AI work for people.

The company has seven offices across the US India UK Germany Singapore and Australia and a presence across 40 countries.

Customers: 700 enterprise customers including 80 Fortune 500 companies such as Shell Schneider Electric and UPS Supply Chain Solutions.

Investors: A total of $270 million USD has been raised as yet. Most recently Series E round of $125 Million USD led by Warburg Pincus with participation from existing investor SoftBank Vision Fund 2. Other investors include Cisco Investments Eight Roads Ventures (A division of Fidelity Investments) Dragoneer Peak XV Partners and Stellaris Venture Partners.

Whatfixs leadership is consistently recognized across top industry analysts and business rankings:

  • Won the 2025 AI Breakthrough Award for the Overall AI-based Analytics Solution of the Year
  • Only DAP to be recognized as a Leader across various DAP reports for the past 5 years by leading analyst firms like Gartner Forrester IDC and Everest Group.
  • With over 45% YoY sustainable annual recurring revenue (ARR) growth Whatfix is among the Top 50 Indian Software Companies as per G2 Best Software Awards.
  • Named a Gartner Customers Choice for DAP for the second year in a row (2024 and 2025)the only vendor in the market to earn this distinction consecutively.
  • We also boast a star rating of 4.6 on G2 Crowd 4.5 on Gartner Peer Insights and a super-high CSAT of 99.8%
  • Stevie Award winner in the category (Bronze): Customer Service Department of the Year Computer Software - 100 or More Employees.
  • Winner of the ISG Paragon Innovation Award in partnership with Sophos (customer) for the EMEA region and finalist in the Transformation Award category.
  • RemoteTech Breakthrough Awards winner for Software Asset Management Solution of the Year

These recognitions are matched by business performance:

  • Highest-Ranking DAP on 2023 Deloitte Technology Fast 500 North America for Fifth Consecutive Year
  • Listed on the Financial Times & Statistas High-Growth Companies Asia-Pacific 2025 list.
  • Won the Silver for Stevies Employer of the Year 2023 Computer Software category and also recognized as Great Place to Work 2022-2023
  • Only DAP to be among the top 35% companies worldwide in sustainability excellence with EcoVadis Bronze Medal

Position Overview:We are looking for a highly skilled and experienced Software Engineer (E5) to join our Site Reliability Engineering team who can take endtoend ownership of large businesscritical features. Youll design build ship and operate reliable scalable services; break complex work into actionable tasks for yourself and other engineers; set the technical bar through thoughtful design and rigorous reviews; and mentor teammates while partnering with product platform and customerfacing groups to keep our systems fast observable and alwayson.

Candidates must be authorized to work in the United States on a full-time basis without employer sponsorship either now or in the future.

Responsibilities:

Scope & Impact:

This role is critical to enhancing the reliability availability and overall resilience of Whatfixs software products. The role will own these Non Functional Areas and build automated mechanisms to target gaps in these areas. These automated mechanisms should be scalable to an extent where other Engineering Teams can build their own pipelines to ensure reliability for their owned services. The role should be able to build a framework which can democratize the approach to enhance observability recoverability and self healing capabilities of the products in Whatfix EcoSystem. This should also provide visibility to other engineering systems on the performance of their microservices.

Ownership:

  • Designs and ships scalable platform code that bakesin reliability faulttolerance and selfhealing for all Whatfix products

  • Owns designs and develops frameworks (eliminate or significantly reduce manual efforts e.g. through self-healing and auto-scaling systems and platformization) processes and architecture which enhances the Availability and Reliability of the System.

  • Provides as a first responder for critical software issues within the teams domain.

  • Prioritizes and takes ownership of unowned or complex tasks that enable the team to move faster.

  • Ensure that customer issues are not just fixed but that effective long-term solutions are implemented to prevent recurrence.

Technical Execution:

  • Own task breakdown from stories/features ensuring each task is feasible within five days
  • Detail out design documents for the features being worked on
  • Implement well tested and documented code based on engineering standards and best practices
  • Own and support the features owned by the team to ensure high availability and compliances
  • Review designs and code written by peers as well as other teams from perspectives of testability maintainability reliability security and cost.
  • Work with other teams to enhance developer experience through the enhancement of developer tools suggest and implement AI workflows in the area of observability availability and reliability
  • Demonstrate expertise in one or more technical areas and contribute to the overall technical direction of the team.

Skillset:

Observability and Alertability of Infrastructure:

The candidate should have proven experience in:

  • Increasing the observability of Software Systems
  • Managing Infrastructure in automated manner (utilizing automated pipelines for CI/CD and frameworks for IaaC)
  • Identifying gaps in Monitoring and Observability and fixing such gaps in a sustainable scalable and automated manner.
  • Proven track record of defining SLAs for Systems and working on tasks to continuously track these SLAs and enhancing these SLAs
  • Resilience Engineering Practices: Drives postincident blameless RCAs and converts findings into code tests and platform improvements
  • Collaboration & Guidance:

The candidate should have experience in:

  • Working with other teams to help enhance the observability and recoverability (such as through self healing) of those teams features
  • Conduct training sessions or workshops on observability and reliability practices.
  • Provide guidance on best practices for monitoring alerting and logging.

Position Overview: We are looking for a highly skilled and experienced Software Engineer (E5) to join our Site Reliability Engineering team who can take endtoend ownership of large businesscritical features. Youll design build ship and operate reliable scalable services; break complex work into actionable tasks for yourself and other engineers; set the technical bar through thoughtful design and rigorous reviews; and mentor teammates while partnering with product platform and customerfacing groups to keep our systems fast observable and alwayson.

Responsibilities:

Scope & Impact:

This role is critical to enhancing the reliability availability and overall resilience of Whatfixs software products. The role will own these Non Functional Areas and build automated mechanisms to target gaps in these areas. These automated mechanisms should be scalable to an extent where other Engineering Teams can build their own pipelines to ensure reliability for their owned services. The role should be able to build a framework which can democratize the approach to enhance observability recoverability and self healing capabilities of the products in Whatfix EcoSystem. This should also provide visibility to other engineering systems on the performance of their microservices.

Ownership:

  • Designs and ships scalable platform code that bakesin reliability faulttolerance and selfhealing for all Whatfix products
  • Owns designs and develops frameworks (eliminate or significantly reduce manual efforts e.g. through self-healing and auto-scaling systems and platformization) processes and architecture which enhances the Availability and Reliability of the System.
  • Provides as a first responder for critical software issues within the teams domain.
  • Prioritizes and takes ownership of unowned or complex tasks that enable the team to move faster.
  • Ensure that customer issues are not just fixed but that effective long-term solutions are implemented to prevent recurrence.

Technical Execution:

  • Own task breakdown from stories/features ensuring each task is feasible within five days
  • Detail out design documents for the features being worked on
  • Implement well tested and documented code based on engineering standards and best practices
  • Own and support the features owned by the team to ensure high availability and compliances
  • Review designs and code written by peers as well as other teams from perspectives of testability maintainability reliability security and cost.
  • Work with other teams to enhance developer experience through the enhancement of developer tools suggest and implement AI workflows in the area of observability availability and reliability
  • Demonstrate expertise in one or more technical areas and contribute to the overall technical direction of the team.

Skillset:

Observability and Alertability of Infrastructure:

The candidate should have proven experience in:

  • Increasing the observability of Software Systems
  • Managing Infrastructure in automated manner (utilizing automated pipelines for CI/CD and frameworks for IaaC)
  • Identifying gaps in Monitoring and Observability and fixing such gaps in a sustainable scalable and automated manner.
  • Proven track record of defining SLAs for Systems and working on tasks to continuously track these SLAs and enhancing these SLAs
  • Resilience Engineering Practices: Drives postincident blameless RCAs and converts findings into code tests and platform improvements

Collaboration & Guidance:

The candidate should have experience in:

  • Working with other teams to help enhance the observability and recoverability (such as through self healing) of those teams features
  • Conduct training sessions or workshops on observability and reliability practices.
  • Provide guidance on best practices for monitoring alerting and logging.

Required Technical Skills and Qualifications:

  • Candidate should have experience in the following technologies
  • Strong experience in Java.
  • Working experience in Kubernetes Helm ArgoCD
  • Ability to work with Java and Python based applications and identify gaps that could result in failures.
  • Familiarity with CI/CD pipelines and infrastructure as code (IaC) practices.

Preferred Skills:

  • Familiarity with log aggregation tools (e.g. ELK Stack).
  • Knowledge of Chaos Engineering principles.

Soft Skills:

  • Strong problem-solving and troubleshooting abilities.
  • Excellent communication and collaboration skills.
  • Ability to mentor and guide cross-functional teams.

Perks / Benefits

  • Uncapped incentives
  • Equity plan
  • Mac shop work with the newest technologies
  • Unlimited PTO policy
  • Paid maternity/paternity leave
  • Monthly cell phone stipend
  • Paid UberEats lunches-daily
  • Medical Dental and Vision coverage (Whatfix pays 80% of the premium for individuals and their families; for the HSA Whatfix contributes $1000 for individuals and $2000 for a family)
  • Team and company outings
  • Learning and Development benefits

At Whatfix we value collaboration innovation and human connection. We believe that working together in the office five days a week fosters open communication strengthens our community and drives innovation helping us achieve our goals more effectively.

To facilitate global collaboration our US teams start and end early while our India teams start and end teams do not have any evening meetings. Relocation and Sponsorship offered.

We strive to live and breathe our Cultural Principles and encourage employees to demonstrate some of these core values - Customer First; Empathy; Transparency; Fail Fast and scale Fast; No Hierarchies for Communication; Deep Dive and innovate; Trust is the foundation; and Do it as you own it.

Whatfix is an Equal Opportunity Employer and an E-Verify participant. All activities must comply with our Equal Opportunity Laws ADA and other regulations as appropriate.

We are an equal opportunity employer and value diverse people because of and not in spite of the differences. We do not discriminate on the basis of race religion color national origin ethnicity gender sexual orientation age marital status veteran status or disability status.

Compensation will be determined by factors such as level job-related knowledge skills and experience.

Due to our companys global nature and our hiring committees span of different time zones the interviews for this role will be recorded for those not in attendance to review.

Who are weWhatfix is an AI platform advancing the userization of enterprise applications empowering companies to maximize the ROI of their digital investments. Technology needs adoption. Its no different for AI. As AI reshapes roles workflows and human-machine interactions it also introduces new lay...
View more view more

Key Skills

  • Spring
  • .NET
  • C/C++
  • Go
  • React
  • OOP
  • C#
  • Data Structures
  • JavaScript
  • Software Development
  • Java
  • Distributed Systems