Our Company
Changing the world through digital experiences is what Adobes all about. We give everyonefrom emerging artists to global brandseverything they need to design and deliver exceptional digital experiences! Were passionate about empowering people to create beautiful and powerful images videos and apps and transform how companies interact with customers across every screen.
Were on a mission to hire the very best and are committed to creating exceptional employee experiences where everyone is respected and has access to equal opportunity. We realize that new ideas can come from everywhere in the organization and we know the next big idea could be yours!
Job Description
Our Company
Changing the world through digital experiences is what Adobes all about. We give everyonefrom emerging artists to global brandseverything they need to design and deliver exceptional digital experiences. Were passionate about empowering people to craft beautiful and powerful images videos and apps and to transform how companies interact with customers across every screen.
Were on a mission to hire the very best and are committed to building exceptional employee experiences where everyone is respected and has equal opportunity. We know that new ideas can come from everywhere in the organization and we know the next big idea could be yours!
System Architecture & Technical Strategy
- Define and drive the long-term reliability and scalability strategy for the Adobe Pass platform aligning with product and business goals.
- Architect large-scale distributed and multi-region systems designed for resiliency observability and self-healing.
- Anticipate systemic risks and design proactive mitigation strategies ensuring zero single points of failure across critical services.
- Partner with software architecture and infrastructure teams to evolve the platform toward greater reliability efficiency and cost optimization.
Automation Observability & Reliability Engineering
- Build and champion advanced automation frameworks that enable zero-touch operations across deployment recovery and scaling workflows.
- Introduce AI/ML-based predictive monitoring and anomaly detection systems to anticipate failures before they impact users.
- Lead organization-wide reliability initiatives such as chaos engineering error budgets and SLO adoption driving measurable reliability improvements.
- Continuously refine observability architecture (metrics traces logs) to ensure comprehensive actionable insights into production health.
Incident Response & Operational Excellence
- Serve as a technical authority during high-impact incidents guiding cross-functional teams through real-time mitigation and long-term prevention.
- Establish and enforce best-in-class incident management frameworks improving MTTR MTBF and reducing incident recurrence rates.
- Lead blameless postmortems and translate findings into actionable reliability roadmaps.
- Drive reliability reviews and operational readiness assessments for all major product launches.
Performance Scalability & Cost Efficiency
- Lead large-scale performance tuning and capacity engineering efforts ensuring optimal resource utilization and cost efficiency across environments.
- Identify architectural bottlenecks drive performance benchmarking and influence platform evolution for better scalability and elasticity.
- Partner with FinOps and CloudOps to optimize spend while maintaining reliability SLAs and SLOs.
Cross-Team Leadership & Mentorship
- Mentor and coach SREs and software engineers cultivating deep reliability-first thinking across teams.
- Serve as an inspiring leader in reliability engineering encouraging reliable methods fostering an automation-first culture and guiding technical standards across multiple teams.
- Collaborate with engineering leaders PMs and operations to align priorities set strategic goals and deliver on high-impact reliability initiatives.
- Lead technical deep dives and design reviews ensuring all systems are built to scale securely and reliably.
Qualifications
- Bachelors or Masters degree in Computer Science Engineering or a related field.
- 10 years of experiencein site reliability production engineering or large-scale distributed system operations.
- Proven track record of designing and managinghighly available globally distributed systemsin cloud-native environments (AWS Azure GCP).
- Expert-level proficiency in one or more programming/scripting languages (Python Go Java Bash) for automation and tooling.
- Deep understanding ofKubernetes microservices and service mesharchitectures.
- Advanced experience withInfrastructure as Code(Terraform CloudFormation) andCI/CD automation frameworks.
- Mastery in observability and monitoring stacks (Prometheus Grafana Datadog OpenTelemetry).
- Strong expertise innetworking storage and distributed databases(both SQL and NoSQL).
- Demonstrated ability to influence architectural decisions and drive reliability strategy across organizations.
- Communication leadership and collaborator management skills that are outstanding.
Preferred Qualifications
- Experience designing reliability frameworks or SRE platforms at scale (error budgets chaos engineering reliability reviews).
- Prior experience in high-traffic or latency-sensitive systems (media streaming advertising or real-time platforms).
- Familiarity with distributed data ecosystems (Kafka Spark Hadoop) and large-scale data ingestion pipelines.
- Hands-on experience withsecurity compliance and governancein production environments (SOC2 GDPR ISO27001).
- Cloud or Kubernetes certifications (AWS Solutions Architect Professional CKA/CKAD GCP Professional Cloud Architect).
- Published contributions or conference talks on reliability automation or distributed systems.
Adobe is proud to be anEqual Employment Opportunityemployer. We do not discriminate based on gender race or color ethnicity or national origin age disability religion sexual orientation gender identity or expression veteran status or any other applicable characteristics protected by law.Learn more.
Adobe aims to make accessible to any and all users. If you have a disability or special need that requires accommodation to navigate our website or complete the application process emailor call .
Required Experience:
Senior IC
Our CompanyChanging the world through digital experiences is what Adobes all about. We give everyonefrom emerging artists to global brandseverything they need to design and deliver exceptional digital experiences! Were passionate about empowering people to create beautiful and powerful images videos...
Our Company
Changing the world through digital experiences is what Adobes all about. We give everyonefrom emerging artists to global brandseverything they need to design and deliver exceptional digital experiences! Were passionate about empowering people to create beautiful and powerful images videos and apps and transform how companies interact with customers across every screen.
Were on a mission to hire the very best and are committed to creating exceptional employee experiences where everyone is respected and has access to equal opportunity. We realize that new ideas can come from everywhere in the organization and we know the next big idea could be yours!
Job Description
Our Company
Changing the world through digital experiences is what Adobes all about. We give everyonefrom emerging artists to global brandseverything they need to design and deliver exceptional digital experiences. Were passionate about empowering people to craft beautiful and powerful images videos and apps and to transform how companies interact with customers across every screen.
Were on a mission to hire the very best and are committed to building exceptional employee experiences where everyone is respected and has equal opportunity. We know that new ideas can come from everywhere in the organization and we know the next big idea could be yours!
System Architecture & Technical Strategy
- Define and drive the long-term reliability and scalability strategy for the Adobe Pass platform aligning with product and business goals.
- Architect large-scale distributed and multi-region systems designed for resiliency observability and self-healing.
- Anticipate systemic risks and design proactive mitigation strategies ensuring zero single points of failure across critical services.
- Partner with software architecture and infrastructure teams to evolve the platform toward greater reliability efficiency and cost optimization.
Automation Observability & Reliability Engineering
- Build and champion advanced automation frameworks that enable zero-touch operations across deployment recovery and scaling workflows.
- Introduce AI/ML-based predictive monitoring and anomaly detection systems to anticipate failures before they impact users.
- Lead organization-wide reliability initiatives such as chaos engineering error budgets and SLO adoption driving measurable reliability improvements.
- Continuously refine observability architecture (metrics traces logs) to ensure comprehensive actionable insights into production health.
Incident Response & Operational Excellence
- Serve as a technical authority during high-impact incidents guiding cross-functional teams through real-time mitigation and long-term prevention.
- Establish and enforce best-in-class incident management frameworks improving MTTR MTBF and reducing incident recurrence rates.
- Lead blameless postmortems and translate findings into actionable reliability roadmaps.
- Drive reliability reviews and operational readiness assessments for all major product launches.
Performance Scalability & Cost Efficiency
- Lead large-scale performance tuning and capacity engineering efforts ensuring optimal resource utilization and cost efficiency across environments.
- Identify architectural bottlenecks drive performance benchmarking and influence platform evolution for better scalability and elasticity.
- Partner with FinOps and CloudOps to optimize spend while maintaining reliability SLAs and SLOs.
Cross-Team Leadership & Mentorship
- Mentor and coach SREs and software engineers cultivating deep reliability-first thinking across teams.
- Serve as an inspiring leader in reliability engineering encouraging reliable methods fostering an automation-first culture and guiding technical standards across multiple teams.
- Collaborate with engineering leaders PMs and operations to align priorities set strategic goals and deliver on high-impact reliability initiatives.
- Lead technical deep dives and design reviews ensuring all systems are built to scale securely and reliably.
Qualifications
- Bachelors or Masters degree in Computer Science Engineering or a related field.
- 10 years of experiencein site reliability production engineering or large-scale distributed system operations.
- Proven track record of designing and managinghighly available globally distributed systemsin cloud-native environments (AWS Azure GCP).
- Expert-level proficiency in one or more programming/scripting languages (Python Go Java Bash) for automation and tooling.
- Deep understanding ofKubernetes microservices and service mesharchitectures.
- Advanced experience withInfrastructure as Code(Terraform CloudFormation) andCI/CD automation frameworks.
- Mastery in observability and monitoring stacks (Prometheus Grafana Datadog OpenTelemetry).
- Strong expertise innetworking storage and distributed databases(both SQL and NoSQL).
- Demonstrated ability to influence architectural decisions and drive reliability strategy across organizations.
- Communication leadership and collaborator management skills that are outstanding.
Preferred Qualifications
- Experience designing reliability frameworks or SRE platforms at scale (error budgets chaos engineering reliability reviews).
- Prior experience in high-traffic or latency-sensitive systems (media streaming advertising or real-time platforms).
- Familiarity with distributed data ecosystems (Kafka Spark Hadoop) and large-scale data ingestion pipelines.
- Hands-on experience withsecurity compliance and governancein production environments (SOC2 GDPR ISO27001).
- Cloud or Kubernetes certifications (AWS Solutions Architect Professional CKA/CKAD GCP Professional Cloud Architect).
- Published contributions or conference talks on reliability automation or distributed systems.
Adobe is proud to be anEqual Employment Opportunityemployer. We do not discriminate based on gender race or color ethnicity or national origin age disability religion sexual orientation gender identity or expression veteran status or any other applicable characteristics protected by law.Learn more.
Adobe aims to make accessible to any and all users. If you have a disability or special need that requires accommodation to navigate our website or complete the application process emailor call .
Required Experience:
Senior IC
View more
View less