At EY were all in to shape your future with confidence.
Well help you succeed in a globally connected powerhouse of diverse teams and take your career wherever you want it to go.
Join EY and help to build a better working world.
L2 Support Engineer Gen AI Projects (GCP)
Location: Offshore
Role Type: Support & Operations
Experience: 24 years
Role Summary
The L2 Support Engineer will monitor troubleshoot and support multiple Gen AI applications built on Google Cloud Platform (GCP). The role focuses on following runbooks identifying where issues are occurring providing first-level fixes and handing off deep technical problems to the L3 engineering team.
Key Responsibilities
- Monitor Gen AI pipelines dashboards alerts and system health across various applications.
- Follow runbooks to investigate failures and identify exactly where the process or workflow is breaking.
- Perform basic troubleshooting and resolve issues related to access permissions environment setup or configuration.
- Validate errors using logs job history and GCP console (Cloud Functions Cloud Run Pub/Sub Firestore GCS).
- Raise detailed tickets to L3 with clear findings logs and reproduction steps.
- Support daily operations such as checking stuck workflows failed requests or incomplete data ingestions.
- Perform minor fixes like restarting jobs updating config values clearing queues or re-triggering pipelines.
- Handle user access requests IAM role updates and environment-related incidents within approved guidelines.
- Ensure SLAs are met for incident acknowledgment analysis and escalation.
- Maintain documentation update runbooks and contribute to process improvements.
- Coordinate closely with L3 engineers cloud teams and application owners as needed.
Required Skills
- Basic understanding of cloud environments (preferably GCP).
- Ability to read logs and identify errors across functions pipelines and APIs.
- Familiarity with JSON REST APIs and debugging simple configuration issues.
- Good understanding of Python basics or the ability to read/debug simple scripts (not mandatory).
- Strong problem-solving skills with an ability to follow structured troubleshooting steps.
- Good communication skills for reporting issues writing summaries and escalating to L3.
Nice-to-Have
- Exposure to Gen AI/LLM applications Document AI or RAG systems.
- Experience with ticketing systems (Jira ServiceNow).
- Basic knowledge of Pub/Sub Cloud Run Cloud Functions or Firestore.
L3 Gen AI Engineer Support & Engineering (GCP)
Location: Offshore
Role Type: L3 Technical Support / Engineering
Experience: 510 years (with 2 years in AI/LLM/RAG-specific systems)
Role Summary
The L3 Gen AI Engineer will provide deep technical support debugging and enhancements for enterprise Gen AI applications running on Google Cloud Platform (GCP). This role handles escalations from L2 resolves complex failures across AI pipelines improves system performance and implements fixes across models orchestration backend services and integrations.
Key Responsibilities
- Diagnose and resolve complex issues in Gen AI workflows (LLM pipelines RAG retrieval document processing audio-to-text summarization etc.).
- Debug model-related problems such as prompt failures accuracy drops hallucinations versioning issues and latency.
- Troubleshoot backend services including Cloud Run Cloud Functions Pub/Sub Firestore Composer DAGs and API integrations.
- Fix failures in document ingestion Document AI extraction vector indexing and data transformation pipelines.
- Review logs traces and metrics to identify root causes and apply permanent fixes.
- Implement hotfixes patch releases new prompt versions and workflow improvements.
- Collaborate with UI middleware and ML teams to resolve cross-component issues.
- Optimize performance throughput token usage and cloud costs across AI workloads.
- Improve observability by enhancing logging monitoring and alerting strategies.
- Manage model lifecycle tasksdeploying new LLM endpoints tuning prompts updating embeddings where needed.
- Ensure best practices for security PII handling role-based access and environment integrity.
- Provide guidance to L2 support by updating runbooks troubleshooting guides and automation scripts.
Required Skills
- Strong hands-on experience in building or supporting GenAI/LLM-based systems (OpenAI Vertex AI Claude etc.).
- Deep understanding of RAG workflows vector databases embeddings and retrieval pipelines.
- Strong programming skills in Python (must-have).
- Experience with LLM orchestration frameworks (LangChain LiteLLM custom orchestrators).
- Good knowledge of GCP cloud services:
- Cloud Run Cloud Functions
- Pub/Sub Firestore GCS
- Vertex AI / Document AI
- Logging Monitoring IAM
- Ability to debug distributed workflows async jobs and event-driven architectures.
- Experience with CI/CD pipelines and Git-based workflows.
- Strong troubleshooting and root-cause analysis capabilities.
Nice-to-Have
- Experience with multimodal pipelines (Speech-to-Text audio processing call summarization).
- Experience deploying or fine-tuning custom models.
- Knowledge of Typescript or middleware frameworks for end-to-end debugging.
- Familiarity with enterprise security networking and role-based access controls.
EY Building a better working world
EY is building a better working world by creating new value for clients people society and the planet while building trust in capital markets.
Enabled by data AI and advanced technology EY teams help clients shape the future with confidence and develop answers for the most pressing issues of today and tomorrow.
EY teams work across a full spectrum of services in assurance consulting tax strategy and transactions. Fueled by sector insights a globally connected multi-disciplinary network and diverse ecosystem partners EY teams can provide services in more than 150 countries and territories.