Industry/Sector
Not ApplicableSpecialism
Data Analytics & AIManagement Level
ManagerJob Description & Summary
At PwC our people in data and analytics focus on leveraging data to drive insights and make informed business decisions. They utilise advanced analytics techniques to help clients optimise their operations and achieve their strategic goals.Exp 8 yrs
PwC US - Acceleration Center is looking for an experienced and visionary GenAI Data Engineer to join our team as aManager. This leadership role involves overseeing the development and maintenance of data pipelines the implementation of machine learning models and the optimization of data infrastructure for our GenAI projects. The ideal candidate will have an extensive background in data engineering with a deep focus on GenAI technologies and a solid understanding of data processing event-driven architectures containerization and cloud computing.
Responsibilities:
Lead the design development and maintenance of robust data pipelines and ETL processes for GenAI projects.
Manage and guide a team of data scientists and data engineers in implementing complexdata and machine learning systems.
Strategize andoptimizedata infrastructure and storage solutions to ensure efficient scalable and reliable data processing across projects.
Implement andoptimizereal-time data streaming solutions using platforms such as Kafka Spark Streaming or similar.
Oversee the deployment of containerization technologies like Kubernetes and Docker to enhance scalability and operational efficiency.
Direct the development and governance of data lakes ensuring effective management of large volumes of structured and unstructured data.
Lead the integration of LLM frameworks (such asLangchainand Semantic Kernel) to advance language processing and analytical capabilities.
Collaborate with cross-functional teams to architect and implement solution frameworks that align with GenAI project goals.
Develop and deploy solutions on multiple cloud platforms (Azure AWS GCP Databricks)leveragingcloud-native services and containerization (Kubernetes Docker).
Monitor diagnose and resolve issues within data pipelines and systems tomaintaincontinuous and smooth operations.
Stay current with GenAI and data engineering trends; recommend and implement innovative solutions.
Implement CI/CD pipelines and version control (Git) for efficient development and deployment.
Translate complex business requirements into effective technical solutions driving projectsuccessand technological innovation.
Document and standardize data engineering processes methodologies and best practices across teams.
Ensure professional development and certification in solution architecture for team membersmaintainingindustry best practices.
Requirements:
8 years of relevant technical/technology experience with a strong emphasis on GenAI projects
ProficiencyinPython(minmum3 years) and SQL (must have) with hands-on experience in Scala Java and Shell scripting.
Experience with Spark and/or Hadoop for distributed data processing.
Solid understanding of designing and architecting scalable Python applications particularly for Gen AI use cases with a strong understanding of various components and systems architecture patterns to make cohesive and decoupled scalable applications.
Familiarity with Python web frameworks (FlaskFastAPI) for building web applications around AI models.
Demonstratedability to design applications with modularity reusability and security best practices in mind (session management vulnerability preventionetc.).
Familiarity with cloud-native development patterns and tools (e.g. REST APIs microservices serverless functions).
Experience deploying and managing containerized applications on Azure/AWS/GCP Databricks(Azure Kubernetes Service Azure Container Instances or similar).
Strongproficiencyin Git for effective code collaboration and management.
Proficiencyin SQL and database management systems
Excellent collaboration and communication skills.
Nice to Have Skills:
Experience in setting up data pipelines for model training and real-time inference.
Exposure to LLM frameworks and tools for interacting with large language models.
Experience developing and deploying machine learning applications in production environments.
Understanding data privacy and compliance regulations.
Practical knowledge of ML/DL frameworks such as TensorFlowPyTorch and scikit-learn.
Proficient in object-oriented programming with languages such as Java C or C#.
Educational Background:
BE / MCA / Degree/MBA/ Any degree
Preferred Qualifications:
Relevant certifications in Databricks Cloud Data Engineering related
Having finance background or experience working with Finance or Banking domain
Travel Requirements
0%Job Posting End Date
Required Experience:
Manager
At PwC, our purpose is to build trust in society and solve important problems. We’re a network of firms in 155 countries with over 284,000 people who are committed to delivering quality in assurance, advisory and tax services. Find out more and tell us what matters to you by vis ... View more