Cloud Data Engineer – Python, Spark, Scala, AWS & AI Integration

Synechron

Not Interested
Bookmark
Report This Job

profile Job Location:

Bengaluru - India

profile Monthly Salary: Not Disclosed
Posted on: Yesterday
Vacancies: 1 Vacancy

Job Summary

Job Summary

Synechron is seeking a highly skilled Data Engineer to design develop and maintain data pipelines and analytical solutions within enterprise data platforms. This role requires hands-on expertise with cloud-native data processing leveraging big data frameworks and integrating AI/ML capabilities to support business intelligence and data-driven decision-making. As a strategic contributor you will lead initiatives to deliver scalable reliable and secure data solutions aligned with industry regulations fostering operational efficiency and advanced analytics.


Software Requirements

Required:

  • Hands-on experience with Python PySpark and Scala for building data pipelines and processing massive datasets (4 years)

  • Proficiency in big data platforms: Spark Hadoop or similar (batch and streaming processing)

  • Experience with cloud-based data solutions on AWS including EMR S3 Glue CloudFormation and CDK

  • Deep understanding of SQL and relational databases like PostgreSQL SQL Server and DynamoDB

  • Familiarity with ETL frameworks and data management best practices

  • Knowledge of data lineage data quality and metadata management


Preferred:

  • Experience integrating AI/ML models and GenAI frameworks such as LangChain Hugging Face

  • Exposure to NoSQL databases and advanced data storage solutions

  • Knowledge of containerization and orchestration: Docker Kubernetes


Overall Responsibilities

  • Design develop and support scalable data pipelines and data processing architectures using PySpark Scala and cloud-native tools

  • Build and optimize batch and streaming data workflows supporting enterprise analytics and reporting

  • Translate business requirements into robust high-performance data solutions ensuring data integrity and security

  • Perform performance tuning debugging and fine-tuning of data processing jobs to meet strict SLAs

  • Implement data quality compliance and governance standards across enterprise data assets

  • Collaborate with data analysts data scientists and business teams to develop data models and insights

  • Lead efforts to automate data workflows orchestrate processing pipelines and support cloud migrations

  • Provide production support and perform root cause analysis for data pipeline issues

  • Stay updated on emerging data technologies AI/ML integrations and industry best practices

  • Document processes data lineage and architecture to support compliance and operational transparency


Technical Skills (By Category)

Programming Languages (Essential):

  • Python Scala PySpark (4 years)

  • SQL for data querying validation and optimization


Preferred:

  • Java or additional scripting languages for automation


Frameworks & Libraries:

  • Spark Hadoop Hive and related big data tools

  • AI/ML frameworks such as LangChain Hugging Face or similar (preferred)

  • Data validation and lineage tools


Databases & Storage:

  • Relational: PostgreSQL SQL Server Oracle

  • NoSQL: DynamoDB Cassandra


Cloud Technologies:

  • AWS: EMR S3 Glue Lambda CloudFormation CDK Redshift (desired)


Data Management & Governance:

  • Metadata management data lineage data quality frameworks

  • Enterprise data governance standards and compliance requirements


Experience Requirements

  • 4 years of hands-on experience designing developing and supporting enterprise data pipelines

  • Proven expertise working with big data frameworks such as Spark Hadoop Hive and Kafka

  • Practical experience with cloud-native data solutions on AWS or similar platforms

  • Exposure to applying AI or GenAI models within data pipelines is highly valued

  • Experience working in regulated industries like banking finance or healthcare is advantageous


Day-to-Day Activities

  • Develop test and optimize data pipelines handling large-scale datasets

  • Coordinate with data scientists analytics teams and product owners to refine data models

  • Troubleshoot and resolve performance bottlenecks or data quality issues

  • Automate data workflows and orchestrate processes using cloud-native tools and frameworks

  • Support cloud infrastructure provisioning and migration strategies

  • Monitor data pipeline health perform root cause analysis and implement improvements

  • Document data processes lineage and governance policies for compliance and operational transparency

  • Stay updated on emerging trends in AI/ML data engineering and cloud-native solutions


Qualifications

  • Bachelors or Masters degree in Computer Science Data Engineering or related field

  • 4 years of experience with cloud-based data engineering and big data platforms

  • Proven track record delivering scalable secure and regulatory-compliant data pipelines

  • Certifications such as AWS Certified Data Analytics or equivalent are preferred


Professional Competencies

  • Strong analytical and troubleshooting skills for complex data environments

  • Excellent collaboration and stakeholder management skills

  • Leadership qualities for guiding junior team members and technical decision-making

  • Adaptability to new tools protocols and industry changes

  • Results-oriented with a focus on data quality security and operational efficiency

  • A continuous learning mindset for emerging technologies in data science and cloud data engineering

SYNECHRONS DIVERSITY & INCLUSION STATEMENT

Diversity & Inclusion are fundamental to our culture and Synechron is proud to be an equal opportunity workplace and is an affirmative action employer. Our Diversity Equity and Inclusion (DEI) initiative Same Difference is committed to fostering an inclusive culture promoting equality diversity and an environment that is respectful to all. We strongly believe that a diverse workforce helps build stronger successful businesses as a global company. We encourage applicants from across diverse backgrounds race ethnicities religion age marital status gender sexual orientations or disabilities to apply. We empower our global workforce by offering flexible workplace arrangements mentoring internal mobility learning and development programs and more.


All employment decisions at Synechron are based on business needs job requirements and individual qualifications without regard to the applicants gender gender identity sexual orientation race ethnicity disabled or veteran status or any other characteristic protected by law.

Candidate Application Notice


Required Experience:

IC

Job SummarySynechron is seeking a highly skilled Data Engineer to design develop and maintain data pipelines and analytical solutions within enterprise data platforms. This role requires hands-on expertise with cloud-native data processing leveraging big data frameworks and integrating AI/ML capabil...
View more view more

Key Skills

  • APIs
  • Jenkins
  • REST
  • Python
  • SOAP
  • Systems Engineering
  • Service-Oriented Architecture
  • Java
  • XML
  • JSON
  • Scripting
  • Sftp

About Company

Company Logo

Chez Synechron, nous croyons en la puissance du numérique pour transformer les entreprises en mieux. Notre cabinet de conseil mondial combine la créativité et la technologie innovante pour offrir des solutions numériques de premier plan. Les technologies progressistes et les stratégie ... View more

View Profile View Profile