Data Engineer (Mid)Hyderabad6+yrs
Posted on:
8 days ago
Vacancies:
1 Vacancy
Job Summary
- Data Engineer (Mid) - 4777IT Development TeamOverviewLocation: onsite HyderabadEmployment Type: full timeExperience: 6 yearsCompensation: INR-Interview Process:L1: Technical interview with Engineer (Virtual)L2: SME interview (Virtual)L3: Final round with Manager (Onsite)Job TitleData Engineer (Mid)Experience RequiredExtracted: 6 years of hands-on experience building enterprise-scale applications data platforms ordistributed systems.Strong experience in Python (5 years)Big Data (6 years)Hands-on expertise with AWS services particularly:EMR (mandatory and most critical skill)EC2LambdaCandidate should have experience with EMR performance and cost optimizationOverviewSeeking a Data Engineer / Backend Developer to design build and operate cloud-native API-driven Big Datasystems. The role focuses on AWS EMR and Spark-based batch processing with strong emphasis on Pythondevelopment workflow orchestration (Airflow) and production reliability.Key ResponsibilitiesDesign and develop API-driven systems to govern manage and monitor large-scale batch Big Dataapplications.Build scalable backend services and data engineering solutions supporting data processing and operationalworkflows.Develop and maintain data transformation processes using Spark SQL Hive Python Scala and relatedtechnologies.Build cloud-native data solutions using AWS services including EMR S3 EC2 Lambda DynamoDB andAPI Gateway.Design and enhance Apache Airflow workflows including complex DAGs scheduling dependencymanagement monitoring and failure handling.Optimize AWS EMR performance and cost efficiency across large-scale data workloads.Participate in requirements gathering technical research and solution design.Contribute to architecture reviews code reviews performance tuning and operational readiness.Support testing validation and integration across applications and data pipelines.Collaborate with product owners data engineers backend engineers QA and DevOps teams in anAgile/Scrum environment.Troubleshoot production issues and drive automation to improve system reliability.All other duties as assigned.Required Qualifications6 years of hands-on experience building enterprise-scale applications data platforms or distributed systems.6 years of experience developing and operating Big Data platforms in the cloud preferably using AWS EMRand the Hadoop ecosystem.Strong hands-on experience with AWS Spark Python and/or Scala Airflow SQL and Hive.Advanced Python or Scala development skills with experience building production-grade data pipelines.
- Strong experience with Apache Airflow orchestration for complex production workflows.Solid understanding of data engineering concepts including batch processing data quality performanceoptimisation and reliability.Experience with Git Jenkins and CI/CD workflows.Strong analytical problem-solving and communication skills.Ability to work effectively in cross-functional Agile/Scrum teams.Technical SkillsRequiredAWSAWS EMRApache SparkPythonScalaApache AirflowSQLApache HiveHadoop ecosystemCI/CDGitJenkinsNice-to-haveLLMsGenerative AIAgentic AIAPI designMicroservicesEvent-driven architecturesServerless architecturesInfrastructure as codeAutomated testingProduction deploymentData governanceMetadata managementData lineageData observabilityTools/PlatformsAWS EMRAmazon S3Amazon EC2AWS LambdaAmazon DynamoDBAmazon API GatewayApache SparkApache AirflowApache HiveHadoop ecosystemGitJenkinsPreferred QualificationsExperience with LLMs Generative AI Agentic AI or AI-assisted engineering workflows.Experience with API design microservices event-driven or serverless architectures.Experience with infrastructure as code automated testing and production deployment.Exposure to data governance metadata management lineage or data observability platforms.
- Role LogisticsLocation Type: onsiteEmployment Type: Full-Time