Job Overview:
Must-Haves:
- 3 years of experience in Data Engineering
- Strong expertise in ETL pipelines to manage and process data effectively
- Hands-on experience with AWS resources (EC2 Athena Lambda Step Functions) this is
CRITICAL for the role
- Proficiency in MySQL to manage and query databases
- Experience with Docker for containerization
Good-to-Haves:
- Experience with Airflow for orchestration of workflows
- Familiarity with PySpark for big data processing
- Strong Python skills including libraries like:
- SQLAlchemy DuckDB PyArrow Pandas and Numpy
- Experience with DLT (Data Load Tool) is optional but nice to have
numpy,etl pipelines,data engineering,mysql,docker,glue,pandas,airflow,aws step functions,etl pipeline,lamda,panda,python,aws (ec2, athena, lambda, step functions),duckdb,dlt (data load tool),sqlalchemy,aws,building data pipeline,pyspark,data engineer,pyarrow