Role : Data Engineer (Python / PySpark)
Kansas City MO
Core Technical Skills
- Python:
- Data processing and transformation using Pandas NumPy
- Writing modular reusable code for ETL workflows
- Automation and scripting for data operations
- PySpark:
- Building distributed data pipelines
- Spark SQL DataFrame APIs and RDDs
- Performance tuning (partitioning caching shuffle optimization)
- SQL:
- Complex queries joins aggregations and window functions
- Query optimization for large datasets
- Data Modeling & ETL:
- Designing schemas for analytics and operational systems
- Implementing ETL/ELT pipelines and orchestration tools (Airflow Databricks Jobs)
- Big Data & Cloud Platforms:
- Experience with AWS Azure or GCP
- Familiarity with data lakes and Delta Lake patterns
- File Formats & Storage:
- Parquet ORC Avro for efficient storage
- Understanding of partitioning strategies
- Testing & CI/CD:
- Unit and integration testing for data pipelines
- Git-based workflows and automated deployments
Role : Data Engineer (Python / PySpark) Kansas City MO Core Technical Skills Python: Data processing and transformation using Pandas NumPy Writing modular reusable code for ETL workflows Automation and scripting for data operations PySpark: Building distributed data pipelines Spark SQL Data...
Role : Data Engineer (Python / PySpark)
Kansas City MO
Core Technical Skills
- Python:
- Data processing and transformation using Pandas NumPy
- Writing modular reusable code for ETL workflows
- Automation and scripting for data operations
- PySpark:
- Building distributed data pipelines
- Spark SQL DataFrame APIs and RDDs
- Performance tuning (partitioning caching shuffle optimization)
- SQL:
- Complex queries joins aggregations and window functions
- Query optimization for large datasets
- Data Modeling & ETL:
- Designing schemas for analytics and operational systems
- Implementing ETL/ELT pipelines and orchestration tools (Airflow Databricks Jobs)
- Big Data & Cloud Platforms:
- Experience with AWS Azure or GCP
- Familiarity with data lakes and Delta Lake patterns
- File Formats & Storage:
- Parquet ORC Avro for efficient storage
- Understanding of partitioning strategies
- Testing & CI/CD:
- Unit and integration testing for data pipelines
- Git-based workflows and automated deployments
View more
View less