Data Engineer Databricks, Pyspark, Azure
Job Summary
Job Title:
Data Engineer Databricks PySpark & Azure
Location:
Job Summary
We are seeking skilled Data Engineers with strong expertise in Databricks PySpark and cloud-based data platforms. The ideal candidate will design and build scalable ETL pipelines work with large datasets and contribute to modern data platform development in a fast-paced Agile environment.
Primary Skills
- Databricks
- Python (PySpark Pandas)
- Java (ETL development)
- SQL
- Elasticsearch
- CI/CD (Jenkins Git)
- Azure (preferred) / GCP
Key Responsibilities
- Design develop and maintain scalable ETL pipelines and data workflows
- Build and optimize data models for large-scale data processing
- Develop and maintain data applications with complex integrations
- Write optimize and execute SQL scripts for data processing and analysis
- Work with large datasets across distributed data platforms
- Collaborate with cross-functional teams on data architecture and solutions
- Implement CI/CD pipelines and DevOps best practices
- Develop and integrate APIs and web services
- Ensure high performance reliability and scalability of data systems
- Participate in Agile development processes and follow TDD practices
Required Qualifications
- Minimum 4 years of experience in data engineering or related roles
- Strong programming skills in Python (PySpark Pandas) or Java
- Hands-on experience designing ETL pipelines and data models
- Experience building and maintaining large-scale data applications
- Strong SQL skills
- Experience with data technologies and databases such as:
- PostgreSQL
- MS SQL Server
- Oracle
- Apache Spark
- Apache Kafka
- Elasticsearch
- PostgreSQL
- Experience with data platforms:
- Databricks or Snowflake
- Databricks or Snowflake
- Experience with cloud platforms:
- Azure (preferred)
- GCP
- Azure (preferred)
- Experience with workflow orchestration tools:
- Apache Airflow or Azure Data Factory
- Apache Airflow or Azure Data Factory
- Experience with:
- Web Services & APIs
- CI/CD tools (Jenkins Git)
- Web Services & APIs
- Experience working in Agile environments
- Understanding of Test-Driven Development (TDD)
- Bachelors degree in Computer Science Engineering or related field
Preferred Qualifications (Nice to Have)
- Understanding of networking protocols and security principles
- Knowledge of Capital Markets domain
- Experience with:
- Docker
- Kubernetes
- Docker
- Experience with real-time high availability and low-latency systems
- Experience developing multi-threaded applications
Required Skills:
Experience (Years): 4-6 Essential Skills: Work with project teams throughout the organization to design implement and manage CDN infrastructure using Akamai to ensure high availability performance and scalability for customer facing applications and business processes. Handle multiple priorities and assignments with excellence and precision. Be a part of a 24/7/365 organization (some after hours support is expected as part of normal on-call rotation). Directly support line of business development teams provide guidance to them on implementation and changes for customer facing applications Develop and maintain security protocols and measures to protect CDN infrastructure from cyber threats. Monitor and analyze network performance identifying and resolving issues to optimize content delivery of critical applications. Collaborate with cross-functional teams to integrate Akamai CDN solutions with existing systems and applications. Collaborate with information security teams to implement DDoS protection strategies and other security measures in the CDN. Provide technical support and guidance to clients and internal teams regarding CDN and security best practices. Work closely with vendor and professional service teams on delivery related activities and strategy. Qualifications: Bachelors degree in Computer Science Information Technology or a related field. OR similar work experience. Strong understanding of network protocols (HTTP/HTTPS DNS TCP/IP). Proven experience as a CDN Engineer or similar role with a strong focus on -depth knowledge of Content Delivery Network technologies including caching load balancing and content optimization. Excellent problem-solving skills and attention to detail. Strong communication and teamwork abilities. Experience supporting 24/7/365 customer facing applications at enterprise scale. Awareness and experience with cybersecurity tools and practices such as firewalls intrusion detection/prevention systems and encryption. Proficiency in scripting and automation (e.g. Python Bash) a plus. Relevant certifications (e.g. CISSP CEH) are a plus but not required.