drjobs Python Developer with PySpark Experience

Python Developer with PySpark Experience

Employer Active

1 Vacancy
drjobs

Job Alert

You will be updated with latest job alerts via email
Valid email field required
Send jobs
Send me jobs like this
drjobs

Job Alert

You will be updated with latest job alerts via email

Valid email field required
Send jobs
Job Location drjobs

Pasadena, CO - USA

Monthly Salary drjobs

Not Disclosed

drjobs

Salary Not Disclosed

Vacancy

1 Vacancy

Job Description

Job Title: Python Developer with PySpark Experience (Direct Client Requirement) (needed Locals Only )

Tax Work Location: Pasadena CA (Hybrid 3 Days Onsite Per Week) (Needed Locals Only )

Job Type : (C2C ) (needed Locals Only ) (IN PERSON INTERVIEW / F2F)

Job Summary:

We are looking for a skilled Python Developer with strong PySpark experience to join our data engineering team. You will be responsible for designing developing and optimizing large-scale data processing pipelines using Python and Apache Spark. The ideal candidate has a strong understanding of distributed computing principles data wrangling techniques and is comfortable working in cloud-based environments.

Key Responsibilities:

  • Design and develop scalable data pipelines using PySpark and Python.
  • Build ETL/ELT workflows to ingest transform and load structured and unstructured data.
  • Optimize PySpark jobs for performance and scalability.
  • Collaborate with data scientists analysts and product teams to understand data needs.
  • Integrate data from various sources including relational databases APIs and cloud storage.
  • Implement data quality checks validation and monitoring systems.
  • Deploy and manage jobs on big data platforms like Hadoop Databricks or EMR.
  • Write clean maintainable and well-documented code following best practices.
  • Participate in code reviews and provide constructive feedback.
  • Ensure adherence to data security and governance standards.

Required Qualifications:

  • Bachelors or Masters degree in Computer Science Engineering or related field.
  • 3 years of experience in Python development.
  • 2 years of hands-on experience with PySpark and Spark-based data processing.
  • Strong understanding of data structures algorithms and distributed systems.
  • Proficiency with SQL and experience working with relational databases.
  • Experience with data pipeline orchestration tools like Airflow Oozie or Luigi.
  • Familiarity with cloud platforms (AWS Azure or GCP) and services like S3 EMR Databricks or Glue.
  • Strong debugging performance tuning and optimization skills.

Preferred Qualifications:

  • Experience with CI/CD pipelines and containerization tools (Docker Kubernetes).
  • Knowledge of data warehousing concepts and tools like Snowflake Redshift or BigQuery.
  • Understanding of Delta Lake Hive HDFS or Kafka.
  • Experience working in Agile environments using tools like JIRA Confluence or Git.
Please send your resumes to:

Employment Type

Full Time

Company Industry

Report This Job
Disclaimer: Drjobpro.com is only a platform that connects job seekers and employers. Applicants are advised to conduct their own independent research into the credentials of the prospective employer.We always make certain that our clients do not endorse any request for money payments, thus we advise against sharing any personal or bank-related information with any third party. If you suspect fraud or malpractice, please contact us via contact us page.