Our team works in close interaction with ML applied research teams infrastructure and client teams as well as with other groups and other functions across Apple (legal privacy) and externally. This position focuses on designing and implementing flexible data pipelines and data tools based on advanced computer vision technology NLP and humans in the loop. Responsibilities may include:* design consistent and robust data models* design and implement data pipelines to process data at scale (up to the Petabyte scale!)* find creative ways to automate data flows or build self service tooling that enable PMs to iterate faster * production-ize synthetic data workflows * preprocess transform and clean data in multiple domains (tabular image video text etc...) and at scale* interact with ML models to optimize human-in-the-loop workflows* support the day-to-day operations of the data team
- Bachelors Masters or PhD in Computer Science Mathematics Physics or a related field; or equivalent practical experience.
- Excellent programing skills in Python with strong CS foundations (data structure low level parallelization)
- Experience in Machine Learning (eg familiarity with model training) in either NLP or Computer Vision
- You are able to design prototype and put in production robust data components that scale
- Experience working with data orchestration frameworks such as Airflow and other data related environment (No)SQL Docker Kubernetes Spark Databricks
- You are resilient in a fast pace environment comfortable with ambiguity and juggling between different projects with short term deliveries. You have excellent written and verbal communication skills.
- Experience designing and implementing agentic workflow
Our team works in close interaction with ML applied research teams infrastructure and client teams as well as with other groups and other functions across Apple (legal privacy) and externally. This position focuses on designing and implementing flexible data pipelines and data tools based on advance...
Our team works in close interaction with ML applied research teams infrastructure and client teams as well as with other groups and other functions across Apple (legal privacy) and externally. This position focuses on designing and implementing flexible data pipelines and data tools based on advanced computer vision technology NLP and humans in the loop. Responsibilities may include:* design consistent and robust data models* design and implement data pipelines to process data at scale (up to the Petabyte scale!)* find creative ways to automate data flows or build self service tooling that enable PMs to iterate faster * production-ize synthetic data workflows * preprocess transform and clean data in multiple domains (tabular image video text etc...) and at scale* interact with ML models to optimize human-in-the-loop workflows* support the day-to-day operations of the data team
- Bachelors Masters or PhD in Computer Science Mathematics Physics or a related field; or equivalent practical experience.
- Excellent programing skills in Python with strong CS foundations (data structure low level parallelization)
- Experience in Machine Learning (eg familiarity with model training) in either NLP or Computer Vision
- You are able to design prototype and put in production robust data components that scale
- Experience working with data orchestration frameworks such as Airflow and other data related environment (No)SQL Docker Kubernetes Spark Databricks
- You are resilient in a fast pace environment comfortable with ambiguity and juggling between different projects with short term deliveries. You have excellent written and verbal communication skills.
- Experience designing and implementing agentic workflow
View more
View less