Data Domain Architect Lead
Job Summary
Join us for an exciting opportunity to leverage your advanced data annotation skills in the financial industry and contribute to cutting-edge machine learning models.
As a Data Domain Architect Lead within the Consumer & Community Banking team you will lead data labeling initiatives that produce reliable controlled and actionable datasets for model training and evaluation. You will set product direction manage delivery and partner with technology operations and data science teams to improve data quality scalability and stakeholder outcomes.
Job Responsibilities
Translate business requirements and ML objectives into implementable requirements schema guidelines and quality metrics while defining success measures and key result for each labelling effort and actively manage scope risks dependencies and stakeholder communications
Own the annotation operating model including workflow design task routing queue management and delivery governance
Scale labeling capacity across multiple lines of business while maintaining consistency quality throughput and clear documentation
Own data cleaning and preparation processes to resolve noise duplicates inconsistencies and labeling defects
Establish metrics and annotation reliability standards and a measurable quality framework including calibration routines gold datasets reviews and feedback loops
Leverage prompt engineering to improve task instructions enable pre-labeling and support synthetic data generation for LLM-related datasets
Develop LLM-as-judge approaches and agentic workflows to automate quality evaluation at scale flag low-confidence items and surface disagreements with human oversight
Drive annotation innovation by implementing automation across the labeling lifecycle including ingestion validation checks dataset packaging and audit-ready lineage artifacts
Lead benchmarking and executive-ready reporting on delivery performance quality outcomes and continuous improvement
Collaborate proactively with machine learning engineers and scientists to define evaluation requirements labeling expectations and target data volumes as models and usecases evolve in the new agentic/LLM initiatives to keep data deliverables unblocked & on track.
Keep the team growing and stay current on AI data trends publications and tools and nurture teams AI & tech capability through training coaching and growth opportunities
Required Qualifications Capabilities and Skills
Masters or PhD degree in Computational Linguistics Linguistics Computer Science Data Science or a related field.
5 years of experience delivering data products or machine learning-enabled products across the full product lifecycle
Hands-on experience in developing annotation metrics annotation and performing annotation reviews
Experience running text data labeling programs end-to-end including guideline and taxonomy design and annotation platform operations
Hands-on experience in Python for automation data analysis cleaning and validating structured and unstructured datasets; plus experience using Git for version control
Hands-onprompt engineeringexperience for LLM labeling workflows (for example pre-labeling synthetic data generation and instruction clarity)
Working knowledge ofLLM-as-judgemethods including rubric design and integrating automated signals into human-in-the-loop review
Hands-on experience in designing labeling quality measurement (for example gold datasets calibration sampling and inter-annotator agreement targets)
Hands-on experience in benchmarking data quality and evaluation outcomes and translating results into product and process improvements
Strong stakeholder management written and verbal communication and disciplined execution under deadlines
Experience leading cross-functional delivery across technology operations and vendor partners
Preferred Qualifications Capabilities and Skills
- Experience managing globally distributed annotation teams and third-party vendors
- Familiarity with metadata management data cataloging and dataset lineage practices
- Experience applying machine learning to data quality monitoring and anomaly detection
- Track record influencing senior stakeholders and aligning priorities through measurable OKRs
- Experience working with privacy data governance or model risk controls related to training data
Required Experience:
Staff IC
About Company
JPMorganChase, one of the oldest financial institutions, offers innovative financial solutions to millions of consumers, small businesses and many of the world’s most prominent corporate, institutional and government clients under the J.P. Morgan and Chase brands. Our history spans ov ... View more