Applied AIML Lead
Job Summary
As Applied AI/ML Lead within Commercial & Investment Bank with the Healthcare Provider team you will lead the design development and production deployment of AI/ML solutions focused on image classification text categorization and data extraction from scanned TIF documents. You will architect and implement computer vision pipelines leveraging CRNN architectures for document type identification page-level categorization and visual feature extraction.
Job responsibilities
- Lead the design development and production deployment of AI/ML solutions focused on image classification text categorization and data extraction from scanned TIF documents and evaluate and explore additional models and architectures to continuously improve classification accuracy extraction quality and processing efficiency.
- Drive the development and fine-tuning of models for document understanding text categorization named entity recognition and semantic understanding and combine visual layout information textual content and spatial relationships to extract structured data from complex scanned documents while enabling automated categorization and metadata tagging of OCR-extracted text.
- Lead the integration and optimization of OCR technology and generative AI capabilities into the document processing pipeline ensuring high-accuracy text extraction from scanned TIF images across diverse document types layouts fonts and quality levels. Leverage Amazon Bedrock to explore foundation model capabilities for intelligent document understanding classification document summarization and augmenting traditional extraction pipelines.
- Architect and implement scalable ML training and inference pipelines using AWS SageMaker managing model training hyperparameter tuning distributed training for large vision models and real-time/batch inference endpoint deployment. Collaborate with software engineering teams to integrate trained models into Java/Python-based microservices deployed on AWS EKS ensuring low-latency high-throughput inference for production document processing workloads.
- Establish robust MLOps practices and annotation workflows including model versioning automated retraining triggers A/B testing of model variants drift detection on document distributions and comprehensive performance monitoring dashboards and design and manage labeling strategies for training data ensuring high-quality ground truth datasets for image classification text categorization and document extraction tasks.
- Build and manage a team of ML engineers and applied scientists fostering a culture of experimentation rapid prototyping and rigorous evaluation of model performance against business KPIs.
Required qualifications capabilities and skills
- Bachelors degree or MS or PhD in quantitative discipline e.g. Computer Science Mathematics Operations Research Data Science.
- 7 years of experience in applied ML/AI roles with at least 2 years leading teams or large-scale ML initiatives
- Advanced proficiency in Python and enterprise languages with deep experience in PyTorch TensorFlow Hugging Face Transformers OpenCV and Pillow for model development and image processing. Proficiency in Java and/or Groovy for integrating ML capabilities into backend services and enterprise application ecosystems. Familiarity with Oracle databases for feature extraction training data retrieval and integration with ML workflows.
- Deep expertise in computer vision and NLP models with hands-on experience implementing and fine-tuning CRNN-based architectures for image classification and feature extraction. Strong experience with multimodal document understanding combining text layout and image features. Proficiency in transformer-based NLP models for text categorization sequence labeling named entity recognition and semantic analysis of OCR-extracted content.
- Practical experience with OCR technologies and image preprocessing for text extraction from scanned documents with an understanding of OCR accuracy optimization preprocessing techniques and post-processing correction. Experience with image preprocessing for scanned documents in TIF format including multi-page handling resolution normalization deskewing binarization and noise removal.
- Deep hands-on experience with AWS SageMaker and Amazon Bedrock including end-to-end ML workflows such as training jobs processing pipelines model registry distributed training and real-time/batch inference endpoints. Practical experience leveraging foundation models prompt engineering and building generative AI-augmented document processing solutions. Experience deploying and scaling ML models as containerized microservices on AWS EKS using Docker and Kubernetes with expertise in optimizing GPU-based inference workloads.
- Strong knowledge of MLOps tools and practices including MLflow SageMaker Pipelines or equivalent platforms for experiment tracking pipeline automation and model lifecycle management. Excellent leadership and communication skills with the ability to present complex technical concepts to senior leadership and non-technical audiences.
Preferred qualifications capabilities and skills
Domain expertise in the healthcare industry
Experience in applied ML/AI roles in document processing computer vision or NLP domains
About Company
JPMorganChase, one of the oldest financial institutions, offers innovative financial solutions to millions of consumers, small businesses and many of the world’s most prominent corporate, institutional and government clients under the J.P. Morgan and Chase brands. Our history spans ov ... View more