Data ScienceAI Manager
Job Summary
Short Description: |
Ford Global Data Insight & Analytics team is looking for professionals experienced in NLP/LLM/GenAI who are hands-on and can employ many NLP/Prompt engineering techniques from traditional statistical/ML NLP to DL-based sequence models and transformers in their day-to-day work. Experience in automotive domain preferred.
Description: |
Youll be working alongside leading technical experts from all around the world on a variety of products involving Sequence/token classification QA/chatbots translation semantic/search and summarization among others.
Responsibilities: |
- Design NLP/LLM/GenAI applications/products by following robust coding practices
- Explore SoTA models/techniques so that they can be applied for automotive industry usecases
- Conduct ML experiments to train/infer models; if need be build models that abide by memory & latency restrictions
- Deploy REST APIs or a minimalistic UI for NLP applications using Docker and Kubernetes tools.
- Showcase NLP/LLM/GenAI applications in the best way possible to users through web frameworks (Dash Plotly Streamlit etc.)
- Converge multibots into super apps using LLMs with multimodalities.
- Develop agentic workflow using Autogen Agentbuilder langgraph
- Build modular AI/ML products that could be consumed at scale.
Qualifications: |
Education: Masters degree in computer science Engineering Maths or Science
Performed any modern NLP/LLM courses/open competitions is also welcomed.
Technical Requirements:
Soft Skills:
- Strong communication skills and do excellent teamwork through Git/slack/email/call with multiple team members across geographies.
GenAI Skills:
- Experience in LLM models like PaLM GPT4 Mistral (open-source models)
- Work through the complete lifecycle of Gen AI model development from training and testing to deployment and performance monitoring.
- Developing and maintaining AI pipelines with multimodalities like text image audio etc.
- Have implemented in real-world Chat bots or conversational agents at scale handling different data sources.
- Experience in developing Image generation/translation tools using any of the latent diffusion models like stable diffusion Instruct pix2pix.
- Expertise in handling large scale structured and unstructured data.
- Efficiently handled large-scale generative AI datasets and outputs.
ML/DL Skills:
- High familiarity in the use of DL theory/practices in NLP applications
Responsibilities
Responsibilities: |
- Design NLP/LLM/GenAI applications/products by following robust coding practices
- Explore SoTA models/techniques so that they can be applied for automotive industry usecases
- Conduct ML experiments to train/infer models; if need be build models that abide by memory & latency restrictions
- Deploy REST APIs or a minimalistic UI for NLP applications using Docker and Kubernetes tools.
- Showcase NLP/LLM/GenAI applications in the best way possible to users through web frameworks (Dash Plotly Streamlit etc.)
- Converge multibots into super apps using LLMs with multimodalities.
- Develop agentic workflow using Autogen Agentbuilder langgraph
- Build modular AI/ML products that could be consumed at scale.
Qualifications
Qualifications: |
Education: Masters degree in computer science Engineering Maths or Science
Performed any modern NLP/LLM courses/open competitions is also welcomed.
Technical Requirements:
Soft Skills:
- Strong communication skills and do excellent teamwork through Git/slack/email/call with multiple team members across geographies.
GenAI Skills:
- Experience in LLM models like PaLM GPT4 Mistral (open-source models)
- Work through the complete lifecycle of Gen AI model development from training and testing to deployment and performance monitoring.
- Developing and maintaining AI pipelines with multimodalities like text image audio etc.
- Have implemented in real-world Chat bots or conversational agents at scale handling different data sources.
- Experience in developing Image generation/translation tools using any of the latent diffusion models like stable diffusion Instruct pix2pix.
- Expertise in handling large scale structured and unstructured data.
- Efficiently handled large-scale generative AI datasets and outputs.
ML/DL Skills:
- High familiarity in the use of DL theory/practices in NLP applications
- Comfort level to code in Huggingface LangChain Chainlit Tensorflow and/or Pytorch Scikit-learn Numpy and Pandas
- Comfort level to use two/more of open source NLP modules like SpaCy TorchText farm-haystack and others
NLP Skills:
- Knowledge in fundamental text data processing (like use of regex token/word analysis spelling correction/noise reduction in text segmenting noisy unfamiliar sentences/phrases at right places deriving insights from clustering etc.)
- Have implemented in real-world BERT/or other transformer fine-tuned models (Seq classification NER or QA) from data preparation model creation and inference till deployment.
Python Project Management Skills
- Familiarity in the use of Docker tools pipenv/conda/poetry env
- Comfort level in following Python project management best practices (use of logging pytests relative module importssphinx docsetc.)
- Familiarity in use of Github (clone fetch pull/pushraising issues and PR etc.)
Cloud Skills and Computing:
- Use of GCP services like BigQuery Cloud function Cloud run Cloud Build VertexAI
- Good working knowledge on other open-source packages to benchmark and derive summary.
- Experience in using GPU/CPU of cloud and on-prem infrastructures.
- Skillset to leverage cloud platform for Data Engineering Big Data and ML needs.
Deployment Skills:
- Use of Dockers (experience in experimental docker features docker-compose etc.)
- Familiarity with orchestration tools such as airflow Kubeflow
- Experience in CI/CD infrastructure as code tools like terraform etc.
- Kubernetes or any other containerization tool with experience in Helm Argoworkflow etc.
- Ability to develop APIs with compliance ethical secure and safe AI tools.
UI:
- Good UI skills to visualize and build better applications using Gradio Dash Streamlit React Django etc.
- Deeper understanding of javascript css angular html etc. is a plus.
Miscellaneous Skills:
Data Engineering:
- Skillsets to perform distributed computing (specifically parallelism and scalability in Data Processing Modeling and Inferencing through Spark Dask RapidsAI or RapidscuDF)
- Ability to build python-based APIs (e.g.: use of FastAPIs/ Flask/ Django for APIs)
- Experience in Elastic Search and Apache Solr is a plus vector databases.
Required Experience:
Manager
About Company
Ford® is Built for America. Discover the latest lineup in new Ford vehicles! Explore hybrid & electric vehicle options, see photos, build & price, search inventory, view pricing & incentives & see the latest technology & news happening at Ford.