Note: PhD preferred or Atleast Masters required.
Role: Sr Data Scientist
Duration: Longterm
Location: Reston VA
Job Description
Minimum Qualifications:
- Work or educational background in one or more of the following areas: machine learning computational linguistics deep learning ratification intelligence data science and/or data analytic generative AI symbolic AI causal AI operations research computer science Mathematics business analytics or knowledge management.
- Demonstrated experience programming with R/Python Linux and Spark in AWS cloud environment or knowledge and algorithmic design experience in Python (3 years)
- Proficient with Amazon AWS Sagemaker Jupyter Notebook and Python Scikit Deep Learning Machine Learning tools such as TensorFlow
- Experience with image processing models such as Coco CLIP ResNet or comparable models
- Demonstrated experience with machine learning techniques including natural language processing and Large language Models (GPTv4-o1 o3 OpenAI APIs Llama Claude etc).
- Experience developing AI agents and development proficiency using agentic programming
- Proficient in Natural language processing (NLP) and Natural language generation (NLG) including prior projects in any of the following categories: top modeling of text sentiment analysis of text part of speech tagging Name Entity Recognition (NER) Bag of Words text extraction
- Experience building and working with any of these components: Vector DB BERT RoBERTa (or comparable tools) Spacy LLM and GenAI tools. Experience with LoRA LangChain RAG LLM Fine Tuning and PEFT Knowledge Graphs.
- Strong skills in developing GraphRAG Chain of Thought (CoT) Tree of Thought (ToT) Reinforcement learning and AI development architectures with Human-in-the-Loop (HITL
- Demonstrated experience with SQL and any relational database technologies such as Oracle PostgreSQL MySQL RDS Redshift Hadoop EMR Hive etc.
- Demonstrated experience processing structured and unstructured data sources data cleansing data normalization and prep for analysis
- Demonstrated experience with code repositories and build/deployment pipelines specifically Jenkins and/or Git/GitHub/GitLab.
- Demonstrated experience using Tableau or Kibana Quicksights or other similar data visualizations tools.
- Very comfortable working with ambiguity (e.g. imperfect data loosely defined concepts ideas or goals)
Qualifications & Requirements
- Education: MS in Computer Science Statistics Math Engineering or related field PhD preferred
- 3 years of relevant experience in building large scale machine learning or deep learning models and/or systems
- 1 year of experience specifically with deep learning (e.g. CNN RNN LSTM)
- 1 year of experience building NLP and NLG tools.
- Experience with wide range of LLMs (Llama Claude OpenAI Cohere etc.) LoRA LangChain RAG LLM Fine Tuning and PEFT are preferred.
- Demonstrated skills with Jupyter Notebook AWS Sagemaker or Domino Datalab or comparable environments
- Passion for solving complex data problems and generating cross-functional solutions in a fast-paced environment
- Knowledge in Python and SQL object oriented programming service oriented architectures
- Strong scripting skills with Shell script and SQL
- Strong coding skills and experience with Python (including SciPy NumPy and/or PySpark) and/or Scala.
- Knowledge and implementation experience with NLP techniques (topic modeling bag of words text classification TF/IDF Sentiment analysis) and NLP technologies such as Python NLTK or Spacy or comparable technologies
- Knowledge and implementation experience with statistical and machine learning models (regression classification clustering graph models etc.)
Preferred Qualifications
- Hands on experience building models with deep learning frameworks like Tensorflow Keras Caffe PyTorch Theano H2O or similar
- Experience with LLM Agents Agentic programming
- Experience with search architecture (for instance: Solr ElasticSearch AWS OpenSearch)
- Experience with building querying ontologies such as Zeno OWL RDF SparQL or comparable are preferred
- Knowledge & experience with microservices service mesh API development and test automation are preferred
- Demonstrated experience using Docker Kubernetes and/or other similar container frameworks are preferred
- Strongly prefer a PhD in math computer science stat or comparable field with experience in data science AI development and deep learning advanced analytics
Additional Job Qualifications:
- Ability to translate business ideas into analytics models that have major business impact.
- Demonstrated experience working with multiple stakeholders.
- Demonstrated communication skills e.g. explaining complex technical issues to more junior data scientists in graphical verbal or written formats.
- Demonstrated experience developing tested reusable and reproducible work.
- Transparently documenting code and methodologies.
- Ability to work in Agile Lean and rapid development processes
Cloud BC Labs Inc is a digital transformation organization aimed at creating seamless solutions for clients to effectively manage their business operations. The company specializes in Business and Management Consulting AI/ML Data Analytics & Visualization Cloud Data Warehouse Migration Snowflake Implementation Informatica Implementation & Upgrade Staffing Services and Data Management Solutions