AIML Data Scientist Intern

Not Interested
Bookmark
Report This Job

profile Job Location:

Suffolk, NC - USA

profile Monthly Salary: Not Disclosed
Posted on: 4 hours ago
Vacancies: 1 Vacancy

Job Summary

We are looking for a curious and driven AI/ML Data Scientist Intern to join our team in Suffolk Virginia. This internship offers a hands-on opportunity for students or early-career professionals with a foundation in Computer Science to gain real-world experience in artificial intelligence machine learning and data science. You will work alongside experienced engineers and data professionals to build fine-tune and deploy machine learning models construct retrieval-augmented generation pipelines and curate high-quality datasets that support organizational objectives.


What Youll Do

  • Assist in the development and fine-tuning of large language models using techniques such as LoRA to optimize model performance for specific use cases.
  • Support the design and implementation of retrieval-augmented generation (RAG) pipelines to enhance AI-driven applications with relevant contextual data.
  • Curate clean and prepare datasets for training and evaluation ensuring data quality and relevance across projects.
  • Work with embedding models to convert text and documents into vector representations for search and retrieval systems.
  • Develop and refine chunking strategies for processing large documents into manageable semantically meaningful segments.
  • Extract structured information from unstructured documents using automated document extraction techniques.
  • Build and experiment with agentic AI workflows that enable autonomous task execution and decision-making.
  • Contribute to front-end interfaces and internal tools using HTML JavaScript and React to support data visualization and model interaction.
  • Document processes experiments and findings for internal knowledge sharing and reproducibility.
Requirements

To be considered for this position candidates must demonstrate foundational knowledge in the following areas:

  • Linux Foundations Basic understanding of Linux operating systems including file system navigation user management permissions and command-line operations.
  • Python Basics Foundational proficiency in Python programming including the ability to write scripts work with libraries manipulate data structures and debug code.
  • Agentic AI Familiarity with the concepts and architecture behind agentic AI systems including how autonomous agents plan reason and execute multi-step tasks.
  • Hugging Face Experience navigating the Hugging Face ecosystem including the ability to load pre-trained models tokenizers and datasets from the Hugging Face Hub.
  • Dataset Curation Understanding of how to source clean label and organize datasets for machine learning training and evaluation purposes.
  • LoRA Fine-Tuning Knowledge of Low-Rank Adaptation (LoRA) techniques for efficiently fine-tuning large language models with reduced computational overhead.
  • RAG Pipelines Understanding of retrieval-augmented generation architecture including how to connect language models with external knowledge sources to improve response accuracy.
  • Document Extraction Familiarity with techniques and tools for extracting structured data from unstructured documents such as PDFs scanned images and web pages.
  • Chunking Strategies Knowledge of methods for splitting large documents into smaller semantically coherent segments optimized for embedding and retrieval.
  • Embedding Models Understanding of how text embedding models work and how they are used to represent documents as vectors for similarity search and retrieval applications.
  • Basic Networking Understanding of core networking concepts including IP addresses subnetting the OSI model and the functional differences between Layer 2 and Layer 3 protocols.
  • Azure Virtual Desktop Concepts Familiarity with Azure Virtual Desktop components including Host Pools Workspaces and Application Groups.
  • HTML JavaScript React Foundational knowledge of front-end web technologies including the ability to read and understand HTML structure JavaScript logic and React component architecture.

Nice to Have

The following skills are not required but would strengthen your candidacy:

  • Vector Databases Experience working with vector database platforms such as Pinecone Weaviate or ChromaDB for storing and querying high-dimensional embeddings.
  • LangChain or LlamaIndex Familiarity with orchestration frameworks used to build applications powered by large language models.
  • Prompt Engineering Knowledge of techniques for crafting effective prompts to guide large language model behavior and improve output quality.
  • MLOps and Model Deployment Experience with tools and workflows for packaging deploying and monitoring machine learning models in production environments.
  • Docker & Containerization Basic understanding of container concepts and experience running applications in Docker or Kubernetes environments.
  • Transformer Architectures Understanding of the transformer model architecture including self-attention mechanisms and how they power modern language models.
  • Data Annotation and Labeling Experience with data annotation workflows and labeling tools used to prepare supervised learning datasets.
  • Evaluation Metrics for Generative AI Knowledge of how to assess the quality of generative AI outputs using metrics such as BLEU ROUGE perplexity or human evaluation frameworks.
  • Cloud Platforms for ML Workloads Exposure to cloud-based machine learning services on AWS GCP or Azure for training hosting and scaling models.
  • Version Control Systems (Git) Familiarity with Git workflows for managing code collaborating with teams and tracking project history.
  • Microsoft EntraID Familiarity with Microsofts identity and access management platform for managing user authentication and permissions.
  • API Calls Experience making and testing API calls using tools such as Postman cURL or similar utilities.
  • Azure Services Broader exposure to Azure services beyond the fundamentals such as Azure Storage Azure Networking or Azure Active Directory.
  • / .NET API Experience building or consuming APIs using or framework.
  • Azure Serverless Functions Familiarity with event-driven serverless computing in Azure for running lightweight backend processes.
  • Visio or Other Drawing Application Ability to create data flow diagrams system architecture visuals or workflow documentation using Microsoft Visio or comparable tools such as or Lucidchart.

About us: We are Command Post Technologies Inc. (CPT). CPT is a Service-Disabled Veteran-Owned Small Business (SDVOSB) providing engineering services in the areas of Cyber Security Software Development Test & Evaluation and Strategic Planning. CPT employees appreciate working in a caring environment that promotes a healthy work-life balance. As individuals we come together as a team supporting a culture rooted in our core principles of integrity determination and all of CPTs collaboration efforts our team prioritizes communication accountability and being resourceful in order to maximize efficiency and results.


Whats In It for You

  • Leadership training
  • Career professional development
  • Work/Life balance
  • Rewards and recognition

Command Post Technologies Inc. (CPT) is a Service-Disabled Veteran-Owned Small Business (SDVOSB) founded in 2008 and headquartered in Suffolk VA with personnel in various states including Virginia Maryland Florida and Texas. With 2/3 of our staff being former military CPT firmly believes in employing veterans. Command Post Technologies Inc. is a unique provider of innovative solutions that enhance our corporate clients productivity and empower our government clients with the ability to protect against all enemies: foreign and domestic. CPT adapts its successful military experiential approach to the needs of leaders in a global business environment and provides an elite leadership curriculum that results in a world-class leadership-altering event.


Command Post Technologies Inc. (CPT) is an Equal Employment Opportunity and Affirmative Action employer. We consider applicants without regard to race color religion age national origin ancestry ethnicity gender gender identity gender expression sex sexual orientation marital status veteran status disability genetic information citizenship status or membership in any other group protected by federal state or local law. We take Affirmative Action to ensure equal opportunities for employees and potential employees without regard to race color religion age national origin ancestry ethnicity gender gender identity gender expression sex sexual orientation marital status veteran status disability genetic information citizenship status or membership in any other group protected by federal state or local law.


We abide by the Pay Transparency Nondiscrimination Provision and will refrain from discharging or otherwise discriminating against employees or applicants who inquire about discuss or disclose their compensation or the compensation of other employees or applicants. An exception exists where the employee or applicant makes the disclosure based on information obtained while performing his or her essential job functions.


Required Experience:

Intern

We are looking for a curious and driven AI/ML Data Scientist Intern to join our team in Suffolk Virginia. This internship offers a hands-on opportunity for students or early-career professionals with a foundation in Computer Science to gain real-world experience in artificial intelligence machine le...
View more view more