JD for Data Scientist- (Client- Daimler Truck)
Job Summary:
We are looking for a results-oriented Data Scientist with expertise in Statistics Economics Machine Learning Deep Learning Computer Vision and Generative AI. The ideal candidate will have a proven track record of building and deploying predictive models conducting statistical analysis and applying cutting-edge AI techniques to solve real-world business challenges.
Key Responsibilities:
Good to Have Palantir Foundry Experience:
Experience using Foundry Code Workbooks for developing and deploying machine learning models.
Leveraging Foundry Ontology to access and interpret structured enterprise data for modeling.
Familiarity with Foundrys AI/ML integration capabilities including support for Python Spark and external ML libraries.
Building interactive dashboards and analytical apps using Foundrys visualization tools.
Collaborating within Foundrys shared workspace for reproducible and auditable data science workflows.
Experience deploying models and integrating them into Foundry Operational Workflows for real-time decision support.
Develop models for regression classification clustering and time series forecasting.
Perform hypothesis testing and statistical validation to support data-driven decisions.
Build and optimize deep learning models (ANN CNN RNN including LSTM BERT).
Implement computer vision solutions using YOLOv3 SSD U-Net R-CNN etc.
Apply Generative AI and LLMs (GPT-4 LLaMA2 Bard) for NLP and content generation.
Create interactive dashboards and applications using Streamlit or Flask.
Deploy models using AWS SageMaker Azure Docker Kubernetes and Jenkins.
Collaborate with cross-functional teams to integrate models into production.
Handle large datasets using SQL/NoSQL and PySpark.
Stay updated with the latest AI/ML research and contribute to innovation.
Technical Skills:
Programming & Frameworks:
Languages: Python
Libraries/Frameworks: TensorFlow PyTorch Keras Flask Transformers Langchain PySpark Caffe
Visualization & Data Tools: Pandas NumPy Seaborn Matplotlib Scikit-learn Scipy NLTK Streamlit OpenCV Scikit-Image Dlib MXNet Fasta
ML & Statistical Techniques:
Regression (Linear/Logistic) Decision Trees Random Forest KNN Na ve Bayes
Clustering (KMeans Hierarchical) Time Series Forecasting
Hypothesis Testing Statistical Inference
Deep Learning & Computer Vision:
ANN CNN RNN (LSTM BERT) VGGs YOLOv3 SSD HOGs DCGAN U-Net R-CNN NEAT Inpainting
Gen-AI / LLMs:
HuggingFace GPT-4 Bard LLaMA2 Pinecone Palm GenAI Studio OpenAI fine-tuning
Deployment & DevOps:
AWS (SageMaker) Azure Docker Kubernetes Jenkins Git GitHub API integration
Databases & Tools:
MySQL NoSQL
Jupyter Notebook Google Colab Visual Studio Power BI