Research not mandatory now we have to share pool of Data Scientist. Candidate should have Mid Level exp
Skill
- Both roles require a strong Data Science/ML foundation (model development experimentation diagnosing issues).
- Both use Python ML frameworks (PyTorch TensorFlow scikit-learn).
- Both involve applied research curiosity strong communication skills and ability to explain DS concepts clearly.
- Both are in multi-disciplinary teams working in fast-moving collaborative environments
- Strong in ML model development
- Python programming
- Bonus SWE skills (very beneficial but candidates without will still be considered):
- Backend development (FastAPI/Flask)
- Cloud (AWS) CI/CD pipelines
Orientation- Embedded in product team focused on shipping features & solving user issues
Candidate Fit - Strong DS/ML hands-on SWE integration; fit for those who enjoy hands-on building solving user issues and working closely with engineers.
The Privacy team within GovTechs Data Engineering Practice was set up to raise the governments capabilities in data privacy and drive adoption of Privacy-Enhancing Technologies. Our team has developed whole-of-government tools like Cloak for anonymisation which is used by 100 agencies today and has anonymised over 30 million documents to power data analytics and Generative AI use cases as a Mirage for Synthetic Data Generation
Synthetic Data Generation (SDG) is an upcoming privacy technology that allows for agencies to generate alternative forms of sensitive data that can then be safely utilised and shared. We are seeking a Data Scientist for Mirage a whole-of-government product for SDG. Mirage is currently offered as a web interface with API integration on the roadmap. We launched in August 2024 and are currently in a growth phase with active feature development.
Job Scope
This role will be at the intersection of data science applied machine learning and software engineering. You will be involved in:
Model Development
- Design and conduct experiments to evaluate emerging SDG models (e.g. DDPM ARF Gaussian Copula).
- Investigate failure cases (e.g. when models fail with certain data types size or cardinality).
- Tune hyperparameters refine architectures and propose new modeling strategies.
Feature & Product Development
- Collaborate with software engineers to build product features that require ML/DS input (e.g. imputation methods handling of constraints preprocessing pipelines).
- Recommend and develop suitable approaches for features like single-/multi-column constraints imputation strategies and privacy metrics.
Diagnostics & Debugging
- Work directly with users and the engineering team to diagnose user issues with training failures poor outputs or integration challenges.
- Provide actionable fixes and communicate technical insights in a user-friendly way.
Documentation & Knowledge Sharing
- Write user-facing documentation pages. This could include explaining model choice hyperparameters and utility/privacy metrics in a user-friendly manner.
- Translate complex technical Data Science concepts into clear approachable explanations.
Collaboration
- Work closely with the SWE team ( FastAPI AWS) to integrate the generation engine into production-ready systems.
- Participate in Agile rituals code reviews and design discussions.
Requirements
- Bachelors degree or higher in Computer Science Data Science Business Analytics or a related field with at least 2-3 years of relevant professional experience.
- Core Data Science & ML skillset
- Strong foundation in machine learning with hands-on experience in model development and experimentation.
- Strong programming proficiency in Python and experience with ML frameworks (e.g. PyTorch TensorFlow scikit-learn).
- Ability to analyze model behavior diagnose training issues and design experiments to improve performance.
Applied Research & Experimentation
- Familiarity with reading synthesizing and ability to translate emerging research into practical prototypes.
Software Engineering
- Working knowledge of backend development (REST APIs FastAPI Flask or similar).
- Comfortable working with cloud environments (AWS preferred).
- Ability to debug and fix software-level issues when they affect ML workflows.
- Familiarity with Git CI/CD and collaborative coding best practices.
Nice-to-Haves
- Experience with privacy-enhancing technologies anonymisation synthetic data generation or differential privacy.
- Familiarity with frontend integration workflows ( experience working in multi-disciplinary product teams.
Mindset & Collaboration
- Curiosity and willingness to learn new domains (esp. data privacy).
- Strong communication skills to explain technical concepts to both engineers and non-technical stakeholders.
- Inclination to work in a collaborative fast-moving Agile environment.
Requirements
Ajai Agarwal
09 9291
Registration ID R1768374
Irtish Consulting Pte Ltd EA Licence 20S0377 Reg. No.E
Required Skills:
Data Scientist (with Software Engineering Competency)
Research not mandatory now we have to share pool of Data Scientist. Candidate should have Mid Level expSkill Both roles require a strong Data Science/ML foundation (model development experimentation diagnosing issues).Both use Python ML frameworks (PyTorch TensorFlow scikit-learn).Both involve app...
Research not mandatory now we have to share pool of Data Scientist. Candidate should have Mid Level exp
Skill
- Both roles require a strong Data Science/ML foundation (model development experimentation diagnosing issues).
- Both use Python ML frameworks (PyTorch TensorFlow scikit-learn).
- Both involve applied research curiosity strong communication skills and ability to explain DS concepts clearly.
- Both are in multi-disciplinary teams working in fast-moving collaborative environments
- Strong in ML model development
- Python programming
- Bonus SWE skills (very beneficial but candidates without will still be considered):
- Backend development (FastAPI/Flask)
- Cloud (AWS) CI/CD pipelines
Orientation- Embedded in product team focused on shipping features & solving user issues
Candidate Fit - Strong DS/ML hands-on SWE integration; fit for those who enjoy hands-on building solving user issues and working closely with engineers.
The Privacy team within GovTechs Data Engineering Practice was set up to raise the governments capabilities in data privacy and drive adoption of Privacy-Enhancing Technologies. Our team has developed whole-of-government tools like Cloak for anonymisation which is used by 100 agencies today and has anonymised over 30 million documents to power data analytics and Generative AI use cases as a Mirage for Synthetic Data Generation
Synthetic Data Generation (SDG) is an upcoming privacy technology that allows for agencies to generate alternative forms of sensitive data that can then be safely utilised and shared. We are seeking a Data Scientist for Mirage a whole-of-government product for SDG. Mirage is currently offered as a web interface with API integration on the roadmap. We launched in August 2024 and are currently in a growth phase with active feature development.
Job Scope
This role will be at the intersection of data science applied machine learning and software engineering. You will be involved in:
Model Development
- Design and conduct experiments to evaluate emerging SDG models (e.g. DDPM ARF Gaussian Copula).
- Investigate failure cases (e.g. when models fail with certain data types size or cardinality).
- Tune hyperparameters refine architectures and propose new modeling strategies.
Feature & Product Development
- Collaborate with software engineers to build product features that require ML/DS input (e.g. imputation methods handling of constraints preprocessing pipelines).
- Recommend and develop suitable approaches for features like single-/multi-column constraints imputation strategies and privacy metrics.
Diagnostics & Debugging
- Work directly with users and the engineering team to diagnose user issues with training failures poor outputs or integration challenges.
- Provide actionable fixes and communicate technical insights in a user-friendly way.
Documentation & Knowledge Sharing
- Write user-facing documentation pages. This could include explaining model choice hyperparameters and utility/privacy metrics in a user-friendly manner.
- Translate complex technical Data Science concepts into clear approachable explanations.
Collaboration
- Work closely with the SWE team ( FastAPI AWS) to integrate the generation engine into production-ready systems.
- Participate in Agile rituals code reviews and design discussions.
Requirements
- Bachelors degree or higher in Computer Science Data Science Business Analytics or a related field with at least 2-3 years of relevant professional experience.
- Core Data Science & ML skillset
- Strong foundation in machine learning with hands-on experience in model development and experimentation.
- Strong programming proficiency in Python and experience with ML frameworks (e.g. PyTorch TensorFlow scikit-learn).
- Ability to analyze model behavior diagnose training issues and design experiments to improve performance.
Applied Research & Experimentation
- Familiarity with reading synthesizing and ability to translate emerging research into practical prototypes.
Software Engineering
- Working knowledge of backend development (REST APIs FastAPI Flask or similar).
- Comfortable working with cloud environments (AWS preferred).
- Ability to debug and fix software-level issues when they affect ML workflows.
- Familiarity with Git CI/CD and collaborative coding best practices.
Nice-to-Haves
- Experience with privacy-enhancing technologies anonymisation synthetic data generation or differential privacy.
- Familiarity with frontend integration workflows ( experience working in multi-disciplinary product teams.
Mindset & Collaboration
- Curiosity and willingness to learn new domains (esp. data privacy).
- Strong communication skills to explain technical concepts to both engineers and non-technical stakeholders.
- Inclination to work in a collaborative fast-moving Agile environment.
Requirements
Ajai Agarwal
09 9291
Registration ID R1768374
Irtish Consulting Pte Ltd EA Licence 20S0377 Reg. No.E
Required Skills:
Data Scientist (with Software Engineering Competency)
View more
View less