Job Title: Data Scientist
Term: 12 months
Location: Nairobi Kenya
Reports to: Technical Program Manager
Level: Senior (4-7 years of relevant experience)
Role Purpose
We are looking for a Data Scientist to play a critical role in driving the data intelligence layer of the implementation programme. Embedded within a crossfunctional tech squad the role is responsible for delivering propensity model deployment customer segmentation PANbased analytics digital lift measurement and insight dashboards that support datadriven acquisition activation and usage campaigns. The data scientist will work closely with Backend Engineers and the API Integration Engineer to operationalize data pipelines and partner with the marketing and product teams to translate analytical outputs into actionable campaign targeting and measurement.
Key Responsibilities
- Define and implement a PAN (Primary Account Number) extraction and pseudonymization approach that supports targeted campaign analytics while adhering to data governance PCI-DSS and applicable data privacy regulations; document the data handling approach clearly.
- Design validate and deploy propensity models to identify high-potential customers for digital payment acquisition activation and usage campaigns including Visa card adoption Visa Direct usage and tokenization uptake.
- Build customer segmentation frameworks that combine transactional behavioural and demographic signals to produce actionable cohorts for marketing and campaign teams.
- Develop and maintain a digital lift measurement framework defining control/treatment group methodology attribution logic and statistical significance thresholds for evaluating campaign impact.
- Design and deliver analytics dashboards and reporting packs that provide stakeholders with clear actionable visibility of campaign performance model output and digital adoption metrics.
- Collaborate with Backend Engineers to design and validate data pipelines that reliably feed analytical models with fresh clean and correctly structured data.
- Partner with the Frontend Engineer to align analytics event taxonomy and validate that app-level instrumentation is firing correctly and producing usable data.
- Support the Diaspora consumer proposition workstream with relevant analytical inputs including diaspora remittance patterns activation rates and channel preference analysis.
- Conduct data quality assessments of source datasets; define data quality rules and escalate data issues to the engineering team for remediation.
- Document all models methodologies feature engineering approaches and validation results in reproducible peer-reviewable notebooks and technical reports.
- Deliver structured knowledge transfer to internal data and analytics team
- Maintain awareness of and compliance with all applicable data governance policies; escalate any data handling concerns to the Scrum Master and relevant stakeholders.
Measurable Outcomes & Deliverables
First 30 Days
- Data landscape assessment completed: key data sources access status quality issues and governance considerations documented.
- PAN handling and analytics data governance approach reviewed with data governance team; agreed approach documented.
- Propensity model scope and feature set defined; initial exploratory data analysis (EDA) completed.
- Digital lift measurement framework design (v1) produced and reviewed with client marketing/product stakeholders.
- Analytics event tracking requirements shared with Frontend Engineer; event taxonomy v1 agreed.
Days 3160
- Propensity model (v1) trained validated and output reviewed with stakeholders; model card produced documenting performance limitations and intended use.
- First customer segmentation cohort produced and delivered to campaign team; cohort definition and selection logic documented.
- Data pipeline (v1) for model feature ingestion operational in development / staging environment; data freshness and quality validated.
- Digital lift measurement baseline established for at least one active campaign or initiative.
- Analytics dashboard (v1) live showing key digital adoption and campaign KPIs.
Days 6190
- Propensity model deployed to production / scoring environment; scoring pipeline operational with defined refresh cadence.
- At least one end-to-end campaign cycle measured using the digital lift framework; results reported to stakeholders with statistical confidence intervals.
- PAN-based analytics approach operationalized (within agreed governance framework); targeted campaign extract produced and delivered to campaign execution team.
- Diaspora consumer analytics input delivered: activation rate analysis channel preference insights and prioritization recommendations.
- Model and pipeline documentation completed; client data team onboarded to operate and retrain model.
Ongoing KPIs
- Propensity model consistently meets agreed performance and stability thresholds at each refresh cycle
- Propensityscored customers align well with the intended behavioural cohorts when validated postcampaign
- Dashboard availability and data accuracy: 99% dashboard uptime; zero material data errors in executive-level reporting packs.
- Data governance compliance: zero data handling incidents escalated to privacy/compliance teams during engagement.
- Knowledge transfer: Internal data team able to independently run scoring pipeline and refresh model by end of engagement.
Stakeholders & Ways of Working
Agile Ceremonies: All sprint ceremonies; leads data science story refinement; participates in daily stand-ups.
Reporting Cadence:
- Sprint-level: analytics and modelling progress at sprint review.
- Monthly: campaign performance and digital lift summary to client marketing and senior stakeholders.
- Ad-hoc: data quality or governance escalations to TPM and internal data governance team.
Cross-Functional Touchpoints:
- Backend Engineers (data pipeline design and delivery).
- Frontend Engineer (analytics instrumentation validation).
- Marketing and campaign teams (cohort delivery campaign measurement).
- Data governance / privacy team (data handling approvals).
Required Skills & Experience
- 6 years of data science experience with at least 4 years in payments fintech financial services or telecoms.
- Proven experience deploying propensity models or classification models in a production or near-production environment; familiarity with the full ML lifecycle (EDA feature engineering training validation deployment monitoring).
- Strong experience with customer segmentation methodologies and campaign analytics.
- Demonstrated understanding of PAN-based analytics approaches and the associated data governance PCI-DSS and privacy requirements; ability to design compliant analytical frameworks.
- Proficiency in Python (pandas scikit-learn XGBoost/LightGBM statsmodels) and/or R for analytical modelling.
- Experience designing and interpreting A/B tests and causal inference frameworks for digital lift measurement.
- Ability to build and maintain data pipelines using SQL dbt Airflow or equivalent tools.
- Experience producing clear stakeholder-ready insight reports and dashboards (Tableau Power BI Looker or equivalent).
- Strong data quality assessment skills; experience defining and enforcing data quality rules.
- Excellent communication skills; ability to explain complex analytical outputs to non-technical stakeholders.
Preferred / Nice-to-Have Skills
- Experience with mobile money or digital payment customer analytics (M-Pesa or comparable platforms).
- Familiarity with diaspora remittance analytics or cross-border payment customer behaviour.
- Knowledge of differential privacy or anonymisation techniques applicable to payment data.
- Experience with MLOps tooling for model deployment and monitoring (MLflow / Vertex AI / SageMaker / equivalent).
- Experience in emerging markets data contexts (data sparsity network effects airtime credit proxies etc.).
Tools & Technologies
- Languages: Python (pandas scikit-learn XGBoost LightGBM statsmodels) SQL
- Data pipelines: dbt Apache Airflow or equivalent
- Dashboarding: Tableau Power BI or equivalent
- Notebooks: Jupyter or equivalent
- Version control: Git (GitHub / GitLab)
- Cloud: Azure or equivalent
- Collaboration: Confluence / SharePoint
- Issue tracking: Jira / Azure DevOps
Job Title: Data Scientist Term: 12 months Location: Nairobi Kenya Reports to: Technical Program Manager Level: Senior (4-7 years of relevant experience) Role Purpose We are looking for a Data Scientist to play a critical role in driving the data intelligence layer of the implementation programme. E...
Job Title: Data Scientist
Term: 12 months
Location: Nairobi Kenya
Reports to: Technical Program Manager
Level: Senior (4-7 years of relevant experience)
Role Purpose
We are looking for a Data Scientist to play a critical role in driving the data intelligence layer of the implementation programme. Embedded within a crossfunctional tech squad the role is responsible for delivering propensity model deployment customer segmentation PANbased analytics digital lift measurement and insight dashboards that support datadriven acquisition activation and usage campaigns. The data scientist will work closely with Backend Engineers and the API Integration Engineer to operationalize data pipelines and partner with the marketing and product teams to translate analytical outputs into actionable campaign targeting and measurement.
Key Responsibilities
- Define and implement a PAN (Primary Account Number) extraction and pseudonymization approach that supports targeted campaign analytics while adhering to data governance PCI-DSS and applicable data privacy regulations; document the data handling approach clearly.
- Design validate and deploy propensity models to identify high-potential customers for digital payment acquisition activation and usage campaigns including Visa card adoption Visa Direct usage and tokenization uptake.
- Build customer segmentation frameworks that combine transactional behavioural and demographic signals to produce actionable cohorts for marketing and campaign teams.
- Develop and maintain a digital lift measurement framework defining control/treatment group methodology attribution logic and statistical significance thresholds for evaluating campaign impact.
- Design and deliver analytics dashboards and reporting packs that provide stakeholders with clear actionable visibility of campaign performance model output and digital adoption metrics.
- Collaborate with Backend Engineers to design and validate data pipelines that reliably feed analytical models with fresh clean and correctly structured data.
- Partner with the Frontend Engineer to align analytics event taxonomy and validate that app-level instrumentation is firing correctly and producing usable data.
- Support the Diaspora consumer proposition workstream with relevant analytical inputs including diaspora remittance patterns activation rates and channel preference analysis.
- Conduct data quality assessments of source datasets; define data quality rules and escalate data issues to the engineering team for remediation.
- Document all models methodologies feature engineering approaches and validation results in reproducible peer-reviewable notebooks and technical reports.
- Deliver structured knowledge transfer to internal data and analytics team
- Maintain awareness of and compliance with all applicable data governance policies; escalate any data handling concerns to the Scrum Master and relevant stakeholders.
Measurable Outcomes & Deliverables
First 30 Days
- Data landscape assessment completed: key data sources access status quality issues and governance considerations documented.
- PAN handling and analytics data governance approach reviewed with data governance team; agreed approach documented.
- Propensity model scope and feature set defined; initial exploratory data analysis (EDA) completed.
- Digital lift measurement framework design (v1) produced and reviewed with client marketing/product stakeholders.
- Analytics event tracking requirements shared with Frontend Engineer; event taxonomy v1 agreed.
Days 3160
- Propensity model (v1) trained validated and output reviewed with stakeholders; model card produced documenting performance limitations and intended use.
- First customer segmentation cohort produced and delivered to campaign team; cohort definition and selection logic documented.
- Data pipeline (v1) for model feature ingestion operational in development / staging environment; data freshness and quality validated.
- Digital lift measurement baseline established for at least one active campaign or initiative.
- Analytics dashboard (v1) live showing key digital adoption and campaign KPIs.
Days 6190
- Propensity model deployed to production / scoring environment; scoring pipeline operational with defined refresh cadence.
- At least one end-to-end campaign cycle measured using the digital lift framework; results reported to stakeholders with statistical confidence intervals.
- PAN-based analytics approach operationalized (within agreed governance framework); targeted campaign extract produced and delivered to campaign execution team.
- Diaspora consumer analytics input delivered: activation rate analysis channel preference insights and prioritization recommendations.
- Model and pipeline documentation completed; client data team onboarded to operate and retrain model.
Ongoing KPIs
- Propensity model consistently meets agreed performance and stability thresholds at each refresh cycle
- Propensityscored customers align well with the intended behavioural cohorts when validated postcampaign
- Dashboard availability and data accuracy: 99% dashboard uptime; zero material data errors in executive-level reporting packs.
- Data governance compliance: zero data handling incidents escalated to privacy/compliance teams during engagement.
- Knowledge transfer: Internal data team able to independently run scoring pipeline and refresh model by end of engagement.
Stakeholders & Ways of Working
Agile Ceremonies: All sprint ceremonies; leads data science story refinement; participates in daily stand-ups.
Reporting Cadence:
- Sprint-level: analytics and modelling progress at sprint review.
- Monthly: campaign performance and digital lift summary to client marketing and senior stakeholders.
- Ad-hoc: data quality or governance escalations to TPM and internal data governance team.
Cross-Functional Touchpoints:
- Backend Engineers (data pipeline design and delivery).
- Frontend Engineer (analytics instrumentation validation).
- Marketing and campaign teams (cohort delivery campaign measurement).
- Data governance / privacy team (data handling approvals).
Required Skills & Experience
- 6 years of data science experience with at least 4 years in payments fintech financial services or telecoms.
- Proven experience deploying propensity models or classification models in a production or near-production environment; familiarity with the full ML lifecycle (EDA feature engineering training validation deployment monitoring).
- Strong experience with customer segmentation methodologies and campaign analytics.
- Demonstrated understanding of PAN-based analytics approaches and the associated data governance PCI-DSS and privacy requirements; ability to design compliant analytical frameworks.
- Proficiency in Python (pandas scikit-learn XGBoost/LightGBM statsmodels) and/or R for analytical modelling.
- Experience designing and interpreting A/B tests and causal inference frameworks for digital lift measurement.
- Ability to build and maintain data pipelines using SQL dbt Airflow or equivalent tools.
- Experience producing clear stakeholder-ready insight reports and dashboards (Tableau Power BI Looker or equivalent).
- Strong data quality assessment skills; experience defining and enforcing data quality rules.
- Excellent communication skills; ability to explain complex analytical outputs to non-technical stakeholders.
Preferred / Nice-to-Have Skills
- Experience with mobile money or digital payment customer analytics (M-Pesa or comparable platforms).
- Familiarity with diaspora remittance analytics or cross-border payment customer behaviour.
- Knowledge of differential privacy or anonymisation techniques applicable to payment data.
- Experience with MLOps tooling for model deployment and monitoring (MLflow / Vertex AI / SageMaker / equivalent).
- Experience in emerging markets data contexts (data sparsity network effects airtime credit proxies etc.).
Tools & Technologies
- Languages: Python (pandas scikit-learn XGBoost LightGBM statsmodels) SQL
- Data pipelines: dbt Apache Airflow or equivalent
- Dashboarding: Tableau Power BI or equivalent
- Notebooks: Jupyter or equivalent
- Version control: Git (GitHub / GitLab)
- Cloud: Azure or equivalent
- Collaboration: Confluence / SharePoint
- Issue tracking: Jira / Azure DevOps
View more
View less