GG 1442 Senior Data Engineer
Posted on:
9 days ago
Vacancies:
1 Vacancy
Department:
Job Summary
SENIOR DATA ENGINEER
Enterprise Systems Data & Intelligence Platform
ABOUT THE ROLE
You design and build data platform end-to-end. Given a business problem you own the
solution architecture build the pipelines model the data and deliver the serving layer
independently to production quality. You are equally comfortable at a whiteboard and in a
notebook. You dont wait for a detailed brief; you write one.
This is a hands-on engineering role working within a small data team. You will own the data platform from ingestion through to the intelligence layer the structured governed AI-ready data products that power reporting self-serve analytics and operational decision-making.
You operate across a hybrid cloud environment: migrating existing MSSQL Server data warehouse to Microsoft Fabric integrating new data sources from REST API MySQL PostgreSQL (AWS) ensuing data has been setup for semantic models build and intelligence outputs in Power BI. You are a technical authority on data solution design in this space.
CORE RESPONSIBILITIES
| Area | What You Own |
| Solution Design & Architecture |
- Translate ambiguous business problems into clearly scoped data engineering solutions source mappings transformation logic medallion layer design serving layer specification and data lineage documentation.
- Produce solution design artefacts independently: architecture diagrams data flow documentation modelling approach feasibility and security assessment. No requirement for a detailed brief.
- Assess build-vs-buy and platform capability trade-offs across the Microsoft Fabric Azure and AWS stack.
- Challenge requirements when the data approach is wrong.
- Propose alternatives before committing to implementation.
- Define and enforce data engineering standards: naming conventions medallion layer contracts pipeline testing requirements and documentation expectations across the platform.
- Assess cost of proposed data solutions commercial vs capacity usage limits
| Pipeline Engineering & Ingestion |
- Design and build ingestion pipelines within Microsoft Fabric utilising data pipelines spark notebooks and RTI to orchestrate data into our enterprise datalake.
- Build and maintain medallion architecture pipelines (bronze silver gold platinum) including Delta table management schema evolution spark & delta table optimisation techniques and various data ingest patterns.
- Implement CDC and batch ingestion strategies against various sources using AWS DMS Debezium Fabric-native mirroring CDF high water marking strategies - depending on source system characteristics and latency requirements.
- Build and provide support within additional upstream / downstream systems including - SQL Server MySQL AWS RDS and PostgreSQL on AWS
- Implement robust orchestration: dependency chaining failure handling retry logic alerting on pipeline degradation and watermark-based incremental loads & logging frameworks
- Adopt and extend Fabric Pipelines (FDF) Dataflows Gen2 Notebooks and Real-Time Intelligence for streaming source integration.
- Enable multi-cloud data delivery: S3 staging cross-account AWSAzure connectivity and OneLake shortcut/mirroring strategies.
| Data Modelling |
- Design gold/platinum layer data for modelling: fact table grain definition conformed dimensions SCD Type 1/2 implementation in Delta Lake bridge tables for many-to-many relationships.
- Apply modelling patterns appropriate to the company domains: transactional periodic snapshot and accumulating snapshot fact tables across ordering store operations and loyalty data.
- Consider models that support both analytical consumption (Power BI Direct Lake) and operational data product use cases.
- Document modelling decisions not just what was built but why with explicit trade-offs recorded.
| Intelligence Layer & Semantic Models |
- Design and own enterprise Power BI semantic models as governed reusable data products not per-report datasets.
- Deliver DAX at depth: calculation groups complex measure patterns context transition role-playing dimensions and row-level security.
- Configure Direct Lake datasets with awareness of its constraints: column limits aggregation behaviour composite model restrictions and refresh mechanics.
- Structure the gold serving layer for AI consumption: schema-stable well-documented Delta tables suitable for Fabric Copilot grounding RAG pipelines and natural language query interfaces.
- Design the serving layer for analyst self-service field naming standards measure documentation and logical model structure that allows analysts to answer business questions without engineering tickets.
- Maintain XMLA endpoint tooling (Tabular Editor ALM Toolkit) for semantic model version control and deployment.
| Data Quality & Governance |
- Define and implement data quality controls across the pipeline: profiling on ingest null/cardinality/distribution checks and alerting before downstream consumption is affected.
- Apply governance frameworks: data lineage documentation sensitive data classification PII handling controls and audit traceability across the lakehouse.
- Implement testing approaches (dbt tests Great Expectations or Fabric-native DQ rules) to validate accuracy completeness and pipeline behaviour against expected outcomes.
- Ensure all solutions align with enterprise security models access control patterns and change management workflows.
| Platform Support & Reliability |
- Proactively monitor pipeline health and data model performance identify and resolve issues before they surface as business incidents.
- Integrate into incident and change management workflows; maintain runbooks for critical data pipelines.
- Ensure notebooks semantic models and pipelines remain secure traceable versioned and performant as the platform scales.
ESSENTIAL REQUIREMENTS
| Category | Requirement |
| Experience |
- 6 years hands-on data engineering in production environments pipelines modelling and serving layer.
- Proven ability to design and deliver end-to-end data solutions from a business problem statement without requiring a detailed technical brief.
- Track record of owning data platform components in hybrid cloud environments.
| Microsoft Fabric | Platform |
- Operational experience (not just familiarity) with: Lakehouse Dataflows Gen2 Fabric Pipelines Direct Lake datasets OneLake shortcuts and mirroring Real-Time Intelligence.
- Understanding of Direct Lake mode constraints: column limits aggregation behaviour composite model restrictions.
- Delta table lifecycle management: schema evolution time travel optimisation techniques (PARTION ZORDER VACUUM)
- Up to date with the feature developments and monthly releases
| SQL Server & DW |
- T-SQL proficiency: window functions CTEs execution plan analysis.
- Dimensional modelling: star/snowflake schema design SCD Type 1/2/4 implementation.
- DW migration patterns: SQL Server cloud lakehouse schema mapping type coercion constraint translation.
| AWS Source Systems |
- CDC and batch extraction from MySQL and PostgreSQL on AWS RDS into Azure-based platforms.
- AWS DMS or Debezium for ongoing replication pipelines.
- Cross-cloud connectivity: IAM private endpoints S3 staging patterns for Azure ingest.
| Power BI & DAX |
- Understanding of semantic model engineering: DAX calculation groups RLS incremental refresh configuration.
- Knowledge of XMLA endpoint tooling: Tabular Editor ALM Toolkit for model deployment and version control.
- Serving layer design for Direct Lake: schema standards that support reliable Power BI consumption.
| Data Modelling |
- Medallion architecture design with clear layer contracts.
- Fact table grain definition and SCD handling in Delta Lake.
- Conformed dimension design across multi-source domains.
| Disposition |
- Owns a problem statement to production without requiring hand-holding at each step.
- Challenges requirements when the data approach is wrong proposes and defends alternatives.
- Chooses reliable maintainable solutions over clever ones. Ships working data products.
- Comfortable operating with autonomy in a cross-functional environment.
PREFERRED STRONG ADVANTAGE
- dbt or equivalent transformation framework with version-controlled tested transformation logic
- Great Expectations dbt tests or Fabric-native DQ rules for automated data quality
- Fabric Data Agents / Ops Agents /Copilot or LLM grounding patterns structuring serving layer data for AI-assisted analytics
- Event streaming experience: Kafka Azure Event Hubs or Fabric Real-Time Intelligence for low-latency pipelines
- Experience in quick-service restaurant retail or high-frequency transactional domains
- AI-augmented development tooling: Copilot in Fabric GitHub Copilot or equivalent in day-to-day engineering workflow
WHAT YOULL BE BUILDING
You will join a team during an exciting period of transformation as we re-platform from an established SQL Server data warehouse to a new Microsoft Fabric lakehouse migration.
Your immediate focus will be:
- Completing and hardening the medallion architecture in Fabric from bronze through to governed gold & platinum layer outputs
- Building the enterprise data foundations for our insight intelligence layer
- Designing a serving layer that supports both current Power BI consumption and future AI/Copilot integration
- Establishing data quality controls and pipeline reliability standards that the platform can scale on
- Owning the technical data standards naming conventions layer contracts testing requirements that all future platform work aligns to
About Softobiz Technologies
Softobiz Technologies is a technology and product services company headquartered in India operating Global Capability Centers (GCCs) for leading international clients across healthcare fintech and enterprise software. Our GCC model enables world-class talent in India to work directly within the product and engineering teams of our global partners contributing meaningfully to product strategy growth and operations
Innovation begins with like-minded people aiming to transform the world together. At Softobiz we invite you to become a part of an organization that has been helping clients transform their business by fusing insights creativity and technology. With a team of 400 technology enthusiasts we have been trusted by leading enterprises around the globe for over 18 years.
At Softobiz we foster a culture of equality learning collaboration and creative freedom empowering our employees to grow and excel in their careers. Our technical craftsmen are pioneers in the latest technologies like AI machine learning and product development.
Why Should You Join Softobiz
- Work with technical craftsmen who are pioneers in the latest technologies.
- Access training sessions and skill-enhancement courses for personal and professional growth.
- Be rewarded for exceptional performance and celebrate success through engaging parties.
- Experience a culture that embraces diversity and creates an inclusive environment for all employees.
Softobiz is an equal opportunity employer. We celebrate diversity and are committed to creating an inclusive environment for all employees. All qualified applicants will be afforded equal employment opportunities without discrimination based on race creed color national origin sex age disability or marital status.
Required Experience:
Senior IC
About Company
Softobiz prepares businesses for transformative success by embracing change and engineering innovative digital products.