Position : Sr. Data Engineer
Location : Bellevue HQ or Overland Park
Work Required
- Lead the architecture design and implementation of scalable modular and reusable data flow pipelines using Cribl Apache NiFi Vector and other open-source platforms ensuring consistent ingestion strategies across a complex multi-source telemetry environment.
- Develop platform-agnostic ingestion frameworks and template-driven architectures to enable reusable ingestion patterns supporting a variety of input types (e.g. syslog Kafka HTTP Event Hubs Blob Storage) and output destinations (e.g. Snowflake Splunk ADX Log Analytics Anvilogic).
- Spearhead the creation and adoption of a schema normalization strategy leveraging the Open Cybersecurity Schema Framework (OCSF) including field mapping transformation templates and schema validation logic-designed to be portable across ingestion platforms.
- Design and implement custom data transformations and enrichments using scripting languages such as Groovy Python or JavaScript while enforcing robust governance and security controls (SSL/TLS client authentication input validation logging).
- Ensure full end-to-end traceability and lineage of data across the ingestion transformation and storage lifecycle including metadata tagging correlation IDs and change tracking for forensic and audit readiness.
- Collaborate with observability and platform teams to integrate pipeline-level health monitoring transformation failure logging and anomaly detection mechanisms.
- Oversee and validate data integration efforts ensuring high-fidelity delivery into downstream analytics platforms and data stores with minimal data loss duplication or transformation drift.
- Lead technical working sessions to evaluate and recommend best-fit technologies tools and practices for managing structured and unstructured security telemetry data at scale.
- Implement data transformation logic including filtering enrichment dynamic routing and format conversions (e.g. JSON CSV XML Logfmt) to prepare data for downstream analytics platforms. (100 plus sources of data)
- Contribute to and maintain a centralized documentation repository including ingestion patterns transformation libraries naming standards schema definitions data governance procedures and platform-specific integration details.
- Coordinate with security analytics and platform teams to understand use cases and ensure pipeline logic supports threat detection compliance and data analytics requirements.
Overview
We are seeking eight Senior Data Engineers to lead efforts in orchestrating and transforming complex security telemetry data flows. These individuals will be responsible for high-level architecture governance and ensuring secure and reliable movement of data between systems particularly for legacy and non-standard log sources. There are 100 data sources including existing and new that are specific to Cyber Security workloads that are in-scope. These tasks will be performed on one or more data ingestion pipelines (Cribl Vector NiFi)