HEROIC Cybersecurity ( ) is seeking a Senior Data Infrastructure Engineer with deep expertise in DataStax Enterprise (DSE) and Apache Cassandra to help architect scale and maintain the data infrastructure that powers our cybersecurity intelligence platforms.
You will be responsible for designing and managing fully automated big data pipelines that ingest process and serve hundreds of billions of breached and leaked records sourced from the surface deep and dark web. Youll work with DSE Cassandra Solr and Spark helping us move toward a 99% automated pipeline for data ingestion enrichment deduplication and indexing all built for scale speed and reliability.
This position is critical in ensuring our systems are fast reliable and resilient as we ingest thousands of unique datasets daily from global threat intelligence sources.
What you will do:
- Design deploy and maintain high-performance Cassandra clusters using DataStax Enterprise (DSE)
- Architect and optimize automated data pipelines to ingest clean enrich and store billions of records daily
- Configure and manage DSE Solr and Spark to support search and distributed processing at scale
- Automate dataset ingestion workflows from unstructured surface deep and dark web sources
- Cluster management replication strategy capacity planning and performance tuning
- Ensure data integrity availability and security across all distributed systems
- Write and manage ETL processes scripts and APIs to support data flow automation
- Monitor systems for bottlenecks optimize queries and indexes and resolve production issues
- Research and integrate third-party data tools or AI-based enhancements (e.g. smart data parsing deduplication ML-based classification)
- Collaborate with engineering data science and product teams to support HEROIC s AI-powered cybersecurity platform
Requirements
- Minimum 5 years experience with Cassandra / DataStax Enterprise in production environments
- Hands-on experience with DSE Cassandra Solr Apache Spark CQL and data modeling at scale
- Strong understanding of NoSQL architecture sharding replication and high availability
- Advanced knowledge of Linux/Unix shell scripting and automation tools (e.g. Ansible Terraform)
- Proficient in at least one programming language: Python Java or Scala
- Experience building large-scale automated data ingestion systems or ETL workflows
- Solid grasp of AI-enhanced data processing including smart cleaning deduplication and classification
- Excellent written and spoken English communication skills
- Prior experience with cybersecurity or dark web data (preferred but not required)
Benefits
About Us: HEROIC Cybersecurity ( ) is building the future of cybersecurity. Unlike traditional cybersecurity solutions HEROIC takes a predictive and proactive approach to intelligently secure our users before an attack or threat occurs. Our work environment is fast-paced challenging and exciting. At HEROIC you ll work with a team of passionate engaged individuals dedicated to intelligently securing the technology of people all over the world.
Position Keywords: DataStax Enterprise (DSE) Apache Cassandra Apache Spark Apache Solr AWS Jira NoSQL CQL (Cassandra Query Language) Data Modeling Data Replication ETL Pipelines Data Deduplication Data Lake Linux/Unix Administration Bash Docker Kubernetes CI/CD Python Java Distributed Systems Cluster Management Performance Tuning High Availability Disaster Recovery AI-based Automation Artificial Intelligence Big Data Dark Web Data
Minimum 8 years years of full-stack PHP development experience, with 3 years in Laravel Deep expertise in PHP, Laravel, MySQL, JavaScript ( or similar), Git, and RESTful APIs Experience with server and database management (Linux, Apache/Nginx, MySQL/PostgreSQL, Cassandra) Strong familiarity with AI-enhanced coding tools and modern DevOps workflows (CI/CD, GitHub Actions) Experience in a security-focused or SaaS product environment is a strong plus Excellent English communication skills (written and verbal) Comfortable working independently during 9:00am-6:00pm (Pacific Time) U.S. hours and owning mission-critical systems