FINRA/Big Data Engineer
Location: Onsite Rockville MD or Tysons Corner VA
Role: CTH 6 months
| Ideal/ Required | Skills | Actual Candidate skills credentials describe actual experience |
| Required | Must be within commutable distance from Rockville MD or Tysons Corner VA; what city do you live in | |
| 5 yrs | Designing/implement big data and distributed systems with large-scale datasets (Petabyte-scale) that includes data processing environments troubleshooting performance or scalability challenges resource optimization. | |
| Required | Strong Apache Spark and its architecture (executors stages DAG tasks). | |
| Required | Strong big data frameworks Hadoop Hive or Trino. | |
| Required | Strong Python Scala or Java w/ focus on scalable and modular code. | |
| Required | Hands-on with AWS tools including S3 EMR Glue Lambda and Athena. | |
| Required | Experience designing/maintaining production ETL and data processing systems. | |
| Required | Strong writing adv SQL queries like window functions complex joins and aggregations. | |
| Required | Understands CI/CD pipelines & automated testing in data engineering environments. | |
| Ideal | Worked within Financial Services or regulated industries | |
| Ideal | Kubernetes EKS or serverless architectures. | |
| Ideal | data lake and modern data platform architectures. | |
| Ideal | Help drive adoption of modern data and AI engineering practices | |
| Ideal | AI-assisted dev tools such as GitHub Copilot ChatGPT Claude or similar with prompt engineering AI workflow design and AI-driven productivity. | |
| Ideal | Understands data governance compliance and security best practices | |
| Ideal | AWS certifications. | |
| Required | Bachelors degree in computer science Information Systems or a related discipline or equivalent practical experience. Masters Degree Ideal | |
| Required | Work in fast-paced dynamic environments and manage competing priorities. | |
| Required | Clear English strong written and verbal communications and teamwork skills | |
| Required | Strong self-starter attitude team-oriented and good communication skills | |
Highly skilled and experienced Big Data Engineer to design build and optimize large-scale data platforms and distribute processing systems at our FinTech customer. This role is critical in enabling data-driven decision-making across the organization by delivering scalable reliable and high-performance data solutions.
Ideal candidate has deep expertise in distributed computing cloud platforms and modern big data technologies such as Apache Spark Hadoop Hive and Trino. This individual will work closely with data scientists analysts product teams and engineering stakeholders to architect and implement robust data pipelines and enterprise-grade data platforms. The role also requires strong software engineering practices AI-assisted development proficiency and the ability to optimize systems handling petabyte-scale data.
This position offers the opportunity to work in a modern cloud-first environment while driving innovation in big data AI-enabled engineering and scalable data architecture.
Responsibilities
- Design develop and maintain large-scale data pipelines using modern big data technologies such as Spark Hadoop Hive and Trino.
- Build scalable and reliable solutions for data ingestion transformation storage and analytics.
- Architect distributed data platforms capable of processing massive (petabyte-scale) datasets.
- Optimize and enhance existing data pipelines for performance scalability cost efficiency and reliability.
- Implement automated testing frameworks and continuous validation for data quality and pipeline accuracy.
- Develop unit integration and end-to-end test strategies for data platforms.
- Collaborate with cross-functional teams to translate business requirements into scalable data solutions.
- Support data scientists and analytics teams by delivering high-quality production-ready datasets.
- Monitor troubleshoot and resolve data pipeline issues in production environments.
- Investigate and resolve challenges such as data skew resource constraints job failures and large-scale system bottlenecks.
- Apply Spark tuning techniques including partitioning caching broadcast joins and performance optimization.
- Ensure strong software engineering practices including version control code quality and CI/CD automation.
- Stay current with emerging big data cloud and AI technologies to continuously improve data architecture.
- Drive AI-enabled development practices including prompt engineering AI-assisted coding and workflow optimization.
- Partner with stakeholders to ensure regulatory governance and financial data integrity requirements are met.
Qualifications
Required:
- Bachelors degree in computer science Information Systems or a related discipline or equivalent practical experience.
- 5 years of experience designing and implementing big data and distributed systems.
- Strong expertise in Apache Spark and its architecture (executors stages DAG tasks).
- Hands-on experience with big data technologies such as Hadoop Hive and Trino.
- Strong proficiency in Python Scala or Java with a focus on scalable and modular code.
- Extensive experience writing advanced SQL queries including window functions complex joins and aggregations.
- Experience working with large-scale datasets and troubleshooting performance or scalability challenges.
- Hands-on experience with cloud platforms such as AWS including S3 EMR Glue Lambda and Athena.
- Experience designing and maintaining production ETL and data processing systems.
- Strong understanding of distributed system performance tuning and resource optimization.
- Experience implementing CI/CD pipelines and automated testing in data engineering environments.
- Strong understanding of Agile methodologies such as Scrum and Kanban.
- Excellent communication and collaboration skills.
- Ability to work in fast-paced dynamic environments and manage competing priorities.
Desirable:
Masters degree in computer science Data Engineering or related field.
- Experience in Financial Services or regulated industries.
- Exposure to petabyte-scale data processing environments.
- Experience with Kubernetes EKS or serverless architectures.
- Experience with data lake and modern data platform architectures.
- AWS certifications.
- Experience with AI-assisted development tools such as GitHub Copilot ChatGPT Claude or similar.
- Experience in prompt engineering AI workflow design and AI-driven productivity.
- Strong understanding of data governance compliance and security best practices.
- Experience driving organizational adoption of modern data and AI engineering practices.
FINRA/Big Data Engineer Location: Onsite Rockville MD or Tysons Corner VA Role: CTH 6 months Ideal/ Required Skills Actual Candidate skills credentials describe actual experience Required Must be within commutable distance from Rockville MD or Tysons Corner VA; what city do you li...
FINRA/Big Data Engineer
Location: Onsite Rockville MD or Tysons Corner VA
Role: CTH 6 months
| Ideal/ Required | Skills | Actual Candidate skills credentials describe actual experience |
| Required | Must be within commutable distance from Rockville MD or Tysons Corner VA; what city do you live in | |
| 5 yrs | Designing/implement big data and distributed systems with large-scale datasets (Petabyte-scale) that includes data processing environments troubleshooting performance or scalability challenges resource optimization. | |
| Required | Strong Apache Spark and its architecture (executors stages DAG tasks). | |
| Required | Strong big data frameworks Hadoop Hive or Trino. | |
| Required | Strong Python Scala or Java w/ focus on scalable and modular code. | |
| Required | Hands-on with AWS tools including S3 EMR Glue Lambda and Athena. | |
| Required | Experience designing/maintaining production ETL and data processing systems. | |
| Required | Strong writing adv SQL queries like window functions complex joins and aggregations. | |
| Required | Understands CI/CD pipelines & automated testing in data engineering environments. | |
| Ideal | Worked within Financial Services or regulated industries | |
| Ideal | Kubernetes EKS or serverless architectures. | |
| Ideal | data lake and modern data platform architectures. | |
| Ideal | Help drive adoption of modern data and AI engineering practices | |
| Ideal | AI-assisted dev tools such as GitHub Copilot ChatGPT Claude or similar with prompt engineering AI workflow design and AI-driven productivity. | |
| Ideal | Understands data governance compliance and security best practices | |
| Ideal | AWS certifications. | |
| Required | Bachelors degree in computer science Information Systems or a related discipline or equivalent practical experience. Masters Degree Ideal | |
| Required | Work in fast-paced dynamic environments and manage competing priorities. | |
| Required | Clear English strong written and verbal communications and teamwork skills | |
| Required | Strong self-starter attitude team-oriented and good communication skills | |
Highly skilled and experienced Big Data Engineer to design build and optimize large-scale data platforms and distribute processing systems at our FinTech customer. This role is critical in enabling data-driven decision-making across the organization by delivering scalable reliable and high-performance data solutions.
Ideal candidate has deep expertise in distributed computing cloud platforms and modern big data technologies such as Apache Spark Hadoop Hive and Trino. This individual will work closely with data scientists analysts product teams and engineering stakeholders to architect and implement robust data pipelines and enterprise-grade data platforms. The role also requires strong software engineering practices AI-assisted development proficiency and the ability to optimize systems handling petabyte-scale data.
This position offers the opportunity to work in a modern cloud-first environment while driving innovation in big data AI-enabled engineering and scalable data architecture.
Responsibilities
- Design develop and maintain large-scale data pipelines using modern big data technologies such as Spark Hadoop Hive and Trino.
- Build scalable and reliable solutions for data ingestion transformation storage and analytics.
- Architect distributed data platforms capable of processing massive (petabyte-scale) datasets.
- Optimize and enhance existing data pipelines for performance scalability cost efficiency and reliability.
- Implement automated testing frameworks and continuous validation for data quality and pipeline accuracy.
- Develop unit integration and end-to-end test strategies for data platforms.
- Collaborate with cross-functional teams to translate business requirements into scalable data solutions.
- Support data scientists and analytics teams by delivering high-quality production-ready datasets.
- Monitor troubleshoot and resolve data pipeline issues in production environments.
- Investigate and resolve challenges such as data skew resource constraints job failures and large-scale system bottlenecks.
- Apply Spark tuning techniques including partitioning caching broadcast joins and performance optimization.
- Ensure strong software engineering practices including version control code quality and CI/CD automation.
- Stay current with emerging big data cloud and AI technologies to continuously improve data architecture.
- Drive AI-enabled development practices including prompt engineering AI-assisted coding and workflow optimization.
- Partner with stakeholders to ensure regulatory governance and financial data integrity requirements are met.
Qualifications
Required:
- Bachelors degree in computer science Information Systems or a related discipline or equivalent practical experience.
- 5 years of experience designing and implementing big data and distributed systems.
- Strong expertise in Apache Spark and its architecture (executors stages DAG tasks).
- Hands-on experience with big data technologies such as Hadoop Hive and Trino.
- Strong proficiency in Python Scala or Java with a focus on scalable and modular code.
- Extensive experience writing advanced SQL queries including window functions complex joins and aggregations.
- Experience working with large-scale datasets and troubleshooting performance or scalability challenges.
- Hands-on experience with cloud platforms such as AWS including S3 EMR Glue Lambda and Athena.
- Experience designing and maintaining production ETL and data processing systems.
- Strong understanding of distributed system performance tuning and resource optimization.
- Experience implementing CI/CD pipelines and automated testing in data engineering environments.
- Strong understanding of Agile methodologies such as Scrum and Kanban.
- Excellent communication and collaboration skills.
- Ability to work in fast-paced dynamic environments and manage competing priorities.
Desirable:
Masters degree in computer science Data Engineering or related field.
- Experience in Financial Services or regulated industries.
- Exposure to petabyte-scale data processing environments.
- Experience with Kubernetes EKS or serverless architectures.
- Experience with data lake and modern data platform architectures.
- AWS certifications.
- Experience with AI-assisted development tools such as GitHub Copilot ChatGPT Claude or similar.
- Experience in prompt engineering AI workflow design and AI-driven productivity.
- Strong understanding of data governance compliance and security best practices.
- Experience driving organizational adoption of modern data and AI engineering practices.
View more
View less