Overview:
TekWissen is a global workforce management provider headquartered in Ann Arbor Michigan that offers strategic talent solutions to our clients world-wide. This Client is an American multinational semiconductor company based in Santa Clara California that develops computer processors and related technologies for business and consumer markets. global company that specializes in manufacturing semiconductor devices used in computer processing. The company also produces flash memories graphics processors motherboard chip sets and a variety of components used in consumer electronics goods.
Job Title: Software Development Engineer Release
Work Location: San Jose CA
Duration: 3 Months
Work Type: Temporary Assignment
Job Type: Remote
Job Description:
THE ROLE:
- We are seeking a skilled and motivated Software Development Engineer to join our Training at Scale team.
- In this role you will develop tools and automation to support large-scale model training on the latest GPUs.
- Youll work closely with engineers across teams to optimize training workloads manage CI/CD pipelines and ensure reliable high-performance releases.
- This is a hands-on engineering position with a strong focus on distributed systems performance and automation at scale.
THE PERSON:
- The ideal candidate brings deep experience in open-source software (OSS) release cycles container-based packaging (e.g. Docker) and has strong debugging skills-particularly around model training workloads.
- You thrive in fast-paced environments and are passionate about automation system reliability and continuous improvement.
KEY RESPONSIBILITIES:
- Manage and maintain nightly builds for multiple training frameworks
- Collaborate on integrating new training workloads and expanding test coverage
- Ensure the stability and releasability of the main branch at all times
- Update and maintain build processes to support biweekly release and performance goals
- Handle and deliver ad-hoc development test builds as requested
- Track build performance and reliability metrics over time
Must Have Skills:
- Release engineering & CI/CD at scale
- Containerization & reproducible builds (Expert Docker workflows (multi-stage builds caching multi-arch)
- Build & test automation for distributed ML workloads
- Strong debugging scripting
PREFERRED EXPERIENCE:
- Experience with open-source software contributions and release management
- Strong hands-on experience with Docker and container-based workflows
- Excellent problem-solving skills and attention to detail
- Ability to work independently and a willingness to learn new technologies quickly
ACADEMIC CREDENTIALS:
- Bachelors degree in Computer Science Engineering or a related technical field
TekWissen Group is an equal opportunity employer supporting workforce diversity.