Staff Software Engineer, ML Infrastructure

SimpliSafe


Job Location:

Boston, NH - USA

Monthly Salary: $ 146600 - 215100
Posted on: 3 days ago
Vacancies: 1 Vacancy

Job Summary

About SimpliSafe

Were a high-tech home security company thats passionate about protecting the life youve built and our mission of keeping Every Home Secure. And weve created a culture here that cares just as deeply about the career youre building. Ours is a no ego culture of collaboration and innovation where those seeking their next challenge can find big opportunities and make a huge impact on the lives of all those who we protect. We dont just want you to work here. We want you to grow and thrive here.

Were embracing a hybrid work model that enables our teams to split their time between office and home. Hybrid for us means we expect our teams to come together in our state-of-the-art office on two core days typically Tuesday Wednesday or Thursday working together in person and choosing where they work for the remainder of the week. We all benefit from flexibility and get to use the best of both worlds to get our work done.

Why are we hiring

Well were growing and thriving. So we need smart talented and humble people who share our values to join us as we disrupt the home security space and relentlessly pursue our mission of keeping Every Home Secure.

About the Role

Were looking for a Staff Software Engineer to join our Cloud ML team the team that owns both the cloud-side ML infrastructure and the applied ML research that powers SimpliSafes intelligent home security products. This is a senior individual contributor role for a distributed systems expert who wants to apply that craft to one of the most demanding problem domains in the company.

Youll partner closely with other Staff and Principal engineers to drive architecture mentor across the team and set the technical direction for our ML platform. The work spans two of our most demanding workloads: real-time computer vision inference that processes video from cameras and doorbells across our customer base and LLM/GenAI infrastructure that will power our future generation of intelligent applications. Both are fundamentally distributed systems problems high-throughput low-latency multi-tenant GPU-aware and unforgiving of regressions.

This role is for someone who has built and operated large-scale distributed services in production high-QPS APIs real-time platforms low-latency serving systems and is excited to bring that depth to ML infrastructure. Prior ML experience is a plus not a prerequisite. If youve shipped systems that serve a lot of traffic scale gracefully and stay up at 3am we want to talk to you.

What Youll Do

Set technical direction for ML infrastructure

  • Drive architecture decisions for our Kubernetes-based ML platform anchored on Ray for inference alongside KServe Triton and vLLM across real-time and batch workloads.
  • Lead deep technical reviews on system design capacity planning and reliability for the highest-stakes ML systems at SimpliSafe.
  • Identify and remove the systemic bottlenecks in our ML deployment infrastructure whether thats serving reliability deployment friction observability gaps scaling or cost.

Build and operate real-time CV inference at scale

  • Own the design and evolution of cloud-side inference systems that process live video and events from SimpliSafe devices in real time.
  • Drive throughput latency and cost improvements (batching strategies GPU utilization autoscaling multi-model serving) for production CV models.
  • Build the feedback loops between cloud inference edge devices and the data flywheel that improves model quality over time.

Stand up LLM/GenAI serving infrastructure

  • Help shape how SimpliSafe serves LLMs in production model serving patterns KV-cache and batching strategies evaluation pipelines guardrails and cost controls.
  • Partner with applied ML engineers to take new GenAI-powered product features from prototype to scaled deployment.

Raise the engineering bar across Cloud ML

  • Mentor engineers across the team through design reviews code reviews pairing and written guidance a meaningful uplift on everyone you work with.
  • Establish and evangelize best practices for model lifecycle management (registry deployment monitoring rollback drift) and on-call.
  • Write the documentation runbooks and architectural decision records that make the platform legible and durable.

Own reliability and operational excellence

  • Lead incident response and postmortems for critical ML systems; turn lessons learned into platform-level improvements.
  • Define SLOs observability standards and on-call practices for ML services in production.

Qualifications

  • 8 years of software engineering experience with a clear track record of building and operating large-scale distributed systems in production.
  • Deep expertise in high-throughput low-latency services ad serving recommendations real-time APIs online platforms or similar including the operational reality of running them at scale.
  • Strong production experience on Kubernetes and AWS (EKS S3 IAM networking) and with Kafka containerized deployments CI/CD and infrastructure-as-code.
  • Demonstrated experience with the building blocks of high-scale systems: load balancing autoscaling batching caching multi-tenancy queuing and capacity planning.
  • Proficiency in Python is required; experience with a systems language (Go C Rust) for performance-sensitive components is a plus.
  • Staff-level technical leadership: ability to drive ambiguous cross-cutting initiatives align senior stakeholders and elevate the engineers around you without formal authority.
  • Strong written and verbal communication you can make complex technical tradeoffs legible to ML scientists product and other infra teams.
  • ML exposure is preferred having deployed or operated production ML systems worked closely with ML teams or built ML-adjacent infrastructure. Exceptional distributed systems engineers without direct ML experience are encouraged to apply; well help you ramp.

Bonus Points

  • Hands-on experience with Ray KServe Triton vLLM or other ML serving stacks.
  • Hands-on experience with LLM serving in production (vLLM TGI TensorRT-LLM SGLang) KV cache management continuous batching speculative decoding quantization for serving.
  • Experience building real-time video or streaming pipelines (Kafka Kinesis Flink or similar) at scale.
  • Experience operating GPU-based inference systems GPU-aware scheduling multi-model serving accelerator utilization optimization.
  • Familiarity with ML fundamentals how models are trained evaluated versioned deployed monitored and rolled back in production.
  • Experience with model lifecycle tooling (MLflow Weights & Biases model registries drift detection shadow deployments).
  • Open source contributions to distributed systems or ML infrastructure projects.
  • Experience operating in environments with strong security and compliance requirements.

Why This Role

The Cloud ML team owns the full surface area infrastructure and applied research which means your work as a Staff infra engineer directly shapes whats possible for the science. Youll have unusual leverage: the platform you build determines how fast SimpliSafe can ship intelligent features and the features we ship directly impact whether someones home is safer tonight than it was yesterday.

What Values Youll Share

  • Customer Obsessed - Building deep empathy for our customers putting them at the core of our work and developing strong long-term relationships with them.
  • Aim High - Always challenging ourselves and others to raise the bar.
  • No Ego - Maintaining a no job too small attitude and an open inclusive and humble style.
  • One Team - Taking a highly collaborative approach to achieving success.
  • Lift As We Climb - Investing in developing others and helping others around us succeed.
  • Lean & Nimble - Working with agility and efficiency to experiment in an often ambiguous environment.

What We Offer

  • A mission- and values-driven culture and a safe inclusive environment where you can build grow and thrive
  • A comprehensive total rewards package that supports your wellness and provides security for SimpliSafers and their families (For more information on our total rewards please click here)
  • Free SimpliSafe system and professional monitoring for your home.
  • Employee Resource Groups (ERGs) that bring people together give opportunities to network mentor and develop and advocate for change.

The target annual base pay range for this role is $146600 to $215100.

This target annual base pay range represents our good-faith estimate of what we expect to pay for this role. We use a market-based compensation approach to set our target annual base pay ranges and make adjustments annually. We carefully tailor individual compensation packages including base pay taking into consideration employees job-related skills experience qualifications work location and other relevant business factors.

Beyond base pay we offer a Total Rewards package that may include participation in our annual bonus program equity and other forms of compensation in addition to a full range of medical retirement and lifestyle benefits. More details can befound here.

Were committed to fair and equitable pay practices as well as pay transparency. We regularly review our programs to ensure they remain competitive and aligned with our values.

We wholeheartedly embrace and actively seek applications from all individuals no matter how they identify. We are committed to cultivating a diverse and inclusive workplace and we believe our work is enriched when we incorporate a multitude of perspectives backgrounds and experiences. We want everyone who works here to thrive and contribute to not only our mission of keeping every home secure but also to making our workplace safe and supportive for others. If a reasonable accommodation may be needed to fully participate in the job application or interview process to perform the essential functions of a position or to receive other benefits and privileges of employment please contact.


Required Experience:

Staff IC

About SimpliSafeWere a high-tech home security company thats passionate about protecting the life youve built and our mission of keeping Every Home Secure. And weve created a culture here that cares just as deeply about the career youre building. Ours is a no ego culture of collaboration and innovat...

About Company

Company Logo

Shop award-winning home security systems from SimpliSafe. Professional monitoring, protection from break-ins and hazards, and no contracts.

View Profile View Profile