drjobs Solutions Architect Platform Infrastructure Remote

Solutions Architect Platform Infrastructure Remote

Employer Active

1 Vacancy
drjobs

Job Alert

You will be updated with latest job alerts via email
Valid email field required
Send jobs
Send me jobs like this
drjobs

Job Alert

You will be updated with latest job alerts via email

Valid email field required
Send jobs
Job Location drjobs

San Francisco, CA - USA

Yearly Salary drjobs

$ 131000 - 181000

Vacancy

1 Vacancy

Job Description

At Weights & Biases our mission is to build the best tools for AI developers. We founded our company on the insight that while there were excellent tools for developers to build better code there were no similarly great tools to help ML practitioners build better models. Starting with our first experiment tracking product we have since expanded our solution into a comprehensive AI developer platform for organizations focused on building their own deep learning models and generative AI applications.

Weights & Biases is a Series C company with $250M in funding and over 200 employees. We proudly serve over 1000 customers and more than 30 foundation model builders including customers such as OpenAI NVIDIA Microsoft and Toyota.

The Solutions Architect role at Weights & Biases is a unique hybrid blending the technical expertise of a Site Reliability Engineer (SRE) with the communication and advisory skills of a Solutions Architect. In this role you will focus on all aspects of the Weights & Biases Platform managing customer deployments across various cloud infrastructures and onprem environments to ensure scalability reliability and operational excellence.

You will work closely with customers to debug issues provide best practices and help them unlock the full potential of Weights & Biases. Additionally you will produce technical content such as blog posts documentation updates and internal enablement material to support the Field Engineering team. This role requires deep collaboration with Support Product and Engineering teams to drive product improvements based on customer insights.

Responsibilities:

    • Deployment & Operations:
    • Work with customer operations teams to provision Weights & Biases services in Dedicated Cloud Private Cloud and onprem environments.
    • Manage complex infrastructure implementations partnering with highly skilled customer engineers.
    • Monitor and ensure the reliability performance and scalability of customer deployments using SRE best practices.
    • Debugging & Troubleshooting:
    • Diagnose and resolve issues in customer environments documenting resolutions to accelerate future problemsolving.
    • Provide handson support for containerized and distributed systems using Docker Kubernetes and related technologies.
    • Customer Engagement:
    • Lead technical discussions with customers acting as a trusted advisor for infrastructure reliability and operational excellence.
    • Deliver training sessions product demos and workshops to help customers maximize the value of Weights & Biases.
    • Collaborate with customers to uncover desired outcomes and recommend solutions tailored to their needs.
    • Enablement & Collaboration:
    • Partner with AI Solution Engineers to streamline postsales processes including onboarding adoption and training.
    • Collaborate with Sales Engineering to ensure a seamless transition from POC to onboarding.
    • Provide insights to the Product team based on customer feedback to influence the product roadmap.

Requirements:

    • Based in the Pacific Standard Time (PST) timezone.
    • A proven track record of systematically diagnosing and resolving infrastructure issues.
    • Prior experience in a customerfacing technical role.
    • Expertise with Docker Kubernetes Helm charts networking and cloudmanaged services (e.g. MySQL Object Stores).
    • Strong fundamentals in Infrastructure as Code (IaC) preferably Terraform.
    • Proficiency with at least one cloud platform (AWS GCP Azure); experience with multiple platforms is a plus.
    • Strong Linux/Unix command line experience.
    • Basic proficiency in Python and familiarity with ML workflows or tools.
    • Exceptional communication skills both written and verbal with the ability to simplify complex topics for diverse audiences.
    • Proven ability to prioritize and manage multiple competing tasks in a dynamic environment.

Strong Plus:

    • Deep proficiency in Kubernetes design patterns including Operators.
    • Familiarity with data engineering and MLOps tooling.
    • Experience as an educator or facilitator for technical training sessions workshops or demos.
    • SaaS web service or distributed systems operations experience.

Our Benefits:

    • Flexible time off
    • Medical Dental and Vision for employees and Family Coverage
    • Remote first culture with inoffice flexibility in San Francisco
    • Home office budget with a new highpowered laptop
    • Truly competitive salary and equity
    • 12 weeks of Parental leave (U.S. specific)
    • 401(k) (U.S. specific)
    • Supplemental benefits may be available depending on your location
    • Explore benefits bycountry

Employment Type

Full-Time

About Company

Report This Job
Disclaimer: Drjobpro.com is only a platform that connects job seekers and employers. Applicants are advised to conduct their own independent research into the credentials of the prospective employer.We always make certain that our clients do not endorse any request for money payments, thus we advise against sharing any personal or bank-related information with any third party. If you suspect fraud or malpractice, please contact us via contact us page.