Description
The Site Reliability Engineer (SRE) SQL is a critical technical leader responsible for ensuring the availability performance and reliability of Tyler Technologies SQL Server infrastructure. This role combines deep database expertise with operational excellence to support highperformance systems across complex client environments. The ideal candidate is confident authoritative and takes ownership in diagnosing and resolving issues while influencing database design infrastructure improvements and longterm stability strategies.
Hybrid Work Policy: The candidate is required to be onsite at least 3x per week at the Plano TX office.
Responsibilities
- Serve as the lead resource for highseverity SQL Server incidents driving triage diagnostics and resolution in real time.
- Own performance tuning indexing strategies and architecturelevel improvements to optimize database systems at scale.
- Proactively monitor database performance system health and workload trends to identify and resolve issues before they impact customers.
- Collaborate with product and development teams to refine schema design improve query performance and enhance overall data architecture.
- Develop and maintain database standards observability dashboards and automation for patching backups and alerting.
- Design and execute comprehensive backup and disaster recovery strategies for critical systems.
- Contribute to continuous improvement initiatives including cloud modernization infrastructure as code and capacity planning.
- Author technical documentation including runbooks architecture designs internal KBs and lessons learned.
- Provide mentorship and technical leadership to engineers across support and infrastructure teams.
- Advocate for architectural and operational improvements across teams using data and insight to influence decisions.
Role Complexity
To be successful this individual must:
- Demonstrate expertlevel understanding of SQL Server internals optimization techniques and operational best practices.
- Lead highstakes conversations and incident calls with clarity confidence and control.
- Understand systemlevel performance (I/O memory CPU) and how it affects SQL operations.
- Communicate technical issues and solutions to both technical and business stakeholders effectively.
- Analyze recurring incidents to identify trends and permanently resolve root causes.
- Operate autonomously with a sense of ownership and urgency.
- Balance shortterm firefighting with longterm architectural planning and automation.
Qualifications
- Bachelors degree in Computer Science Information Systems or a related fieldor 5 years of equivalent experience.
- Proven experience in a productionfacing SQL Server environment preferably in a SaaS or multitenant context.
- Handson expertise in:
- Query tuning indexing and execution plan analysis.
- High availability replication disaster recovery and backup strategies.
- Scripting and automation using PowerShell TSQL or similar.
- Monitoring and observability tools for SQL Server and infrastructure health.
- Strong familiarity with virtualization storage performance and cloud platforms (e.g. AWS Azure).
- Demonstrated ability to lead incident response and influence crossfunctional technical decisions.
- Exceptional written and verbal communication skills especially under pressure.
- Previous experience mentoring peers or junior engineers is a plus.