Job description

Head of Support & Service Reliability Engineering

We are seeking a Head of Support & Service Reliability to lead and evolve our global support function into a proactive, platform-integrated reliability capability.

This role provides an exciting and dynamic opportunity for an outcome focused individual; as Sycurio is in a critical inflection point as we transition from a single-tenant architecture to a multi-tenant SaaS platform, requiring a fundamental shift from reactive ticket handling to systemic reliability, observability, and customer experience management at scale.

You will own the end-to-end operational integrity of the platform, ensuring availability, performance, and customer trust, while partnering closely with Engineering, Product, and Customer-facing teams; being a key contributor to our GRR goal of 90%+

Sycurio employs a strategic managed service provider who provides the people, tooling, and day-to-day execution across all support tiers. The Head of Support sets the standards, governs vendor performance, and ensures every aspect of the support experience — from incident response to customer satisfaction — meets enterprise-grade expectations

Key Responsibilities:

Service Reliability & Platform Stability
Own platform availability, performance, and reliability across all tenants
Reduce incident frequency, severity, and blast radius
Establish and drive Service Reliability Engineering (SRE) principles
Ensure scalability and operational readiness of a multi-tenant platform
Incident Management & Response
Implement and lead a structured incident management framework (P1–P4)
Act as executive owner of major incidents (P1/P2)
Drive improvements in:
Mean Time to Detect (MTTD)
Mean Time to Resolve (MTTR)
Ensure clear, consistent internal and external communication during incidents
Observability & Monitoring
Define and implement a comprehensive observability strategy, including:
Technical telemetry (infrastructure, application, APIs)
Business telemetry (transactions, payment success rates, usage)
End-to-end customer journey visibility
Ensure issues are detected proactively, not customer-reported
Partner with Product and Engineering to embed telemetry into the platform
Support Operations (L1–L3)
Lead global support teams ensuring high-quality, SLA-driven case management
Define and enforce support processes, tooling, and performance standards
Improve key metrics:
First response time
Resolution time
Reopen rate
Escalation quality
Platform Operations & Change Management
Oversee operational aspects of the platform, including:
Release management and deployment safety, ensuring all releases are observable, reversible, and low-risk
Change control processes
Environment consistency across staging and production
Own the visibility and continuous improvement of delivery and recovery performance using the DORA metrics, in partnership with Engineering
Issue Management & Root Cause Discipline
Establish rigorous Root Cause Analysis (RCA) standards
Identify and eliminate systemic issues (not just symptom fixes)
Track and reduce recurring incidents
Feed insights into Product and Engineering roadmaps
Customer Experience & Commercial Alignment
Align support with Customer Success and Sales
Ensure coordinated communication during incidents
Protect customer relationships during critical events
Introduce tenant-aware impact assessment (ARR, strategic accounts, regulatory exposure)
Support enterprise-grade expectations for transparency and reliability
Cross-functional Leadership
Act as the bridge between:
Engineering
Product
Customer Delivery / Success
Embed supportability and operational readiness into:
Pre-sales (Stage 4/5 governance)
Product development
Deployment processes

Managed Service Governance
Chair regular operational reviews and quarterly business reviews with the managed service leadership team
Own the managed service scorecard — defining KPIs, reviewing performance data, and driving accountability for misses
Manage contract compliance, SLA adherence, and commercial exposure from managed service underperformance
Lead continuous improvement programs jointly with the managed service provider, including tooling upgrades, process redesigns, and training investments
Maintain an escalation path for systemic or persistent managed service failure, up to and including remediation planning

Key qualifications, skills, experience:

Required

10+ years in Support, Platform Operations, or SRE leadership roles
Proven experience in multi-tenant SaaS and legacy environments
Strong understanding of:
Distributed systems
Incident management at scale
Observability frameworks
Track record of building and scaling high-performing operational teams
Experience in outsourced or hybrid operational models
Experience working cross-functionally with Engineering and Product

Preferred

Background in payments, security, or compliance-driven environments (e.g., PCI)
Experience with API-first platforms and telephony/payment flows
Familiarity with observability tools (e.g., Grafana, etc.)

PI6555f5205be5-30511-40604515

Alert me to jobs like this

Head of Support & Service Reliability Engineering Job at Guildford, England, United Kingdom, GU1 1QA Full Time NEW

Sycurio

Key Responsibilities:

Key qualifications, skills, experience:

Job Overview

Log In

Sign Up

Head of Support & Service Reliability Engineering Job at Guildford, England, United Kingdom, GU1 1QA Full Time NEW

Sycurio

Apply For This Job

Key Responsibilities:

Key qualifications, skills, experience:

Related Jobs

Maintenance Engineering Manager Job at Ashby de la Zouch, LEC, GB Full Time

Lead Backend Engineer (Node.js / TypeScript /Remote) Job at Northern Ireland, County Antrim, Belfast Full Time

Lecturer/Senior Lecturer in Electrical and Electronic Engineering Job at Gipsy Lane Site Full Time

Engineering supervisor Job at Peterborough, GB Full Time

Lifecycle Marketing Lead (PLG SaaS) – 26051601 Job at United Kingdom, GB Full Time

Engineering Technician Job at Stafford, Staffordshire, United Kingdom Full Time

Job Overview

Apply For This Job