We are seeking a Head of Support & Service Reliability to lead and evolve our global support function into a proactive, platform-integrated reliability capability.
This role provides an exciting and dynamic opportunity for an outcome focused individual; as Sycurio is in a critical inflection point as we transition from a single-tenant architecture to a multi-tenant SaaS platform, requiring a fundamental shift from reactive ticket handling to systemic reliability, observability, and customer experience management at scale.
You will own the end-to-end operational integrity of the platform, ensuring availability, performance, and customer trust, while partnering closely with Engineering, Product, and Customer-facing teams; being a key contributor to our GRR goal of 90%+
Sycurio employs a strategic managed service provider who provides the people, tooling, and day-to-day execution across all support tiers. The Head of Support sets the standards, governs vendor performance, and ensures every aspect of the support experience — from incident response to customer satisfaction — meets enterprise-grade expectations
Service Reliability & Platform Stability
Own platform availability, performance, and reliability across all tenants
Reduce incident frequency, severity, and blast radius
Establish and drive Service Reliability Engineering (SRE) principles
Ensure scalability and operational readiness of a multi-tenant platform
Incident Management & Response
Implement and lead a structured incident management framework (P1–P4)
Act as executive owner of major incidents (P1/P2)
Drive improvements in:
Mean Time to Detect (MTTD)
Mean Time to Resolve (MTTR)
Ensure clear, consistent internal and external communication during incidents
Observability & Monitoring
Define and implement a comprehensive observability strategy, including:
Technical telemetry (infrastructure, application, APIs)
Business telemetry (transactions, payment success rates, usage)
End-to-end customer journey visibility
Ensure issues are detected proactively, not customer-reported
Partner with Product and Engineering to embed telemetry into the platform
Support Operations (L1–L3)
Lead global support teams ensuring high-quality, SLA-driven case management
Define and enforce support processes, tooling, and performance standards
Improve key metrics:
First response time
Resolution time
Reopen rate
Escalation quality
Platform Operations & Change Management
Oversee operational aspects of the platform, including:
Release management and deployment safety, ensuring all releases are observable, reversible, and low-risk
Change control processes
Environment consistency across staging and production
Own the visibility and continuous improvement of delivery and recovery performance using the DORA metrics, in partnership with Engineering
Issue Management & Root Cause Discipline
Establish rigorous Root Cause Analysis (RCA) standards
Identify and eliminate systemic issues (not just symptom fixes)
Track and reduce recurring incidents
Feed insights into Product and Engineering roadmaps
Customer Experience & Commercial Alignment
Align support with Customer Success and Sales
Ensure coordinated communication during incidents
Protect customer relationships during critical events
Introduce tenant-aware impact assessment (ARR, strategic accounts, regulatory exposure)
Support enterprise-grade expectations for transparency and reliability
Cross-functional Leadership
Act as the bridge between:
Engineering
Product
Customer Delivery / Success
Embed supportability and operational readiness into:
Pre-sales (Stage 4/5 governance)
Product development
Deployment processes
Managed Service Governance
Chair regular operational reviews and quarterly business reviews with the managed service leadership team
Own the managed service scorecard — defining KPIs, reviewing performance data, and driving accountability for misses
Manage contract compliance, SLA adherence, and commercial exposure from managed service underperformance
Lead continuous improvement programs jointly with the managed service provider, including tooling upgrades, process redesigns, and training investments
Maintain an escalation path for systemic or persistent managed service failure, up to and including remediation planning
Required
10+ years in Support, Platform Operations, or SRE leadership roles
Proven experience in multi-tenant SaaS and legacy environments
Strong understanding of:
Distributed systems
Incident management at scale
Observability frameworks
Track record of building and scaling high-performing operational teams
Experience in outsourced or hybrid operational models
Experience working cross-functionally with Engineering and Product
Preferred
Background in payments, security, or compliance-driven environments (e.g., PCI)
Experience with API-first platforms and telephony/payment flows
Familiarity with observability tools (e.g., Grafana, etc.)
PI6555f5205be5-30511-40604515
Job description Challenge Yourself and Impact the Future! MacDermid Alpha Electronics Solutions, a business segment of Element Solutions Inc (NYSE:...
Apply For This JobJob description Lead Backend Engineer | Node.js / TypeScript | Remote First Lead backend engineering across modern cloud-native platforms Own...
Apply For This JobJob description You will be responsible for: Providing teaching on relevant advanced modules on undergraduate and postgraduate Engineering programmes Supporting...
Apply For This JobJob description Better places, thriving communities. Join Mitie – the future of high performing places. ROLE: Engineering Supervisor HOURS: Monday...
Apply For This JobJob description We are looking for a senior lifecycle marketer who wants to build, own, and optimize a complete behavioral...
Apply For This JobJob description Engineering Technician Engineering Technician Stafford Fulfilment Centre 48,127.56 + 10% shift allowance | 40.25 hrs per week |...
Apply For This Job