StackPulse Debuts Automated Kubernetes Troubleshooting & Remediation Tools

SRE Playbooks and automated diagnostics help enterprises deliver reliable production-grade Kubernetes applications

PORTLAND, Ore.--(BUSINESS WIRE)--StackPulse announced today a Kubernetes-centric “operations center” initiative as a part of its Reliability platform, and will showcase it at next week’s KubeCon + CloudNativeCon Europe 2021 conference. With these additions, StackPulse gives organizations running Kubernetes a powerful set of capabilities to augment their existing incident response practices, helping Site Reliability Engineers (SRE) understand and investigate issues faster, and deploy well-tested outage mitigation strategies, helping prevent customer-facing downtime.

The 15-month old company that exited stealth mode in January, with $28 million in funding, automates tasks associated with outage response so that SRE and DevOps teams can recover applications more quickly, saving lost revenue and degraded customer experiences.

Since Kubernetes is the de-facto standard for running containerized applications, StackPulse wanted to create a set of code-based tools engineers could use to operationalize incident response for production Kubernetes-based applications. When an error is detected in a Kubernetes environment, StackPulse automatically executes diagnostic steps to gather information from the clusters, and assists engineers in performing the root-cause analysis. This automation helps them quickly identify how to mitigate and resolve an issue. Additionally, StackPulse has released more than a dozen playbooks built by SRE experts that remediate common Kubernetes problems. Using the StackPulse platform to automate these playbooks significantly reduces the time to resolution, helping teams restore services faster and meet SLOs.

“If you're serious about cloud-native, you're using Kubernetes, but it requires learning new concepts, and turning applications alongside infrastructure for best performance,” said Leonid Belkind, CTO and co-founder of StackPulse. “While developer teams push to adopt K8s due to the benefits in velocity it brings, it can be hard for Ops teams or on-call developers to know how to respond to alerts, or fix issues in production. This leads to costly incidents and outages. What we’re releasing today is a set of automated tools for diagnostics, mitigation, and remediation that help any Kubernetes environment operate with the best practices of planet-scale Kubernetes shops.”

All the Kubernetes tools and automated diagnostics are available to teams in the same platform as StackPulse's incident response functionality so teams can communicate during outages, centralize event data, and take action to remediate. From detecting issues by correlating signals from multiple sources to enriching alerts sent to on-call teams with root cause and remediation information, StackPulse drastically decreases the customer impact of production issues, helping stop outages in their tracks.

Additional Resources

Visit the StackPulse Automated Kubernetes Operations and Troubleshooting page featuring playbooks, tools and automated incident diagnostics
Access the Kubernetes Playbooks or the StackPulse Playbook Library in full on GitHub
Get started with a StackPulse trial
StackPulse is proud to sponsor KubeCon + CloudNativeCon Europe 2021

About StackPulse

StackPulse keeps software services reliable, from alert to resolution. Intuitive incident response and powerful playbook orchestration help you automate your incident process to mitigate and remediate faster.

StackPulse is backed by GGV Capital and Bessemer Venture Partners, and was founded in 2020 by a team with a shared history working together across numerous DevOps, Cloud Security and IT startups. Currently, the company has offices in Portland, Ore, Tel Aviv, Israel and New York.

Contacts

Josh Thorngren
StackPulse
+1 971 645-7736

Bill Hankes
Waters Agency
+1 206 883-7658

Industry:

More News From StackPulse

StackPulse Releases Free Edition of Reliability Platform

PORTLAND, Ore.--(BUSINESS WIRE)--Today, StackPulse announced the release of a free-to-use edition of its reliability platform that gives developers, DevOps, and Site Reliability Engineering (SRE) teams a modern way to respond to software outages, automate manual operations, and deliver more reliable software services to end users. With StackPulse, teams can apply DevOps and Site Reliability Engineering principles to the on-call process of identifying, responding to, and resolving service incide...

StackPulse Exits Stealth with $28 Million to Transform IT Service Management into Reliability Engineering

PORTLAND, Ore.--(BUSINESS WIRE)--StackPulse, founded a year ago to modernize software reliability with engineering best practices, today announced a $20 million Series A led by GGV Capital. This newest funding brings the total amount raised to $28 million, including a previously undisclosed $8 million seed round less than a year ago led by Bessemer Venture Partners, which also participated in the Series A. Glenn Solomon at GGV and Amit Karp at Bessemer will join the StackPulse board of director...

Back to Newsroom

Services & Solutions

Services

Solutions For

Resources

Education

Why Business Wire

StackPulse Debuts Automated Kubernetes Troubleshooting & Remediation Tools

Contacts

StackPulse

Contacts

StackPulse Releases Free Edition of Reliability Platform

StackPulse Exits Stealth with $28 Million to Transform IT Service Management into Reliability Engineering

StackPulse

Contacts