PagerDuty Advance - A Discovery into AI

Summary

With AI transforming industries, we saw an opportunity to explore how it could enhance our product and improve the incident response experience for our users. This research focused on understanding how users interact with PagerDuty, their incident management workflows, and the challenges they face. By uncovering these insights, we aimed to identify opportunities where AI could streamline processes and reduce the effort required to resolve incidents.

Tools

  • Salesforce

  • Outreach.io

  • Zoom

Type & Timeline

  • Discovery

  • 3 months

Background

Through past research and conversation with customers, we know that most of the work and communication during incidents happens through chat tools such as Slack or Microsoft Teams rather than working in our web app. We decided to start with exploring AI solutions in this area and aimed to learn more about how our users were communicating during and after incidents, what pain points they were facing and how we could expand our toolset and capabilities to improve their experience.

Goals

This project aimed to:

  • Understand incident communication workflows:

    • What communication channels are used?

    • What language and tone are common?

    • How do stakeholders and end-users interact?

    • How are post-incident reviews conducted?

    • What aspects of communication can be automated?

  • Identify opportunities for AI in incident management:

    • Focus on the needs of admins and managers

    • Explore AI’s role in streamlining workflows and decision-making

  • Investigate how users currently integrate AI into their processes

Methods

Since we were trying to get a larger overall overview into how our users are working in the product or our integrations, we decided interviews would be the most appropriate way to approach the research. We created a script that allowed us to address our goals while still leaving it open enough to ask questions or probe more into answers the participants gave. The interviews were structured to be run by me as the researcher and also included the designer and the product manager on the team to provide further support and take notes.

Participants

For this discovery we decided to stick to speaking with users in admins or managers roles given they would have the most insight into how new tools and solutions are implemented and oversee incident response. In the end we ran 5 user interviews with PagerDuty users in these roles. Participants were recruited through our Customer Success team or our design partners program. The design partner is an initiative started by our UX Research team that allows our users to opt-in to be available and involved in research as participants when relevant projects come up. We reached out to these customers using Salesforce and Outreach.io and all interviews were conducted through Zoom video calls.

Results

Analysis

Once the interviews had been completed, we compiled our notes and performed a thematic analysis with affinity mapping together as a team in order to collaborate on different ideas and identify major themes that emerged from the data.

Key Findings

Some of the key findings regarding incident response and AI included:

  • Differences between internal and external communication

  • Various methods/processes and forms of contact for each case

  • Post-incident review/postmortems

    • Time-consuming and inconsistent

  • Too much data coming in from various sources

  • More work is being done in chat tools or integrations rather than in the product

  • Difficulties with catching up on an incident and actions that have already been taken

Outcomes

With the insight uncovered from this discovery as well as other sources of data, the PagerDuty Advance team created an AI chatbot that allows you to take a variety of actions and aims to reduce manual work. Some of the major features include a “catch me up” option which will provide a summary of what has occurred during an incident so far and allows those joining incidents late to quickly understand the context of what is happening. Another major feature is “wrap me up” for after an incident has been resolved. This prompt will create a post-incident review of what occurred during the incident and what steps were taken to resolution which is then integrated with Jeli, our dedicated postmortem tool. Additionally, we included the option for the AI tool to create an automatic post-incident review which will save our users time and manual effort. We also added the ability to use AI to send status updates, get more information regarding an incident including root cause analysis or related incidents.