PoC for Incident Evacuation Training Using Generative AI with Slack + AWS Chatbot + Bedrock
#SRG(Service Reliability Group) mainly provides cross-sectional support for the infrastructure of our media services, improving existing services, launching new ones, contributing to OSS, etc.
This article introduces a PoC for an incident evacuation drill using Slack workflow + AWS Chatbot + Bedrock that our team created at a hackathon event.
SRG's one-day hackathon event "SRG TechFest"Incident evacuation drills using generative AIReason for choosing incident evacuation drills as the subjectcompositionIntroducing what I createdthoughtsConclusion
SRG's one-day hackathon event "SRG TechFest"
Once a quarter, our team holds a hackathon event in which all team members participate.
For this event, each team will come up with a topic related to the theme and will spend the day working on it.
The overall theme of this event was "Utilizing generative AI from an SRE perspective."
My team and I have been working on two things:
- Incident evacuation drills using generative AI
- Leverage local LLM for Slack search
In this article, I will write about the former, "Incident evacuation drills using generative AI."
Incident evacuation drills using generative AI
Reason for choosing incident evacuation drills as the subject
Incident evacuation drills involve simulating an actual incident and checking escalation and response flows.
Doing this has the benefit of improving our ability to respond to actual incidents and also helping to develop junior members.
However, there may be times when the preparations required for implementation are difficult, or you don't know where to start.
Therefore, we thought that by using generative AI to simulate an incident and consider how to respond to it, we could use the knowledge we gained to respond to actual incidents.
composition
By triggering a Slack workflow, you can send an inquiry to AWS Chatbot, which will then communicate with Amazon Bedrock and return a reply.

With the September 2024 update, it is now possible to interact with Amazon Bedrock from Slack using AWS Chatbot with no coding required.
This production was also achieved without writing any code.
Introducing what I created
Click a button to launch a workflow from a Slack channel

When the workflow starts, select the system's service name and system configuration and submit.

A message will be posted and system configuration information will be sent to AWS Chatbot.

After a few seconds, the chatbot will send the question as a reply to the thread.

Press the button to send your response, enter your response policy in text, and send it.

Occasionally you will be asked additional questions.

Click the Submit Answer button and enter your answer.

Finally, you will receive a score, evaluation, and commentary on your actions so far.

thoughts
Good points
- Although the accuracy is low, it may be useful for considering initial steps in troubleshooting.
- The advice on "What should I have done?" could be used to improve the junior generation.
Weak points and areas for improvement
- To determine whether the escalation flow will work, you need to actually involve people in the training.
- Scoring is lenient and high scores are obtained even if the survey content is vague.
- It would be good to establish the next step in this incident evacuation training workflow.
- Example: If the escalation flow is not yet established, provide guidance to the information necessary to establish the flow.
Conclusion
It's still pretty rough, but I think I'm starting to see the direction of results.
The event was a good opportunity and it was fun to get to work on areas that are difficult to tackle in my regular work.
In the future, I would like to have people outside our team try it out and get feedback so that we can make improvements.
SRG is looking for people to work with us. If you are interested, please contact us here.