Picture the scene: It’s a nice calm day on the service desk, the clients are happy, the users are satisfied and the agents are calm and settled. The sun is shining and the birds are singing in the trees. 5 minutes later;
all hell breaks loose and the phone lines are going crazy. The board showing incoming calls is the reddest red you have ever seen and the agents are stressed and unhappy. The users are not happy because they are waiting for hours in the call queue, only to be told that it is a major outage on the network. The clients are not happy because they see the SLA’s are dropping which means the management are unhappy and dissatisfied. Even the sun went away and the birds stopped singing.
Every experienced service desk agent knows that the scenario above is exactly what happens when there is a major network outage. We can have the best systems in place to prevent outages and major incidents but sometimes we have to understand this is IT and things often happen that are beyond our control. What we need to do is to respond to the network outage in an appropriate way; with speed, accuracy and calmness. If we do this right then as soon as the major incident is over, everyone can go back to being happy and calm and the board once again goes green.
5 ways to help
1. Quick reaction times – message on the phone system / website
A service desk should have a dedicated individual who can monitor what is going on behind the scenes. He or she should be able to see what tickets are coming into the queue of the desk and what incidents people are calling about. This can be easily done with good filters on ticketing systems and if the individual notices a sudden rush of tickets for the same problem, he or she can place a message on the phone system informing the caller of the issue and the expected time of resolution. This should be practiced often to improve the quality and speed.
2. Incident Management Team – constant communication
Depending on the size of your operation, your incident management team might work independently or among your service desk. Either way, your Incident Management team should be trained to discover what the problem is and get straight onto the phone with the people who can fix it. Whatever the size of your team or operation, you will certainly have SLA’s that have to be met when a major incident occurs; it’s the major incident management team who are responsible and they should have perfect lines of communication with the service desk. Someone from this team should be available throughout the period of your coverage. It is not good enough to say that the Incident Management team will be helping you after they finish their lunch.
3. Have you team leaders prepared – earn their positions
This is where the team leader should be earning their extra money. Once the incident happens and the board starts going red, the team leaders should be communicating with the agents and explaining what the issue is, whether there is a workaround or not, what the estimated outage time is and what the master ticket number is. The team leader should be well prepared and tested to handle these situations; and he shouldn’t not be panicking or worrying which might get everyone else worked up.
4. Triage – assess
In the army, triage of casualties means to assess which injuries are the worse and handle them accordingly. In the same way, triage of incidents means that your agents should be handling calls in an effective way. For instance, if you have 10 calls in the queue and most of them are going to be for an outage which will mean a 30 second call, except one which will be an hour-long installation, ask that user if you can call them back in a short while and see to the shorter calls first. This way, the shorter calls will not be waiting an hour to be told that the system is down and everyone knows what is going on.
5. Reflect on what you did right and wrong
The system goes back online and the board goes green; incident over? Not for the managers and team leads anyway. Find out what you did wrong and what worked well from experience in a major incident. If something can be done better, train and test the agents and appropriate teams to handle it better next time. Additionally know how simplicity at the IT Service Desk is yet another important factor for customer service.