Friday, March 4, 2016

My Brain Hurts

Customer Rep: "We urgently need your help to fix this problem we're having. Stuff is on fire!"

Me: "Ok, cool. I can give you some time this week and next."

Customer Rep: "Alright. As soon as you can, please come on site and help us"

Me: Arrive on-site. Listen to customer's problem report. "How long has this issue been going on?"

Customer Rep: "The customer alerted us of the problem last month. We've been trying to fix it since then, but things are only  getting worse as time goes by".

Me: (brain showing first signs of imploding) "...aaaand when did your customer first notice this critical problem?"

Customer Rep: "Maybe a few weeks or a month before they reported it to us?"

Me: "...and this is a critical problem?"

Customer Rep: "Yes: the customer can't do any of the things they need to do and the stuff that was working the last time you were here no longer works."

Me: "Did anyone open a support case with the vendor?"

Customer Rep: "We wanted to wait for you to help us. However, we do have some vendor consultants coming in, next week."

Me: "Aaaalright... Just so we can get the most out those consultants, I need you to open a case so we can get appropriate notes put in and appropriate backline experts looking at the problem so that the guys coming onsite don't have their time wasted."

Time passes

Me: "Can you give me the number of the case you opened? I want to put it in my notes"

Customer Rep: "No. We didn't open one. We figured we'd open one next week when the rep got on site if he said he needed one."

Me: "...Open the case, now, please. It takes the vendor time to identify appropriate resources and we need to ensure those resources lined up ahead of time. Give me the case ID when you're done."

More time passes. I have time to begin preliminary investigations prior to vendor's arrival. I make an initial set of recommendations to help address some of the issues reported.

Customer replies (through customer rep), "we don't want to make those kinds of changes."

More time passes, and the vendor comes on-site. By this time, I've done a lot more digging and have identified a set of more concrete symptoms and potential causes and cures. Consult with the vendor's on-stie and backline engineers. Some of the observed issues can be addressed via a very recently-released emergency-patch.

Me: "Vendor has supplied a patch that will address some of your customer's problems. We'l need to take the service offline, for a bit, to apply the patch."

Customer Rep: "Ok, cool. Lemme go ask the customer for a downtime to apply the patch."

Me: "Aaaalright. I coulda sworn you told me that this thing was useless to your customer in its current state?"

Customer Rep: "Well, he's able to do some stuff on it, it's just taking a really long time to do those things and we had to turn off a whole bunch of other things. I know that the customer will want his active tasks to complete before we take the whole shebang offline".

Me: "You do remember me telling you that I saw your customer's scheduled tasks kick back off, every day, at X o'clock, right? You really need to get the patch applied and verified well before that time ...and there's less than two hours till X o'clock."

Customer Rep: "Customer said we can't do anything until his task finishes."

Me: "Remember me telling you about the new task kickoffs? Looking back in their task-tracker's history, those currently running tasks are going to overlap the soon-to-be-started tasks. And the soon-to-be-started tasks will overlap with the next day's tasks: there's not going to be a window during which nothing is happening. We need to take an outage to apply these patches if you ever want to start getting this software back to a working state."

My brain hasn't completely imploded, yet, but I feel this expression crawling across my face every time I have to deal with customers like this one.


First: if something's been going on for days, weeks or months, it's not an emergency
Second: if fixing your problem is less important than continuing to limp along in your "broken" state while you wait for always-overlapping tasks to finish, then your problem really isn't an emergency.
Third: if someone suggests adjustments that can improve performance without causing a downtime, yet you refuse to make those adjustments, your problem really isn't an emergency.