Archive for Cognitive Bias
Trust your instincts, but don’t explain them
Posted by: | CommentsHave you had the experience of looking at a situation and knowing – just knowing, that something was wrong? Perhaps it was a user interface design or a software diagram or project schedule. I’m not talking about a dispassionate concern but an emotional response – you recoiled inside and just knew. Then you had to explain to the person or team that presented the situation why this was the case. Most likely you can draw on your experience and come up with a few very convincing explanations for your gut reaction, but usually you’ll walk away unsettled. It still isn’t right, but you just couldn’t put your finger on it.
Most professionals develop an instinctive ability to size up situations within their core expertise. For example, a seasoned product manager that has worked up through the ranks of developers can often look at a schedule and get a quick feel that it can or can’t be met. Most of the time this intuition expresses itself as a strong gut reaction that you can’t explain. Malcolm Gladwell wrote the excellent book Blink talking about this phenomenon, well worth reading so that you know what to do when you have this reaction the next time.
The basic challenge is this: Even if your instinctive reaction is correct, it isn’t particularly useful until you can explain why. It’s very tempting to try to verbalize the rationalization for your reaction, but don’t. The problem is that your rational mind has no idea why you had an emotional reaction. The emotional reaction is your clue – your intellect and emotions aren’t really connected that well. They need to arrive at their conclusions independently: For you to intellectually justify your reaction you need time to perform a dispassionate intellectual process. Your instinctive emotional reaction didn’t need that time, but it can’t explain itself.
The Path Through The Woods
So if your emotional reaction is equally as reliable as your intellectual evaluation, but you can’t articulate it right away what should you do? The first thing is to develop a code phase for your team that indicates that this is your blink reaction, and not something else. This lets everyone weigh it correctly: It isn’t because you just don’t want to do it or you like another option better, it’s that there’s something instinctively wrong with it. Your team should give it equal weight to you having just expressed a cogent, real argument for there being a problem. Second, the reaction can’t be challenged – at least not directly. It’s an emotional response, so any challenge will drive the participants strait into conflict from collaboration. In our shop, we just say “I have a blink about this…” That cues everyone in.
At the same time, it’s important to recognize that your blink can and will be wrong, sometimes a great deal. It’s no more accurate than a rational analysis of the same circumstance, and that means if the circumstance is something notoriously hard to predict like an election, a project schedule, or a roll of dice then it’s not going to do better than spending some quality time in analysis. This means that when you have a blink reaction then your team should continue with the assumption that the reaction is dead on, but casually seek out the proof.
Casually Find The Evidence
Once you’ve articulated your instinctive blink reaction, have the team take the case that it’s true and then as discussions continue identify facts and data that support the reaction. Eventually, one of a few things will happen:
- Killer supporting evidence will emerge: In the process of going through subsequent analysis, you’ll find the evidence that suddenly has the reason behind your reaction clear to the team. It’s now an intellectually reasoned response; you just got there faster instinctively.
- You’ll realize your reaction was wrong: As time passes, your mind will keep attempting to align facts with your reaction seeking to prove or disprove it. Eventually you, or your team, will see what it was that your mind originally caught on and understand that it doesn’t apply in this case: The reaction was wrong, and now you can proceed. This was still an important exercise because you now have validation on your course of action (“we can ignore this previous best practice because it no longer applies because….”)
- You’ll look at the problem tomorrow and get over it: Perhaps what you felt was just the normal human fear of change. So be it- now that you aren’t feeling threatened any more, your rational mind can reassert itself and look at the item more objectively and be open to new possibilities.
Each of these is a powerful collaboration result because it lets the team and the individual practice and demonstrate the characteristics of a trusting, supporting environment. The team showed the respect for the individual participant and leveraged the experience of everyone, not just those with great oratorical skills. The individual gets to articulate something they feel very deeply without being embarrassed or having to justify and then later defend something that they just can’t. Everyone is in on it – and when the instinct resolves into intellect later everyone will have better insight into the experience and thinking of the person that voiced it. This insight develops trust between the individuals that will live beyond the team.
Finally, by the team not pushing for an immediate defense and then challenging it demonstrates and reinforces that it’s a safe environment to voice ideas and opinions, and builds credibility for when a miss-step happens.
Gaining Speed
Over time as your team works together people may be able to help each other articulate their blink reactions into concrete issues to be resolved based on understanding the key emotional drivers each participant brings to the table. In our company, we have some folks that tend to have reactions over usability, others over ultimate performance (we’ve dubbed one “the keeper of the nanosecond”), and others over code simplicity. Knowing these points helps us not miss-read emotional reactions to ideas and to help each other understand what we really need to do to create the best outcome.
With practice, you’ll find your team can get to great, collaborative conclusions faster and generate the buy-in from the participants with little or no effort.
What Gets You to Blink?
Looking back, when did you have a blink reaction to something? What did you do with it? Share your story by posting a comment or dropping me a line.
Three Levels of Developer Zen
Posted by: | CommentsOne of the first things I’m listening for when interviewing developers is for how they relate to problems with authoring code, namely how do they approach a defect. Depending on how much experience they have and how much they’ve thought about it, I’ve found developers go through three levels of Zen when considering problems:
It’s Not the Compiler
Similar to the classic discussion that Select isn’t Broken, every developer when first starting out runs into something they are just sure is the compiler’s fault. The code is right, the compiler just doesn’t agree. It may be tempting to even use #pragma or the equivalent in the language to get rid of the warning or error.
In the end, it isn’t the compiler, it’s your code. Try explaining carefully and thoroughly to another developer why the code is right and the compiler is wrong, and most of the time in the middle of your explanation it’ll hit you what the real problem is, and that the compiler was right (if perhaps obtuse in how it is reporting the problem). This is an excellent trick to breaking through cognitive bias: Explain to someone else exactly why the world is flat, and in so doing your mind will often make the creative or fresh jump to what the real problem is.
It’s In Your Code
It’s one thing to trust the compiler – compilers have generally matured over decades, and even newer languages like C# and Java have very low defect rates of compiling because even if the syntax is new, the basic lexical parsing is very close to C++ and the long standing family of engineering behind it.
If the compiler is the most trustworthy code, what’s the least trustworthy? The code you’re writing right now. Everything else is more tested and more reviewed. In general, the probability you’re going to run into a defect in an operating system library or framework is relatively low, low enough compared to the probability of defects in your own code as to be statistically insignificant.
Even in areas of high churn, like evolving web standards and CSS for Internet Explorer, if the output isn’t right the defect is virtually guaranteed to be in your code.
It Doesn’t Matter Where It Is – You Have To Fix It
The ultimate evolution of this thinking is practical: It doesn’t matter where the defect is, you have to fix it – and you have to fix it in your code (or at least code you control). In the end, you’re writing software to deliver features for someone, and you’re only successful if it works. They really aren’t going to care whether you can’t make headway because of a problem in your code, an underlying library, or the operating system: They all produce the same effective result of the feature not working. So in the end, it isn’t a question of who’s right but rather what you’re going to do to resolve it.
There is some danger in this last point. There is a fine line between pragmatically realizing you have to make it work one way or another and just hacking away at a problem you don’t understand under the assumption that you’ve found an underlying defect you need to work around. Before you assume this is the case, you should prove not just to your satisfaction you’ve found an honest defect in another library but seek some independent evidence of this: A knowledgebase article, defect report from the library owner, or at least corroboration of your peers is a good place to start. Developers that charge straight to attempting to work around a perceived defect in a library without verification are really back at the first level.
Listen To Where You Are
When I’m talking with developers, I can usually find out where they are by listening for key phrases, such as “it works on my box.” If you hear that and it’s offered in anything other than a sheepish or self deprecating tone, this developer probably has a way to go. Listen to your own language – what do you say when you run into something? If you find yourself getting into a cognitive bias death spiral, that’s the time to get up, take a break, and come back to it. Repeating my favorite trick: Go find a talented friend to explain just how right your thinking is. In my experience, that exercise invariably has me figure out where I’m off track even if the other party can’t follow what I’m saying.
Editors Note: Sometimes you let an article spend too long cooking before you publish it. Jeff Atwood came at this from a very similar angle on Coding Horror while this was in review. Check out his article as well.
Aviate, Navigate, Communicate
Posted by: | CommentsIf you’re involved in IT operations or even in business long enough, you’re going to experience some emergencies. During these emergencies, you’re going to have to balance several conflicting things that will demand your attention simultaneously:
- Cause of the problem: What is really happening? What device is at the root of the problem (network switch died because an admin configured a loop in the fabric and miss-configured the port)
- Scope of the problem: Just how bad is it? Problems usually show up in one place (users can’t access Exchange) but those symptoms often represent a larger problem (network switch died)
- Communicate with users: First, people will be coming in the door to report the problem (do you know that Exchange is down?) and will be expecting updates on what’s going on and when it’ll be resolved (I really need to tell my friend about a party tonight, when will email be back up?)
Even in a shop with healthy staffing, this can be a lot to handle at once particularly because your impulse is going to be to move between the root cause and communication. The first because it’s the real high value item -fix the problem. The last because whenever someone walks in, you’ll want to tell them what’s going on. The higher up the chain of command, the better you’ll want it to sound.
Whenever I’m wondering how to look at an IT Operations problem from a different perspective to gain insight, aviation is the first place I go. Think about the modern air transport system in the United States not from your usual perspective (a passenger on a plane) but from the standpoint of the people that live within it and operate it. For example, the life of a flight deck crew isn’t that different than system support in the sense that you have long periods of routine punctuated by periods of high stress activity. A classic rule taught to pilots when they’re first being trained is Aviate, Navigate, and Communicate – in that order.
- First, fly the plane. (Be in the middle of the air, not the bottom)
- Figure out where you are. (Over the White House)
- Then communicate. (Sorry Tower, would you like us to land?)
To make things easier on commercial planes, you have a pilot and co-pilot that divide these responsibilities by having clear designation of one being the Pilot Flying and the other (called the Pilot Not Flying or Pilot Monitoring) responsible for navigation and communication. This is practiced carefully during training with different parts of each emergency checklist assigned to either the Pilot Flying or Pilot Monitoring.
Now apply this back to a system problem:
- Create Clear Roles: Have your team know who is going to take on the role of Admin Flying and Admin Monitoring. This shouldn’t always be the same – it may be based simply on rotation (who is “up”) or who gets the trouble ticket or whatever within your shop. The team should declare their role in a situation so everyone knows their role.
- Perform in Order: If you have an Admin monitoring, it’s their role to intercept external communication while the Admin Flying is working on the problem.
- Make a Checklist: When there is an emergency isn’t the time to be winging it. During quiet moments, talk as a team about what you would do in a hypothetical situation and work to distill out a basic checklist of things you’re going to run through. Focus on having it be the shortest list that verifies the largest set of items. When a problem shows up, use the checklist.
Problem Checklists
There are a few great advantages to using a checklist for problems:
- Reduce Solution Focus: When diagnosing problem, the general process is to propose a theory then test it to either prove or disprove it. This create cycles where you create theories you have to believe in then your job is to prove yourself wrong. It turns out that people tend to naturally bias towards information that proves themselves right and away from information that’s inconsistent with that diagnosis. Checklists for diagnostics can ensure that a significant breadth of information is available at the start of this process to enable the best theories to be created quickly.
- Creates a Pace: It’s easy to get caught up in an emergency and start working at a pace that really isn’t necessary, but degrades your accuracy and effectiveness. Checklists stop the emotional cycle that reinforces the early stages of emergencies and instead create a steadily paced environment of gathering and verifying facts.
- Establish a Baseline for Improvement: One of the most important parts of any emergency, and the least frequently used effectively, is an after action review. After you’re back up and everyone has calmed down, you want to learn as much as you can from what happened. The existence of a checklist creates a baseline for systematic (As opposed to random or by chance) improvement to your team’s ability to handle future problems. This is true even if the checklist wasn’t used; the fact it wasn’t used is itself an indictment of either the checklist itself or the team’s training.
While initially it may feel corny or even overly dramatic or bureaucratic to create checklists, there is real evidence to back up using them in environments where the downside cost (crash and death) is very steep, and if pressed to admit it most engineer will confess they have a mental checklist they use for standard problems.
Plans are Useless, Planning is Priceless.
Just by creating the checklists (even if they were never used) your team can get a lot of value:
- Cooperative learning: This is a great tool for the team to learn from each other. Each admin will share their best tips and tricks from their mental checklist and be surprised that they don’t line up. Where they don’t, the discussion on which approach is better and why is gold. It’s hard to get the same result with a contrived exercise, so use this opportunity to build the checklist and maintain it as a team.
- Clarifies Automation: While creating the checklist, it will naturally precipitate ideas for how to automatically identify and possibly solve steps in the checklist itself. For example, if a step in the checklist is to verify Internet connectivity, how are you going to accomplish that? Instead of having an ad-hoc mechanism, can an automated mechanism be put in place so that you now can quickly check that data point without variation?
- Encourages Collaboration: If the team collaborates to create the checklist, when a problem occurs they will be more likely to collaborate on resolving the problem because they already have had the experience of working together as a team. This will tend to replace individual ego with group esprit de corps.
An Exercise Left to the Interested Student
A friend of mine also pointed out the principle that if you have a checklist that always ends in the same action, why not automate the action in response to the checklist? In other words, if you can automate the detection steps that lead up to the action, then find a way to automate the resolution. You will often find you get here in inches: You progressively improve your monitoring so that you can find problems faster. Once this is reliable, you start just hooking up alarms to the monitoring so you don’t wait for a call from a real user or a higher level system. Once that’s working well enough, you get tired of performing the resolution manually so you write a script that takes a few arguments to perform the resolution. Now, just connect them together.
Move Forward One Step Today
The best part about this is that you can get there in small steps that even the busiest team can fit into their schedule with a confidence that they will pay back in time saved in the future. With practice, it will become second nature and make it easier for your team to accommodate new processes and service requirements with ease. In the end, isn’t that what you need to ensure your team is viewed as a vital part of your organization?