Team Bus Factors: How to Reduce Them and How to Prevent Them
There was a discussion on HN asking about good tips for team leaders. So I thought I'd chime in a bit more deeply.
The following is a new chapter from the upcoming second edition of my book for software team leaders
They’re all around you, and they are huge risks to your projects. Bus factors.
A bus factor can be defined as “How many people need to get hit by a bus for the project or team to stop functioning”. A bus factor of 1 is the most risky.
If you’ve worked in software any length of time and I’d ask you “Do you know a person in your project that, if they had disappeared tomorrow, the project or team would get stuck?” it is very very likely that you can provide me with multiple names.
These people (roles, really) are what I’ll call “bus factors” from now on. I had a friend who joked that “every successful software company is hiding one “Yuri” in the basement that does all the important stuff"..
Joking aside, “Yuri” represents talent, a specialist, acquired from overseas perhaps, that is hard to reproduce. Many companies indeed do have multiple “Yuri”s.
And they are a huge risk.
Let’s break down some of the reasons they are a risk:
- Having a single point of failure
- Having a bottleneck that slows things to a crawl
- Reducing moral/Avoiding moral boosting/job insecurity
- Avoiding team growth
- · Having a single point of failure
I once consulted at a large insurance provider One of the projects, with about 200 developers, had just one build consultant who had been working there for several years, but was still an outside consultant. His job was to take care of the build and release. A release took a week. There was one release every month.
Everyone stayed out of each other’s way, and he mind his own business as well.
One day he got tired of working at the same place and same project and transferred to a different consulting gig at another place.The project (the copany’s main source of revenue) was unable to release software, at all.No bug fixes to production, no new features seeing the light of day. Nothing.
Huge money loss, as well as possible new customer loss. Everyone was running around like headless chickens, trying to fix this. Eventually they called the build consultant again, paid him 3 times his usual hourly to sit there for a couple of weeks, while they attached a mic and screen recording software to his machine, while he was doing two releases (fake and real one). They paid dearly for getting rid of the bus factor.
Lesson: The less people you have that know an important part of your business, the more you will pay when they move on.
- · Having a bottleneck that slows things to a crawl
One of the large projects I consulted for had a huge problem with code quality and slow release cycles. When I paired with developers several things were very clear.
One developer that I paired with was working on a long function that was very unreadable and unmaintainable. He was just adding spaghetti code onto more spaghetti code.
When I asked him why he wouldn’t refactor his new code a little bit or just extract some stuff to an external method, his reply was “That’d be crazy. If I do that, I’m going to have to ask for a new code review for that code section, and those things take days until they're done by an architect.
There were only two architects that were allowed to do code reviews, and they were busy writing new code most of the time. So they had little time to review code changes and approve them. So getting code into even a regular dev branch would take days, which slowed the place down to a crawl.
It is harder to keep up with fixes, add changes quickly or even show demos and get feedback fast enough in a sprint cycle.
If you’ve read about the “Theory of Constraints”, then a bus factor is a type of a constraint, and thus everything will move as slowly as the slowest constraint.
· Job insecurity/Reducing moral/Avoiding moral boosting
There’s also negative consequence to the bus factor person themselves.
I once consulted at a large software/hardware firm. The build process was managed by three people and they were very defensive about teaching other people how to use the build process (we’ll talk about job security in the next point regarding this).
One of them was adamantly against showing other people how the build worked, because he felt that this was his job. He was the specialist with 20 years of training on the job.
It can be said that being the only expert at something is a good way to have job security, but it is is very much a double edges sword.
An organization will usually seek to reduce the bus factor risk. A person who is a bus factor and avoid reducing the factor will usually have to take a forceful stand against the organization, effectively holding the project or team hostage since no one else can fill the role.
It is very likely for the organization to try to find the first “match” to come along to replace or augment that person’s job to reduce the bus factor. The more forceful he was originally, the less chance he has of staying onwards when someone else with the same or better skill set comes along.
It’s the same people hate buying from monopolies, and can’t way to jump ship with an adequate competitor comes along, even if it is just on principle.
- · Preventing team growth
I’ve seen many people quit their jobs because they were’t learning anything new in them They got bored of doing the same work, of not being challenged, and they moved on to more challenging pastures.
Bus factors are knowledge silos by definition. Knowledge silos that are not broken can lead to the “not my job” mentality (Go yourself a favor and google “No my Job Award” and see extreme versions of that mentality when it gets perpetuated by the organization itself.)
In that mentality, even people who want to learn and be challenged are going to have a hard time finding new challenges since they are expected to stay in their own little cube. That , in turn, makes for a team of specialists, or, to put it more bluntly, a team of bus factors.
One of the places I was consulting for once had an issue: the search server the product was using was failing and since it was setup by an external company, nobody on the team knew how to fix it. The people form the external consultancy were on a company vacation or something of that sort, and it took three days to fix.
Here’s how it got fixed: A developer called the expert on the phone,The expert told the developer exactly what to type into the terminal.The developer blindly typed it into the terminal , waited, then said “ok it seems to work now” and hung up.
What a wasted opportunity to learn something that might really help solve this problem next time. Not even a “why did that command fix it?” or “Why did it fail? What can we do next time on our own to prevent this?”. Nothing.
This is what that mentality begets: lots of lost time, and people putting cubicles inside their minds, preventing learning.
Removing Bus Factors
There are several ways that I’ve used in the past to remove existing bus factors. Some of them are slower, but more effective in the long run. Others are faster but are a bit less effective in the long run.
- Pairing
Ask the bus factor to pair up for at least 30 minutes a day with one other people in the team, and during that time, the less experiences person will do most of the hands on work, with the bus factor sitting next to them coaching and explaining what to do next.
Do this until the less experienced person knows how to accomplish one task without the bus factor’s help, then either move to a new task, or have the newly minted person stop pairing with the bus factor, and pair with some other member of the team to achieve the same goals.
The best way to learn is to teach, and by teaching, the new person will have learned the new task in a much deeper way.
- Make them in charge of teaching
Have the bus factor be in charge of a project that requires multiple people to accomplish tasks relating to the bus factor’s area of knowledge. Then make sure part of that project is the bus factor teaching others on how to accomplish this.
- Prevent them from working on area of knowledge
Ask the bus factor to not work hands on in their area of knowledge for a day a week, while others try to take over. This might feel scary but it is a great way to get up and running fast.
- Apprentices
Assign a full time apprentice to that expert and make sure they pair as much as possible.
Avoiding future bus factors
You can’t always avoid bus factors from sprouting, but you can minimize the likelihood of it happening by:
- Pairing
The more pairing your team does, the less likely it is that only one person knows how something works.
- 1-1 code reviews
it’s almost as good as pairing: Each code checkin gets personally reviews locally or via remove call (skype etc) with at least audio. This provides learning since it is meant to be a conversation.
- Rotation (support, scrum master, build..)
Set a daily, weekly or bi weekly rotation on tasks that are bus factors or with new tasks to prevent them from being bus factors.
- Pushing people out of their comfort zone instead of asking the veterans to do it
If a new task comes up, select the person on your team with the least skills to accomplish it (assuming it is not high risk). If it is high risk, save that thought for a low risk task (don’t tell me everything is high risk!)
Next, we’ll discuss survival mode, which , if you don’t take care of your bus factors, you can easily fall into.