top of page

A case study: how Microsoft could not manage properly a team until they have adopted Kanban.

Updated: Oct 15, 2021

This article is a continuation of the blog post from below:


David Anderson


David Anderson is one of the first coaches that have taught Kanban to the IT professionals. He has implemented Agile methods at large companies such as Sprint, Motorola, and Microsoft.


David is the author of several books as:

His Kanban "Blue Book" it is seen by the Kanban's trainers as a "must have" book for all professionals getting familiarized with the Kanban methodology.


You can find details about a famous case study he has presented in his Kanban "Blue Book", the first implementation of the Kanban method at Microsoft in 2004/2005 known as: "The Microsoft XIT Sustaining Engineering Story", in the link from below.



The Microsoft XIT Sustaining Engineering Story


The story presented by David Anderson, describes the situation of the XIT business unit team at Microsoft from 2003 - "a team that was known for having the worst customer satisfaction record across all of Microsoft’s IT organization." This is as well the story of Dragos Dumitriu, a Romanian-immigrant P.M. that soon after joining Microsoft decided to take the challenge of leading the team.


Dragos and his team ("Sustaining Engineering" team) were responsible for the software maintenance for the XIT business unit. The team consists of 3 developers, 3 testers and a local functional manager located in Hyderabad, India. A few years earlier, Microsoft had made a strategic decision to outsource its IT function to provide services that were not seen essential to their business.


The team has to perform the following work:

  1. minor upgrades known as change requests.

  2. defect fixes for about 80 IT applications used at Microsoft.

The work was coming from 4 different product managers at Microsoft that led other departments.


The team performance was very low: an average of five-month lead time existed on change requests and it was likely that for any one item the lead time from commitment to delivery was between 6 weeks to greater than 1 year. This was the reason why the program manager position had been vacant for several months.


When each request for a change or defect fix arrived from a product manager, Dragos would send it to India for an estimate. The policy was that estimates had to be made and returned to the business owners within 48 hours. Once a month, Dragos would meet with the product managers, and they would reprioritize the backlog and create a project plan from the requests.


The service level agreement - to return the estimates within 48 hours- was viewed as having higher priority than the planned worked. In many instances, estimates were given to potential work that later on will be discarded.


Team performance in numbers


The requests for estimates ranged between 18 to 25 a month and each of them took the developers a whole day of work. The planned and committed work ranged between 9 to 13 items per month with an average implementation time of 5 days.

6 planned work items were delivered within a month.


The delivery capability was calculated to an average of 5.5 months at a growing rate of 0.5 months per month while the items delivered against committed dates was 0%.


While only 6 items were delivered each months and the committed backlog consists of at least 80 items, the entire backlog had to be reprioritized and re-planned each month with a single request being re-planned 4 to 5 times.


The department could not deliver and keep its promises and this was a major cause for customer dissatisfaction.


Implemented solution


Dragos proposed two management changes:

  1. No estimates

  2. Limit the work in progress and pull work from the input buffer as work was getting completed.

He chose to limit WIP (work in progress) in development to one request per developer and to use a similar rule for testers.


He inserted a small buffer between development and test in order to smooth the flow of work and set it arbitrarily to 5 (with the intention to adjust it later on in the process.)

In exchange for accepting these changes to the process, Dragos has committed to deliver within a 25-day service level agreement.


On top of these changes, Dragos convinced the managers to use an average cost when calculating the ROI for each estimate, average cost calculated based on historical data.


Results

  • The delivery rate of change requests increased by 230%.

  • The lead times fell from an average of 5.5 months to a mere 12 days.

  • On-time performance rose to 98%.

  • Dragos was rewarded with the division’s process improvement award for the 2nd half of 2005.

Further improvements


Dragos flew to Hyderabad and spent 2 weeks observing the process. He suspected that the testers had a lot of capacity available for work. Based on his observation, he reduced the test team from 3 to 2 members and hired one more developer. The result was a linear increase in productivity from 45 to 56 requests completed.


During the fiscal year ending of 2005, the General Manager noticed the productivity improvement of the XIT sustaining engineering team and gave Dragos permission to hire more people.


Dragos hired one more developer and one more tester, to balance the work between the 2 departments.


Final note of the story

  • The backlog was eliminated entirely on November 22, 2005.

  • The lead time was reduced to an average of 14 days against an 11-day engineering time.

  • The on-time performance on the 25-day delivery time target was 98 percent.

  • The requests throughput increase more than three times, while lead times had dropped by more than 90 percent.

My notes and observations


It is worth mentioning that the story is more complex and contains additional details worth reading and understanding, details that I have not included in this article in order to simplify to case study to the basics.


While reading this study case I was surprised to see how easily things can go wrong even in a company as reputable as Microsoft, one of the major player in the business of building and selling software.


There were a number of changes that have been tried as:

  • switch to a new team offshore,

  • all of the personnel had changed,

  • the management had changed,

  • the service was now provided by a vendor with a master services agreement

and still the situation did not get any better.


There were a number of management professionals involved with this situation that could not find a way to deliver one single item on-time.


Let's try to imagine a working plan for this team without using any Kanban system but just based on our experience and common sense and see if we could have done a better job and have at least any number of items delivered on-time.


(I have to say that while reading the case study for the first time I have stopped moving on with the rest of the article and did this thought experiment just for fun.)


The case study showed that 3 developers and 3 testers were able to process the estimates for 18 to 25 change requests in a month and deliver only 6 items from a planned worked of 9 to 13 that were added monthly. Because these items were not delivered on-time, the backlog keeps growing and the developers were always working on items that passed their delivery milestone.


If all the developers were performing both the estimating and the fixing of items, we may encounter a loss in productivity due to the lack of specialization. We can change this by assigning one of the developer and one of the testers to perform just the estimates work. If one developer / tester performed an estimate a day, they can perform a total of 20 estimates a months (using 20 days as the number of days in a month). This is still less that the maximum number of estimates that can be received a month: 25. So a second developer may be asked to performed the estimate activity for a week of his time leaving the rest 3 weeks from the month to perform planned work.


One developer working full-time on implementing the planned work items could complete 4 items in a month (5 days per item) while the part-time estimator could complete an additional 3 items. (We are assuming that the testers have the same time to complete their tests for these items.)


Based on this calculation, the team can deliver 7 planned work items instead of 6 in a month beside providing all the estimates.


By analyzing these numbers we can see that the main activity that stop the team from providing the planned work items on time it is the estimating activity. So the next question we can ask is: what gives in the estimating process?


Can this activity be reduced to a limited number of estimates per month and increase the estimated backlog only but have all the planned worked items completed on-time? If this is the case, then it is just a matter of setting up the right priorities.


Can we use an average estimate based on historical data ahead of time and have the business calculate their ROI and budgets based on these average? If that could have been accepted, then the whole team load can be used for delivering the planned work items.


Answering these questions requires a discussion with the other P.M.s that provided the work, and the agreed solutions could shape the development process by becoming new policies to run the business.


By just executing this simple thought experiment, without even going into setting up a Kanban process-flow, could be easily understood the following:

  • the issue was not with not delivering planned work items in a month but with not delivering them on-time.

  • some compromise needs to be achieved:

either prioritize the planned work items first or

prioritize the change requests estimates first or

concentrate on performing a single stream of work, if that was agreed.


You cannot simply have it both way.

(Of course, at any given time you can crash the project and add additional resources, if that was approved by the G.M.)


That was my thought process before continuing reading the case study to the end.


Takeaways


You always have to understand the process bottlenecks before thinking to propose changes.


When eliminating the blockers, it is better to take in consideration all the policies and processes internal to a company and propose alternate ways to address them. The new policies or processes will not confront the old ones directly, especially when the end result can be perceived as an attack to the people position and status within the organization.


If the process you have added and performed in parallel with the existing ones brings clear value within the organization, then almost always this new process will replace the old one.


Even though the case study was presented as solving the problem by implementing a Kanban process, in reality it has to do with understanding the process blockers and with the ability to negotiate and implement the solution. As I have stated at the beginning of my previous blog, people will always present a situation to advance their ideas (David Anderson wrote 2 chapters of his book about this case): you just have to be careful to understand the crux of the problem and take everything told to you with a grain of salt.


As I have seen so many times in a lifetime of work it is always about the quality of the people that makes a place run smoothly; well-rounded processes helped but they are not miracle solutions for solving the problems.


##
















404 views0 comments

Recent Posts

See All

Agile process and Test-Driven Development

The story One of clients I have worked for in the past, initiated a multi-year project for migrating services from an existing platform...

Comments


bottom of page