# Estimation Dilemma

During the Egyptian revolution in 2011, thousands of people went to Tahrir Square in Cairo for mass demonstrations against the regime. The number of people in Tahrir Square was overwhelming enough but the variation of  crowd estimation was even worse. The government news channels claimed only 8000 thousand in Tahrir Square while others  in support of the revolution recorded 1 million 125 times the figure reported!

Figure 1: Egyptian Revolution Tahrir Square

Politics aside, , the fact is that crowd estimates can be quite variable. Estimates of the size of the crowd at the royal wedding of Prince William and Kate Middleton, for example, ranged from 500,000 to one  million. At Obama’s inauguration ceremony, unofficial government estimates put the crowd at 1.8 million. Others gave estimates closer to one million.

Figure 2:William and Kate’s wedding day crowds

Estimates can be very deceptive and this plays out in our day to day working life.  I recently discovered that one of the news channels used a metric estimation method to estimate how many people in Tahrir Square. They calculated the size of the venue Sand the maximum people per meter that could be gathered and  discovered that the maximum capacity of Tahrir Square was actually only 500,000 people.

The same estimation issues occur frequently IT projects. It is not uncommon to have a variation of  estimates on  the same project when estimated by different people delivering the project based on factors such as;

• Technical experience
• Risk assessment skills
• Clarity of the requirements from client
• Assumptions

# Cone of Uncertainty

Researchers have compiled the last 60 years’ worth of software projects to measure the accuracy of the project estimates and plotted it over a chart as shown in Figure 3.  The same concept could be applied to agile projects as shown in Figure 4

Figure 3: waterfall cone of uncertainty

Figure 4: Agile cone uncertainty

The chart shows that as the time progresses, the estimate gets better. So, if the initial estimate of the project is one year, then your estimate is accurate by plus or minus four. This means the project can be finished in four years or little as 3 months. However, as the project progresses, the estimate gets more accurate as result of the developing team become clearer about the product, business domain knowledge and more familiar with developing tools and technology  used in the project. But our estimation will never be accurate 100% even during the last of stages of the project.

The uncertainty around how long exactly the project could take to develop and to be deployed into production causes a range of issues to the business such as:

• Difficulty in predicting the project cost
• Allocating resources
• Planning training
• Planning marketing events

# How to Improve our Estimation

## Split Big Project into Smaller Projects

Studies show that agile projects have a much better success rate – almost 3 times as compared to waterfall projects. However, adopting an agile approach in your project doesn’t guarantee high success rate in your project. As you can see in Figure 5 the success rate of large projects using agile approach is only 18% comparing to 58% success rate in smaller sized projects.

Figure 5: The resolution of all software projects from FY2011–2015 within the new CHAOS database, segmented by the agile process and waterfall method. The total number of software projects is over 10,000.

Large project will usually have a very wide cone of uncertainty, especially at the initial stage of the project. This makes the large project more likely to be challenged or even failed and delivered with much higher cost and time than the initial estimation.

One way of narrowing the uncertainty cone, therefore, is slicing larger projects into smaller ones, where each project is independent and more easily deployed to production, providing quicker business value to the organisation.

## One Story Points Does Not Equal One Day

Many people get confused around the significance  of story points in agile projects. Unfortunately, many development teams have the concept that  of 1 story point is equal to 1 day. This often means that ,  the project managers also assume 1 story point is equal 1 day which increases the cone of uncertainty instead of narrowing it down.

The purpose of using story points is to decouple time from effort and to use a different unit of the measurement to measure the effort from time. Story points will help to measure the team velocity later on but will not  measure how long (in time) it will take to finish a specific feature or user story.

## The Power of Relativity

Humans naturally don’t make great estimators. We tend to either be optimists or pessimists and very rarely realists. For example, if you were asked to estimate how long it will take us to walk up all of the buildings as shown in Figure 6 using the stairs

Figure 6: how long it will take us to walk up all of the buildings?

Consider you’ve never climbed these buildings before or aren’t sure how physically fit we are or what types of obstacles you might need to negotiate in the stairwells. Unfortunately, our estimate will never be accurate regardless of which approach we take. . It will be subject to plus or minus 4x based on our cone of uncertainty.

On the other hand, let’s assume the first building will take 100 story points to climb then what’s the estimate to climb the rest of buildings?

We can estimate the effort of climbing the other buildings as following:

 Building 1 100 Points Building 2 300 Points Building 3 250 Points Building 4 750 Points

Note all these estimated numbers are relative. Therefore, our estimates will be always relative and not absolute. So, if the first building took 1 hour to climb including rest time, then, how long it will take to climb the other buildings?

## Using the  Fibonacci Sequence Will Make You Better at Estimating User Stories

The Fibonacci sequence (1,2,5,8,13,21,34, 55,….) is based on the golden ratio and it’s important the team stick to the exponential sequence growth during the estimation sessions. The shorter the time span the more certainty. Longer tasks are more complex and has higher risk factor and time estimates are less precise. This will help the team to encapsulate the risk of large and complex tasks during estimation versus small tasks which carry less risk.

# Stop Estimating and Use Metric

Let’s look at  a practical example of how to use agile metric approach to have better estimation, narrowing the cone of uncertainty and increasing the rate of successful projects.

## Data to Collect

During each sprint we will need to collect data to facilitate more accurate agile estimation

### Product backlog size

The backlog size will change during each sprint as the team will complete work items as well as items will be added or removed including bugs. The product backlog items could be calculated as following:

Backlog size = (Initial backlog size + item added to backlog + bugs) – (items removed from backlog + completed items)

### Committed Sprint Backlog

At the beginning of each iteration the team will be committed to complete selected items from product backlog. It’s important to record the sprint size which the team is committed to do.

### Completed backlog items

The actual backlog item size that  are completed during the sprint. This will be indication of how the team is performed during the iteration.

### Cumulative team velocity

Measure team velocity during each iteration to record the fluctuations as the project moves. The team velocity could be measured as:

The initial velocity sprint will not be accurate, but as the project progress the velocity will become more accurate typically after the first 3 or 4 sprints

### Item removed from backlog

Similar to item added to backlog, the business could request to remove a feature from backlog or replace it with other features

Regardless of the quality control you have in your project, there will always be bugs rate of which usually fluctuates during the project

### Numbers of sprints to complete the project

To measure how many sprints are required to finish all backlog items and complete the project can be  measured as follows

### Allocated Percentage for New Bugs & Rework

As the project progresses, the bug rate and effort of rework or redevelopment usually increases. The percentage will be different for each project based on many factors, for example:

• Existence of coding quality controls like unit test and integration testing
• Length of project
• Quality of solution architect design of the project

Based on statistics, the percentage of effort to develop new user stores versus the effort of rework and bug fixing could look  something like Figure 7

Figure 7: Allocation % of bugs and rework through the project

### Weighted sprints to complete

The weight sprint to complete is based on the allocation percentage of new bugs and rework.

## Sprint Metrics Example

At the beginning of the project, let’s assume we have sum of 100 story points in backlog items. These items based on the initial project requirements and analysis of work which is required to build the project foundation. Then, our metrics could be something like the following

 Sprint 1 Sprint 2 Sprint 3 Sprint 4 Product backlog size 100 94 90 81 Committed Sprint Backlog 10 8 8 9 Completed backlog items 7 8 9 9 Team velocity 7 7.5 8 8.25 Item added to backlog 0 2 1 0 Item removed from backlog 0 0 1 0 Bugs added to backlog 1 2 3 2 Estimated Numbers of sprint to complete the project 13.4 12 10.5 8.9 Allocations % for new bugs, rework 1 0.95 0.90 0.90 Weighted sprints to Complete 13.4 12.6 11.6 9.9

As shown in the previous table, note the following:

• Team velocity usually improves after the 3rd or 4th The team start becoming more familiar with the used technology with a better understanding of the business domain knowledge as project progresses.
• The team velocity could be impacted if the team size kept changing every sprint. So, it’s important to have stable team to get accurate results
• There are no hard rules to allocate percentage of new bugs and rework, and it will be different from team to team and from project to project.

# Every Burndown Chart Has a Story to Tell​

The burndown chart is a great tool to help the team track progress, since it shows progress on a daily basis, it helps scrum master to predict if a team will be able to achieve the target. Also, it’s an indication of how the team is updating theirs tasks to ensure the accuracy of the data which will be collected during each sprint.

## Ideal Chart

The graph below shows the ideal scenario where is actual and ideal lines are in sync and the team was able to finish all committed tasks at the end of sprint.

Figure 8: Ideal Chart

## Early Finish

Actual curve finishing below ideal curve. So, if repeated then you might need to consider increasing the team velocity and commitment  to more work.

Figure 9:Early Finish

## Unable to Finish All Committed Work

Actual curve finishing above ideal curve. So, if repeated then you might need to consider decreasing the team velocity and commitment to less work.

Figure 10:Unable to Finish All Committed Work

## Large peaks or valleys​

Large peaks in the actual curve could means many things but the most common scenario is the team not updating their tasks regularly.

Figure 11:Large Peaks or Valleys

## More Unexpected Work Discovered​

The jump in the actual burning curve could be because of more work discovered or more urgent bugs that are discovered during the sprint and must be resolved during the sprint. , if repeated then you might need to consider checking your quality control and increasing the allocation percentage of new bugs and rework.

Figure 12:More Unexpected Work Discovered

## Stretched Curve

Team stretched toward end to meet the commitment​

Figure 13:Stretched Curve

## Inconsistent Curve

This could mean the team is not consistent through  excessive  meetings or not updating their tasks regularly.

Figure 14:Not Consistent Curve

# Final Thoughts

The main idea of agile metrics is to narrow down the cone of uncertainty of the estimations and to notify the business as soon as possible for more accurate estimations. Importantly, metrics are just one part in building a team’s culture. They give quantitative insight into the team’s performance and provide measurable goals for the team.  This will help the team to be able to get unbiased estimations and to have more realistic estimations based on team effort.

# Definition of System

“A system isn’t just any old collection of things. A system is an interconnected set of elements that is coherently organized in a way that achieves something” Therefore, the system mainly consist of three kinds of things: elements, interconnections, and a function or purpose. (Wright and Meadows, 2012)

Reference:

Wright, D. and Meadows, D. (2012). Thinking in Systems. Hoboken: Taylor and Francis.

# 1  What Is Antifragility?

Antifragility was first elaborated by N. N. Taleb in 2012. Antifragile does not mean that a system is robust or resilient; it is not the opposite of fragile. However, antifragile means the ability to increase the capability of the system so that it becomes stronger or more robust when subjected to volatility, shocks, stressors and errors either from your system or other systems. In other words, what does not kill you makes you stronger or antifragile.

# 2 Human Body System

If you visit a car factory you will see many different teams each performing a different task. One team will be responsible for the car’s design, another will assemble the car’s engine and another will paint the car’s frame.

Even though each team performs a different task, they all share a common goal: to make cars.

The same can be said about our bodies. Different systems or organs within the human body are responsible for different jobs, yet they all share a common goal: to keep us alive.

## 2.1 Body Cell

Each system within the human body is made up of trillions of cells. Each cell plays it own simple role and interacts with other individual cells to complete more complex tasks. There is no ‘super cell’ that does everything.

## 2.2 Cell lifespan cycle and Regeneration

Each cell type has a different lifespan cycle. Some cells live longer than others (Figure 1). Furthermore, each cell has the ability to regenerate by creating a new cell as the old one dies. However, this does not mean that the whole organ dies or fails when old cells die. The organ continues to function normally without impacting the human body.

## 2.3 Strain

The human body responds to strain, it does so by adding muscle mass in the relevant areas. If you have a tendency of using your forearm muscles all the time, they will bulk up. Similarly, your body muscles lose their mass when the strain level is reduced by neglecting to exercise.

## 2.4 Feedback Loops

There are many feedback loops in biological systems. For instance, we have internal controllers for maintaining our body temperature at 37°C, which is the optimal internal state at which our bodies operate best.

As you can see in (Figure 2), the body exceeds a certain temperature it takes action to regulate it. This is detected by nerve cells that give feedback to the part of the brain that regulates body temperature, and the brain sends out a signal for the body to cool itself down by sweating.

# 3 Microservices Architecture

Microservices or Microservices architecture is defined by (Richardson, 2017)as “An architectural style that structures an application as a collection of loosely coupled services, which implement business capabilities”

Microservices share many characteristics similar to those of the human body cells. Each service performs a different task, but together, all services share the same target: to keep the system alive and produce maximum business value. There are different patterns that can be used to implement microservices depending on an organisation’s requirements and capabilities.

## 3.1 Lightweight and doing one thing

Let’s imagine that you are building a booking system similar to that of Expedia.com or cheapflights.com.au that takes orders from customers to help book your flights, hotel and car renting. The application consists of several components including: booking website interface, mobile application and multiple backend services to orchestrate your booking order as shown in (Figure 3).

The system will mainly consist of 4 different layers:

• User Interface (UI)

UI is an interface for the system with the exception of any server side business logic. It could either be a website, mobile native application or a reporting dashboard system for internal operations.

• Communication Layer

The communication layer will enable the UI to communicate with different services through a single gateway without the need to directly access any of the services.

• Service Layer

The service layer will contain all microservices which can be utilised by the organisation. In our example, we only have five services; however, in the real world it could be hundreds of services to handle each aspect of the system.

• Integration Layer

All services should not have any direct link to other services or share data storages. All communications between services should take place within the integration layer to ensure service decoupling.

Therefore, each of the system services is does one thing and handles a particular task.

## 3.2 Autonomy

Each service needs to be changed independently without affecting other services and should work as a separate entity. Therefore, each service can be modified and integrated with our system without the need to modify other services.

In addition, each service needs to expose one or many communication interfaces like API (Application Programming Interface) and should collaborate with other services through this interface only. Therefore, each service can control what it wishes to keep open for other services to use and what to hide. If each service shares a lot of functionality with our services, then our consumers will be more coupled to our service and demand more changes to decrease service autonomy.

## 3.3 Easy to replace

A microservice should be as easy to replace as possible. Replacing a microservice can be sensible when its technology becomes outdated or if the microservice code is of bad quality that it cannot be developed further. The replicability of microservices is advantageous when compared to monolithic applications, which can hardly be replaced.

# 4 Antifragile Characteristics of Microservices

For years, many enterprises have focused on building robust systems that they invest heavily to avoid failure by handling all known scenarios.

At the same time, business architecture keeps changing over time and enterprise IT systems need to adapt to these changes. The IT system needs to respond to business changes either by changing business rules, extending the IT system or integrating with other systems.

The more changes are made to an IT system, the more the system becomes complex as it needs to handle more predications of failure scenarios when integrated with other systems. This approach often leads to a more complicated system which is hard to be understood by developers. Also, the maintenance cost of the system will become very high, delivery cycles of any changes or extensions are longer and it becomes very difficult to integrate the existing system with other systems.

All these problems lead to a more fragile system that cannot adapt to unpredictable changes. The result is a volatile system that leads to either total system failure or full costly replacement with a brand new enterprise system built from scratch.

Antifragile microservices is a natural design approach for creating antifragile systems that assumes all parts of a system will fail.

## 4.1 Built for failure

The system needs to handle different types of failures and assume all the parts will fail. Most distributed systems have a higher chance of failure when it comes to network, application issues or even hardware faults. Therefore, each service can be temporarily unavailable to consumers and system architecture needs to minimize the impact of partial outages of any service system.

## 4.2 Response to stressors

Each service system needs to respond to the stressor by scaling up or down based on the stress level, similar to your body cells. The more you stress your muscles cells by going to the gym, the more your body responds by generating more cells to help build more muscles. Similarly, you will start to lose your body muscles once you stop training and remove the stress on your muscles.

Also, each service needs to handle stress independently without the need to scale the whole system up or down. We never expect to increase our whole body muscle mass by shoulder weight lifting exercises alone.

## 4.3 Feedback loop

There are many types of feedback that should be gathered from each service to help identify abnormal issues. Another reason is to aid in continuous improvement of each service as well as the whole system in general.

• Service logging
• Health monitoring

## 4.4 Randomness

Systems are generally designed to handle known risks and errors. As a result, a system breaks easily when unpredicted shocks hit it. A well built system needs to handle unexpected service unavailability. For example, by throwing random errors or disconnections at the system, it is possible to measure impact on the system. For instance, Netflix has implemented a resiliency tool that helps applications tolerate random instance failures called Chaos Monkey. It will randomly terminate virtual machine instances and services that run within the production environment. Exposing engineers to failures more frequently incentivizes them to build resilient services(NetFlix, 2017).

# 5 Impact of Microservices on Enterprise Economy

There are many characteristics of Microservices that have direct impact on enterprise revenue and growth.

The four main characteristics of Microservices are:

• Flexibility to scale
• Flexibility to change and extend
• Long term solution complexity
• Short term design complexity

## 5.1 Flexible Architecture & Delivery Performance

The flexibility of the architecture of Microservices will have a direct impact on the solution delivery speed and cost of each project. For instance, the deployment of a monolithic application starts as a strong structured application. However, over time, more and more dependencies between the individual modules creep in. This leads to the application becoming overly difficult to maintain and update. In addition, the testing strategy of monolithic applications is complicated and slow, as the entire deployment of the system needs to be tested every time a business asks for changes to be done in one of the features of the system.

On the other hand, the system features will be distributed across different services and each service could be divided into more microservices. Further, each team is responsible for each individual service. Therefore, it will be more flexible and cheaper to modify, deploy and test each service individually without affecting other services on the system.

## 5.2 Agile Business Architecture & Knowledge Retention

The increasing pace of business innovation continues to push organizations to evaluate business models that will meet stakeholder expectations. Also, IT systems will need to quickly respond to these changes. The response speed of IT development teams to adapt to the new business change is essential to the business’s agility process. Furthermore, it increases the confidence level that exists between the business and IT development team.

The organisation’s employment retention will be improved when the confidence and satisfaction level increases amongst all different parties, businesses, stakeholders and IT development teams. Also, the delivery speed of the new system features will be improved when there is high employment retention and this will reduce the cost of training new employees

## 5.3 System Performance and Scalability

With Microservices, everything is more granular, including scalability and management of spikes in demand. Demand may surge in one or more services of the application. Therefore, Microservices architecture enables the system to scale only certain impacted services, rather than the entire application and its underlying infrastructure. Also, the application users will not encounter performance issues during busy times as the application will be able to scale up or down based on demand.

## 5.4 Short Term Design Complexity of Microservices

Microservices do not come free of charge, and although the individual complexity of a microservice may be reduced, the sophistication of orchestrating a large, distributed system is noteworthy. Consequently, Microservices architecture is shifting the complexity from space of code design and implementation to system operations. Moreover, the cost of third party tools, licences and custom implementations to manage the system operations is high with regards to the required tools for monolithic applications.

## 5.5 Long Term Solution Complexity of Microservices

Going for the Microservices approach will definitely help the organisation in the long run by providing simple IT solutions that have a high integration ability, extension and low maintenance cost. The sample solution characteristics can be summarised as:

• Each Micro service can be deployed independent of other services.
• A new developer needs to understand the functionality of individual services only without extensive training to understand the whole enterprise system and start developing new functionality.
• The development of a service is confined to one team and this team can work independent of other teams.
• Improved fault isolation, as only the faulty service can be isolated or rolled back without affecting the rest of the services.
• Avoid long term commitment to any technology stack and ability to adapt new technologies to increase the business value outcome of an IT system.

# 6 The Overview Effect

Over four decades ago, a group of astronauts saw our planet earth for the first time during the Apollo 8 mission in 1968. They described a cognitive shift in awareness after seeing our planet “hanging in the void” and this state of mentality is called “The Overview Effect”.

The view of earth hanging in the void, shielded and nourished by a thin atmospheric layer, changed their perspective about life and they realized that our planet is very tiny. Astronaut Gene Cernan stated that “There has to be somebody bigger than you, and bigger than me, and I mean this in a spiritual sense, not a religious sense and You do not see the barriers of colour and religion and politics that divide this world” (White, 1987).

## 6.1 Agile and the Big Picture

### 6.1.1 Agile Methodology

Agile software development methodologies were introduced into the IT business over 15 years ago. However, in the last 7 years we have seen agile methodologies invading many IT businesses around the world.

The main reason for agile technology success is due to the short feedback cycle that exists between the delivery team and stakeholders. Agile improves the collaboration between all interested elements, has quick delivery to the market, gathers the feedback and adopts these feedbacks within the next delivery cycle.

### 6.1.2 Losing the Big Picture

While many organisations are adopting agile methodologies, it is unfortunate that they are losing focus on the big picture with regards to the enterprise level of the whole system and are focusing more on project delivery.

The reason for losing the big picture is that agile and scrum in particular focuses on iterations or sprints, on the short term. As depicted below.

Most projects in the enterprise world will be considered successful projects if the team is able to deliver a set of features to solve particular business problems within a set budget and time. However, during the project design and development cycle a lot of adjustments and alterations can be made to the project based on unforeseeable problems or business feedbacks during iteration demos. This design changes and compromises have the ability to deliver the project on time and within budget. However, the big picture of the whole system and integrations of the system with current and future systems are left out.

## 6.2 Collaborative Enterprise Architecture and the Big Picture

Over the last few decades, IT systems have had a huge impact on how enterprises do business. As the scale and complexity of a system grows, the need for proper planning in designing, building, and operating it becomes more and more important in order to understand and visualise the big picture of the IT system. Enterprise architecture is responsible for controlling the complexity of IT systems as well as identifying the enterprise business needs from IT systems.

The enterprise architecture ensures that the IT system of the organisation is stable, adaptable, agile, efficient and antifragile.

The basic definition of architecture is “Architecture is the illustration of the structure and behaviour of a system and its fundamental parts, plus a set of principles that guide the system’s long-term evolution” (Langade, 2012).

Yesterday’s solutions are the main reason for most of today’s IT system issues, limitation and complexity. Also, to solve today’s problems there has to be collaboration amongst all stakeholders. In the same way that mistakes are made collectively and not individually, solutions need to be found collectively and not individually.

System stakeholders are people who deal with creation, evolution, and operation of the system. For example,

• The owner who pays for it
• The strategist who conceptualizes it
• The planner who plans its creation
• The designer who designs it
• The implementer who builds it
• The subcontractor who provides constituent parts for it
• The field support staff who maintains or operates it in the field

# Conclusion

Antifragile Microservices architecture will have plethora of advantages to reduce the building and maintenance cost of enterprise system in the long term. Also, the Microservices architecture is not the silver bullet for IT system problems and requires more planning and devise a first coarse-grained domain architecture that leads to the first microservices. Therefore, the way to microservices is evolutionary. It is not necessary to start adopting microservices for the whole system from the beginning. Instead, a stepwise migration is the usual way.

# References

BIONUMBERS.ORG. 2017. HOW QUICKLY DO DIFFERENT CELLS IN THE BODY REPLACE THEMSELVES? [Online]. Available: http://book.bionumbers.org/how-quickly-do-different-cells-in-the-body-replace-themselves/ [Accessed 26/09/2017 2017].

HP 2015. Agile is the new normal. In: ENTERPRISE, H. P. (ed.). Hewlett Packerd Enterprise

LANGADE, S. B. U. B. S. 2012. Collaborative Enterprise Architecture, US, Morgan Kaufmann.

NETFLIX. 2017. Chaos Monkey [Online]. Available: https://github.com/Netflix/chaosmonkey [Accessed 22/10/2107 2017].

RICHARDSON, C. 2017. What are microservices? [Online].  [Accessed 22/10/2017 2017].

TAL, L. 2015. Agile Software Development with HP Agile Manager, US, Apress.

THOMAS, S. 2015. Did HeartOfAgile emerge from PDCA? [Online]. Available: http://itsadeliverything.com/did-heartofagile-emerge-from-pdca [Accessed 11/10/2017 2017].

WHITE, F. 1987. The Overview Effect: Space Exploration and Human Evolution, New York, Houghton and Mifflin Co.

# IT Projects Failure Rates

Large Projects => Higher Failure Rates.

It is time to break down your large projects Into small and micro projects then glue these sub projects (services) together.

# Transforming IT From Strategic Liability To Strategic Asset

You can not make good strategic decisions with bad system and data