IT projects have always carried high levels of risk. The explosion in "Internet" projects, or projects in which an IT solution is integrated to the Internet, has further increased this risk because the Internet channel not only alters customer behavior and expectations for service but also dramatically increases the exposure of the organization.
In this article, which has been excerpted from the new IBM Press book On-line, On-time, On-budget: Titanic Lessons for the e-business Executive, I'll provide some insights into this risk and explain how to calculate the real costs of Internet projects, specifically the hidden costs. This means looking beyond the end of the project into the online operation, something that most projects fail to consider carefully. I'll summarize the key steps to take in the seven stages of an Internet project so that you can increase the probability of success and deliver a better project and, ultimately, online operation.
The Challenge of Internet Projects
Much has been written about IT projects and the notoriously high project failure rates. The landmark Standish Group CHAOS reports from 1994, 1996, and 1998 helped bring this to the attention of organizations by revealing that only 25% of all IT projects finish on time, on budget, and with all the features and functions originally specified. (Source: "Chaos, a Recipe for Success," Standish Group, 1998.)
More recently, the Internet has further exasperated these projects in the following ways, categorized by the ten E's:
1) Encompassing: No longer are new systems the domain of one line of business; the whole organization is affected. The Internet drives the integration of systems, the interdependency of data and functions, and a single view of the customer. All the lines of business need to pay attention to the Internet project. It is impossible to be successful and stay in demarcated silos. There is a need to look at the whole value chain.
2) Expectation: The Internet changes the expectations and behavior of customers and partners as they acclimatize to a 24-by-7 online operation and online access through intranets, extranets, the Internet, or portals. Fickle and savvy online customers readily "click" over to competitive service providers when organizations fail to deliver the basics through the Internet. The potential loss of customer loyalty and revenue is enormous.
3) Exposure: As an organization moves its operations online to the Internet, it becomes exposed. Most people are very aware of hackers and cyber attacks. But a lesser understood consequence is the exposure of the inner workings of the back-end business operations to potentially millions of customers, partners, and suppliers around the world. Outages have become horrendously expensive and highly visible (a large Internet e-tailer like Amazon loses $1,500 per minute; an airline reservation system like Sabre, $36,000 per minute). One disaster can hit the newspapers and undermine customer/consumer confidence overnight. Continuous business service availability is a major competitive advantage and a service differentiator.
4) Erratic and varying conditions: The Internet is continually changing, and this impacts the online operation. Organizations that fail to provide adequate, highly responsive, stable business services that are capable of withstanding the onslaught of weekly changes required will lose demanding customers to a competitor. The Internet project requires a flexible change environment that can rapidly propagate these ongoing changes. Web site downtime cost UK businesses more than 500 million pounds in 2001 and will cost them even more in 2002. (Source: "Web Site Downtime Costs," Yankee Group Report commissioned for Worldport, March 20, 2002.)
5) Evolution: The Internet online operation has to rapidly evolve with the demands of a changing competitive environment. The site will date quickly without an evolutionary strategy in place.
6) End point: The Internet project never really ends as it goes online into operation. The lack of a clear end point presents a challenge, as organizations need to regularly readjust their investment models.
7) Effort: The Internet project requires significant effort--from requirements gathering to construction to testing to implementation. Improving the availability of services after the project is in production is expensive, time-consuming, and painful. 43% of U.S. businesses never reopen after a disaster. Another 29% close within two years. (Source: "Can Your Website Afford a Day Off?" Hurwitz Group, Inc., February 2001). Organizations need to ensure that their Internet projects effectively deliver the online requirements and meet customer demands in the first place.
8) Expenditure: The Internet project requires careful investments to match risk mitigation strategies against the exposure. Investments need to be well-targeted and focused at protecting the most vulnerable areas of the online operation. A fundamental oversight is to invest in just the technology--hardware and software--at the expense of processes and organization. Nearly 50% of operations groups do not have the core processes in place to effectively support e-business initiatives at the time of implementation. (Source: "E-Business Efforts Neglect IT Operation Involvement," META Group, May 2001).
9) Environmental complexity: IT infrastructures are vast and complex, and they are growing beyond what humans can possibly manage. The interacting applications, data interdependencies, and common services make pinpointing failures more difficult. According to Gartner research, people or process failures directly cause 80% of mission-critical application service downtime. The complexity of today's IT infrastructures and applications makes managing these systems to very high levels of availability enormously difficult. (Source: "Making Smart Investments to Reduce Unplanned Downtime," Gartner Research, June 2001). The Internet project needs to consider this in the early stages.
10) Expensive to fix: The Internet project needs to get it right the first time, as rectifying problems later, once in operation, can be very expensive. Most Internet projects don't look beyond the implementation. However, the repercussions of the project and the impact on online operations remain with an organization for many years. With one financial institution, serious problems were not unearthed until "year-end processing," a full nine months after implementation.
Internet Projects Take Project Risk to a New Level
The E's that require the most attention in an Internet project, because they typically carry the greatest risk and uncertainty, are these:
Expectation: The evolution of the Internet and its emergence as a predominant channel has shifted the expectations and behavior of customers and partners. Customers are quick to jump to a competitor if the service is not right. With many choices available, "don't waste my time" is becoming a predominant need. Untangling yourself from any organization that provides a service is very easy with the Internet channel, as new services are available at the click of a switch.
Exposure: An organization's back-end business operations are typically complex and understood by few. In fact, with most of the focus on the customer and the related customer operations, back-end operations are afterthoughts that do not share the same kind of funding. One of the biggest mistakes organization make when they start to move their operations online is they expose their inner workings to potentially millions of customers, partners, and suppliers around the world. For example, the window exposes information sourced from a relatively forgotten server and puts it under the focus of the Internet microscope, all unbeknownst to the organization. This sets up dependencies and expectations in scenarios the architects never envisaged. So if a high level of availability is not built-in, the hidden costs are racked up when an outage does occur.
How Do You Calculate the Real Costs of Internet Projects?
Before you commit to an Internet project, you need to make a "go/no-go" decision on whether your "online operation" is viable--i.e., the proposed Internet solution has enough value to pay for and support itself and is not a "risk" to the business. Most projects go through a quick cost-benefit analysis to highlight this viability and draw on a three-year payback. A minority go through a more detailed business case to forecast a return on investment (ROI) and calculate the risk. Very few look beyond the project post-implementation (the online operation). You must give this thorough attention and consider the possibility of the solution being unavailable. This is a problem because, with the Internet channel, there are substantial "repercussive effects" that need to be carefully considered and factored in. For example, customers tend to be less patient and forgiving with the channel and may make a rapid switch to a competitor.
In many projects, there tends to be a very clear distinction between what was created in the project and what is running in production, an "us and them" syndrome. Yet the two are directly related. If you cut corners in the project or don't perform due diligence, this will have a repercussive effect that may not surface right away. In creating a business case for your project, you need to look into the future of the online operation for at least a year and calculate a profitability analysis. So to justify an online operation, the first step is to start with a simple formula:
Revenue > fixed costs + variable costs + solution investment
In putting operations online, on the Internet, you are faced with the challenge of providing a 24-hour, seven-day, year-round operation to your customers. However, it won't always be 100% available. What kind of impact is this going to have, and how much unavailability can your organization tolerate? To get an accurate picture of ROI for your project, you need to factor unavailability and its "real" cost into the above formula:
Revenue > fixed costs + variable costs + solution investment + total unavailability costs
But how do you measure total unavailability cost and make it meaningful? Every minute your online operation is unavailable has an impact on your customers and your organization. In that minute, you are not generating revenue or saving costs, and you can put a value against that minute.
To complete the calculation, you need to measure the number of times this happens for a period (e.g., a year) and the number of outage minutes assuming a 24-by-7 clock. A User Outage Minute (UOM) provides a meaningful measure and baseline to organizations. A UOM is based on the number of minutes one user is affected in an outage. So in the formula below, the UOMs are based on the total number of outages, the duration time of an outage, and the number of users impacted.
Total unavailability costs = Unavailability cost * UOMs
For each UOM, you need to calculate:
Unavailability cost = (average revenue per minute - absence effect value)
However, with online operations, revenue is not evenly generated. More revenue is generated in "peak periods." Knowing the revenue per minute for those peak period minutes is very significant because they are a lot more valuable. For example, with an online stock trading operation, the "end of day trading period" is the most valuable. The "absence effect value" is what it would cost your company if that minute of operation disappeared:
Absence effect value = (average revenue per peak minute + repercussion value)
The "repercussion value" is the ripple effect of the outage minute. For example, for "revenue-generating" online operations, this includes the impact of lost transactions, cost of adjustments and settlements, penalties paid for missing service-level guarantees, loss of customers and goodwill, loss of shareholder confidence, damage to image, brand-name erosion, lawsuits, and losses due to unfortunate timing, like a during peak sale period.
But not all online operations are the same. For example, a "cost-reducing" online operation includes the automation of paper handling, back-end functions, or workflow processes. These have a different set of impacts, such as lost employee productivity, adjustments and settlements, additional support and maintenance expenses, penalties paid for missing service-level guarantees, loss of confidence in service, and losses due to unfortunate timing, like month-end processing.
Working through an example better highlights how the revenue-generating formula works:
Assume for a given month a worst-case scenario in which 5,000 users experience 20 minutes of outage each, or 100,000 UOMs. Half (50%) of these occur in a normal period.
Total unavailability costs = Unavailability cost * UOMs
Or total unavailability costs = Unavailability cost * 50,000
For each UOM, assume the average revenue per minute, generated by one user, is $0.5 and the absence effect value is $0:
Unavailability cost = (average revenue per minute - absence effect value)
Or unavailability cost = ($0.5 - $0)
Therefore, total unavailability costs = $0.5 * 50,000 = $25,000
However, 50% of the UOMs fall in a peak period, or 50,000 UOMs.
Total unavailability costs = Unavailability cost * 50,000 UOMs
For each normal-period UOM, assume the average revenue per minute is still $0.5:
Unavailability cost = ($0.5 - absence effect value)
Absence effect value = (average revenue per peak minute + repercussion value)
But assume the average revenue per peak minute is $2 and repercussion value is $2.
Absence effect value = ($2+$2) = $4
Unavailability cost = ($0.5 - $4) = $3.5
and Total unavailability costs = $3.5 * 50,000 = $175,000
So in a single month for normal and peak periods:
Total unavailability costs = $175,000 + $25,000 = $200,000
Benchmarks exist for unavailability costs across industries; e.g., a financial institution solution is $1,000 per minute, or a Telco solution is $2,000 per minute. (Source: Standish Group Research, 1998.)
And over a year:
Total unavailability costs = $200,000 * 12 = $2.4 million
This overall number is a worst-case scenario that is then fed back into the business case.
What Else Do You Need to Know?
You have now established the relationship between Internet projects and the worst-case costs and impacts on the online operation. To help your organization deliver better Internet projects--and, ultimately, online operations--you now need to investigate:
- What exactly can go wrong in a complex online operation?
- How it can be prevented from going wrong?
- What kind of investment is required to prevent it?
Answer these questions at the outset of the project, where at every stage you need to identify where the project risk is, what decisions are critical, which activities require tight control, how scope creep can arise, and what business representation is required and where.
Stage 1: Requirements--How Does the Internet Project Align to the Business?
In the first stage of the project, you should question how the Internet project aligns to the business; articulates the business problem or opportunity; and specifies the solution, its business drivers, and its overall value to the organization. Include the business risks of the Internet, and factor this into the business case (as discussed above) that underpins the expected levels of service. This sets up a go/no-go decision whether to proceed with the project and online operation.
Stage 2: Which Parts of the Infrastructure Need Investments to Make the Business Successful?
In the second stage, ensure that the business view is etched into the functional requirements and design and that the online operation is architected with the appropriate levels of availability to protect it, according to nonfunctional requirements supported by the business case. To do this, the project team must identify critical areas and components in the architecture. This helps create the setting for important granular decisions in the next stage.
Stage 3: What Are the Best Safety Features to Incorporate to Protect the Business?
In the third stage, ensure that the critical areas and components (the ones that would cause the greatest problems if they were unavailable) are protected adequately by selecting from a comprehensive list of availability techniques (software, hardware, and process). Be sure to look at the advantages and disadvantages of high availability, the best circumstances for each technique, and, of course, the costs. This also requires reviewing the challenges involved in integrating the online operation with the back-end systems in the environment, completing functional unit testing, and preparing for nonfunctional testing in the next stage. As the construction nears completion, you need to ensure that esthetic factors do not compromise the nonfunctional requirements.
Stage 4: What Sort of Tests Need to be Planned to Ensure the Business Is Protected?
In the fourth stage, ensure that there is an approved plan for testing the online operation. This requires getting ready to test for the characteristics that are important, planning the level of dynamic testing required, selecting the right kind of tests, and preparing the test environment. Focus on nonfunctional requirements first. Plan to test for one of the following: a new isolated solution, a new solution that is integrated to an existing solution, or a new solution that is replacing an existing solution. You need to understand the risks associated with the online operation, the potential impact to the environment and existing business services, and the operational readiness of the organization for the implementation. The "Berlin Wall" approach to change management, which forces all changes through one or two tightly policed checkpoints, is not feasible in today's highly dynamic business climate.
Stage 5: How Is the Plan Followed to Ensure That Everything Is Tested?
In the fifth stage, ensure that the testing is done according to plan to determine the robustness of the online operation. This requires integrating the online operation into a test environment and, through extensive nonfunctional (and some functional) testing, determining its overall integrity and availability as well as its potential impact to the surrounding service delivery environment. Once all the tests are passed, prepare for "going live" (delivering a fully working and tested solution into the live service delivery environment). The testing stage is a critical part of the project life cycle, as typically this is where any warning signs of a potential pending failure will start to become visible. Allocate adequate time to this activity, and do not compromise. At this point, the business service metrics and measurements have already been set up, and the service-level objectives and agreements have been established and agreed to by all parties. The team's business representation reviews results and signs off on the tests.
Stage 6: Is the Online Operation for the Business Ready to Run?
In the sixth stage, ensure that the organization and processes have been set up to successfully run and deliver the online operation for the business and the service delivery environment. The project does not end as soon as the service is operational; it continues until a proven level of stability is attained. You need to know the impact of the implementation on business services and the risk of remaining live with it. You need to know how to create a support infrastructure, maintain a smooth and stable running operation, and prevent disruptions from faults from occurring or at least minimize them through a quick recovery method to minimize per-minute costs. A rapid and accurate problem management process oriented around a "speed of recovery clock" will get the operation back online as quickly as possible. Be sure to include strategies for early warning systems and automation, eventually leading to self-monitoring, self-healing, and self-balancing systems. These will not only monitor, manage, repair, and maintain the online operation but also improve the required levels of service and availability.
Stage 7: What Sort of Contingency Is Needed?
In the seventh stage, ensure that business continuity is in place to allow recovery of online operations in times of disaster. This requires a "Why-What-How" approach: why disaster recovery is critical, what disaster recovery entails, and how to determine whether you are in a disaster. You need to know what the current business continuity plan is, how the plan will address the incoming implementation, and what the risks are in the plan. You must consider business continuity planning and issues such as application selection, recovery windows, and cost justification. Review alternatives--from hot to cold to online sites--and some of the techniques available through extended mirroring and remote replication.
Conclusion
Successful Internet projects do not happen by accident. To be successful, you need to start with a solid and perceptive business case that visualizes all possibilities. Once the real cost is established, you can then best determine where to make investments by drawing on all the stages and the above points to question and challenge the project.
Mark Kozak-Holland is a Senior Management Consultant with IBM Global Services. He has been working with mission-critical solutions since 1985, specifically with the availability of business services to the end-user. Mark can be contacted via email at
About On-line, On-time, On-budget: Titanic Lessons for the e-business Executive
This book is intended for readers who want to know how to put operations online and deliver Internet projects but do not want to be overwhelmed by perceived complexities or IT jargon. The book explains in layman's terms how to get involved and deliver an Internet project successfully. It relies on the use of a highly vivid and well-known example of a project gone wrong, the luxury liner Titanic. It is designed to help business managers understand the phases of a project life cycle systematically and draws the relevant analogies to the design, the construction, the testing, and, ultimately, the loss of Titanic in order to clarify the basic business issues in designing, building, and implementing a new online system from beginning to end.
LATEST COMMENTS
MC Press Online