Prerequisite #1: Quantify Performance Requirements
The most important prerequisite is to have defined service-level agreements (SLAs) systematically examined by key stakeholders after a deep problem analysis. AnFirst, it is specific. It is an exact value that quality assurance personnel can test with before moving the application to production.
Second, it is flexible in the context of its distributed variance. The use case must adhere to the specific
Third, it must also be realistic. You can ensure this by requiring the
Prerequisite #2: Know Your Users
The most difficult part of writing user-representative load-testing scripts is the process of discovering how your users are using your applications. One useful first step is to look at your access logs. I would not recommend doing this by hand, though, because the task is insurmountable for even a Web application of medium size. There are plenty of commercial or free tools that will analyze your access logs for you.Regardless of the software that you choose to analyze your access logs, it is important that you perform the analysis and then use this information as a starting point for building your test scripts. However, access logs are somewhat limited in what they report and may give you only part of the solution. You will need to have a deeper understanding of the application itself. There are tools that can capture a more holistic view of actual end-user activity on your application.
For example, one of the biggest mistakes I saw one of my clients making was to assume that users logged out of the system when they finished their tasks. In my experience, around 20 percent of users don't log out, for whatever reason. This assumption led to having to reboot application servers every few days because of too many lingering sessions. This scenario had never been tested.
Performance Testing Phases
Performance testing must be conducted at the following specific points in the development lifecycle:
· Unit Test
· Application Integration Test
· Application Integration Load Test
· Production Staging Test
· Production Staging Load Test
· Capacity Assessment
Most of us break the development effort into iterations. Each iteration specifies a set of use cases that must be implemented. Typically, the first iteration implements the framework of the application and ensures that the communication pathways between components are functional. Subsequent iterations add functionality to the application and build upon the framework established during the first iteration.
Because iterations are defined by the use cases (or sections of use cases) that they implement, each iteration naturally offers specific criteria for performance testing. The use cases define additional test steps and test variations to the SLAs that quality assurance personnel should test against. Therefore, all of the following performance test phases should be applied to each iteration, for each iteration's use cases.
Unit Test
Performance unit testing must be performed by all developers against their own components before submitting them for integration. Traditional unit tests exercise functionality but neglect performance.Performance unit testing means that the component needs to be analyzed during its unit test by a memory, code, and coverage profiler. The memory profiler shows us the memory impact of the use case and the list of specific objects left in memory by the use case. The developer needs to review those objects to ensure that they are supposed to stay in memory after the use case terminates.
One common potential memory issue is "object cycling." If an object is created and deleted rapidly, it could be placing too much demand on the JVM. Each object that is created and deleted can only be reclaimed by a garbage collection, and object cycling dramatically increases the frequency of garbage collection.
For example, consider the following:
for( int i=0; i<object.size(); i++ ) {
for( int j=0; j<object2.size(); j++ ) {
Integer threshold = system.getThreshold();
if( object.getThing() – object2.getOtherThing() >
threshold.intValue() ) {
// Do something
}
}
}
The outer loop iterates over all of the items in object, and the inner loop iterates over the collection of object2's items. If object contains 1000 items and object2 contains 1000 items, then the code defined in the inner loop will be executed one million times (1000 * 1000). In this code, the threshold variable is allocated and destroyed every time the inner loop runs (it is destroyed as its reference goes out of scope).
The code could be rewritten to remove this condition:
Integer threshold = system.getThreshold();
for( int i=0; i<object.size(); i++ ) {
for( int j=0; j<object2.size(); j++ ) {
if( object.getThing() – object2.getOtherThing() >
threshold.intValue() ) {
// Do something
}
}
}
Now, the threshold variable is allocated once for all one million iterations. The impact of the threshold variable goes from being significant to being negligible.
Application Integration Test
After components have been through unit tests and deemed satisfactory to add to the application, the next step is to integrate them into a single application. After functional integration testing is complete and the application satisfies the functional aspects of the use cases, the next step is to run performance tests against the integrated whole.This test is not a load test, but rather a small-scale set of virtual users. The virtual users are performing the functionality that we defined earlier: attempting to simulate end users through balanced and representative service requests. The purpose of this test is not to break the application, but to identify application issues such as contention, excessive loitering objects, object cycling, and poor algorithms, which can occur in any application when it is first exposed to multiple users. This test is the first one that holds the use case to its SLAs. If the application cannot satisfy its use case under a light load, then there is no point in subjecting it to a load test.
Application Integration Load Test
This test is a full load test with the number of projected users that the application is expected to eventually support in production. It should be performed in two stages:
1. Executed with minimal monitoring
2. Executed with detailed monitoring
In the first test, the goal is to see if the code holds up to its SLAs while under real levels of load. With minimal monitoring enabled, akin to production, we give the application every chance to succeed.
In the second test, we enable detailed monitoring, either for the entire application or in a staged approach (with filters to capture only a subset of service requests) so that we can identify performance bottlenecks. If we identify and fix them at this stage, then they do not have the opportunity to grow larger in subsequent iterations.
This phase is our first opportunity to tune the performance of the application—quite a change from the traditional approach of waiting to tune until the application is finished.
Production Staging Test
Due to the expense of hardware and software licenses, we must deploy to a shared environment. This means that while our integration load tests helped us tune our applications, we still need a real-world testing environment that will mimic a production deployment.Just as with the application integration test, this is not a load test, but rather a test to identify resources that applications may be competing for. The load is minimal and is defined in the performance test plan. If contention issues arise, deep analysis is required to identify the problem. But this is the very reason that the test is automated and performed by adding one component at a time, on a known-working test bed.
Production Staging Load Test
When your application has successfully integrated into the shared environment, it is time to turn up the user load to reflect production traffic. If your application holds up through this test and meets its SLAs, you can have confidence that you are headed in the right direction.If it fails here, you need to enable deeper monitoring, filter on your application's service requests, and identify new bottlenecks. You may also need to retune the new environment to account for the existing applications and load; this may mean resizing shared resources such as the heap, thread pools, JDBC connection pools, etc.
Capacity Assessment
When you've finally made it to this stage, you have a very competent application iteration in your hands. This final stage of performance testing captures the capacity of your application. In this stage, you generate a load test on the entire environment, combining the expected usage of your application with the observed production behavior of the existing environment. In other words, you start with the environment load test and then start scaling the usage up in the same proportion as the environment load test. All the while, you are testing for all SLAs.The capacity assessment gives you the total picture of your application (and environment) so you can assess new architectural considerations.
Formal Capacity Assessment
Rather than reactively buying additional hardware, I advocate a proactive approach to capacity planning that systematically determines the actual capacity of your environment. Often, additional hardware is not the right solution to capacity issues (although it's an effective one if you have the funds). Formal capacity assessment will help you make more educated decisions; it's more than a load test. Capacity assessment requires the following components:
· Balanced Representative Service Requests—You need to know your users, specifically what they do and in what percentage (balance) they do it.
· Hard SLAs—This is self-explanatory.
· Expected Load—This is the number of simultaneous users (and their typical behaviors) that your application needs to support.
· Graduated Load Generator—A load generation application will climb up to your expected load in a reasonable amount of time and then slowly crawl up.
· SLA Evaluation—This functionality can be built into your load generator or be accomplished through synthetic transactions, but the focus is on monitoring the response time of your service requests against their respective SLAs.
· Resource Utilization Monitor—Capture the performance of application server and operating system resource utilizations to determine saturation points and identify which resources give out first.
With all of these in hand, it is time to start loading your application. Configure your load generator to generate your expected usage in a reasonable amount of time. While you are increasing load to the expected usage, capture the response time of your service requests and evaluate them against their SLAs.
Once you reach your expected user load, it is time to determine the size of the steps that you want to monitor. The size of a step is the measurable increase in user load between sampling intervals; it defines the granularity of accuracy of your capacity assessment. Pick a time interval in which to increase steps and record the response times of your service requests at these intervals.
Continue this pattern for each service request until the response time of each exceeds its
For each service request, compile your information and note the capacity of your application at the lowest common denominator: the service request that first consistently misses its
While this is going on, you need to monitor the utilization of your application server and operating system resources. You need to know the utilizations of your thread pools, heap, JDBC connection pools, other back-end resource connection pools (e.g. JCA and JMS), and caches, as well as CPU, physical memory, disk I/O, and network activity.
A formal capacity assessment report includes the following:
· The current/expected user load against the application
· The capacity of the application under balanced and representative service requests
· The performance of key service requests under current load
· The degradation patterns of each service request
· The system saturation point
· Recommendations
After you have gathered this data and identified the key points (e.g., met
· Extremely Under-utilized System—The system can support more than 50 percent additional load.
· Under-utilized System—The system is under current/expected load, all service requests are meeting their SLAs, and the system can easily support more than 25 percent additional load.
· Nearing Capacity—The application is meeting its SLAs, but its capacity is less than 25 percent above current load.
· Over-utilized System—The application is not meeting its SLAs.
· Extremely Over-utilized System—The system is saturated at the current/expected load.Performing a capacity assessment on your environment should be required for all of your application deployments and should occur at the end of significant application iterations. Without a capacity assessment, you are running blind, hoping that your applications won't fall over with the next promotion or during the holiday season.
I hope this methodology shows that ensuring successful enterprise application performance takes more than tools. It takes an intelligent application of strategy, testing, and analysis to the stages of development. Urge your management to permit this exercise. I can assure you that the resulting calm nights of sleep will more than make up for it.
LATEST COMMENTS
MC Press Online