Have you ever had an out-of-control job bring your system performance to its knees? It starts out as a normal job--an ordinary query or maybe an SQL, an ODBC, or a server communication job. Then, it suddenly turns into a storage-devouring menace. These troublesome jobs all have one thing in common: excessive temporary storage consumption. And all your users will suffer as a result of this poor performance.
Every job uses temporary storage. Programs and internal system objects that support routing steps within a job require it. But there isn't a good way to easily identify which job is causing the problem.
You can try to find a problem job by using the WRKJOB command to view temporary storage information for each job. Unfortunately, this approach is too random and requires more time than you have in the midst of a performance crisis.
You can use the WRKSYSSTS command to check the temporary storage used for the entire system. This won't bring you closer to solving the problem, but it can help you avoid exceeding your total disk capacity, which would be a complete disaster.
By using the WRKCLS command to find your class objects, you can try to prevent runaway jobs from spoiling system performance by specifying a job temporary storage threshold at the system class object level. However, if you use this threshold, you risk having critical jobs end if they exceed the storage amount. And that's a risk you may not be able to afford.
Or you can use Robot/SPACE 2.0 from Help/Systems. It monitors the amount of temporary storage used by each job, allows you to set flexible job temporary storage thresholds, notifies you when a threshold has been exceeded, and can hold jobs that exceed your thresholds--automatically (Figure 1).
Figure 1: You can easily control any job's temporary storage. (Click images to enlarge.)
Robot/SPACE monitors and graphically displays the amount of total unprotected storage used. If a job begins to consume massive amounts of temporary storage, it will drive up your unprotected storage used, and you'll see a noticeable spike on the graph. At this point, you can display the current size of all your jobs and hold or end the out-of-control job.
Robot/SPACE allows you to set a default job temporary storage threshold that, when reached, sends a message to a message queue. You're aware of the situation before system performance is affected. In addition, you can define exceptions for any jobs or subsystems that require thresholds higher or lower than the default.
When a job temporary storage threshold is reached, Robot/SPACE can hold the job automatically. It can also use Robot/ALERT, Help/Systems' event notification software, to send text, email, or pager messages to any device. Robot/SPACE is designed to eliminate disk storage crises by keeping you informed and letting you choose exactly how you want to handle potential problems.
A job that reaches its threshold also creates a system event. You can display the Job Temporary Storage Threshold History (Figure 2) to see which jobs have exceeded their thresholds and when. And you can see the job's peak size and who submitted it. This information helps you identify potential problems in your operations schedule.
Figure 2: The Job Temporary Storage Threshold History helps you identify performance problems.
Robot/SPACE also features a graphical interface that allows one-click access to its disk management tools. And it lets you set storage thresholds for your monitored auxiliary storage pools (ASPs) and independent auxiliary storage pools (iASPs). And new in Robot/SPACE 2.0 is the Critical Storage Investigator (CSI), a problem-solving tool that helps you locate the source of disk storage problems quickly.
Don't let performance problems hold you hostage. Take control.
Check out Help/Systems' offerings in the MC Showcase Buyer's Guide.
Tom Huntington is Vice President of Technical Services for Help/Systems, Inc.
He can be reached at 952.563.1606 or
LATEST COMMENTS
MC Press Online