17
Fri, Jan
2 New Articles

Beyond Disaster Recovery: High Availability's Other Hats

High Availability / Disaster Recovery
Typography
  • Smaller Small Medium Big Bigger
  • Default Helvetica Segoe Georgia Times

Protection is irrefutably the central purpose of technology that facilitates computing system resiliency, and absolutely nothing offers more comfort in times of desperation than having a mirrored system sitting in a bunker with a live screen ready to take on production operations.

Over the course of two decades, HA has made a significant imprint on IT. Its role, however, has been rather narrowly defined in the minds of most people. Interestingly, over time, HA technology has been queued up and used for "outside the box" tasks, not just business continuation. This flexibility is largely made possible by the harmonic convergence of numerous technologies that HA impinges upon, including hardware, software, and network bandwidth.

In some cases, the architects of commercially available HA packages have created new solutions to specifically handle some of these off-mainstream tertiary tasks, but all of these spin-off systems use the tenets of HA at their core.

Evolving business requirements have pushed the need for uninterrupted system availability to levels that previously seemed unapproachable, given the complexity of computing technologies. System users—who now include customers as well as upstream and downstream supply chain partners—are notoriously impatient and unforgiving. And, in the age of SOA, where computers interact directly with each other, Web services need to be online 100 percent of the time.

Given this fact, one needs to consider all of the circumstances that can render a system inaccessible. Statistics show that users are more often inconvenienced by planned system downtime rather than hardware, software, or network failures. Other uses for HA resources have evolved to address these factors.

Software Upgrades

Despite experience and sound planning, the process of upgrading to a new software release level can be in inexact exercise. Unexpected problems can hurt short-term productivity and profitability. While the likelihood of encountering a catastrophic problem when rolling your system up to a new release level is relatively slim, all of the small nagging problems can add up. In some situations, when core business applications don't work, and returning to a previous software level is not possible because some data structures have been changed, users can languish for hours.

HA technology facilitates a smooth transition here by allowing a software upgrade to be evaluated in a "dry run" on the mirrored, backup environment before it is rolled out to the production environment. Any problems or conflicts can then be resolved without the risk of downtime or data loss in a live business environment. In fact, many businesses that use HA find that the ability to evaluate upgrades and test all manner of changes and modifications before rolling them out to the production environment is one of the key benefits of this technology.

Test Environments

Testing new software is a central function of software development that regrettably gets neglected. New code is rarely defect-free, and all too frequently, it is the user who finds problems. Time-to-market pressures coupled with the difficulty of containing the cost of new development are two factors that are placing greater pressure on developers to get new software off of the test bench and into production.

In days past, third-party software vendors issued new releases of their code on a relatively infrequent basis. After 10 years, some products had only advanced to version 3x. Now, vendors release upgrades much more frequently because of market pressure and interoperability issues. Since very few IT shops run major third-party applications that are entirely unmolested, most of us have work to do on our own modifications before we can load a third-party upgrade. Once initial programming is complete, these modifications need to be integrated with the vendor's package and tested.

System i HA tools can be used to supply excellent data for testing. First, test data must contain a representative sample of production data. A test database is typically much smaller than your production database but must be identical with respect to tables, indexes, and constraints. When test data accurately represents production data, you can be certain that a report will run the same way in both environments.

One of the more tedious and time-consuming aspects of maintaining the viability of test data is ensuring its freshness. Software testing changes the data content. As structural and data changes are made, test data loses its usefulness.

When HA technology is used to supply data for testing, data is always fresh, abundant, and relevant. It eliminates the need to continually extract test data, use stale data for testing, or, heaven forbid, test against a live production database.

This type of environment incorporates the use of one System i machine to serve as the primary production machine and another partitioned system to serve as the backup and test machine. A third, smaller system can be used as a dedicated development and test box if such resources are available.

Once the hardware, application software, and communications environment are stable, HA can be used to replicate for business resiliency between the production machine and the backup, and, for the purposes of testing, between the production machine and the development/test system. To do this, your HA solution must support "one-to-many" replication.

To minimize impact on the network, you can replicate to the development/test machine only data that is necessary to test the applications under development. A smaller volume of data speeds up queries and allows testers to move through their scripts more quickly.
Once an adequate amount and variety of data resides on the development/test machine, you can suspend replication and commence with your manual or automated QA/QC protocol.

Normally, you'll find your test data to be in bad shape once testing is completed. In ordinary cases, these data files would be deleted and replaced with fresh ones, but in instances where high availability is used, a resync can restore the integrity and freshness of the test data sets. If the HA product involved has self-healing capabilities, a resync is not necessary because this type of technology will compare production data files against those on the test machine and repair them automatically.

In either case, data files on the test machine will resync with the last transactions committed on the production machine, and more importantly, there is absolutely no downtime exposure because the replication to the backup machine is never hindered.

While HA systems can greatly simplify the extraction and maintenance of test data, they cannot make data safe for testing. Test data can contain sensitive information like social security numbers or credit card and bank account numbers. It can also directly reference a specific person's health status. On a daily basis, there are dozens of accounts in the news of how this type of information was mishandled or stolen. Sensitive data that is used for testing must be scrambled to protect the identities of people and companies yet still maintain referential integrity. Many tools on the market today can scramble account numbers while maintaining the data's viability in a test environment.

Upgrade, Migrate, and Consolidate

Despite rational man's tendency to lean toward denial, every System i shop must periodically upgrade its hardware. By any measure, this type of system maintenance takes time. Typically, a full system swap takes up to 24 hours even when everything goes as planned, and things rarely do, mainly because the specific knowledge needed to plan and execute a complicated migration project—and foretell its pitfalls—rarely exists.

HA can be used to substantially shrink the window of time needed for system upgrades and migrations and lessen the potential for disruption by synchronizing parallel systems during hardware upgrades. Users can work on one system while IT administrators perform tests, audits, and day-end processes on the new machine to verify system integrity prior to going live.

For this to work, you must first have a mirrored image of the production environment on another system. This second system can be the backup machine for your HA environment, or it can be one supplied on a temporary basis by a hardware or HA vendor. Production system users are switched from the primary production box to the backup system prior to taking the primary system off line. Obviously, replication must be suspended until the new machine comes online.

Once the primary production hardware swap has taken place and the application environment has been reestablished and tested, the database can be updated by using the HA tool to resync all changes to the new system. In this scenario users experience downtime only when they are switched from one machine to the other.

It's worth noting that all of these processes can take place during the day while people are banging away at their desktops and customer requests are being satisfied. When maintenance tasks are done during the day instead of at night, a full cadre of technicians are available to help troubleshoot problems, not to mention the business benefits of keeping revenue-producing operations proceeding at full pace during the upgrade.

Here is an example of how one large hospital in California recently used HA for a hardware upgrade. Florencio Alcocer, Senior System Engineer at Adventist Health in Roseville, California, calculated the amount of downtime users would face as they migrated from an AS/400 Model 730 to a System i Model 570 and realized that users could be offline for up to 20 hours. Because Adventist Health receives patients 24x7 and cannot withstand 20 hours of system downtime, Alcocer used high availability in the manner stated above to facilitate this migration. "We cut our downtime to a fraction of what it would have otherwise been and eliminated some staff scheduling problems."

Server Load Balancing

Server Load Balancing (SLB) is a way to share work between several computers and computer resources. The goal of SLB is to reach optimal resource utilization and accelerate response time. Since load balancing increases the use of interconnected resources, it places an additional burden on the network.

High availability tools can be used to balance a heavy system load between two or more System i servers to accommodate SLB. In fact HA and SLB go hand in hand because, by definition, they both offer redundant resources to satisfy varying computing requirements.

If an HA software solution is used to accommodate SLB, no dynamic network-based switching device is needed to move user loads from one server to another. Instead, user loads are statically balanced, meaning that the backup machine is used to handle specific tasks in a static manner. For example, you can assign interactive processes to the primary production machine while end-of-day batch processing, reports, or development tasks can be tasked to the backup machine. To accomplish this, replication must be bi-directional.

Boost Business Productivity, Profitability

A high availability solution can also uncover and exploit unused value in your business processes and throughout your organization. Because a high availability solution minimizes or completely eliminates planned downtime, it immediately boosts productivity, raises efficiency, and makes an organization more resilient and responsive.

Many organizations have unused opportunities to maximize uptime across the full range of the business and IT infrastructure, including front and back offices, go-to-market processes, partner and channel operations, product development, and information exchange and collaboration.

Start by identifying opportunities where currently unused time can be leveraged for higher productivity, revenue growth, profitability, competitive advantage, and information sharing. Ask IT staff and line managers to identify where planned downtime can interrupt or prevent operations and other business processes from achieving their goals.

It's also likely that planned downtime may be adding hidden costs to productivity and the bottom line across a variety of functions including go-to-market strategies, supply chain management, ERP applications, logistics, collaboration, after-market sales and service, channels, customer analysis, service-level agreements, and off-shoring and outsourcing operations.

More Than You Thought It Was

High availability solutions should be viewed as part of the vital infrastructure that supports all your operations as well as providing a safety net against unexpected IT downtime. In other words, rather than just helping you get out of trouble when things go wrong, a high availability solution can streamline and accelerate your technical, operational, and business processes in the best of times.

Bill Hammond directs Vision Solutions' worldwide product marketing efforts and is responsible for marketing strategy, product branding and messaging, and marketplace and competitive intelligence. He has over 15 years of experience in product marketing, product management, and product development roles in the software industry.



BLOG COMMENTS POWERED BY DISQUS

LATEST COMMENTS

Support MC Press Online

$

Book Reviews

Resource Center

  • SB Profound WC 5536 Have you been wondering about Node.js? Our free Node.js Webinar Series takes you from total beginner to creating a fully-functional IBM i Node.js business application. You can find Part 1 here. In Part 2 of our free Node.js Webinar Series, Brian May teaches you the different tooling options available for writing code, debugging, and using Git for version control. Brian will briefly discuss the different tools available, and demonstrate his preferred setup for Node development on IBM i or any platform. Attend this webinar to learn:

  • SB Profound WP 5539More than ever, there is a demand for IT to deliver innovation. Your IBM i has been an essential part of your business operations for years. However, your organization may struggle to maintain the current system and implement new projects. The thousands of customers we've worked with and surveyed state that expectations regarding the digital footprint and vision of the company are not aligned with the current IT environment.

  • SB HelpSystems ROBOT Generic IBM announced the E1080 servers using the latest Power10 processor in September 2021. The most powerful processor from IBM to date, Power10 is designed to handle the demands of doing business in today’s high-tech atmosphere, including running cloud applications, supporting big data, and managing AI workloads. But what does Power10 mean for your data center? In this recorded webinar, IBMers Dan Sundt and Dylan Boday join IBM Power Champion Tom Huntington for a discussion on why Power10 technology is the right strategic investment if you run IBM i, AIX, or Linux. In this action-packed hour, Tom will share trends from the IBM i and AIX user communities while Dan and Dylan dive into the tech specs for key hardware, including:

  • Magic MarkTRY the one package that solves all your document design and printing challenges on all your platforms. Produce bar code labels, electronic forms, ad hoc reports, and RFID tags – without programming! MarkMagic is the only document design and print solution that combines report writing, WYSIWYG label and forms design, and conditional printing in one integrated product. Make sure your data survives when catastrophe hits. Request your trial now!  Request Now.

  • SB HelpSystems ROBOT GenericForms of ransomware has been around for over 30 years, and with more and more organizations suffering attacks each year, it continues to endure. What has made ransomware such a durable threat and what is the best way to combat it? In order to prevent ransomware, organizations must first understand how it works.

  • SB HelpSystems ROBOT GenericIT security is a top priority for businesses around the world, but most IBM i pros don’t know where to begin—and most cybersecurity experts don’t know IBM i. In this session, Robin Tatam explores the business impact of lax IBM i security, the top vulnerabilities putting IBM i at risk, and the steps you can take to protect your organization. If you’re looking to avoid unexpected downtime or corrupted data, you don’t want to miss this session.

  • SB HelpSystems ROBOT GenericCan you trust all of your users all of the time? A typical end user receives 16 malicious emails each month, but only 17 percent of these phishing campaigns are reported to IT. Once an attack is underway, most organizations won’t discover the breach until six months later. A staggering amount of damage can occur in that time. Despite these risks, 93 percent of organizations are leaving their IBM i systems vulnerable to cybercrime. In this on-demand webinar, IBM i security experts Robin Tatam and Sandi Moore will reveal:

  • FORTRA Disaster protection is vital to every business. Yet, it often consists of patched together procedures that are prone to error. From automatic backups to data encryption to media management, Robot automates the routine (yet often complex) tasks of iSeries backup and recovery, saving you time and money and making the process safer and more reliable. Automate your backups with the Robot Backup and Recovery Solution. Key features include:

  • FORTRAManaging messages on your IBM i can be more than a full-time job if you have to do it manually. Messages need a response and resources must be monitored—often over multiple systems and across platforms. How can you be sure you won’t miss important system events? Automate your message center with the Robot Message Management Solution. Key features include:

  • FORTRAThe thought of printing, distributing, and storing iSeries reports manually may reduce you to tears. Paper and labor costs associated with report generation can spiral out of control. Mountains of paper threaten to swamp your files. Robot automates report bursting, distribution, bundling, and archiving, and offers secure, selective online report viewing. Manage your reports with the Robot Report Management Solution. Key features include:

  • FORTRAFor over 30 years, Robot has been a leader in systems management for IBM i. With batch job creation and scheduling at its core, the Robot Job Scheduling Solution reduces the opportunity for human error and helps you maintain service levels, automating even the biggest, most complex runbooks. Manage your job schedule with the Robot Job Scheduling Solution. Key features include:

  • LANSA Business users want new applications now. Market and regulatory pressures require faster application updates and delivery into production. Your IBM i developers may be approaching retirement, and you see no sure way to fill their positions with experienced developers. In addition, you may be caught between maintaining your existing applications and the uncertainty of moving to something new.

  • LANSAWhen it comes to creating your business applications, there are hundreds of coding platforms and programming languages to choose from. These options range from very complex traditional programming languages to Low-Code platforms where sometimes no traditional coding experience is needed. Download our whitepaper, The Power of Writing Code in a Low-Code Solution, and:

  • LANSASupply Chain is becoming increasingly complex and unpredictable. From raw materials for manufacturing to food supply chains, the journey from source to production to delivery to consumers is marred with inefficiencies, manual processes, shortages, recalls, counterfeits, and scandals. In this webinar, we discuss how:

  • The MC Resource Centers bring you the widest selection of white papers, trial software, and on-demand webcasts for you to choose from. >> Review the list of White Papers, Trial Software or On-Demand Webcast at the MC Press Resource Center. >> Add the items to yru Cart and complet he checkout process and submit

  • Profound Logic Have you been wondering about Node.js? Our free Node.js Webinar Series takes you from total beginner to creating a fully-functional IBM i Node.js business application.

  • SB Profound WC 5536Join us for this hour-long webcast that will explore:

  • Fortra IT managers hoping to find new IBM i talent are discovering that the pool of experienced RPG programmers and operators or administrators with intimate knowledge of the operating system and the applications that run on it is small. This begs the question: How will you manage the platform that supports such a big part of your business? This guide offers strategies and software suggestions to help you plan IT staffing and resources and smooth the transition after your AS/400 talent retires. Read on to learn: