Business Continuity Plan Refresh
Four feet of snow in a week might be awesome if you run a ski resort, but it causes havoc if you run a college or university campus. That is just the quandary campus leaders in the mid-Atlantic were dealing with in December 2009.
"We couldn't open campus," says Joy Hughes, CIO and vice president for information technology at George Mason University (Va.). "You couldn't drive around."
In addition to learning it's hard to shift that much snow out of the way, Hughes says they realized Blackboard was still running and faculty members familiar with it were still in touch with their students. "That has led us to ask, 'What is the next step? How can we ensure every faculty member can use Blackboard?' " she recalls.
Ensuring learning can continue in the aftermath of a snowstorm or other natural disaster is a new page in GMU's business continuity plan. Hughes believes it was the existence of the plan that allowed the campus to weather the storm in the first place because back-up servers and generators were already in place.
GMU leaders began working on their business continuity plan in 2006 after following what happened to schools in and around New Orleans in the aftermath of Hurricanes Katrina and Rita in 2005. "They've done a great job educating the rest of higher ed," says Hughes, who has attended workshops on business continuity planning presented by leaders from some of those schools.
"When the state of Florida got hit with nine hurricanes, that was the only education people needed," says Bryan Mehaffey, vice president of technology at Ave Maria University (Fla.), who started working on a business continuity plan seven years ago.
"A lot of people forget Katrina passed over Florida before it hit New Orleans," Mehaffey points out. Although the loss of life was nowhere near what Louisiana experienced, Florida suffered from $1 to $2 billion in damage.
"Recent natural and manmade disasters have underscored the need for these plans," agrees Martin Capurro, director of product management for Qwest Communications, a network service provider. That realization has led many campus leaders to discussions of creating a business continuity plan.
Unfortunately, many schools are still just discussing the need for a plan and not actually making one, says Tony Hernandez, managing director at LECG, a consulting firm serving higher ed. "The mindset has been, 'We haven't needed this before; we can do without it'." He has seen leaders put their campus at risk by thinking, "We know what we'll do; it's just not documented."
Yet campus leaders might not be ABLE to "do" what they expect they would do during an emergency.
"You need a plan resilient to the key people not being there," says Hughes. During a tabletop scenario in which the GMU data center was flooded, someone responded, "I'll call Bob," but Bob was unavailable and the group was stumped on the next step to take. Since then, a large group of people has been trained, since there is no way of knowing who will be available during an emergency. Everyone on the emergency management team has two back-ups with whom they rotate the leadership position every six months.
"A good business continuity plan is person independent," says Hernandez. "A knowledgeable person should be able to pick it up and use it."
Another common issue is people think they are protected since they are backing up their data. "They are writing to tape and that is their plan," says Jeff White, an enterprise storage specialist with CDW-G. "That can work, but it's not the best way to do it."
After determining the mission-critical systems and servers, campus leaders must determine how much tolerance there is for down time. "Recovery Point Objective is how much data can I loose, while Recovery Time Objective is how quickly do I need it back up and running," he explains.
Experts agree that higher ed leaders appear to have a good handle on what their mission-critical systems are, including course management systems so classes can continue, websites for communicating updates, and student records. Access to each of those can be staggered. "Student records have to survive no matter what," says Capurro. But the need for immediate access to student records isn't as high as needing class schedules and data to be up and running when restoring a sense of normalcy.
The world has become so digital that an extended downtime is no longer acceptable, says Fadi Albatal, vice president of product marketing for FalconStor, a provider of digital data backup and storage. "Even in a disaster, people expect services. Our infrastructure today is very resilient." From his perspective, online classes and e-mail are priority systems, he says.
A problem Hernandez sees in determining priority systems is that the task is frequently delegated to the technology department alone. Although the IT department can handle it, they might have different priorities. "Anybody who has an interest in making sure their function can work tomorrow should show up for these discussions," he says.
While advanced technologies are making network continuity more affordable, it still isn't cheap when done properly. That can cause tension among different departments if enough people aren't involved in the early stages.
"The business office will always ask for it tomorrow. The response is, 'We can do that but it will cost X,' " relates Hernandez. "Then everyone takes a breath. They push back to IT and say, 'What can you do?' As you push recovery time back by weeks, it costs less."
A growing higher ed trend to contain costs is partnering with other colleges and universities, both in and out of state, to provide backup services.
Clouding the issue is that people have started confusing business continuity with disaster recovery. According to Hernandez, business continuity is focused on functioning tomorrow, while disaster recovery is about returning to the way things were yesterday. "You have to evaluate the [business] process independent of the technology," he says. "Ask, 'Can we operate for a week with paper forms?'"
Paper might come into play to keep communications running during a crisis. "In a university situation where we are concerned about people's dignity, we don't have a 'runner,' " says Hughes, referring to the common messenger jobs of the past. "But you might need [one] in a disaster situation [if cell phones stop working]."
Like at most campuses, GMU maintains an emergency alert version of its webpage. Prior to last year's snowstorm, the server was on campus and six people had the ability to activate the page. Leaders quickly realized the arrangement was vulnerable and the school now has a redundant server and more authorized users.
Identifying a backup command center off the main campus at which campus leaders can congregate is also an important part of a business continuity plan. Hughes' team can fall back to their Prince William Campus, which has served as a staging ground for presidential inaugurations.
In addition to people, data can be sent off site. "Large universities can easily transfer services over to another campus that hasn't been affected," says Albatal. "They can still provide services in adverse conditions."
Lacking a second campus at Ave Maria, Mehaffey has collaborated with a local school district to co-locate equipment to provide backup and communication continuity. When selecting communications vendors, he took into consideration what area businesses and local government were using in order to create stronger partnerships.
"I use Cisco Call Manager because I can take a subscriber, move them to a new location, and still have communication," Mehaffey explains. The technology allows him to relocate to an emergency command center but still use the same phone number and voicemail, making it easier to stay in touch during a crisis.
Voice over IP is revolutionizing emergency communications, says Capurro, adding that CIOs are working with risk management to better understand new capabilities it provides.
While the technology is new, the dangers are probably old. A risk analysis is an important part of creating or revising a business continuity plan. Hurricane Katrina and the tragedy at Virginia Tech might have made campus leaders realize they need a business continuity plan, but they aren't necessarily the crisis that plan should cover because they are "statistically unlikely," says Hernandez. "You have to understand the context of your college and plan for it, not just react to the news."
Geography comes into play. For example, a wildfire might be a concern for institutions in parts of California, but not those in Pennsylvania.
"You have to plan based on conditions," says Capurro, who has maps that help schools determine where to place a data center as well as what challenges their campus might face. "If you are in a hurricane zone or a flood plain, that really begins to dictate what type of plan you have."
Once developed, a plan should not sit around collecting dust. "The first step is building the plan. The next is implementing it. The third is you have to test it," says White. "Best practices say you should test your plan every quarter so when you do have to activate it you don't have any surprises."
As any CIO knows, testing the data recovery systems can be very disruptive, but that is starting to change. "There are applications now that virtually run a test so the main site doesn't have to go down," says White. Although it is less disruptive, having that functionality can make the system more expensive. "But you still have to test the plan, whatever equipment you have," he says.
In a digital data storage environment, like that offered by FalconStor, backups happen on a continual basis making testing and recovery much less disruptive, Albatal says. Providers such as Qwest, meanwhile, have monitoring systems in place that should automatically switch traffic to a secondary network when the primary network goes down.
Because of the disruptive nature of network testing, it's important to be alert for other opportunities. Hughes used the move to a new data center in June as a way to test the GMU network and practice how long it took to bring a server back up.
A continuity plan is more than just about what's on campus. "Communication links drop. Spark plugs expire. Cables get cut," says Mehaffey. "We test generators every week and chillers every month." In addition to regular testing, Ave Maria leaders revise their business continuity plan every year as part of their annual audit to adjust for changes in technology and personnel. "It's not a tedious process," Mehaffey says. "If anything, we get to improve it every year."
But the technology isn't the only thing that has to be tested. "You have to get everyone in the room for a tabletop exercise," says Hernandez. "You don't want the first time you break the glass and look at it to be when there is an actual event."
At GMU, every two weeks, the president's executive council meeting is dedicated to emergency management, including testing their assumptions around various scenarios.
"There aren't enough schools doing realistic scenarios," says Hughes. "They won't realize their inaccurate assumptions." Tabletop exercises allow people to discuss actions they should take during a crisis in a stress-free environment. "I've occasionally talked to people who think in terms of Monday through Friday. They haven't thought about what to do on a Saturday," Hughes cautions.