RFC
Operational PC Backup Project Report
Remote Backup Test for Loss of Facility
ABRFC - 11 March 2003
Summary
On 11 March 2003, ABRFC conducted a successful operational
backup test that simulated conditions for a total loss of the RFC
facility. The test was conducted by ABRFC personnel at a hotel in
Tulsa, OK, using a portable laptop system. This test proves the
utility of a practical, low cost backup strategy whereby an RFC
provides for operational backup themselves in their home metropolitan
area for any failure scenario using any facility with a dedicated
Internet connection and voice phone line available for 24x7 NWS
use. The laptop computer system ran the river forecast model 25
times faster than AWIPS (as measured in average CPU time). Similar
performance improvements were noted across the board for other applications.
From a cold start, the system was made ready for use by a forecaster
in one hour and ten minutes. Cold start is defined as the system
having no model files, no observed data, no files of estimated radar
precipitation, no QPF files and not connected to the Internet. Ready
for use is defined as a completed run of the entire river forecast
system ready for interactive use by a forecaster and data ingest/dissemination
running. Data ingest software was improved significantly from a
previous ABRFC test of a desktop backup system in May 2002. The
changes in data ingest software brought the receipt of observed
height data (river stage, lake elevation, etc) to an acceptable
level of 98.6% versus the May 2002 level of 67.2%. The test was
totally successful as the laptop computer system was able to host
operations and provide full-featured ABRFC forecast and guidance
products to customers in a timely and transparent manner. The system
ran totally independent of AWIPS. The hydromet situation was fairly
benign during the test with only light precipitation noted in the
ABRFC area of responsibility. Therefore, only a representative routine
daily river forecast, flash flood guidance and hydromet discussion
products were issued using the backup system.
System Configuration
Computations were performed on a single Dell laptop
computer with 2.0 GHz CPU, 512 MB memory, Red Hat LINUX Version
7.2 and Informix for LINUX Version 7.31 connected through a four
port router on a 100 Mb leg of a hotel LAN that was connected to
the Internet with “near” T1 bandwidth. Three additional
forecaster “seats” were available by connecting to the
inexpensive router. ABRFC only tested one additional “seat”
using an old 233 MHz, 32 MB memory laptop running Red Hat LINUX
Version 7.0 and “connecting” to the Dell laptop via
SSH login. Therefore only two persons were using the system at one
time during this rather benign weather situation. The IHFS database
is version 5.22 and NWSRFS is Release 22. Data retrieval is via
Internet using programs developed by ABRFC. DPAs are retrieved via
FTP from the NWSTG central product server (tgftp.nws.noaa.gov).
Text products, such as HADS, COOP, and other SHEF products, are
obtained from the SRH data server (www.srh.noaa.gov/data/...) by
opening a raw socket connection, sending an HTTP GET request, and
waiting for a response (i.e. a non-GUI Web client). Two new software
programs were implemented for this test to improve overall availability
of DCP data. These two programs run periodically and check the Informix
database for missing data. If the programs detect missing data,
they then attempt to access those data from the HADS server (dipper.nws.noaa.gov)
or the USGS server (waterdata.usgs.gov/{state}/nwis/current...).
Access to Mesonet data is direct-to-server via FTP. An ABRFC developed
program converts raw DPA (rdar digital precipitation array) binary
files from UNIX to LINUX (big endian/little endian problem) and
provides the files to the process_dpa program where the standard
nationally supplied radar precip processing software takes over.
The standard ShefDecoder, OFS_DE and BatchPost routines are utililized.
The P2 radar precip estimation software was ported and utilized.
The standalone MPE software was tested on the laptop and functioned.
MPE was not used operationally during the test as is the policy
with ABRFC AWIPS operations. The ABRFC versions of xnav, xdat, xsets
and fcst_prog were used for data display, quality control and product
composition. D2D is not utilized. Radar reflectivity and satellite
images are displayed by forecasters via Internet from their favorite
web locations. QPF was processed using NMAP for LINUX. NMAP performed
flawlessly. The national flood outlook product was not produced
on the backup system since ArcView is unavailable for LINUX. Product
dissemination was accomplished using ABRFC developed scripts which
drop products on the SRH server where they are picked up by a backup
office’s LDAD. Products to be disseminated are then ingested
through the LDAD into AWIPS via the standard handleOUP/distributeProduct
AWIPS software used to pull other information into AWIPS.
System Performance Summary
System Maintenance Requirements
In order for
the backup hardware system to be available during a total loss of
facility scenario, it must be stored offsite in a secure manner.
For economic and simplicity reasons, ABRFC chose to keep the backup
system in a “powered down”, off-line state. Therefore,
in order to cold start the system, one must have access to current/recent
NWSRFS fs5files, Informix observed data, QPF and radar precipitation
files. ABRFC provides access to these data by uploading appropriate
files and data periodically to a Southern Region HQ server. Providing
the latter three types of data are straight forward...one simply
moves an appropriate period of data periodically from the RFC operational
system to the server and one is then updated and ready to go. However,
the fs5files are a different problem. The fs5files must be LINUX
files. Therefore, the RFC must either be running NWSRFS on one of
the AWIPS lx PCs or a stand-alone PC (such as is done at ABRFC)
in order to have the files available in LINUX format. Due to inconsistencies
in NWSRFS files, no one has been able to write a program that successfully
converts all fs5files from HP-UX to LINUX. One other alterative
(ABRFC has rejected for a number of reasons) is to cold start NWSRFS
by defining the entire system from scratch (i.e., via segment defs,
station defs, etc, etc)
It is extremely
important that the backup system have its software updated periodically.
For example, new releases of NWSRFS must be implemented, updates
to local applications must be ported and etc.
The backup system
must also be tested periodically. ABRFC is implementing a quarterly
backup test schedule for the next year or two. This frequency was
chosen due to the newness of the system and in order for all staff
members to have the opportunity to gain experience and confidence
using the backup system.
Concerns Identified During The Test
There were no
serious problems identified during this test. All concerns can be
corrected with proper checklists and new procedures. One concern
surfaced at the start of the test and emphasized the importance
of updating software throughout the system. When one goes into backup
mode with this system, one must have another office monitor the
Southern Region server using their LDAD in order to act on any product
put there for dissemination to AWIPS. When the test began, ABRFC
notified WGRFC that we were going into backup mode and they should
turn on their LDAD program. WGRFC informed us they could not because
they were performing an AWIPS upgrade that day. Therefore ABRFC
used our own LDAD to serve the function for AWIPS dissemination.
However, we later discovered that our LDAD program was an old version
that had bugs and thus did not provide for dissemination of all
the products sent to the server. A similar problem occurred when
two programs that access hourly precipitation data were not run
during the test because the system software was not updated.
One other concern
is the low percent receipt (84.3%) of DPA products. The problem
was the same as in the May 2002 test, i.e., the “tgftp.nws.noaa.gov”
server sometimes too busy and refused our connection. ABRFC was
unsuccessful with NWSHQ in obtaining a special login to the system.
In order to address this problem, we have asked SRHQ to capture
of DPA products from the SBN and to store them on their server.
Future
Improvements
The current
system does not provide for creation of; 1) ABRFC web graphic products
(such as radar precipitation estimates), 2) the Southern Region
River Flood Outlook (RFO) text and graphic products and 3) the national
significant Flood Outlook product (FOP). The capability exists for
graphics creation of products as indicated in item 1 above. If deemed
necessary in the future, software can be ported to create these
graphics directly on the backup laptop because they are not currently
created with ArcView software. The RFO text product (cccESGxxx)
can be ported as well because the gui and text product creator does
not use ArcView. The FOP cannot be ported directly to the LINUX
laptop because the gui uses ArcView. A MS-Windows machine would
need to be integrated into the system in order to produce ArcView
derived graphics. Another category of products that have not been
tested is AHPS ESP-ADP images.
After all this
has been said concerning possible improvements to the system...one
must consider the basic requirements for backup are being met through
production of river forecasts, flash flood guidance and other support
products such as HMD/HCMs. The point is...it is a backup system
at this time and not a replacement system for AWIPS.
Conclusions
This test proves
the utility of a practical, low cost backup strategy whereby an
RFC provides for operational backup themselves in their home metropolitan
area for any failure scenario using any facility with a dedicated
Internet connection and voice phone line available for 24x7 NWS
use.