Table of Contents<<<
Section 9 -- Discussion of Other Options Considered
Each of the three figures in this section illustrates an approach considered for the IHFS infrastructure.
These approaches were generated to allow analysis and discussion on the strengths and weaknesses of each
general approach and they have the level of detail needed to facilitate that purpose only. Figure 10
illustrates the approach that most resembles the chosen architecture.
For the first approach, IHFS applications were described as being one of the following basic types:
· Interactive control and/or processing
· Processing
· Interactive display/editing
· Interactive display
Applications were described as having the following characteristics. They are built
using libraries of independent functions (for example, operations in NWSRFS) and custom main routines.
Data sets are passed between functions by using data access services while control parameters are passed
to functions and status returned. Status was defined to be the information needed at the calling level to
make control decisions.
An application has an execution environment associated with it that defines what run it is in, the input data
sets to use in term of owners, the owner of output data sets, application versions, and other similar items.
IHFS services are aware of an applications run environment when providing
services to the application. An application can exchange messages with other applications associated with
its run. These messages are status and control parameters. An application can
start another application by supplying the same execution environment and desired control parameters as
the parent application. An application's behavior is affected by its run environment parameters.
IHFS data access services were defined to:
· Isolate applications from physical location and structure of data
· Provide defined logical data structure(s) to the application for both the storage and retrieval of data
· Provide both temporary and persistent data storage
· Filter requested data based on the requesting applications run environment (e.g., owner)
· Annotate stored data with information from the requesting applications run environment
· Manage any buffering of data needed for I/O efficiency within an application while ensuring that
data values are synchronized across all applications in a run
· Manage data archival (and retrieval) and manage the deletion of expired data
In this approach, the main feature of the data access services is that it presents a homogeneous interface to
the IHFS applications layer. The applications layer needs only know the what, where, and when of data to
access data.
IHFS application communication and control services were defined to:
· Provide the mechanisms that allow IHFS applications to exchange messages and status
· Provide the mechanisms for starting IHFS applications either by other applications or when an
event occurs (events can be when a certain time has arrived, data has arrived, data state changes,
etc.)
· Provide mechanisms for tracking software configuration
The concept of an IHFS run was defined. An IHFS run is a collection of one or more IHFS applications
that cooperate to perform an IHFS function such as a forecast run or a model calibration run. The
applications in a run produce outputs that are consistent between the applications. The applications may be
run serially, in parallel, at completely different times, or in some combination. To perform the required
functions of the IHFS, a series of runs are performed:
· Runs that get data from AWIPS and store the data in an IHFS logical data structure annotating the
data with rudimentary quality information.
· Interactive runs that allow forecasters to perform quality assurance functions, including viewing,
editing, and annotating the data.
· Runs to aggregate data into larger data sets, including any data smoothing, averaging, and
estimation of missing data points.
· Runs to calibrate models.
· Runs to explore different forecast methods or procedures.
· Runs to execute models to generate guidance.
· Runs to assist forecasters in the use of guidance products and current observations, etc.
On a typical day, the same series of runs will be executed at the RFC to make data sets from input data for
input into model runs, and model runs will be executed to prepare guidance. Guidance will be made
available to WFOs. Likewise at the WFOs, runs will be executed to prepare input data for evaluation and
to evaluate guidance and the effect of current observations to help forecasters prepare user products.
The second approach changed the architecture of the data access services as shown in Figure 11. The
assumptions for the IHFS application layer remain the same as for structure 1. Structure 2 assumptions for
the data access services are based on structure 1, but the interface presented to the IHFS applications layer is
no longer homogeneous. The application layer now needs to be aware of the category of storage used as well
as the what, where, and when of the data to access it. Examples of categories may be Data Base Management
System, NetCDF, or SHEF. The operations allowed and the interface will be different for the different
categories. The location of the data and the physical formats of the data are still hidden from the application.
The assumptions for IHFS Application Communication and Control services and IHFS runs remain the same
as for structure 1.
The third approach changed the architecture of the data access services as shown in Figure 12. A new
application, data ingest - output, was added to the application types. The data access services assumptions
varies from structure 1 because the applications would be allowed to directly read and write AWIPS data.
These special applications use data access services to load the data into an IHFS data store and retrieve it to
write AWIPS format data, such as guidance and other IHFS outputs. The data access services isolates
applications from the physical location and the structure of the data providing defined logical data structure(s)
to the application for both the storage and retrieval of data. Except for data ingest - output applications that
function as part of the data access layer, other applications cannot access data that is not ingested into the
IHFS or the output data files.
All other assumptions from structure 1 remain unchanged.
The conclusion was that the structure of the data access services should be based on structure 1, which is a
considerably large implementation. It was determined that structure 2 required too much knowledge of user
applications. Structure 2, however, might be useful in implementing structure 1 at a lower level. Structure 3
was determined to be a good place to start implementation. Implementing only the relational database to a
uniform data access interface should have good COTS support, and the existing ingest programs could form
the basis of most direct applications.
The question of how applications should be structured was considered separately. The IHFS should support
hydrologic functions across a wide range of application structures; however, the initial IHFS applications
structure supporting current functions can be addressed. Two possibilities were considered: (1) to keep the
same NWSRFS application structure with input decoders, model setup programs, and the preprocessor and
model in the same program with use of an optional GUI, and (2) to keep the same logical structure, but spread
the computation across several processes executed in a IHFS job. A different process for each segment allows
parallel execution where possible, as discussed below.
The NWSRFS Science is organized by segments, where a segment is defined as an ordered set of operations.
There may be dependencies between segments such that one segment may require the outputs from one or
more other segments. Currently, there is no looping or iteration between segments, except in the interactive
case where the forecaster is unsatisfied with some result, changes parameters, and reruns a series of segments.
(One could imagine cases where iteration between segments to reach a convergence might be useful and the
possibility of such behavior should be considered. In fact, such iteration has been implemented within some
segment operations). The only scientific constraints on the ordering of the execution of segments are the
output-to-input relationships of the segments. In general, a segment can be thought of as a relatively small,
unique geographical area (though by definition it may not be so). This organization is appealing because it
matches an intuitive understanding about how water travels down a watershed. A model structure that matches
the structure of the process modeled makes it easier to understand how the model relates to actual events.
9.1 Data Structure Storage and Synchronization: RFC and WFO
How data are stored physically and where they are stored physically can be done in several ways. Some
location options are as follows:
· Separately maintained, but same schema.
· One database with a failure and/or backup plan in case of site failure.
· Distributed database with a failure and/or backup plan in case of site(s) failure.
· Separate databases, same schema, with a synchronization scheme such as replication.
Some physical storage options are as follows:
· DBMS for all data accessed by IHFS applications. Native file formats for raw input data and IHFS
output data. IHFS output data stored twice: once in DBMS and once after they are extracted and
formatted in native file format. Input data are ingested and placed in the DBMS before they are
available to IHFS applications.
· Native file formats for raw input data and IHFS external data outputs. DBMS for all intermediate
data sets produced or used by IHFS.
· Native file formats for raw input data and IHFS external data outputs. A mix of DBMS and native
file formats for data sets produced or used by IHFS.
· Native file formats for all data.
9.2 D2D Architectural Issues
The D2D display framework has been adopted by AWIPS as its operator interface. Figure 13 shows the high-
level structure of the D2D structure. When the D2D software is used to produce a display, the D2D software
calls routines to implement the display as a series of subroutines. The D2D can initiate and monitor
independent applications; however, these applications do not directly display information on the D2D screens.
Some examples of functions provided by D2D are a common display format and user interface, different map
projections and zoom levels displayed with D2D making the translations, and the looping of a display area
through a series of images.
Some examples of D2D limitations and constraints are that D2D is limited to UNIX hosts (and perhaps to HP-
UX); the D2D display layout has 1 large and 4 small display areas and is fixed; interactive overlays are
limited to a single color; most current depictables expect data in netCDF format; the current set of depictables,
extensions, and menus are not geared to hydrologic forecasting; and there is limited communication to
applications external to D2D.
Maintaining the use of D2D in the architecture implies that the display and interactive functions will need to
be isolated from processing functions and data retrieval functions. This separation would allow changing the
display technology in the future with little impact on the processing and data retrieval functions. However, the
porting of the display-dependent functions, once coded, could be a significant effort. In D2D, the display-
dependent functions are coded in the depictables and extensions.
This discussion raises the question of where to place certain types of display-related processing such as map
projection translations. These functions can be placed in the display drivers, applications, or in a separate
layer between the two. These decisions are made for us if we use D2D or define a system that can easily be
adapted to D2D at a later time. In D2D, these type of functions are coded in depictables. Depictables can be
thought of as display drivers for a particular type of data.
Other technologies that should be considered are Web browsers using Java for display-dependent processing
and Xwindows/Motif using a window/display tool such as Tcl/Tk.
Display technologies for multiplatform environments, display software reuse, integration with AWIPS
operational display flows, flexibility, performance, and display software maintenance are all issues that need
to be considered in deciding how to structure the display and/or operator interface portion of the architecture.
9.3 Informix Architectural Issues
Informix is an RDBMS, supplied with AWIPS, with some object extensions available. AWIPS uses Informix
to store and retrieve information in textual formats. The current river forecasting software uses Informix to
hold various data for use at the WFOs by the WHFS. The RFCs have made use of Informix to hold decoded
input data before it is reformatted and placed into internal NWSRFS data files. Much of this data are the same
at the WFO and RFC, thereby forming a common set of information.
One of the identified problems in enhancing and maintaining the river forecasting system is that several
separate programs use the same information from Informix and any change to the schemas used affects several
programs.
The architectural constraint is that an information and/or data storage management function is needed in
IHFS, and, at some level, this function must be able to use Informix to manage information. Since some
information is contained in files (such as netCDF files) and not in Informix, and schemas change the
information and/or data storage functions, applications must be isolated from changes in the physical and
logical storage structures, data location, and functions used to store/retrieve data/information. Furthermore,
IHFS should not be tied to Informix, but should be able to adjust to a replacement of Informix with other
COTS RDBMS software or with other non-RDBMS technologies.
9.4 Application Data Access Services Function Interface
· Needs to support Fortran language and other implementation languages and C++).
· Desirable to have COTS support for interface.
· Desirable to have a data definition language for both sides of the interface.
· The logical and physical organization of the data does not have to be the same; therefore, the data
access services need a description of the logical and physical data organizations and rules for
translating between them. It is better to have this information encoded in a language or table
rather than code.
9.4.1 Data Retrieval
· Applications must be able to identify the data to be retrieved.
An application is executed within a run environment. Run environment variables affect which data
are identified by adding additional qualifiers such as allowed data owners.
· Data retrieval must be efficient.
9.4.2 Data Storage
· Applications must be able to uniquely identify data to be stored.
· How long data are kept by the IHFS varies.
Some data are persistent, some last for a defined time period, some last only for the execution
of an application, some last until a run is completed, and some last until they are explicitly
deleted or overwritten. The data access services must be able to manage these varying degrees
of persistence.
9.4.3 Logical Data Storage Organization
Applications must have knowledge of how to identify data; therefore, a logical data storage organization
must be available to them. The logical data organization should be based on hydrologic concepts and needs,
not on the physical storage of the data. Getting logical data storage right is critical to the ability to maintain
and enhance the system in the future.
To uniquely identify data, there must be a set of attributes that uniquely identifies the desired data. The
logical data storage organization defines those attributes, what sets of attributes can be used to request
data, and how data are grouped when a set of attributes does not uniquely identify data.
In general, a single datum is not what is wanted, but a collection of related data. How these collections of
related data can be arranged is also part of the logical data storage organization. One such arrangement
might be an ordered set of data-records, with each data-record containing the same information for differing
time periods.
The basic issue is: how should an application be able to view hydrologic and related data? Another
consideration is which views of the data can be easily optimized for I/O. A well-planned data model is
needed that must consider likely patterns of data access both in the current modeling approach and for
possible future approaches.
|