Techniques for troubleshooting various aspects of AvnFPS are addressed in other sections of this manual. The purpose of this section is to gather some of the more likely fault scenarios and troubleshooting techniques into a single location.
A number of server processes keep AvnFPS running. A series of status lights on the TAF Monitor GUI
reports the status of most of these processes. Keep alive messages are sent among the servers every 30
seconds to set the colors of these status lights. Section 1. of the System Administration Manual,
"System Overview," contains a detailed description of each server. Server hosts may vary between AWIPS
releases to support load-balancing efforts. For AWIPS OB8.3, the servers reside on a single host,
px2f.
If all the server status lights are red, check for
Failed name/event server (avnserver)
A network failure
The command ps -ejHf | grep avnpython lists server processes and any
component threads. In this case, indentation in the rightmost column indicates hierarchy. A
sample listing follows:
px2-wfo:user:ps -ejHf | grep avnpythonfxa 18202 1 18201 18201 0 Apr28 ? 00:00:00 avnpython /awips/adapt/avnfps/OB8.3/py/avninit.py px2f fxa 18203 18202 18201 18201 0 Apr28 ? 00:06:46 avnpython /awips/adapt/avnfps/OB8.3/py/avnserver.py -d -n px2f fxa 18224 18202 18201 18201 0 Apr28 ? 00:02:10 avnpython /awips/adapt/avnfps/OB8.3/py/avndrs.py -d -n px2f fxa 18239 18202 18201 18201 0 Apr28 ? 00:17:02 avnpython /awips/adapt/avnfps/OB8.3/py/avndis.py -d -n px2f fxa 18250 18202 18201 18201 0 Apr28 ? 00:00:55 avnpython /awips/adapt/avnfps/OB8.3/py/avnxs.py -d -n px2f
Notes:
The command given above will shows the actual processes.
To see the threads spawned by those processes
use -efL flags to the ps command:
ps -efL | grep avnpython.
Some comments follow:
| PID | Comments | "Kill"-able? |
|---|---|---|
| 18202 | avninit daemon | Yes |
| 18203 | avnserver process | Yes |
| 18224 | Data Request Server (avndrs) process | Yes |
| 18239 | Data Ingest Server (avndis) process | Yes |
| 18250 | Transmit Server (avnxs) process | Yes |
Threads of a process will not accept signals from a kill command. You must identify the process that owns the thread and kill that top-level process. Once a parent process terminates, all of its threads will terminate as well.
The command netstat -a | grep 9090 can be used to diagnose
connections among the various servers. The name/event server uses port 9090 to
disseminate information about the various servers that are up and running.
Here's an example:
Checking netstat on px2 …
px2-wfo:user:netstat -a | grep 9090tcp 0 0 px2f-wfo:9090 px2f-wfo:55812 ESTABLISHED tcp 0 0 px2f-wfo:9090 px2f-wfo:55819 ESTABLISHED tcp 0 0 px2f-wfo:55819 px2f-wfo:9090 ESTABLISHED tcp 0 0 px2f-wfo:55812 px2f-wfo:9090 ESTABLISHED tcp 0 0 px2f-wfo:9090 lx3-wfo:40638 ESTABLISHED
Notes:
The columns are Protocol, Receive Queue, Send Queue, Local Address, Foreign Address, and State.
Six connections to and from the name/event server can be seen in
the px2 listing.
The first four established connections show connections between the servers
on px2 and the name/event server.
There is an additional connection to a process on
lx3. Most likely, this is
an instance of the AvnWatch GUI.
Starting the servers for AvnFPS is a little complicated because of the
inter-relationships among them. The utility avninit was developed to
handle these complexities. avninit is a persistent process that
runs on px2 or, in case of failover,
px1 and attempts to restart servers as
needed. If you fix a file or network problem that is preventing a server from
starting, avninit will attempt to restart the downed
server. However, avninit will only attempt 10 restarts of a failed
server in a one-hour span of time.
![]() | Important |
|---|---|
| All AvnFPS servers must run at user fxa; do not try to start them as root. The standard server startup scripts will check userid before launching the servers. |
The utility avnkill can be used to stop all AvnFPS servers that are running on a host. This includes avninit. avnkill will be persistent, sending interrupt signals first, then kill signals. It will also try to clean up ill-behaved child processes that refuse to die when the parent dies. Once all servers have stopped running, restart avninit, using the remoteServers.sh start
To "bounce" a Data Ingest, Data Request, or Transmisison Server, identify the top-level process associated with the application and use the kill command to stop it. Within a few seconds, avninit should launch a new instance.
See Section 6: “Logging” of the System Administration Manual, for complete information on logs. The following information is distilled from that section.
Log files for AvnFPSOB8.3 server processes are stored in the directory tree
/data/logs/adapt/avnfps, local to the host computer.
The names of the logs files are formed by the name of the application and the current day
of the week (e. g., avnmenu_Thu and avndis_Fri).
All applications use collective logging which means that log file entries from different
instances of an application can be found, interleaved, in a single log file. The following
screen capture shows sample directory listings:
GUI logs on a workstation
…
lx2-wfo:user:pwd/data/logs/adapt/avnfpslx2-wfo:user:lsavnclimate_Fri avnmenu_Mon avnmenu_Thu avnqcstats_Tue avnsetup_Thu avnwatch_Mon avnwatch_Thu avnclimate_Wed avnmenu_Sat avnmenu_Tue avnsetup_Mon avnsetup_Tue avnwatch_Sat avnwatch_Tue avnmenu_Fri avnmenu_Sun avnmenu_Wed avnsetup_Sun avnwatch_Fri avnwatch_Sun avnwatch_Wed
Server logs on px1/px2 …
px2-wfo:user:pwd/data/logs/adapt/avnfpspx2-wfo:user:lsavndrs_Fri avndrs_Sun avndrs_Wed avninit_Sun avninit_Wed avnserver_Sat avnserver_Tue avndrs_Mon avndrs_Thu avninit_Fri avninit_Thu avnserver_Fri avnserver_Sun avnserver_Wed avndrs_Sat avndrs_Tue avninit_Sat avninit_Tue avnserver_Mon avnserver_Thu
It is possible for certain unexpected errors to go undetected in log files. One way to test against this possibility is to launch a GUI process from the command line and watch the standard error and standard output streams. Here is a command that will launch the AvnFPS startup menu from the command line:
lx2-wfo:ncfuser:19:/awips/adapt/avnfps/bin/avnstart.sh avnmenu
Data Ingest Servers (instances of avndis) monitor events in AWIPS and decode various data as they become available. For some data types, the term decode means little more than copying a file. Once avndis has put the data into the AvnFPS directory tree, Data Request Servers (instances of avndrs) serve the data to the various GUIs.
Log files for avndrs show data being delivered to GUIs as well as data access problems. See if GUIs are connecting to avndrs successfully and receiving data.
Log files for avndis show data arriving and being decoded. Errors are rather easy to spot in here.
| Text (TAFs, METARs) |
Database triggers deliver products to
/awips/adapt/avnfps/data/text.
|
| Lightning Observations |
Directory /data/fxa/point/binLightning/netcdf
is monitored for new and updated files.
|
| Lightning Probability |
Directory /data/fxa/img/SBN/netCDF/LATLON/3hr/LTG
is monitored for new and updated files.
|
| Low-Level Wind Shear |
The following directories can be monitored for new files:
|
| IFPS Grids |
An entry in /etc/cron.d/px2apps issues the job that exports the
data from GFESuite ifpServer to /awips/adapt/avnfps/data/grids
where the files are monitored by gamin.
|
Downloading and installing climatological data is described in Section 2.4: “Climatological Data” of the System Administration Manual.
URL for data for the new climatology files is http://www.mdl.nws.noaa.gov/~avnfps/data/hdf5. Byte counts and checksums are posted for all files.
Once the climatology files are placed in /data/adapt/avnfps/climate directory and the
etc/ids.cfg updated, the new
climatological applications, WindRose,
CVMonthly and CVTrends, will be able
access them.
Verify checksums with the md5sum when files arrive on
destination host. (The command is md5sum
filename.) If a Windows™
host is used to transfer data, ensure that the OS makes no attempt to
interpret the data. Under certain circumstances,
Windows™ may try to perform end-of-line conversions
on the files. This will be readily detected by a changed checksum value.
Product transmission is covered in depth in Section 1.4: “Transmission Server” of the System Administration Manual. Here are the recommended steps to follow:
Check log file for avnxs. If a forecast prepared by AvnFPS
was written to the pending queue
/awips/adapt/avnfps/OB8.3/xmit/pending
and the transmission server attempted to acces the file, there should be a
corresponding entry, starting with the word SUCCESS or
FAILNNN, where
NNN is the code returned by the system call to
handleOUP.pl.
If avnxs indicates handleOUP.pl failure,
proceed to investigate its log file,
/data/logs/fxa/.
yyyymmdd/handleOUP.log