McIDAS-XCD Administrator's Guide
Version 2024.1
[Search Manual]
[Table of Contents]
[Go
to Previous]
[Go
to Next]
Chapter 4
Troubleshooting
This chapter lists problems that may occur with McIDAS-XCD.
Under each symptom or error message, possible solutions are given. The problems
discussed here are listed below.
-XCD is not receiving real-time data - users report
no real-time data, or the ingestor status display is red
- Verify with your LDM Administrator
that the LDM is ingesting data. Verify that startxcd.k is
running under the user oper. If it is not,
see Starting
and Stopping McIDAS-XCD for information on starting -XCD. If
you are filing GRIB information into a real-time MySQL/MariaDB database, verify
with your MySQL/MariaDB Administrator that mysql is
running. Contact your System Administrator to add all three of these
to your machine's start-up process.
- The file system may be full. You can use the McIDAS-XCD
commands DELWXT, QRTMDG,
and xcdscour to delete older text, point, grid,
and grib files and to clean the real-time GRIB database. Do not delete any
files for the current day. See Configuring Data
Removal for information on configuring cron and
the McIDAS-X Scheduler to remove McIDAS-XCD data files.
- If the file system is not full, check the LDM configuration
file pqact.conf to
make sure it is calling ingebin.k and ingetext.k.
If this is OK, check the LDM configuration file ldmd.conf to
see that it is making the proper requests.
- Check the /home/oper/xcd.log file for any useful information that may help to resolve the problem.
- Check for an obstruction in the antenna and verify that all receiving hardware is working properly.
- Contact your source provider to see if they are having a problem with the broadcast.
- If running MySQL/MariaDB 8.0 or later, check whether the writing of binary log files associated with the database
is disabled. As noted in the gribadmin command, we recommend disabling it because it
is not useful and it fills large amounts of disk (possibly an entire file system) due to it logging millions of
INSERT into file_list entries per day.
-XCD is not creating McIDAS Grid files
The DMGRID GRIB decoder can't file grids if it can't find RTMODELS.CFG,
which contains information about real-time grid file locations.
The file RTMODELS.CFG should reside
in ~oper/mcidas/data when McIDAS-XCD is installed
correctly. Either the decoder can't reach the file or it is missing. If it's
missing, recreate the file or copy a new version of the default file from
~oper/mcidas/xcdversion/data/RTMODELS.CFG
(where version is the version number, e.g., 2024.1).
-XCD is not writing GRIB files
From a McIDAS-X
session running under the oper account, run the
REDIRECT LIST command to check for any file redirections.
Look for the *.gr* and
*.bu* directory
entries and verify that these directories exist on your workstation. If
these directories do not exist, create them. Once this is done, -XCD will automatically
start writing GRIB1 and GRIB2 files to this directory, which in turn will allow
for data to be written to the database. Then users should be able to access
the GRIB data.
-XCD is not writing to the real-time GRIB MySQL/MariaDB database
From a Unix shell prompt under the oper account,
run the gribadmin command to check the most recent
record filed in the mcrtgrib database.
- If MySQL/MariaDB is installed properly and the mcrtgrib database
was created, the most recent record filed in the database will be listed.
The modtime is the last line of the record. If the GRIB file information
is being filed correctly, this should be today's date and a very recent
time.
- If no record is returned, or if the modtime is
not recent, this means that the mcrtgrib database
exists, but no data is being written to it. Verify with your -X and -XCD
Administrator that both packages were compiled with the -mysql flag.
- If an error message containing "Can't connect
to local MySQL server" is displayed, this means that MySQL/MariaDB is installed
on your machine, but that it was not started. Contact your MySQL/MariaDB Administrator
to restart MySQL/MariaDB.
- If an error message from gribadmin containing "mysql: not found" is
displayed, this means that the gribadmin command
can not find the mysql executable. Confirm
that mysql is in the PATH of the user oper.
- If an error message from gribadmin containing "Access denied for user ''@'localhost' to database 'mcrtgrib'" is
displayed, this means that MySQL/MariaDB is running, but the mcrtgrib database
was never created. See Configuring the GRIB Decoders for
more information on creating the mcrtgrib database.
-XCD data is garbled or missing
When more than one ingestor is trying to read the same circuit,
text data may be missing or text output garbled. If you are decoding grids,
grids may be missing.
Only one ingetext.k process should
be running for each text circuit, and only one ingebin.k process
should be running for each binary circuit (e.g., one ingebin.k process
running for GRIB data while another is running for NEXRAD data). Check the
number of ingetext.k and ingebin.k processes
and -XCD data monitors running. At a Unix shell prompt, as user oper,
Type: ps
-ef | grep DM
Type: ps -ef | grep
inge
If there are multiple processes running for
one of the circuits or one of the data monitors, stop and restart
-XCD and the LDM. Some data may be lost while the processes are stopped.
- Stop -XCD and kill any remaining data monitors (DM*),
if necessary. At a Unix shell prompt, as user oper,
Type: xcdadmin stop (you
will need to wait for the data monitors to cleanly exit)
Type: ps -ef | grep DM
Type: kill -9 process_id (if necessary)
- Stop the LDM and kill any remaining processes,
if necessary. At a Unix shell prompt, as user oper,
Type: ldmadmin stop (stops
the INGEBIN and INGETEXT processes)
Type: ps -ef | grep inge
Type: kill -9 process_id (if necessary)
- Restart the LDM and -XCD by running
the two commands below, in order.
Type: ldmadmin start
Type: xcdadmin start
If this process doesn't work, check for an obstruction in
the receiving antenna.
xcdadmin script is not working
- Verify that you are logged in as user oper.
- Run command which xcdadmin to verify the location of the xcdadmin script.
- Verify that the oper account has a .profile file.
- Open the xcdadmin file and verify that its xcd_admin_profile variable is set to the oper account's .profile file.
-XCD GRIB filing results in duplicates or data is missing
The volume of data coming over NOAAPORT/LDM has been steadily increasing over the years. If the -XCD site is receiving CONDUIT data, the increase is even more so. Typical average throughput for the NGRID and CONDUIT feeds that provide most of the GRIB data is about 40 G per hour. At peak times, the throughput may be over 100 G per hour. If the LDM product queue (file ldm.pq) size is too small, this could result in quick overwriting of the product queue during peak times. The LDM logic using checksums might determine that the GRIB message was not already filed, thus resulting in duplicate records in the -XCD database(s). Duplicates from a McIDAS-X GRDLIST command for any GRIB dataset may thus occur.
The steps to prevent the duplicate problem are as follows:
- As user oper or ldm, open the /proc/meminfo file. Check the MemTotal and MemAvailable fields. As the ldm.pq file is a memory-mapped file, its size cannot go above the MemAvailable field. For example, if the MemAvailable is about 50 GB, your system should be OK with a 40-45 GB ldm.pq file provided you are aware of other usage on the system.
- As user oper or ldm, open the /home/ldm/etc/registry.xml file.
- Look for the section like this:
<queue>
<path>/data/ldm/ldm.pq</path>
<size>35G</size>
<slots>1000000</slots>
</queue>
- Set the size to the determined value from step 1. We recommend 1000000 slots so change this value if it has not been already changed.
- As user oper or ldm, run the following six commands below, in order, to recreate the ldm.pq file of updated size.
Type: xcdadmin stop
Type: ldmadmin stop
Type: ldmadmin delqueue
Type: ldmadmin mkqueue
Type: ldmadmin start
Type: xcdadmin start
directory ***NEXRAD_PATH*** exists on the workstation
The ***NEXRAD_PATH*** directory is created if the NEXRAD data monitor is enabled and the DIRFILE= value is not set in the file ~oper/mcidas/data/NEXRID.CFG. See Chapter 2's Configuring the NEXRAD Decoder section for instructions to set the DIRFILE= value and other NEXRAD options.
[Search Manual]
[Table of Contents]
[Go
to Previous]
[Go
to Next]