|
|
The WatchDog is responsible for bootstrapping the MPLS VPN Solution and starting the necessary set of server processes. In addition, the WatchDog monitors the health and performance of each server to ensure it is functioning properly. In the event of a software error that causes a server to fail, the WatchDog automatically restarts the errant server.
This chapter provides the description, syntax, and arguments (listed alphabetically) for the following WatchDog commands:
This section provides the description and syntax for the startwd command.
The startwd command starts the MPLS VPN Solution WatchDog process. Running this manually is only necessary when installing new software, when changing the csm.properties file, or when restarting after issuing a stopwd command. The startwd command is run automatically when the machine is rebooted.
![]() |
Note The Orbix daemon must be running for the startwd command to operate correctly. If the Orbix daemon is not running, you will receive a message indicating that. |
startwd
![]() |
Note The startwd command has no arguments. |
![]() |
Note The location of startwd is: <MPLS VPN Directory>/bin. |
This section provides the description and syntax for the stopwd command.
The stopwd command stops the MPLS VPN Solution WatchDog process. Normally this will only be necessary when installing new versions of MPLS VPN Solution or changing the csm.properties file. When stopping and restarting the WatchDog, the csm.properties file is reread.
stopwd [-y]
where: -y indicates not to prompt before shutdown. If -y is not specified, you are prompted with the following message: "Are you absolutely sure you want to stop the watchdog and all of its servers? Other users may be using this system as well. No activity (for example: collections, performance monitoring, provisioning) will occur until the system is restarted." You are then prompted to reply yes or no.
![]() |
Note The location of stopwd is: <MPLS VPN Directory>/bin. |
This section provides the description, syntax, and options (listed alphabetically) for the wdclient subcommands. These subcommands are diagnostic tools. This section also describes the column format of the output of each of the subcommands.
![]() |
Note The location of wdclient is: <MPLS VPN Directory>/bin. |
The following are the wdclient subcommands:
This section provides the description and syntax for the wdclient group subcommand.
The wdclient group subcommand lists the servers in the specified server group. Server groups provide a convenient way to start or stop a group of servers with a single command.
wdclient [-host <hostname>] group <group_name>
where:
-host <hostname> is an optional parameter. <hostname> is the name of the remote host on which the WatchDog is running.
<group_name> is the name of a server group chosen from the list displayed by the wdclient groups command.
This section provides the description and syntax for the wdclient groups subcommand.
The wdclient groups subcommand lists all the active server groups.
wdclient [-host <hostname>] groups
where:
-host <hostname> is an optional parameter. <hostname> is the name of the remote host on which the WatchDog is running.
This section provides the description and syntax for the wdclient log subcommand.
The wdclient log subcommand displays the specified number of lines of the specified log.
wdclient [-host <hostname>] [-poll <seconds>] log <log_name> [<lines>]
where:
-host <hostname> is an optional parameter. <hostname> is the name of the remote host on which the WatchDog is running.
-poll <seconds> is an optional parameter. <seconds> is the number of seconds. A number other than zero indicates that when new status data is available it will be displayed every <seconds> seconds, where <seconds> is the specified number of seconds. If the number zero is specified for <seconds> this is the same as not specifying -poll.
<log_name> is the name of a log displayed by the wdclient logs command.
<lines> is an optional parameter. It is the number of lines (1 to 100) at the end of the log to be displayed. The default number of lines is 100.
![]() |
Note The complete history (log) file of all WatchDog servers is in the watchdog subdirectory of the temporary directory as configured in the csm.properties file. This temporary directory is specified during system configuration. If the WatchDog is stopped and restarted, this log file is renamed from server.<log_name> to server.<log_name>.<time_stamp>, where <log_name> is the same as specified in the wdclient log subcommand and <time_stamp> is a time indicator of when this file was created. Then new logs are collected in server.<log_name>. If the WatchDog is not stopped and restarted within a 24-hour period, the log file is automatically renamed with a <time_stamp> and a new file is started. Also, any log file a week old is automatically deleted. |
This section provides the description and syntax for the wdclient logs subcommand.
The wdclient logs subcommand lists the names of all the logs.
wdclient [-host <hostname>] logs
where: -host <hostname> is an optional parameter. <hostname> is the name of the remote host on which the WatchDog is running.
This section provides the description and syntax for the wdclient restart subcommand.
The wdclient restart subcommand restarts one or more servers. Any dependent servers will also be restarted.
![]() |
Note It is not necessary to restart servers in a properly functioning system. The wdclient restart command should only be run under the direction of Cisco Support. Restarting an individual server will not read in changes in the csm.properties file. For changes in the csm.properties file to be effective, stop the WatchDog and restart it. |
wdclient [-host <hostname>] restart {<server_name> | group <group_name> | all}
where:
-host <hostname> is an optional parameter. <hostname> is the name of the remote host on which the WatchDog is running.
You must choose one of the following arguments:
<server_name> is the name of a server chosen from the list displayed by the wdclient status command. See Table 2-1, "Servers and Their Functions," for server descriptions.
group <group_name> is the term group followed by the name of a server group chosen from the list displayed by the wdclient groups command.
all is all servers.
This section provides the description and syntax for the wdclient start subcommand.
The wdclient start subcommand starts one or more servers. Other servers that depend on the specified server(s) may also start.
![]() |
Note It is not necessary to stop and start servers in a properly functioning system. The wdclient start command should only be run under the direction of Cisco Support. |
wdclient [-host <hostname>] start {<server_name> | group <group_name> | all}
where:
-host <hostname> is an optional parameter.<hostname> is the name of the remote host on which the WatchDog is running.
You must choose one of the following three arguments.
<server_name> is the name of a server chosen from the list displayed by the wdclient status command. See Table 2-1, "Servers and Their Functions," for server descriptions.
group <group_name> is the name of a server group chosen from the list displayed by the wdclient groups command.
all is all servers.
This section provides the description, syntax, and information produced for the wdclient status subcommand.
The wdclient status subcommand lists all the servers and their states. See Table 2-2, "Valid States," for the list of all the states.
wdclient [-host <hostname>] [-poll <seconds>] status
where:
-host <hostname> is an optional parameter. <hostname> is the name of the remote host on which the WatchDog is running.
-poll <seconds> is an optional parameter. <seconds> is the number of seconds. A number other than zero indicates that when new status data is available it will be displayed every <seconds> seconds, where <seconds> is the specified number of seconds. If the number zero is specified for <seconds> this is the same as not specifying -poll.
The Name column provides the name of each of the servers. Table 2-1 provides a list of the servers and a description of the function that each server provides.
.
| Server | Function |
|---|---|
Provides a CORBA front end for the baseline management libraries. | |
Provides a CORBA front end for provisioning and for downloading service requests. | |
Downloads router configuration files through the Cisco IP Manager product. | |
Provides a CORBA front end for SA Agent and Accounting APIs. | |
Handles delivery of asynchronous events between servers. | |
Provides topology layout recomputation services for web topology, which is used when selecting certain topology views. | |
Launches and manages ReportServer processes that generate and provide access to dynamic web reports. | |
Provides a CORBA front end to the MPLS VPN Solution task repository. | |
Back end that generates Verify Reports. | |
Provisioning API Corba server. | |
aggregator | Aggregates collected datasets periodically. |
Web server. | |
Handles locking for the internal database. | |
Makes the output of tasks available to you in a browsable format. | |
Monitors events to provide user-level information. | |
Gets requests from other data collectors and forwards the requests to the device. Gets the response and sends it to appropriate collectors. | |
Underlying communication process necessary for ReportServerFactory and LayoutServer to communicate with web clients. | |
Enables you to schedule tasks immediately or later in time, once or several times. | |
Catches configuration change traps from routers. | |
Tracks performance for the machine itself. The data is collected and stored in the internal database only. The data is useful for diagnostics. |
The State column provides the current state of the server. Table 2-2 provides a description of each of the states in normal progression order
.
| State | Description |
|---|---|
This server has been asked to start, but is waiting for servers it depends on to start. Once all dependent servers have started, this server will transition to the state of starting. | |
This server is currently starting. Once a successful heartbeat occurs, this server will transition to the state of started. | |
This server is currently started and running. | |
This server is supposed to be stopped, but it is waiting for servers it depends on to be stopped first. | |
This server is in the process of stopping in a gentle fashion by notifying the server that it is to stop. | |
This server is in the process of being killed because either it did not have a way to stop gently or because the gentle stop took too long. | |
This server is stopped. The WatchDog will either start it again or disable it if it has been frequently dying. | |
This server is disabled because one or more servers it depends on are disabled. If all servers it depends on are started, this server will automatically start. | |
This server is disabled and must be manually restarted. | |
This server is delaying before restarting. There is a short delay after a server stops and before it is restarted again. |
The Gen column provides the generation of the server. Each time the server is started, the generation is incremented by 1.
The Exec Time column provides the date and time the server was last started.
The Pid column provides the UNIX process identifier for each server.
The Success column provides the number of successful heartbeats since the server was last started. Heartbeats are used to verify that servers are functioning correctly.
The Missed column provides the number of missed heartbeats since the server was last started.
A few missed heartbeats could simply indicate the system was busy. However, more than a couple of missed heartbeats per day could indicate a problem. See the logs to diagnose the reason. If a server misses three heartbeats in a row, the server is automatically restarted.
![]() |
Note Three missed heartbeats in a row is the default for restarting the server. The default number can be reset in the csm.properties file. After three failed attempts to restart in a row, the server is disabled. |
This section provides the description and syntax for the wdclient stop subcommand.
The wdclient stop subcommand stops one or more servers. Other servers that depend on the specified servers will also stop.
![]() |
Note It is not necessary to stop servers in a properly functioning system. The wdclient stop command should only be run under the direction of Cisco Support. |
wdclient [-host <hostname>] stop {<server_name> | group <group_name> | all}
where:
-host <hostname> is an optional parameter. <hostname> is the name of the remote host on which the WatchDog is running.
You must choose one of the following arguments.
<server_name> is the name of a server chosen from the list displayed by the wdclient status command. See Table 2-1, "Servers and Their Functions," for server descriptions.
group <group_name> is the name of a server group chosen from the list displayed by the wdclient groups command.
all is all servers.
This section provides the description and syntax for the wdgui command. This graphical interface to the WatchDog is a diagnostic tool. This section also describes the column format of the output when you click each of the tabs.
The wdgui command activates the WatchDog user interface. See Figure 2-1, "VPN Solutions CenterWatch Dog."
The top of the screen provides a list of the names of servers. You can drag and drop the columns of information to rearrange them. The columns of information about the servers are described in the following sections:
The bottom of the screen provides tabs for each of the servers. Click the tab of the server that you want to track and you will get up to the most current 250 lines of detailed log information.
wdgui [&]
![]() |
Note The wdgui command has no arguments. To run it as a background process, use the optional &. |

The Name column provides the name of each of the servers. Table 2-3 provides a list of the servers and a description of the function that each server provides.
![]() |
Note To sort alphabetically, click the column header Name. Uppercase sorts prior to lowercase. |
| Server | Function |
|---|---|
Provides a CORBA front end for the baseline management libraries. | |
Provides a CORBA front end for provisioning and for downloading service requests. | |
Downloads router configuration via the Cisco IP Manager product. | |
Provides a CORBA front end for SA Agent and Accounting APIs. | |
Handles delivery of asynchronous events between servers. | |
Provides topology layout recomputation services for web topology, which is used when selecting certain topology views. | |
Launches and manages ReportServer processes that generate and serve dynamic web reports. | |
Provides a CORBA front end to the MPLS VPN Solution task repository. | |
Back end that generates Verify Reports. | |
Provisioning API Corba server. | |
aggregator | Aggregates collected datasets periodically. |
Web server. | |
Handles locking for our internal database. | |
Makes the output of tasks available to the user in a browsable format. | |
Monitors events to provide useful information. | |
Gets requests from other data collectors and forwards the requests to the device. Gets the response and sends it to appropriate collectors. | |
Underlying communication process necessary for ReportServerFactory and LayoutServer to communicate with web clients. | |
Enables the user to schedule tasks immediately or later in time, once or several times. | |
Catches configuration change traps from routers. | |
Tracks performance for the machine itself. The data is collected and stored in the internal database only. The data is useful for diagnostics. |
The State column provides the current state. Table 2-4 provides a description of each of the states in normal progression order.
| State | Description |
|---|---|
This server has been asked to start, but is waiting for servers it depends on to start. Once all dependent servers have started, this server will transition to the state of starting. | |
This server is currently starting. Once a successful heartbeat occurs, this server will transition to the state of started. | |
This server is currently started and running. | |
This server is supposed to be stopped, but it is waiting for servers it depends on to be stopped first. | |
This server is in the process of stopping in a gentle fashion by notifying the server that it is to stop. | |
This server is in the process of being killed because either it did not have a way to stop gently or because the gentle stop took too long. | |
This server is stopped. The WatchDog will either start it again or disable it if it has been frequently dying. | |
This server is disabled because one or more servers it depends on are disabled. If all servers it depends on are started, this server will automatically start. | |
This server is disabled and must be manually restarted. | |
This server is delaying before restarting. There is a short delay after a server stops and before it is restarted again. |
The Generation column provides the generation of the server. Each time the server is started, the generation is incremented by 1.
The Exec Time column provides the date and time that the server was last started.
![]() |
Note To sort from the earliest to the latest date and time, click the column header Exec Time. |
The Pid column provides the UNIX process identifier for each server.
The Success column provides the number of successful heartbeats since the server was last started. Heartbeats are used to verify that servers are functioning correctly.
![]() |
Note To sort from the least number of successful heartbeats to the greatest number of successful heartbeats, click the column header Success. |
The Missed column provides the number of missed heartbeats since the server was last started.
A few missed heartbeats could indicate that the system was busy. However, more than a couple of missed heartbeats per day could indicate a problem. See the logs to diagnose the reason. If a server misses three heartbeats in a row, the server is automatically restarted.
This section provides the description, syntax, and report information for the wdperf command. This section also describes the reports that are generated by executing this command and the common information in these reports:
This graphical interface to the WatchDog provides information about system performance and resource utilization.
The wdperf command is a monitoring tool for MPLS VPN Solution that provides reports indicating the % CPU utilization, the % Memory usage, and the amount of virtual memory used by each of the system's servers and user-defined tasks. The reported values are based on performance data gathered by the WatchDog.
wdperf [%cpu|%mem|vmem] [&]
or
wdperf {%cpu|%mem|vmem} [<date>|start] [&]
where:
%cpu is a parameter that causes the Average % CPU Utilization per Hour report to be displayed. This is the default option.
%mem is a parameter that causes the Average % Memory Utilization per Hour report to be displayed.
vmem is a parameter that causes the Average Virtual Memory Utilization per Hour report to be displayed.
<date> is an optional parameter that specifies the date for which performance data will be displayed. The default date is the current date. The format of the date is either: mm/dd/yy or mm/dd/yyyy, where:
start is an optional parameter that causes the earliest available performance data to be displayed (that is, the repository creation date).
& is an optional parameter that causes wdperf to be run as a background process.
![]() |
Note The location of wdperf is: <MPLS VPN Directory>/bin. |
For a description of the reports created by the wdperf command, first see explanations of the generic report fields in the "Status Row" and "Filter Information" sections in Chapter 12, "Reports Overview." Additionally, each report has the following information:
The columns of information are as follows:
The information in this pane is:
pid = <####>
where: <####> is the Process identifier of the server or task (process) that you highlight in the Results Pane.
start time = localized date, time, and time zone when the server or task (process) that you highlight in the Results Pane started.
![]() |
Note If the highlighted server or task restarts, multiple lines will be displayed, one line for each time the server or task starts. |
From left to right, the bottom task bar includes the following items:
![]() |
Note If you want to view a report for a specific date, you may want to re-enter the wdperf command with the desired date. This may be preferable to using the <= and => buttons, which only display adjacent days one day at a time. |
These reports display the percentage of the CPU that is being occupied by each of the WatchDog's processes. Values less than 20% are displayed in green, those between 20% and 50% are displayed in yellow, and those greater than 50% are displayed in red.
The Average % CPU Utilization per Hour report for the current date is the default report if you do not specify another Metric on the command line, as specified in the "Syntax" section, and maintain the default Aggregate selection on the bottom task bar.
See a sample of the % CPU Utilization report, as shown in Figure 2-2, "% CPU Utilization Report." Note that in this sample report, startwd was executed between 5:00 p.m. (17:00) and 6:00 p.m. (18:00) on January 27, 2000 and performance data collection started at that time.
From this report, you can use the controls in the bottom task bar to navigate to reports displaying other metrics, aggregates, and display periods.

These reports display the percentage of the machine's physical memory that is being used by each of the WatchDog's processes. Values less than 20% are displayed in green, those between 20% and 50% are displayed in yellow, and those greater than 50% are displayed in red.
The Average % Memory Utilization per Hour report for the current date is the report that is displayed if you specify %mem on the command line and maintain the other defaults on the command line, as specified in the "Syntax" section, and the default Aggregate selection on the bottom task bar.
See a sample of the % Memory Utilization report, as shown in Figure 2-3, "% Memory Utilization Report." Note that in this sample report, startwd was executed between 5:00 p.m. (17:00) and 6:00 p.m. (18:00) on January 27, 2000 and performance data collection started at that time.
From this report, you can use the controls in the bottom task bar to navigate to reports displaying other metrics, aggregates, and display periods.

These reports display the amount of virtual memory (in kilobytes) allocated to each of the WatchDog's processes. Values are displayed in various color shades to highlight memory usage trends.
The Average Virtual Memory Utilization per Hour report for the current date is the report that is displayed if you specify vmem on the command line and maintain the other defaults on the command line, as specified in the "Syntax" section, and the default Aggregate selection on the bottom task bar.
See a sample of the Virtual Memory Utilization report, as shown in Figure 2-4, "Virtual Memory Utilization Report." Note that in this sample report, startwd was executed between 5:00 p.m. (17:00) and 6:00 p.m. (18:00) on January 27, 2000 and performance data collection started at that time.
From this report, you can use the controls in the bottom task bar to navigate to reports displaying other metrics, aggregates, and display periods.

![]()
![]()
![]()
![]()
![]()
![]()
![]()
Posted: Thu Apr 20 16:19:44 PDT 2000
Copyright 1989 - 2000©Cisco Systems Inc.