cc/td/doc/product/access/sc/rel7
hometocprevnextglossaryfeedbacksearchhelp
PDF

Table of Contents

Maintenance Overview

Maintenance Overview

This chapter contains an overview of maintenance and troubleshooting for the Cisco-provided components of your solution. The chapter first describes the general concepts of maintenance and troubleshooting, then presents maintenance strategies, and finally system troubleshooting strategies. Maintenance and troubleshooting for the individual components of the system is described in following chapters.

This chapter describes hardware maintenance; for information on installing or upgrading software, refer to the Cisco Media Gateway Controller Software Release 7 Installation and Configuration Guide. This chapter includes the following sections:

Maintenance and Troubleshooting

As described in the various Cisco telephony solution overviews, the Cisco-provided devices that help to make up the system include the Cisco Media Gateway Controllers (MGCs), Cisco Signaling Link Terminals (SLTs), LAN switches, and Cisco Media Gateways (MGWs). Both maintenance and troubleshooting of the system begins with looking at the individual devices first, then looking at the larger system-level picture.

Maintenance usually consists of the following tasks for each device:

Troubleshooting usually consists of determining the nature of a problem and then isolating the problem to a particular device or component of a device. This is frequently accomplished by using the "checking equipment status" maintenance task. When a problem has been isolated and identified, troubleshooting also consists of fixing the problem, usually by replacing the device or some component of the device.

In other words, although maintenance and troubleshooting are described separately, they are linked, and the maintenance and troubleshooting chapters in this guide frequently refer to each other.

Maintenance Strategy Overview

The basic maintenance strategy for the various components of the solution consists of periodically checking equipment status and removing and replacing solution components as necessary.

Determining the current status of the various components consists of three basic activities:

Reading LEDs

Most Cisco components include light-emitting diode (LED) indicators on the front or rear panels and, in some cases, on both panels. These LEDs indicate the status of the equipment. The specific meaning of each LED on each component is described in the corresponding maintenance sections.

Issuing Status Queries

You can query the status of the system using various commands. The commands that can be used to determine the status of the devices in your system are described in the corresponding maintenance sections.

Using a GUI NMS

Using a network management system (NMS) with a graphical user interface (GUI), such as CiscoWorks 2000 or Cisco WAN Manager, to determine the operational status of system devices is described in detail in the corresponding maintenance sections.

Removing Devices

Procedures for removing defective devices from the system with as little impact on the system as possible are described in the corresponding maintenance section.

Replacing Devices

Reinstating a new or repaired device into the system, again with as little impact on the system as possible, is described in the corresponding maintenance section.

Replacing Components

Swapping out components of a device is a maintenance task used for replacing defective components and for upgrading hardware. The maintenance chapter describing each device includes sections describing how to replace the field-replaceable components of that device.

Troubleshooting Strategy Overview

The Cisco telephony solutions support connections to external switches and to internal components, such as media gateway controllers, signal processors, and trunking gateways.

This is a complex environment involving numerous connections, links, and signaling protocols. When connectivity and performance problems occur, they can be difficult to resolve.

The goal of this chapter is to provide you with a general troubleshooting strategy, as well as information about the tools available for isolating and resolving connectivity and performance problems.

Symptoms, Problems, and Solutions

Problems in a system are characterized by certain symptoms. These symptoms can be general (such as a Cisco SLT being unable to access the SS7 network) or more specific (routes not in routing table). You can determine the cause of a symptom by using specific troubleshooting tools and techniques. After identifying the cause, you can correct the problem by implementing a solution consisting of a series of actions.

General Problem-Solving Model

A systematic approach works best for troubleshooting. Define the specific symptoms, identify all potential problems that could be causing the symptoms, then systematically eliminate each potential problem (from the most likely to the least likely) until the symptoms are no longer present.

Figure 4-1 illustrates the process flow for the general problem-solving model. This process is not a rigid outline for troubleshooting. It is a guide you can use to successfully troubleshoot a problem.


Figure 4-1: General Problem-Solving Model


The following steps detail the problem-solving process outlined in Figure 4-1:


Step 1   When analyzing a problem, make a clear problem statement. Define the problem in terms of a set of symptoms and the potential causes behind those symptoms.

Identify the general symptoms and then determine what kinds of problems could cause these symptoms. For example, the symptom might be that the EQPT FAIL alarm has become active. Possible causes might be physical problems, a bad interface card, or the failure of some supporting entity (for example, layer 1 framing).

Step 2   Gather the facts you need to help isolate symptoms and their possible causes.

Ask questions of affected users, network administrators, managers, and other key people. Collect information from sources such as network management systems, protocol analyzer traces, output from router diagnostic commands, or software release notes.

Step 3   Consider possible causes based on the facts you gathered. You can also use these facts to eliminate potential causes from your list.

For example, depending on the data, you might be able to eliminate hardware as a cause, allowing you to focus on software. At every opportunity, try to narrow the number of potential causes so that you can create an efficient plan of action.

Step 4   Create an action plan based on the remaining potential causes. Begin with the most likely cause, and devise a plan in which only one variable is manipulated.

This approach allows you to reproduce the solution to a specific problem. If you alter more than one variable simultaneously, identifying the change that eliminated the symptom becomes more difficult.

Step 5   Perform each step of the action plan carefully, while testing to see if the symptom disappears.

Step 6   Whenever you change a variable, gather the results. Generally, you should use the same method of gathering facts that you used in Step 2.

Analyze the results to determine if the problem has been resolved. If it has, then the process is complete.

Step 7   If the problem has not been resolved, you must create an action plan based on the next most likely problem in your list. Return to Step 4 and continue the process until the problem is solved.

Make sure to undo any "fixes" you made in implementing your action plan. Remember that you want to change only one variable at a time.


Note   If you exhaust all the common causes and actions (either those outlined in this chapter or those that you have identified for your environment), your last recourse is to contact your Cisco technical support representative.


System Troubleshooting Tools

This section presents information about the wide variety of tools available to assist you in troubleshooting the system.

Alarms

The media gateway controller software generates alarms to indicate problems with processes, routes, linksets, signaling links, and bearer channels. Refer to the Cisco Media Gateway Controller Software Release 7 Reference Guide for detailed information on system alarms.

Call Traces

Call traces capture call-processing activity by following the call from a specified destination through the media gateway controller engine to see where it fails using the following information:

The results of call traces are signal flow diagrams that you can use for troubleshooting. Call traces are typically used to capture system activity as part of a procedure to clear an alarm. For more information on using call traces, refer to "Troubleshooting the Cisco MGC."

System Logs

The Cisco MGC software generates log files of various operational measurements (OMs) and alarm records. You can use these logs to obtain statistical information about the calls processed by the system and network events like delays or service-affecting conditions. The controller generates the following types of logs:

Refer to "Interpreting Report Files," for more information.

MML Queries

Man-Machine Language (MML) is the command line interface for configuring and managing the Cisco MGC. As a troubleshooting tool, you can use it to retrieve information about system components, and to perform logging and tracing. Refer to the Cisco Media Gateway Controller Software Release 7 Reference Guide for more information.

Cisco Network Management Tools

Cisco offers several network management products that provide design, monitoring, and troubleshooting tools to help you manage your internetwork.

The Cisco internetwork management tools are useful for troubleshooting internetworking problems:

CiscoWorks 2000

CiscoWorks 2000 is a series of SNMP-based internetwork management software applications. CiscoWorks applications are integrated on several popular network management platforms and build on industry-standard platforms to provide applications for monitoring device status, maintaining configurations, and troubleshooting problems.

Some of the applications included in CiscoWorks 2000 that are useful for troubleshooting are

The CiscoView application can be used for management of the Cisco SLTs and the LAN switches. Refer to the CiscoWorks documentation for more information.

Cisco WAN Manager

Cisco WAN Manager is part of the Cisco Service Management System of provisioning and management tools for service provider and large enterprise networks. Working at the network and element management level, WAN Manager provides fault-management capabilities handled through the Event Browser, CiscoView, and Configuration Save and Restore features.

You can use Cisco WAN Manager to perform searching, sorting, and filtering operations and to tie events to extensible actions. For instance, Cisco WAN Manager can page someone upon receiving a certain type of SNMP trap. It supports alarm hierarchies to report the root cause of problems to operators and higher-level systems.

Configuration Save and Restore saves a snapshot of the entire network configuration. For disaster recovery, operators can selectively restore configurations from a single node up to the entire network. This restoration ability significantly reduces recovery time in the event of a catastrophic failure.

The Cisco WAN Manager Trivial File Transfer Protocol (TFTP) statistics collection facility offers extensive usage and error collection.

A wide range of statistics are available at the port and virtual channel level to support operations and maintenance including

The Cisco WAN Manager application can be used for management of the Cisco SLTs and the LAN switches.

Cisco Media Gateway Controller Node Manager

The Cisco Media Gateway Controller Node Manager (CMNM) is a Cisco Element Management Framework (CEMF)-based Element Management System (EMS) that is responsible for managing the MGC node. The MGC node itself is comprised of Cisco MGC(s), LAN switch(es), and Cisco SLTs.

Network management can be broken down into five discrete areas: Fault, Configuration, Accounting, Performance, and Security (FCAPS). The CMNM provides fault and performance management of the MGC host, as well as flow-through provisioning of the Cisco MGC and its subcomponents. In addition, CMNM also provides fault and performance management of the Cisco SLT and LAN switch. CMNM uses the Cisco MGC Configuration Tool to provide configuration of the Cisco MGC host and uses CiscoView for configuration of the Cisco SLT and the LAN switch.

Security and some accounting features are provided directly by the CEMF platform. The CMNM does not provide any additional security or accounting features in addition to what is natively supported by the CEMF framework. The CMNM is designed to be used on a standalone basis with a customer Operations Support System (OSS) or a Cisco-based NMS such as the Voice Network Manager (VNM)

For more information on CMNM, refer to the Cisco MGC Node Manager Administration, Installation, and User Guide.

Router Diagnostic Commands

Cisco routers provide the following integrated diagnostic commands to assist you in monitoring and troubleshooting systems:

Show Commands

The show commands are powerful monitoring and troubleshooting tools. You can use the show commands to perform a variety of functions:

Some of the most commonly used show commands include

For details on using and interpreting the output of specific show commands, refer to the relevant Cisco IOS command references.

Using debug Commands

The debug privileged EXEC commands can provide a wealth of information about the traffic being seen (or not seen) on an interface, error messages generated by nodes on the network, protocol-specific diagnostic packets, and other useful troubleshooting data.


Caution Exercise care when using debug commands. These commands are processor-intensive and can cause serious network problems (degraded performance or loss of connectivity) if they are enabled on an already heavily loaded router. When you finish using a debug command, remember to disable it with its specific no debug command, or use the no debug all command to turn off all debugging.

Use debug commands to isolate problems, not to monitor normal network operation. Because the high overhead of debug commands can disrupt operation, you should use debug commands only when you are looking for specific types of traffic or problems and have narrowed your problems to a likely subset of causes.

Output formats vary with each debug command. Some generate a single line of output per packet, and others generate multiple lines of output per packet. Some generate large amounts of output, and others generate only occasional output. Some generate lines of text, and others generate information in field format.

To minimize the negative impact of using debug commands, follow this procedure:


Step 1   Use the no logging console global configuration command on your router. This command disables all logging to the console terminal.

Step 2   Telnet to a router port and enter the enable EXEC command.

Step 3   Use the terminal monitor command to copy debug command output and system error messages to your current terminal display.

This permits you to view debug command output remotely, without being connected through the console port. Following this procedure minimizes the load created by using debug commands because the console port no longer has to generate character-by-character processor interrupts.


This guide refers to specific debug commands that are useful when troubleshooting specific problems. If you intend to keep the output of the debug command, spool the output to a file. The procedure for setting up such a debug output file, as well as complete details regarding the function and output of debug commands is provided in Chapter 10, "Debug Command Reference," in the Troubleshooting Internetworking Systems manual.

In many situations, third-party diagnostic tools can be more useful and less intrusive than the use of debug commands. For more information, see "Third-Party Troubleshooting Tools," later in this chapter.

Using the Ping Command

To check host accessibility and network connectivity, use the ping EXEC (user) or privileged EXEC command.

For IP, the ping command sends ICMP Echo messages. If a station receives an ICMP Echo message, it sends an ICMP Echo Reply message back to the source. The extended command mode of the ping command permits you to specify the supported IP header options. This allows the router to perform a more extensive range of test options.

It is a good idea to use the ping command when the network is functioning properly under normal conditions so that you have something to compare against when troubleshooting.

For detailed information on using the ping and extended ping commands, refer to the Cisco IOS Configuration Fundamentals Command Reference.

Using the Trace Command

The trace user EXEC command discovers the routes a router's packets follow when traveling to their destinations. The trace privileged EXEC command permits the supported IP header options to be specified, allowing the router to perform a more extensive range of test options. The trace command uses the error message generated by routers when a datagram exceeds its time-to-live (TTL) value. First, probe datagrams are sent with a TTL value of one. This causes the first router to discard the probe datagrams and send back "time exceeded" error messages. The trace command then sends several probes, and displays the round-trip time for each. After every third probe, the TTL is increased by one.

Each outgoing packet can result in one of two error messages. A "time exceeded" error message indicates that an intermediate router has seen and discarded the probe. A "port unreachable" error message indicates that the destination node has received the probe and discarded it, because it could not deliver the packet to an application. If the timer goes off before a response comes in, the trace command prints an asterisk (*).

The trace command terminates when the destination responds, when the maximum TTL is exceeded, or when the user interrupts the trace with the escape sequence. It is a good idea to use the trace command when the network is functioning properly under normal conditions so that you have something to compare against when troubleshooting.

For detailed information on using the trace and extended trace commands, refer to the Cisco IOS Configuration Fundamentals Command Reference.

Third-Party Troubleshooting Tools

In many situations, third-party diagnostic tools can be more useful than commands that are integrated into the router. For example, enabling a processor-intensive debug command can help to overload an environment that is already experiencing excessively high traffic levels. Attaching a network analyzer to the suspect network is less intrusive and is more likely to yield useful information without interrupting the operation of the router.

Some typical third-party troubleshooting tools used for troubleshooting internetworks include

Volt-Ohm Meters, Digital Multimeters, and Cable Testers

Volt-ohm meters and digital multimeters are at the lower end of the spectrum of cable testing tools. These devices can measure basic parameters such as AC and DC voltage, current, resistance, capacitance, and cable continuity. They are used primarily to check physical connectivity.

Cable testers (scanners) also enable you to check physical connectivity. Cable testers are available for shielded twisted-pair (STP), unshielded twisted-pair (UTP), 10BaseT, and coaxial and twinax cables.

A given cable tester might be able to perform any of the following functions:

Similar testing equipment is available for fiber-optic cable. Due to the relatively high cost of fiber-optic cable and its installation, fiber-optic cable should be tested both before installation (on-the-reel testing) and after installation. Continuity testing of fiber-optic cable requires either a visible light source or a reflectometer. Light sources capable of providing light at the three predominant wavelengths, 850 nanometers (nm), 1300 nm, and 1550 nm, are used with power meters that can measure the same wavelengths and test attenuation and return loss in the fiber-optic cable.

Breakout Boxes, Fox Boxes, and BERTs/BLERTs

Breakout boxes, fox boxes, and bit or block error rate testers (BERTs or BLERTs) are digital interface testing tools used to measure the digital signals present at PCs, CSU/DSUs, and other peripheral interfaces. These devices can monitor data line conditions, analyze and trap data, and diagnose problems common to communications systems. Traffic from data terminal equipment (DTE) through data communications equipment (DCE) can be examined to help isolate problems, identify bit patterns, and ensure that the proper cabling has been installed. These devices cannot test media signals such as Ethernet, Token Ring, or FDDI.

Network Monitors and Analyzers

Network monitors continuously track packets crossing a network, providing an accurate picture of network activity at any moment, or a historical record of network activity over a period of time. They do not decode the contents of frames. Monitors are useful for baselining, in which the activity on a network is sampled over a period of time to establish a normal performance profile, or baseline.

Monitors collect information such as packet sizes, the number of packets, error packets, overall usage of a connection, the number of hosts and their MAC addresses, and details about communications between hosts and other devices. This data can be used to create profiles of LAN traffic as well as to assist in locating traffic overloads, planning for network expansion, detecting intruders, establishing baseline performance, and distributing traffic more efficiently.

A network analyzer (also called a protocol analyzer) decodes the various protocol layers in a recorded frame and presents them as readable abbreviations or summaries, detailing which layer is involved (physical, data link, and so forth) and what function each byte or byte content serves.

Most network analyzers can perform many of the following functions:

TDRs and OTDRs

At the top end of the cable testing spectrum are time domain reflectometers (TDRs). These devices can quickly locate open and short circuits, crimps, kinks, sharp bends, impedance mismatches, and other defects in metallic cables.

A TDR works by "bouncing" a signal off the end of the cable. Opens, shorts, and other problems reflect the signal back at different amplitudes, depending on the problem. A TDR measures how much time it takes for the signal to reflect and calculates the distance to a fault in the cable. TDRs can also be used to measure the length of a cable or calculate the propagation rate based on a configured cable length.

Fiber-optic measurement is performed by an optical time domain reflectometer (OTDR). OTDRs can accurately measure the length of the fiber, locate cable breaks, measure the fiber attenuation, and measure splice or connector losses. An OTDR can be used to take the "signature" of a particular installation, noting attenuation and splice losses. This baseline measurement can then be compared with future signatures when a problem in the system is suspected.


hometocprevnextglossaryfeedbacksearchhelp
Posted: Mon Aug 28 10:31:47 PDT 2000
Copyright 1989-2000©Cisco Systems Inc.