InfoDoc ID   Synopsis   Date
41540   Console Logging Options to capture Fatal Reset output for Sun systems   2 Dec 2002

Status Issued

Description

Use the following procedures to log the console output to help diagnose Fatal Resets on Sun Systems

Purpose of Console Logging

The purpose of console logging is to capture console messages used to improve the quality and timeliness of problem diagnosis. By default, Fatal Reset details and POST output after a Fatal Reset are directed to serial port A (ttya).

In many system interrupts, logging the serial data is the only output available. This is because in some failure modes, Solaris [TM] Operating Environment has already terminated and there is no software running in the system that is capable of logging messages to traditional file system locations. For this reason, capturing diagnostic/failure data via serial console logging provides additional diagnostic information and reduces the number of "unexplained system reboots".

Fatal error fatal resets brings a system down extremely fast. Additional components to the failing item often detect the error, but the speed of crash often leaves these "error artifacts" in component registers. The PROM can subsequently interpret those artifacts by indicating the wrong component as the cause for the reset and may offline a good component as a result. Serial console logging allows analysis of the reboot extended POST to help ensure that the actual defective FRU is replaced, and not a good component incorrectly being reported as failing.

The following sections outline possible console logging options. Note that there may be other software and hardware vendors with equivalent products. The functionality of these other products will likely be similar to what is discussed below.

Console Logging Options - Data Logging Terminal Servers

A replacement for traditional terminal servers, which do not have console logging capability, is a console server device from Lightwave, Inc. Lightwave console server is the equivalent to a traditional network based terminal server, except that the Lightwave device has memory added which is used as a "wrap around" message buffer.

As console messages are output from a SUN system , they are stored in this memory. As the memory fills up, the oldest messages are overwritten. One can connect to this console server via the network, and then display the contents of the memory buffer for a specific system in order to retrieve the stored console messages.

More information on the Lightwave console server can be found at: http://www.lightwavecom.com/server_management/index.html

Console Logging Options - Centralized Console Control

A centralized console control solution is available from Aurora Technologies.

This is a solution that allows a single Sun workstation to serve as a console access and logging point. Hardware is installed in the Sun workstation which supports multiple serial ports and system consoles that are being controlled and monitored via these ports. The workstation can both grant console access as well as log all console activity on its local disk for review at anytime.

More information can be found at: http://www.auratek.com/consolemanagement/ct/

Console Logging Options - Tip line to ttya

This may be one of the least expensive console logging options, but can create challenges when attempting to monitor multiple systems. The system that is performing the monitoring function must be up and operational, or logging of the other systems console is lost.

To enable this console logging mode, take a null serial cable (see below) and connect one end to the system ttya port, then connect the other end of the cable to any serial port on any other local workstation.

Once the cable is connected, a user on this monitoring system can issue the tip command and be connected to the other systems console. Note that prior to issuing the tip command, the user must enable some form of logging, i.e. Using the log to file option of an Xterm session, etc.

Using TIP

Have the system console of the 'problem' system redirected to another system.

The basic steps:

Hook a null modem cable between serial port A of the 'problem' machine and one of the serial ports of the healthy machine. The port (a or b) on the healthy machine depends on the hardwire entry in the /etc/remote file on the healthy system.

Here is the hardwire entry /etc/remote that uses port b on the healthy machine.

hardwire: :dv=/dev/term/b:br#9600:el=^C^S^Q^U^D:ie=%$:oe=^D:

A null modem cable in its most basic form is an rs232 serial cable with a minimal pin connections as follows:

2 ------ 3

3 ------ 2

7 ------ 7

A standard serial cable with a null modem adapter from an electronics store will work too.

There should be an entry for hardwire already in /etc/remote. It comes with the default OS. If one is not there, you can always copy it from another Solaris system.

Now open a command-tool on the healthy system. Sometimes tip behaves better with a shell-tool, but you lose scrolling.

Type in: tip hardwire

You should see a connected message in this command-tool window.

NOTE: you will get the connected message regardless of the presence of the serial cable. Connected just means your tip session is talking to the serial port, not to another system.

Serial console logging using non-Sun system

Serial console logging can also be done using a laptop or other PC type system running a terminal emulator program. The cabling requirements are identical as for a tip session (see "Using Tip"). Serial parameters are 9600 8n1; i.e. 9600baud, 8 data bits, no parity bit and 1 stop bit. Set the term program to emulate a VT100 or similar terminal. Logging to disk parameters are set within the emulator program, usually referred to as either session logging or session capture. For systems running Win OS, a program named Tera Term is available that works with fewer problems than Hyperterm.

Recommended NVRAM settings:

Bring system to OBP level from command line using "shutdown" or "init 0" commands (either will run all RC shutdown scripts), sync file systems and then drop system to OK prompt. DO NOT use a  stop+A key press. The following commands can be executed from the OK prompt or from the command line using the "eeprom <variable=parameter>" command.

at OK prompt # eeprom Description
setenv diag-level max diag-level=max system will run extended POST
printenv boot-device boot-device determine what your boot device is....
setenv diag-device <your boot-device> diag-device=<your bootdev> prevent attempting net boot w diags on
setenv error-reset-recovery sync error-reset-recovery=sync force sync reboot if system drops to OK
setenv diag-switch? True diag-switch?=True
reset-all reboot or init 6 system has to reboot for changes to take affect

Keywords: tip, console, logging, capture, null, modem

INTERNAL SUMMARY:

Once we have the information from the customer use the following documentation to help determine the issue:

http://infoserver.central.sun.com/data/sshandbook/Systems/E6500/docs.html See PN:910-4188 for help in diagnosing the issue

Also see http://cpre-emea.uk.sun.com/cgi-bin/fatal.tcl for a web based decoder for all other systems

brian.pryor@sun.com

doug.munson@sun.com

SUBMITTER: Brian Pryor APPLIES TO: Hardware/Ultra Enterprise/Servers, Hardware/Ultra Enterprise/Servers/Enterprise 6500, Hardware/Ultra Enterprise/Servers/Enterprise 6000, Hardware/Ultra Enterprise/Servers/Enterprise 5500, Hardware/Ultra Enterprise/Servers/Enterprise 5000, Hardware/Ultra Enterprise/Servers/Enterprise 4500, Hardware/Ultra Enterprise/Servers/Enterprise 4000, Hardware/Ultra Enterprise/Servers/Enterprise 3500, Hardware/Ultra Enterprise/Servers/Enterprise 3000, Hardware/Ultra Enterprise/Servers/Enterprise 450, Hardware/Ultra Enterprise/Servers/Enterprise 250, Hardware/Ultra Enterprise/Servers/Enterprise 150, Hardware/Ultra Enterprise/Servers/Enterprise 10S, Hardware/Ultra Enterprise/Servers/Enterprise 5S, Hardware/Ultra Enterprise/Servers/Enterprise 2, Hardware/Ultra Enterprise/Servers/Enterprise 1, Hardware/Ultra Workstations, Hardware/Ultra Workstations/Ultra 80, Hardware/Ultra Workstations/Ultra 60, Hardware/Ultra Workstations/Ultra 30, Hardware/Ultra Workstations/Ultra 10, Hardware/Ultra Workstations/Ultra 5, Hardware/Ultra Workstations/Ultra 2, Hardware/Ultra Workstations/Ultra 1, Hardware/Netra/Netra t 1125, AFO Vertical Team Docs/Hardware, Hardware/Ultra Workstations/Sun Blade 1000 ATTACHMENTS:


Copyright (c) 1997-2003 Sun Microsystems, Inc.