RSM2000 Trouble Shooting Guide ------------------------------------------------------------------------------- Scenario One: Installing RSM2000 patches fail to install Background: System running 2.5.1, generic patch cluster installed. The patches that failed to install > 104563-02 Sun RSM Array 2000 1.0: ICON Chip Failure Messages > 104564-01 Sun RSM Array 2000 1.0: Rdriver deadlock detection code Both these patches failed with # ./installpatch ../104563-02 Checking installed packages and patches... None of the packages included in patch 104563-02 are installed on this system. Installpatch is terminating. # ./installpatch ../104564-01 Checking installed packages and patches... None of the packages included in patch 104564-01 are installed on this system. Installpatch is terminating Resolution: The problem turns out to be the package SUNWosar with VERSION=1.0,REV=06.02, this version was on the early access CD. Pulled the FCS version off of cornmeal.eng. The SUNWosar package shows VERSION=1.0,REV=06.00 Now when you install the RSM2000 patches, they all install successfully!!!!! ------------------------------------------------------------------------------- Scenario Two: Lose a controller(maybe) Background: In the console window, you notice a sudden rash of "disk not responding to selection" message scrolling. Go to the RM6 window, click on the "status" icon. Do a health check. It shows that you have a "Data Path Failure" on our RSM2000 unit.... Goto the RM6 window, click on the "recovery" icon. Inside the RM6 window, you will see a one liner at the bottom of the window stating to start a parity/ repair repair, perform manual recovery. Click on "start parity/check repair" You will notice that there is a horizontal bar that will slowly turn blue as it checks the LUN's on that RAID Module Note: The two RSM2000 controllers will still appear to be operartional( Power light on and second green LED in heartbeat mode) During the "parity/check process" you will see some messages in the console conerning disk "offline". Once you start the "parity/check repair" process, you can NOT start anything else from that window unless you hit the "cancel" button or the "parity/repair process" has completed. After about thirty minutes of the "parity/check process" running. I hit the "cancel" button, as a user, I got very impatient. This "PCR" process is very SLOW. A popup window will ask for "ok" to cancel "PCR" process. Click the "ok" button. Click on the "Recovery Guru". It will show you the status of the RAID module. At the bottom, it will tell you to obtain step-by-step instruction to recover from a "data path failure". Click on the "fix" button. Window will popup and give you a summary of what could have caused the "data path failure"... Note: Make sure YOU read EVERYTHING in this window before going on!!!! Click "ok" Another window pops up, a caution message will be first(READ IT) then it tells you to perform a series of steps Step 1: check cables and terminators Step 2: Click 'ok" after performing Step 1 I had checked the terminators, they appeared to be ok. I replaced both of the cables, clicked "ok". Another window pops up telling me that I have recovered from "data path failure". It then gives you a notice about file system not being accessible or logical units and that you might have to reboot... Click "ok" The Recovery "module Information" window shows that the RAID module has been fixed. Click on "Status" from the main RM6 window, click on "health check", now it shows the RAID module as being in optimal state now. ------------------------------------------------------------------------------- Scenario Three: Power interruption to RSM2000(one of the sequencers dies) Background: Alarms on all RSM trays go off, second light on Controller box goes off(amber) Click on "status", do a "health check", it shows that there has been a drive tray failure and module component failure. Note: RSM2000 controller will soon start blasting messages to the console. Click on the "show details", tells user to go thru "Recovery Guru"..... Click on "Recovery Guru", gives procedure to follow along with a caution about do not operate drive trays with a fan module failure for more then 5 minutes. Resolution: Problem was due to one of the sequencers being tripped off. Flipped it back on and the summary information window shows that the RAID module as being fixed. Alarms on RSM trays shut off, as well as the amber light on the RSM2000 controller box being shut off... Goto the "status", do a "health check", shows that RAID module is optimal now. ------------------------------------------------------------------------------- Scenario Four: lose your host that RSM2000 is attached to.... Background: System crashes.... RSM2000 will keep on trucking, no fault LED's, or anything. If you looked at the unit, you would think that life is good.... I would strongly suggest that you use something life SYMON to monitor the host that the RSM2000 is attached too, so it will allow you to keep an eye on the host that the RSM2000 is attched to. Bring the system back up, right before the login prompts appears. The array monitor daemon gets loaded, all the LED's on the drives in the RSM trays will flicker for a brief moment. (Of course this assumes that you did not have to reinstall any software) Fire up RM6 Click "status" icon click on "health check" Shows that RAID module is in optimal state!!!!!! ------------------------------------------------------------------------------- Scenario Five: Lose complete power to the RSM2000 unit Background: Lose access to the RSM2000 After a few moments, you will see "disk not responding" in console as well as "offline" messages..... click on "status" icon (will be slow in responding) Shows "data path failure" on RAID Module Click on "show summary" It tells you to go thru the "Recovery Guru" click on "Recovery Guru" (slow in responding) click "fix" Note: READ EVERYTHING about "data path failure" click "ok" Check connections as suggested.... Click "ok"... window will popup telling you that it is checking data paths(this will take about five minutes) Note: You will be still getting warnings about "disk not responding" in console... Window then will popup concerning "Data Path Recovery" basically asking if you want too remove the RAID Module from system, if you click yes, then it will do so.....If No, it will give you the steps to replace controllers.... click "no" Window will then popup concerning "Controller Failure Replacement" telling you to umount all file systems and stop all I/O to the RSM2000. click "ok" Another "Controller Failure Replacement" window will popup. READ THIS CAREFULLY, gives you a caution, then steps to replace the controllers. There is a notice at the bottom of the window telling you about the new serial numbers on the new controller boards, and that the RAID Module numbers may change.... WEll, we opened the cabinet to the RSM2000, and discovered that there is NO power to the unit...end up replacing the sequencers.... Power up the RSM2000 soon afterwards, "disk okay" messages will appear in console window. Since we did NOT have to replace the controllers, we clicked "cancel".... Window pops up telling you that you have exited the Recovery Guruand that your controllers are still failed.... clikc "ok" click "status" click "health check" summary information shows that life is good on the RSM2000 ------------------------------------------------------------------------------- Scenario Six: Writing data to RSM2000, lose host!!!! Background: Doing a remote copying(rdist) to the RSM2000 system... When you are writing data to the RSM2000, the cache light will go on, the lights on the RSM drives will begin to flicker on and off..... System crashes to the "ok" prompt, lights on the drives will continue to be lit for a few more moments. Then those lights will stop flickering as well as the cache light going off. The cache light will stay on till it completes all successful writes to the RSM drives in the RSM2000. Bring the hosts back up, go to "statsus" do a "health check" and shows that RAID Module is in optimal mode. ------------------------------------------------------------------------------- Scenario Seven: Write data to RSM2000 and lose the RSM2000 due to complete power failure Background: Again doing a remote copy of data to the RSM2000 Lose power to the RSM2000, the remote copy hangs from that host. You'll soon seea bunch of "rdaemon" messages blast across the console on the RSM2000 host. Stop the remote copy. Got power restored to the RSM2000. Powered RSM2000 back up. After the drives spun up and cam online. The cache light lit up and completed all successful writes that were stored in the cache of the RSM2000 before it lost all power. Note: you will also see some "disk okay" messages cpme across the console of the RSM2000 host as it come online. ------------------------------------------------------------------------------- Scenario Eight: Writing data to RSM2000, lose a drive in that LUN, during the failover to the Hotspare, you lose a second drive in that same LUN..... You have completely lost that LUN!!!! in addition you will see LOTS of message sblow across the console and screen of your RSM2000 host. You'll see messages like "out of inodes" "sense key errors"... Replace the two failed drives from that LUN, click "refresh" from the OpenWindows menu. A few moments after this was done, there were LOTS of "sense key" error messages that blew across the screen for about 10 seconds and then stopped. Did "health check", shows life is good. Cache light is still on at this point. Went to "configuration" and looked at "module Information" and still shows that the failed LUN is dead. Powered cycled RSM2000, came back up with cache light still on, every few moments there is activity on the LUN that did not fail. Click "status" and in the console window, got error messages from rm6stat saying "SysDevOpen Failed (I/O error)" At this point, the RSM2000 host was halted and rebooted.... Fired up RM6, went into "configuration", still shows that LUN as being dead. Deleted that failed LUN(drives go to unassigned state), then created that LUN again with those drives I just deleted. Cache light went OFF at this point. Note: Before failed LUN was deleted, brought up format and label that was associated with that LUN was blown away...that would explain why we got those "SysDevOpen" error messages earlier..... What does all this mean????? If you lose more then one drive in a LUN, then that LUN is DEAD.... To bring that LUN back, you need to do the following.... 1. Replaced the failed drives 2. Delete failed LUN from the "configuration" window. 3. Readd that LUN again 4. After those drives are formatted again, restore any data Note: format takes a while, so if you have a large set of drives in a LUN, you'll be waiting for a while... After format is complete, configuration window shows that LUN is in an optimal state now. ------------------------------------------------------------------------------- Scenario nine: After setting up swap on RSM2000, and rebooting system. During reboot, you get a message stating that "overlapping of swap is not allowed"... The problem is that the rdac driver need to loaded before you load swap local to the system. In the /etc/rcS.d file, you need to edit the S40standardmounts.sh and comment out the line /sbin/swapadd -l. Then go into the S46rdac file and put that line in the VERY END of the file. This will cause the rdac drivers to get loaded, then swap that is on the RSM2000 will get loaded. In the /etc/vfstab file, you need to use the /dev/dsk/cXtXdXsX naming convention. I have filed a Sev 1 bug against using the /dev/rRAID_Module/0s0 in /etc/vfstab. If you use this naming convention, on reboot, you will get that error message. Even though the system shows that it is swapping on the RSM2000. BugID 4039017 Symbios is currently working on the problem. ------------------------------------------------------------------------------- Scenario Ten: I cannot newfs anything >2gb on s2 on a RSM2000.... There have been reports of folks not being able to newfs anything greater then 2gb on s2 on an RSM2000. When folks have tried this, they have seen this..... seek error : wtfs invalid argument. There are a couple of things that could be going on, a possible bug, disk label that went out to lunch, reconfigured something on the RSM2000. Bring up format, can you see the drive??? Also, did you just created a new RAID/group/LUN. If you just did that, you need to do a boot -r, during the reboot, you will see a couple of messages concerning the "reconfiguration" of the rdnexus.conf file and second .conf file. These are configuration files for the RSM2000. After the system comes up, bring up format, you will be able to see the drive, format the drive with everything on s2. Then do newfs tto slice 2 and then your all set and ready to go. It is also possible that your label went out to "lunch". Doing a boot -r has cleaned this up. I have asked the engineer in the field if they seen the "reconfiguration" messages. He cannot remember, if there were "reconfiguration" messages, then there were some mods done to the RSM2000 he was not aware of. Another workaround is moved the entire slice to something other then 2 on the RSM2000 disk then do your newfs. I will file a bug so we can get this looked at, just to be on the safe side.(bugid 4053588) ------------------------------------------------------------------------------- Scenario 11 User has setup a RAID 1+0 with 6 drives and a second 1+0 with 20 drives and also has two hot spares.... When format started on the LUNS, it has been running for the last two days and it appears tht the RSM2000 and or RM6 is hung out to dry.... Resolution: You can use ps to find the process, which must be hung as it should not take take that long, then use the kill command to terminate. I've done this without any problems when I just wanted to stop the config and redo it. Then check for hardware problems and make sure you have all the current patches. The current patch list is at: http://storageweb.eng/techmark_site/arrays/tmkt_rsmpatch ------------------------------------------------------------------------------- Scenario 12 Similar to Scenario 11, if the RAID Manager comes to an abnormal termination(ie: crash), the LUN would remain hidden from the other RAID Manager utilities. If the user should notice a hung or a missing LUN that she/he knows is configured, the user should remove the lunlocks file (rm /etc/raid/locks/lunlocks), exit and then re-enter the RAID Manager application that could not see the LUN. ------------------------------------------------------------------------------- Scenario 13 Deleting LUNs on a RSM2000 using RM6 and getting the following messages.... svr8# WARNING: /sbus@1f,0/QLGC,isp@1,10000/sd@4,1 (sd130): offline WARNING: /sbus@1f,0/QLGC,isp@3,10000/sd@5,1 (sd187): offline WARNING: /sbus@1f,0/QLGC,isp@3,10000/sd@5,1 (sd187): offline WARNING: /sbus@1f,0/QLGC,isp@1,10000/sd@4,1 (sd130): offline WARNING: /sbus@1f,0/QLGC,isp@3,10000/sd@5,1 (sd187): offline WARNING: /sbus@1f,0/QLGC,isp@1,10000/sd@4,1 (sd130): offline WARNING: /sbus@1f,0/QLGC,isp@3,10000/sd@5,1 (sd187): offline Solution: These are NORMAL messages... This is due to the fact sonoma luns look like real disks to the system so deleting them is like unpluging (offlining) an actual disk. -------------------------------------------------------------------------------------- Scenario 14 Everytime I do a boot -r, my RSM2000 device paths get renamed, is there a way to stop this..... Solution: This is a known issue with RM6 6.0 and has been fixed with 6.1. You might have to do a rm -r of /dev/dsk and /rdsk for those sonoma controller numbers. Then you would need to do a rm -r /dev/osa and a boot -r to get things straighten out. -------------------------------------------------------------------------------------- Scenario 15 During a failover, I go into the RM6 gui and turn the "reconstruction rate" up to max, I can see in the graph that it is working. I come back the next morning and the GUI is showing that it is still running and the LED's on the disk involved are still flashing..... What's going on here.... Solution: The short answer is to quit RM6 after a period of time when you think the failover has completed. Then restart RM6, it will then tell you that the resync is done and the LED's on the disks involved in the failover process will stop flashing. The longer answer is that a P1 bug(4060003) has been filed against this. -------------------------------------------------------------------------------------- Scenario 16 Using a fully loaded RSM2000 on a UE6000 with SEVM 2.4 and Oracle, after a system reboot The filesystem changed from a ownership of oracle to root. How do you setup ownership to be "oracle" permanently???? Solution: srdb 11430 Using the following Volume Manager command: # vxedit set user=oracle group=oracle mode=600 volume_name ---------------------------------------------------------------------------------------- Scenario 17 we used vxdiskadm to get veritas control of a LUN and we see this on console when Veritas 2.4 initializes the disk(LUN) .. NOTICE: vxvm:vxio: Disk c3t5d0s2: Unexpected status on close: 19 NOTICE: vxvm:vxio: Disk c3t5d1s2: Unexpected status on close: 19 NOTICE: vxvm:vxio: Disk c3t5d2s2: Unexpected status on close: 19 NOTICE: vxvm:vxio: Disk c3t5d3s2: Unexpected status on close: 19 NOTICE: vxvm:vxio: Disk c3t5d4s2: Unexpected status on close: 19 NOTICE: vxvm:vxio: Disk c3t5d5s2: Unexpected status on close: 19 NOTICE: vxvm:vxio: Disk c5t5d0s2: Unexpected status on close: 19 NOTICE: vxvm:vxio: Disk c5t5d1s2: Unexpected status on close: 19 NOTICE: vxvm:vxio: Disk c5t5d2s2: Unexpected status on close: 19 NOTICE: vxvm:vxio: Disk c5t5d3s2: Unexpected status on close: 19 NOTICE: vxvm:vxio: Disk c5t5d4s2: Unexpected status on close: 19 NOTICE: vxvm:vxio: Disk c5t5d5s2: Unexpected status on close: 19 Solution: The message your getting is ok to ignore for VM 2.4 with RSM2000 ------------------------------------------------------------------------------------------ Scenario 18 I am using format on a 160gb LUN and and format isn't showing the entire LUN????? Solution: The short answer is bugid 4036085 This has been fixed in 2.6 as well as 2.5.1 8/97. Or you can use VxFS if your using an earlier version of Solaris. ---------------------------------------------------------------------------------------- Scenario 19 Node name isses 1)In phase I Sonoma, the node names can become corrupted. This can cause all sorts of problems, and is esentialy, the root of most if not all evil, with redards to software related problems with the product. Fortunitly there is an easy fix for even the worst manifistations of this problem: a complete removal of the /dev/osa directory followed by a re-configuration boot will re-build the RSM2000 node names correctly. For a given hardware configuration, you will get the same answer every time. You should also remove the `rdriver' nodes from /dev/rdsk and /dev/dsk. The script chk.rmodules.sh can be used to check the node names and also their binding in VxVM. It create a file in the current directory called rdsk.rdriver.names. This is a list of the rdriver names that need to be removed from /dev/rdsk and /dev/dsk. examples: rm -r /dev/osa; touch /reconfigure ; init 6 cat rdsk.rdriver.names /dev/rdsk/c0t4d1s6 /dev/rdsk/c0t4d4s6 /dev/rdsk/c0t4d5s6 /dev/rdsk/c0t4d6s6 /dev/rdsk/c2t5d0s6 /dev/rdsk/c2t5d2s6 /dev/rdsk/c2t5d3s6 foreach d (`cat rdsk.rdriver.names`) rm $d end foreach d (`cat rdsk.rdriver.names | sed -e 's/rdsk/dsk/g' `) rm $d end ---------------------------------------------------------------------------------------- Scenario 20 Hung process: Sonoma uses a file to serialize access to shared resourcse. i.e. for locking. If rm6 or one of the CLI is hung, chances are it is a left over lock condition. Check for and remove the file /etc/osa/lunlocks. Be carefull *not* to delete the symlink /etc/raid/locks. This is a pointer to a directory that contains more lock files. It is far lesss common, but it is possible to have a hang condition because of an entry in this directory. If a command is hung, use truss(1M) to observe the command. If it is a loop that involves a reference to a file in this directory, that file is a likley culpret. Be sure no ligitimite use of the array is in process before removing the file. If a process is hung while accessing node names under /dev/osa/dev/rdsk then re-build the LUN names as per 1). examples: truss -f lad truss -f -p PID rm /etc/raid/locks ---------------------------------------------------------------------------------------- Scenario 21 unable to perform operation on LUN: Both controllers in an RSM2000 have a relationship to every LUN; each is the primary or secondary path. The primary controler has ownership and exclusive write access to the LUN's meta data. If there is a problem performing an operation, you can access the LUN with the other channel, and/ or by change the LUNS ownership. This will likley shed more light on the nature of the problem; generate additionaly diagnostic messages. Or may resolve it. example: lad c0t4d1s0 1T63350903 LUNS: 1 4 5 6 c2t5d0s0 1T63350944 LUNS: 0 2 3 raidutil -c c0t4d1s0 -D 0 LUNs found on c0t4d1s0. LUN 1 RAID 5 16000 MB LUN 4 RAID 5 16000 MB LUN 5 RAID 5 16000 MB LUN 6 RAID 5 16000 MB Deleting LUN 0. Press Control C to abort. Deleting LUN Failed: ** Check Condition ** SENSE Data: 7000050000000098 0000000094010000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 Sense Key: 05 ASC: 94 ASCQ: 01 raidutil program failed. This terrifying message is saying that the LUN belongs to the other controller. I intend to file RFEs on the error message handleing. ------------------------------------------------------------------------------------- Scenario 22 If you have mulltiple sonoma's, how do you make the connection with what controllers go with each RAID Modules. In Phase Two, there is a file called "mnf", which is located in /etc/osa. This is a ascii file. Inside there it will have the RAID Module names along with what controllers are with it. --------------------------------------------------------------------------------------- Scenario 23 VxVM/RSM 2000 LUN errors: 1) The condition: mopti# vxdisk define c2t4d6s2 vxvm:vxdisk: ERROR: Device c2t4d6s2: define failed: Disk is not usable 2) The LUN is labeled but VxVM does not like the layout. Take a vtoc from a good LUN of the same kind: prtvtoc /dev/rdsk/c13t4d6s2 > good.r1.vtoc fmthard -s good.r1.vtoc /dev/rdsk/c2t4d6s2 fmthard: Partition 4 specified as 16752640 sectors starting at 4096 does not fit. The full disk contains 16375808 sectors. (adjust the vtoc to the correct size reported above) mopti# fmthard -s fix.r1.vtoc /dev/rdsk/c2t4d6s2 fmthard: New volume table of contents now in place. mopti# vxdisk init c2t4d6s2 mopti# vxdisk list | grep c2t4d6s2 c2t4d6s2 sliced - - online ----------------------------------------------------------------------------------------------------- Scenario 24 I created a 6 disk RAID 10 LUN that is 9 GB. The disks are 4.2 GB. I removed one of the drives and the LUN starting reconstructing. I have no hot spares configured. What is it reconstructing the failed disk to? Why is it doing this? Solution: This is the "awe and mystery" of "hot-relocation". essentially, if vxrelocd detects errors on a volume/drive it scans your disks to see if there is sufficient UNallocated space somewhere and relocates it automagically. This has been met with mixed reactions so some people disable vxrelocd and turn on vxsparecheck (the old hot-spare paradigm). ------------------------------------------------------------------------------------------------------ Scenario 25 Has anyone run the new RM6.1 with Solaris 2.5.1, NOT 2.5.1 4/97 or 8/97. I installed all the patches in the matrix and upon booting I got this error: Spill 5 normal and it came back to the boot prom. Any way around this????? Solution: What I did to recover the system is to boot from cdrom, mount the root file system as /mnt and then use pkgrm with the "-R /mnt" to tell it to the root file system is under /mnt instead of /. I remove all the 5 "osa" packages and the system was back to normal. ----------------------------------------------------------------------------------------------------- Scenario 26 Existing StorEDGE A3000 us running normal, user then edits the rmparams file to increase the number of LUNs to something greater then 8. Everything appears to be normal until the user soons discovers that RM6 cannot see any controllers or drives anymore. However, he could see his LUNs. Background: As of this date, we still do not have > 8 LUN support, so when he created a > 8 LUN config, this information was written out to what a lot of people call the "private region", but the proper name is DACstore. The procedure below describes how to recover from this scenario. WARNING: These steps will completely destroy any previously existing configuration. These steps should only be done as a last resort and if the reset configuration option via the GUI & CLI do not work!!! 1. To ensure that the problem wasn't caused by a problem in the rmparams file, reset the rmparams file to it's original default state (this file should have been saved shortly after the initial install of RM6 but is available from the CDROM if it wasn't). 2. Remove all power from host and RSM2K. 3. Disconnect all disks from the RSM2K save one. The goal here is to force RM6 to rebuild a default configuration by denying it access to the previous (and corrupted) one. This is done by removing the 3 disks that contain the existing configuration information. Since there is no certainty as to where these disks will be located, I advocate removing them all except the 1 disk needed for the RAID controllers to use to build a new configuration. 4. Disconnect the battery backup (this will ensure that NVRAM is drained on the RAID controllers) for about 20-30 seconds. 5. Power on the RSM2K with the single disk. 6. Power on and boot the host. 7. After the host is up, start RM6 and verify that the configuration is accessible and reset. If you still have problems, try another disk in step #3. 8. Re-install all remaining disks into the RSM2K. You should observe the number of disks increase via the RM6 GUI Configuration application. 9. To ensure that everything works, use the reset configuration option from the GUI or CLI. ----------------------------------------------------------------------------------------------------- Scenario 27 IHAC with a newly installed RSM/2000 that started the getting these errors after 30 hours after the installation. Healthck reports no errors. His LUNS are concatenated. His configuration consists of 8 LUNS. 7 LUNS have 4 disks and the 1 LUN has 6 disks. Feb 19 06:39:03 ssht45 raid: Parity event Host=ssht45 Ctrl=1T71525352 Dev=c4t5d2s0 Feb 19 06:39:03 ssht45 raid: Start Blk=032986E1 End Blk=032987E9 # Blks=000000 0A LUN=02 Feb 19 07:58:46 ssht45 raid: Parity event Host=ssht45 Ctrl=1T73942619 Dev=c5t4d3s0 Feb 19 07:58:46 ssht45 raid: Start Blk=021BAF61 End Blk=021BAFE9 # Blks=000000 0A LUN=03 Feb 19 09:21:50 ssht45 raid: Parity event Host=ssht45 Ctrl=1T71525352 Dev=c4t5d4s0 Solution: [1] replace whichever controller is yellow-lighted, in our case it was the top one. [2] replace the cache and processor memory, DO NOT transfer the memory from the controller you are replacing! It is impossible to tell which memory SIMM is bad and it could be either. There are NO DIAGNOSTICS for this problem. [3] Make certain that the controller firmware is at least 2.4.4.1 [4] upgrade to RM6.1 on the server. ------------------------------------------------------------------------------------------------------------- Scenario 28 My customer has a SC2000 with 3 RSM2000. One is in rootdg. As he discovered this is unsupported we try to put this RSM2000 out of rootdg. The RSM2000 is configured in 7 RAID 5 LUNs . Is there a way to transfert these disks from rootdg to a new created group WITHOUT data backup and restore procedure ? Solution: srdb 12177 ------------------------------------------------------------------------------------------------------------ Scenario 29 I am getting an error on Module 10 LUN 1, how can I tell what disk is involved.... /var/adm/messages: The errored I/O is being routed to the Resolution daemon The Array Resolution Daemon is retrying an I/O on Module 10, LUN 1 at sector 129 So what cXtXdXsX is module 10, LUN 1 Look in the following file first: /kernel/drv/rdriver.conf: name="rdriver" module=10 lun=1 target=5 parent="/pseudo/rdnexus@5" dev_a=0x801040 dev_b=0x801628; (/devices/pseudo/rdnexus@5/rdriver@5,1:c,raw) find this device under rdsk: ls -l /dev/rdsk/* | grep 'rdnexus@5/rdriver@5,1:c,r' it was /dev/rdsk/c5t5d1s2 that is associated with Module 10 LUN 1. --------------------------------------------------------------------------------------------------------