[rancid] Re: HP Procuves losing mangement interfaces

Jody Botham jody at ask4.com
Tue Apr 14 08:55:56 UTC 2009


Michael Kania wrote:
> All,
> 
> I've been having a problem with a subset of switches periodically losing 
> their management interfaces. We have 3 data centers set up within the 
> united states and only 1 is having this problem. The problem data center 
> is unique in that it is our largest(roughly ~110 hp procurve 2810s) and 
> CPU usage on each switch averages 35-45%. The servers behind each switch 
> remain connected while the management interface is down. Pinging, 
> snmpget and ssh all fail. The downed management interface on the switch 
> eventually recovers and logs don't show any sign of failure.
> 
> The rancid logs show a timeout when trying to contact that switch and 
> then 3 failures to ssh. I've found that when rancid polls the switch CPU 
> usage spikes dramatically, and my assumption was that the seviere spikes 
> in CPU utilization causes the management interface to fall over. So 
> mitigate against this, Ive turned down the number of retries and the 
> polling interval, but the problem still remains. Anyone familiar with 
> this issue?
> 
> Im using rancid version : 2.3.2~a9 on debian etch
> 
> Thanks,
> Mike Kania
> 
> 
> ------------------------------------------------------------------------
> 
> _______________________________________________
> Rancid-discuss mailing list
> Rancid-discuss at shrubbery.net
> http://www.shrubbery.net/mailman/listinfo.cgi/rancid-discuss

What's the exact model of 2810 and what firmware are you running?  We've 
had similar issues with ProCurve kit (different model of switch but I 
think the 2810 may have the same ASIC) recently and worked with them to 
resolve the bug in their firmware.  You can mail me off list if you need to.

Thanks,

Jody


More information about the Rancid-discuss mailing list