[rancid] Improving Rancid's processing speed when having 1k+ devices

Piegorsch, Weylin William weylin at bu.edu
Mon Jul 29 21:01:56 UTC 2019


> topologically close servers can help, but I would just run more processes instead.

Agree in 99% of cases.  Though, there are rare niche scenarios where having geographically co-located servers can help.  Slow WAN connections ("dial-up"); high latency or high packet loss connections (satellite); unreliable WAN links (ship at sea); and so forth.

weylin
 
On 7/29/19, 2:06 PM, "john heasley" <heas at shrubbery.net> wrote:

    Fri, Jul 26, 2019 at 02:34:49AM -0700, Florin Vlad Olariu:
    > On 25 July 2019 at 18:16:48, Scott Granados
    > (scott.granados at gmail.com(mailto:scott.granados at gmail.com)) wrote:
    > 
    > > I would also recommend running multiple rancid servers maybe scatter them geographically so it’s not a single machine pulling all the weight. Break the work loads up among them.
    > 
    > Great advice which didn't cross my mind. Might have to resort to this
    > if I want ~ 1m poll times.
    
    topologically close servers can help, but I would just run more processes
    instead.  less mgmt overhead.
    
    > > - make sure that the rancid user is not process rlimited to less than ~605
    > processes; or PAR_COUNT * 2 + 5 or so.
    > 
    > My `ulimit -u` gives "4096". I don't this this is a factor?
    
    unlikely.  make sure its not others; -n -d.  you'd see processes being
    killed in the logs
    
    ...
    
    Are your configs very large?  I have one group of 252 devices that are
    scattered around the global totaling 1.2G of on-disk rancid output which
    takes about 28m to collect with 16 processes.
    
    
    



More information about the Rancid-discuss mailing list