[rancid] Any idea on why ssh would resolve hostnames differently from an interactive shell?

Janet Plato techgrrl2003 at yahoo.com
Mon Nov 26 02:07:02 UTC 2007


Hello,

  I hope this is an appropriate place to ask questions, if not I
apologize and would appreciate a pointer to the correct location.  The
essence of my question is, when clogin spawns ssh directly, the ssh
process fails to resolve the hostname (it generates a rather cryptic 17
byte request when viewed from strace -f that does not appear on the
wire as far as I can tell), but when clogin spawns bash -c ssh, it
opens a socket on port 53 and sends a normal volley of requests, one
for each domain in resolv.conf, and ends up getting an answer and
connecting successfully.

  I can also have clogin spawn a bash shell and interact, in which case
eveything works great.

  I am running a config management system more or less based on rancid
using the clogin expect script.  I have tweaked the script a bit to
deal with a variety of cisco and non-cisco devices, as well as to
handle running commands that take longer to execute such as archive
download-software.  I would be glad to provide copies of the source to
anyone interested, the changes are minimal but possibly of interest to
someone.

  I find myself extending the types of service to include ssh v2 and I
am having some trouble, when I have expect "spawn ssh -x user at device"
it fails to resolve the hostname and returns EOF, which makes clogin
exit.  When I "spawn bash" and interact, or "spawn /bin/bash -c 'ssh
device'" it works just fine.  I've used strace to determine what is
different and I am having trouble understanding the output, it appears
my DNS resolution method changes when I spawn ssh versus when I spawn
bash -c ssh.

  I assume most folks are familiar with the foreach device loop in
clogin, and contained within it the case statement where for each
device you try to determine the connection method and spawn the
relevant code.  I am copying just the bit of case statement with some
comments

# Log into the router.
proc login { router user userpswd passwd enapasswd cmethod cyphertype }
{
    global spawn_id in_proc do_command do_script platform sshver
    global prompt u_prompt p_prompt e_prompt
    set in_proc 1
    set uprompt_seen 0

# debug 1
# exp_internal 1

    # try each of the connection methods in $cmethod until one is
successful
    set progs [llength $cmethod]
    foreach prog [lrange $cmethod 0 end] {
        if [string match "telnet*" $prog] {
... code for telnet deleted ...
        } elseif ![string compare $prog "bash-ssh"] {
# if bash spawns ssh, I can login fine
            if [ catch {spawn /bin/bash -c "ssh $sshver -c $cyphertype
-x $user@$router"} reason ] {
            send_user "\nError: ssh failed: $reason\n"
            exit 1
            }
        } elseif ![string compare $prog "ssh"] {
# this fails, ssh returns EOF when it cannot determine the host name
# spawn ssh  -c 3des -x net at fa-cssc-b280c-3-ban-pri
# ssh: : Name or service not known
# sniffing the wire shows no query, strace shows a cryptic 17 bytes
# send followed by a DNS failure
            if [ catch {spawn ssh $sshver -c $cyphertype -x
$user@$router} reason ] {
                send_user "\nError: ssh failed: $reason\n"
                exit 1
            }


Below is the strace -f from a failed clogin attempt, note the 
send which has to be the DNS request, since I just opened a socket on
port 53.  I do not understand how it could be though, it does not look
like a valid packet or fragment.

  [pid 27642] send(4, "\205\r\1\0\0\1\0\0\0\0\0\0\0\0\1\0\1", 17, 0) =
17

and the following recvfrom

  [pid 27642] recvfrom(4,
"\205\r\201\200\0\1\0\0\0\1\0\0\0\0\1\0\1\0\0\6\0\1\0\0" ..., 1024, 0,
{sa_family=AF_INET, sin_port=htons(53),
sin_addr=inet_addr("128.104.254.254")}, [16]) = 92
  [pid 27642] close(4)                    = 0
  [pid 27642] write(2, "ssh: : Name or service not known"..., 34) = 34
  [pid 27642] exit_group(255) 

  The strace manpage says the \### stuff is supposed to be in a format
a c programmer would understand, but I do not understand it.  Is it a
mix of octal and the \t, \r, \n we all know and love?  In some cases I
have seen \Dg which kind of throws the octal and normal escape sequence
theory out the window.  Knowing what strace is telling me would be a
fine start for me.

Strace from a failed attempt
------------------------------------------------------------
[pid 27642] socket(PF_INET, SOCK_DGRAM, IPPROTO_IP) = 4
[pid 27642] connect(4, {sa_family=AF_INET, sin_port=htons(53),
sin_addr=inet_add
r("128.104.254.254")}, 28) = 0
[pid 27642] fcntl64(4, F_GETFL)         = 0x2 (flags O_RDWR)
[pid 27642] fcntl64(4, F_SETFL, O_RDWR|O_NONBLOCK) = 0
[pid 27642] gettimeofday({1195955591, 14548}, NULL) = 0
[pid 27642] poll([{fd=4, events=POLLOUT, revents=POLLOUT}], 1, 0) = 1
[pid 27642] send(4, "\205\r\1\0\0\1\0\0\0\0\0\0\0\0\1\0\1", 17, 0) = 17
[pid 27642] poll([{fd=4, events=POLLIN, revents=POLLIN}], 1, 5000) = 1
[pid 27642] ioctl(4, FIONREAD, [92])    = 0
[pid 27642] recvfrom(4,
"\205\r\201\200\0\1\0\0\0\1\0\0\0\0\1\0\1\0\0\6\0\1\0\0"
..., 1024, 0, {sa_family=AF_INET, sin_port=htons(53),
sin_addr=inet_addr("128.10
4.254.254")}, [16]) = 92
[pid 27642] close(4)                    = 0
[pid 27642] write(2, "ssh: : Name or service not known"..., 34) = 34
[pid 27642] exit_group(255) 

Here is the strace -f from a successful attempt
--------------------------------------------------------------
[pid 17019] socket(PF_INET, SOCK_DGRAM, IPPROTO_IP) = 4
[pid 17019] connect(4, {sa_family=AF_INET, sin_port=htons(53),
sin_addr=inet_add
r("128.104.254.254")}, 28) = 0
[pid 17019] fcntl64(4, F_GETFL)         = 0x2 (flags O_RDWR)
[pid 17019] fcntl64(4, F_SETFL, O_RDWR|O_NONBLOCK) = 0
[pid 17019] gettimeofday({1195953633, 817845}, NULL) = 0
[pid 17019] poll([{fd=4, events=POLLOUT, revents=POLLOUT}], 1, 0) = 1
[pid 17019] send(4,
"\233\233\1\0\0\1\0\0\0\0\0\0\10fa-janet\3net\4wisc\3e"...,
39, 0) = 39
[pid 17019] poll([{fd=4, events=POLLIN, revents=POLLIN}], 1, 5000) = 1
[pid 17019] ioctl(4, FIONREAD, [184])   = 0
[pid 17019] recvfrom(4,
"\233\233\201\200\0\1\0\1\0\3\0\4\10fa-janet\3net\4wisc"
..., 1024, 0, {sa_family=AF_INET, sin_port=htons(53),
sin_addr=inet_addr("128.10
4.254.254")}, [16]) = 184
[pid 17019] close(4)                    = 0
[pid 17019] socket(PF_INET, SOCK_STREAM, IPPROTO_IP) = 4
[pid 17019] connect(4, {sa_family=AF_INET, sin_port=htons(22),
sin_addr=inet_add
r("128.104.137.36")}, 16) = 0

Anyway, if anyone can shed light on the following questions I would be
quite grateful.

 - why would ssh resolve hostnames differently when spawned by expect
versus when invoked by bash (which was spawned from expect)

 - what are the \### in the strace output telling me, especially the 17
byte send that appears to be a DNS requests that is doomed to fail.

I hope everyone had a great thanksgiving, gobble, gobble,

Janet Plato


      ____________________________________________________________________________________
Be a better pen pal. 
Text or chat with friends inside Yahoo! Mail. See how.  http://overview.mail.yahoo.com/


More information about the Rancid-discuss mailing list