Monday, December 15, 2008

drb.rb:852:in `initialize': getaddrinfo: nodename nor servname provided, or not known (SocketError) (aka DRb TCPServer.open(0) failure on OS X)


Update (2009/01/01):
Problem TT moved to redmine.
Update (2009/05/13): TT resolved.

I am currently busy building an priority queue server in ruby and I have chosen to use DRb as my communications platform.

While experimenting the simple examples from the Net (see here and here) I was consistently getting the same error from inside drb.rb (/opt/local/lib/ruby/1.8/drb/drb.rb:852):

/opt/local/lib/ruby/1.8/drb/drb.rb:852:in `initialize': getaddrinfo: nodename nor servname provided, or not known (SocketError)
from /opt/local/lib/ruby/1.8/drb/drb.rb:852:in `open'
from /opt/local/lib/ruby/1.8/drb/drb.rb:852:in `open_server_inaddr_any'
from /opt/local/lib/ruby/1.8/drb/drb.rb:864:in `open_server'
from /opt/local/lib/ruby/1.8/drb/drb.rb:759:in `open_server'
from /opt/local/lib/ruby/1.8/drb/drb.rb:757:in `each'
from /opt/local/lib/ruby/1.8/drb/drb.rb:757:in `open_server'
from /opt/local/lib/ruby/1.8/drb/drb.rb:1346:in `initialize'
from /opt/local/lib/ruby/1.8/drb/drb.rb:1634:in `new'
from /opt/local/lib/ruby/1.8/drb/drb.rb:1634:in `start_service'
from ./queue-provider.rb:32

What's up?
After poking drb.rb::self.open_server_inaddr_any(host, port) with a stick a few times two issues came to light:

  1. Multiple network address families are not catered for properly in the the code.

  2. TCPServer.open(port) where port == 0 fails under OS X but not Linux

Multiple Address Families
The code in question looks like this:

def self.open_server_inaddr_any(host, port)
infos = Socket::getaddrinfo(host, nil,
Socket::AF_UNSPEC,
Socket::SOCK_STREAM, 0,
Socket::AI_PASSIVE)
family = infos.collect { |af, *_| af }.uniq
case family
when ['AF_INET']
return TCPServer.open('0.0.0.0', port)
when ['AF_INET6']
return TCPServer.open('::', port)
else
return TCPServer.open(port)
end
end

From that we can see that we only seem to expect one network address family which is a little naive. Socket::getaddrinfo() on my MacBook Pro has the following to say (where host == 'localhost'):

$ irb
irb(main):006:0> require "socket"
=> true
irb(main):007:0> host = 'localhost'
=> "localhost"
irb(main):008:0> Socket::getaddrinfo(host, nil,
irb(main):009:1* Socket::AF_UNSPEC,
irb(main):010:1* Socket::SOCK_STREAM, 0,
irb(main):011:1* Socket::AI_PASSIVE)
=> [["AF_INET6", 0, "localhost", "::1", 30, 1, 6], ["AF_INET6", 0, "localhost", "fe80::1%lo0", 30, 1, 6], ["AF_INET", 0, "localhost", "127.0.0.1", 2, 1, 6]]

When you take this as your input you'll see that we don't end up matching either 'AF_INET' or 'AF_INET6' and we fall through to return TCPServer.open(port) because the case block expects a match against an array with one element.

TCPServer.open(0) OS X Weirdness
I have used DRb on both Linux and Windblowns in the past without a hitch so I was rather surprised to run into something like this which is a show stopper on OS X. I though I'd see if I was having the same problems on Linux to have something to compare with:

$ irb
irb(main):001:0> require "socket"
=> true
irb(main):002:0> port = 0
=> 0
irb(main):003:0> TCPServer.open(port)
=> #

Works a treat! Let's try that on OS X:

$ irb
irb(main):001:0> require "socket"
=> true
irb(main):002:0> port = 0
=> 0
irb(main):003:0> TCPServer.open(port)
SocketError: getaddrinfo: nodename nor servname provided, or not known
from (irb):3:in `initialize'
from (irb):3:in `open'
from (irb):3
from :0

CRASH! BOOM! BANG!

DRb Quilt
The first issue is rather trivial to fix:

def self.open_server_inaddr_any(host, port)
infos = Socket::getaddrinfo(host, nil,
Socket::AF_UNSPEC,
Socket::SOCK_STREAM,
0,
Socket::AI_PASSIVE)
families = Hash[*infos.collect { |af, *_| af }.uniq.zip([]).flatten]
return TCPServer.open('0.0.0.0', port) if families.has_key?('AF_INET')
return TCPServer.open('::', port) if families.has_key?('AF_INET6')
return TCPServer.open(port)
end

The code now rather assumes we're dealing with an array of one or more network address families and tries the IPv4 and IPv6 families first and then falls though to TCPServer.open(port).

I have opened a TT on RubyForge for this that contains a patch from me to fix the first issue.

What is required to fix the second issue? Dunno just yet, I'll keep looking and see if anything interesting pops up in the TT.




1 comment:

dies-el said...

Hi Charl,

Thanks for writing this up, I came across the very same issue while trying to use spec_server with rspec, though i probably would have chalked it up as a newbie mistake rather than something up with the foundations!

For your own reference, heres some of my system info:
- ruby 1.8.7 (2008-08-11 patchlevel 72) [i686-darwin9] (installed via ports)
- OSX 10.5.6

Anyway, thanks for the patch, everything appears to be moving again at least :)

About Me

My photo
I love solving real-world problems with code and systems (web apps, distributed systems and all the bits and pieces in-between).