[rescue] Solaris 9 NFS client with Linux server
abuse at cabal.org.uk
Wed Aug 8 07:39:11 CDT 2007
On Wed, Aug 08, 2007 at 07:45:59AM -0400, Steve Sandau wrote:
> Now I just unmounted it, tried some other options (version 2 is what makes
> it work) and seem to have introduced some slowness just by remounting it.
> When it is slow, or when it pauses, ethereal shows a packet labeled as
> "fragmented IP protocol", "RPC retransmission", "RPC duplicate" packets
> and a single ICMP packet labeled "Time to live exceeded (fragment
> reassembly time exceeded)".
> Maybe someone else can make some sense of that, but I'd think there were
> network problems or the NIC in the server was a problem if scp from this
> box and NFS from other boxes didn't work fine.
It sure smells like a network configuration problem to me. This kind of
failure is usually due to some incompetent berk configuring a packet filter
and blocking unusual but important packets and options because they don't
know what they do and so assume they're bad.
Your MTU will probably be 1500 bytes, but NFS can use larger packets,
typically up to 8192 bytes. Since it doesn't fit, the packet gets fragmented
for transmission and is reassembled on the other end. This means the packet
(well, several packets now) gains some extra options so that the far end
knows how to reassemble it again. These options can trigger the dumb rules
set up by the aforementioned incompetent berk.
On modern systems, TCP packets are never fragmented (they have the
don't-fragment bit set as part of Path MTU Discovery) so this particular
problem wouldn't bite and the problem wouldn't get noticed. TCP is pretty
tolerant of misconfigured firewalls, so scp and the like will work just
So... to fix it properly you want to start looking for "firewalls",
particularly dumb stateless packet inspectors that an idiot might have been
near, and then stop it filtering ICMP and fragmented packets.
If you can't do that, you could also lower the NFS packet size: on Linux
you'd mount with options rsize=1024,wsize=1024, but I can't remember if
Solaris differs. Performance will suffer somewhat, but at least it'll work.
More information about the rescue