<div dir="ltr"><div><div>well, we figured it out.. </div>it was pretty silly actually.. It looks like for this machine, at this location, and without the network/routing/route disabled, it was picking up a *second* default route.. so some of the packets (seemingly acks and other TCP activity -- somewhat important!) were ending up at this other router which belongs to a peer organization and we're not making it all the way to the remote side under certain circumstances. Once that second default route was removed, everything was fixed. It never affected ping, and my existing ssh was working fine. I have no idea why this suddenly started causing a problem! </div>I'm glad it turned out to be something simple. </div><div class="gmail_extra"> <div class="gmail_quote">On Thu, Dec 18, 2014 at 1:21 PM, Dan McDonald <<a href="mailto:danmcd@omniti.com" target="_blank">danmcd@omniti.com</a>> wrote:<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"> > On Dec 18, 2014, at 11:26 AM, Doug Hughes <<a href="mailto:doug@will.to">doug@will.to</a>> wrote: > > > Here's the simplest test... I start up ttcp -r on the server, it binds to port 5001, listening. I run snoop.. Then I try to connect to 5001 from another machine. I see the packets in snoop, but the accept call on the omniOS machine never returns. Something seems wonky in network land. Has anybody seen this? THe machine has been up for weeks without any problems. > > OmniOS v11 r151012 > Copyright 2014 OmniTI Computer Consulting, Inc. All rights reserved. > Use is subject to license terms. > > Regular/plain Intel chipset: > e1000g0: > root@xyr-r:/root# dladm show-link e1000g0 > LINK CLASS MTU STATE BRIDGE OVER > e1000g0 phys 1500 up -- -- > root@xyr-r:/root# dladm show-phys e1000g0 > LINK MEDIA STATE SPEED DUPLEX DEVICE > e1000g0 Ethernet up 1000 full e1000g0 > > e1000 prtdiag excerpt: > name='device-name' type=string items=1 > value='82574L Gigabit Network Connection' I can't recall if this chipset has problems or not. I want to say it might, BUT I'm not sure, so I won't point fingers. > name='subsystem-name' type=string items=1 > value='unknown subsystem' > Device Minor Nodes: > dev=(112,1) > dev_path=/pci@0,0/pci8086,1d14@1c,2/pci122e,10d3@0:e1000g0 > Ideas? If you've the disk space, please utter "savecore -L" while your machine is in this state. It might be nice to have the system state while things are failing. Do you see any complaints from e1000g in /var/adm/messages? It's like the NIC or the driver stopped receiving packets. One thing you could do is unplumb and replumb the interface. That may make the kernel reset the driver. ifconfig e1000g0 unplumb ifconfig e1000g0 plumb <addr/prefix> up If that doesn't work, you may also need to modunload the driver before replumbing. ifconfig e1000g0 unplumb modinfo | grep e1000g modunload -i <number from modinfo line> ifconfig e1000g0 plumb .... If modunload complains, you will need to unplumb the v6 interface ("ifconfig e1000g0 inet6 unplumb") or maybe disable some other services temporarily. Dan </blockquote></div></div>