[OmniOS-discuss] ixgbe: breaking aggr on 10GbE X540-T2

Stephan Budach stephan.budach at JVM.DE
Wed May 11 11:36:10 UTC 2016


Am 09.05.16 um 20:43 schrieb Dale Ghent:
>> On May 9, 2016, at 2:04 PM, Stephan Budach <stephan.budach at JVM.DE> wrote:
>>
>> Am 09.05.16 um 16:33 schrieb Dale Ghent:
>>>> On May 9, 2016, at 8:24 AM, Stephan Budach <stephan.budach at JVM.DE> wrote:
>>>>
>>>> Hi,
>>>>
>>>> I have a strange behaviour where OmniOS omnios-r151018-ae3141d will break the LACP aggr-link on different boxes, when Intel X540-T2s are involved. It first starts with a couple if link downs/ups on one port and finally the link on that  port negiotates to 1GbE instead of 10GbE, which then breaks the LACP channel on my Cisco Nexus for this connection.
>>>>
>>>> I have tried swapping and interchangeing cables and thus switchports, but to no avail.
>>>>
>>>> Anyone else noticed this and even better… knows a solution to this?
>>> Was this an issue noticed only with r151018 and not with previous versions, or have you only tried this with 018?
>>>
>>> By your description, I presume that the two ixgbe physical links will stay at 10Gb and not bounce down to 1Gb if not LACP'd together?
>>>
>>> /dale
>> I have noticed that on prior versions of OmniOS as well, but we only recently started deploying 10GbE LACP bonds, when we introduced our Nexus gear to our network. I will have to check if both links stay at 10GbE, when not being configured as a LACP bond. Let me check that tomorrow and report back. As we're heading for a streched DC, we are mainly configuring 2-way LACP bonds over our Nexus gear, so we don't actually have any single 10GbE connection, as they will all have to be conencted to both DCs. This is achieved by using VPCs on our Nexus switches.
> Provide as much detail as you can - if you're using hw flow control, whether both links act this way at the same time or independently, and so-on. Problems like this often boil down to a very small and seemingly insignificant detail.
>
> I currently have ixgbe on the operating table for adding X550 support, so I can take a look at this; however I don't have your type of switches available to me so LACP-specific testing is something I can't do for you.
>
> /dale
I checked the ixgbe.conf files on each host and they all are still at 
the standard setting, which includes flow_control = 3;
So they all have flow control enabled. As for the Nexus config, all of 
those ports are still on standard ethernet ports and modifications have 
only been made globally to the switch.
I will now have to yank the one port on one of the hosts from the aggr 
and configure it as a standalone port. Then we will see, if it still 
receives the disconnects/reconnects and finally the negotiation to 1GbE 
instead of 10GbE. As this only seems to happen to the same port I never 
experienced other ports of the affected aggrs acting up. I also thought 
to notice, that those were always the "same" physical ports, that is the 
first port on the card (ixgbe0), but that might of course be a coincidence.

Thanks,
Stephan


More information about the OmniOS-discuss mailing list