README Notes Broadcom bnx2x VMware ESX Driver Broadcom Corporation 5300 California Avenue, Irvine, CA 92617 Copyright (c) 2007-2009 Broadcom Corporation All rights reserved Table of Contents ================= Introduction Limitations Driver Dependencies Driver Settings Driver Parameters Driver Defaults Unloading and Removing Driver Driver Messages Introduction ============ This file describes the bnx2x VMware ESX driver for the Broadcom NetXtreme II BCM57710/BCM57711/BCM57711E 10/100/1000/2500/10000 Mbps PCIE Ethernet Network Controllers. Driver Settings =============== The bnx2x driver settings can be queried and changed using ethtool. The latest ethtool can be downloaded from http://sourceforge.net/projects/gkernel if it is not already installed. The following are some common examples on how to use ethtool. See the ethtool man page for more information. ethtool settings do not persist across reboot or module reload. Some ethtool examples: 1. Show current speed, duplex, and link status: ethtool vmnic0 2. Change speed, duplex, autoneg: Example: 100Mbps half duplex, no autonegotiation: ethtool -s vmnic0 speed 100 duplex half autoneg off Example: Autonegotiation with full advertisement: ethtool -s vmnic0 autoneg on Example: Autonegotiation with 100Mbps full duplex advertisement only: ethtool -s vmnic0 speed 100 duplex full autoneg on 3. Show flow control settings: ethtool -a vmnic0 4. Change flow control settings: Example: Turn off flow control ethtool -A vmnic0 autoneg off rx off tx off Example: Turn flow control autonegotiation on with tx and rx advertisement: ethtool -A vmnic0 autoneg on rx on tx on Note that this is only valid if speed is set to autonegotiation. 5. Show offload settings: ethtool -k vmnic0 6. Change offload settings: Example: Turn off TSO (TCP segmentation offload) ethtool -K vmnic0 tso off 7. Get statistics: ethtool -S vmnic0 8. Perform self-test: ethtool -t vmnic0 Note that the interface (vmnic0) must be up to do all tests. 9. See ethtool man page for more options. Driver Parameters ================= Several optional parameters can be supplied as a command line argument to the vmkload_mod command. These parameters can also be set via the esxcfg-module command. See the man page for more information. The optional parameter "int_mode" is used to force using an interrupt mode other than MSI-X. By default, the driver will try to enable MSI-X if it is supported by the kernel. In case MSI-X is not attainable, the driver will try to enable MSI if it is supported by the kernel. In case MSI is not attainable, the driver will use legacy INTx mode. Set the "int_mode" parameter to 1 as shown below to force using the legacy INTx mode on all NetXtreme II NICs in the system. vmkload_mod bnx2x int_mode=1 Set the "int_mode" parameter to 2 as shown below to force using MSI mode on all NetXtreme II NICs in the system. vmkload_mod bnx2x int_mode=2 The optional parameter "disable_tpa" can be used to disable the Transparent Packet Aggregation (TPA) feature. By default, the driver will aggregate TCP packets, but if a user would like to disable this advanced feature - it can be done. Set the "disable_tpa" parameter to 1 as shown below to disable the TPA feature on all NetXtreme II NICs in the system. vmkload_mod bnx2x.ko disable_tpa=1 Use ethtool (if available) to disable TPA (LRO) for a specific NetXtreme II NIC. The optional parameter "dropless_fc" can be used to enable a complementary flow control mechanism on 57711 or 57711E. The default flow control mechanism is to send pause frames when the on chip buffer (BRB) is reaching a certain level of occupancy. This is a performance targeted flow control mechanism. On 57711 or 57711E, one can enable another flow control mechanism to send pause frames in case where one of the host buffers (when in RSS mode) are exhausted. This is a "zero packet drop" targeted flow control mechanism. Set the "dropless_fc" parameter to 1 as shown below to enable the dropless flow control mechanism feature on all 57711 or 57711E NetXtreme II NICs in the system. vmkload_mod bnx2x dropless_fc=1 There are some more optional parameters that can be supplied as a command line argument to the vmkload_mod command. These optional parameters are mainly to be used for debug and may be used only by an expert user. The debug optional parameter "poll" can be used for timer based polling. Set the "poll" parameter to the timer polling interval on all NetXtreme II NICs in the system. The debug optional parameter "mrrs" can be used to override the MRRS (Maximum Read Request Size) value of the HW. Set the "mrrs" parameter to the desired value (0..3) for on all NetXtreme II NICs in the system. The debug optional parameter "debug" can be used to set the default msglevel on all NetXtreme II NICs in the system. Use "ethtool -s" to set the msglevel for a specific NetXtreme II NIC. Driver Defaults =============== Speed : Autonegotiation with all speeds advertised Flow control : Autonegotiation with rx and tx advertised MTU : 1500 (range 46 - 9000) Rx Ring size : 4078 (range 0 - 4078) Tx Ring size : 4078 (range (MAX_SKB_FRAGS+4) - 4078) MAX_SKB_FRAGS varies on different kernels and different architectures. On a 2.6 kernel for x86, MAX_SKB_FRAGS is 18. Coalesce rx usecs : 25 (range 0 - 3000) Coalesce tx usecs : 50 (range 0 - 12288) MSI-X : Enabled (if supported by 2.6 kernel) TSO : Enabled WoL : Disabled Unloading and Removing Driver ============================= To unload the driver, do the following: vmkload_mod -u bnx2x Driver Messages =============== The following are the most common sample messages that may be logged in the file /var/log/messages or /var/log/vmkernel. Driver signon: ------------- Broadcom NetXtreme II 5771x 10Gigabit Ethernet Driver bnx2x 0.40.15 ($DateTime: 2007/11/22 05:32:40 $) NIC detected: ------------ vmnic0: Broadcom NetXtreme II BCM57710 XGb (A1) PCI-E x8 2.5GHz found at mem e8800000, IRQ 16, node addr 001018360012 MSI-X enabled successfully: -------------------------- bnx2x: vmnic0: using MSI-X Link up and speed indication: ---------------------------- bnx2x: vmnic0 NIC Link is Up, 10000 Mbps full duplex, receive & transmit flow control ON Link down indication: -------------------- bnx2x: vmnic0 NIC Link is Down Memory Limitation on VMWare ESX 4.0 ESX: -------------------- Note if you see messages in the log file which look like the following: Dec 2 18:24:20 ESX4 vmkernel: 0:00:00:32.342 cpu2:4142)WARNING: Heap: 1435: Heap bnx2x already at its maximumSize. Cannot expand. Dec 2 18:24:20 ESX4 vmkernel: 0:00:00:32.342 cpu2:4142)WARNING: Heap: 1645: Heap_Align(bnx2x, 4096/4096 bytes, 4096 align) failed. caller: 0x41800187d654 Dec 2 18:24:20 ESX4 vmkernel: 0:00:00:32.342 cpu2:4142)WARNING: vmklinux26: alloc_pages: Out of memory This means that the ESX host is severely strained. To relieve this please disable NetQueue. This can be done by loading the bnx2x vmkernel module manually via the command: vmkload_mod bnx2x multi_mode=0 or to persist the settings across reboots via the command esxcfg-module -s multi_mode=0 bnx2x (and then reboot the machine for the settings to take place) MultiQueue/NetQueue: -------------------- The optional parameter "num_queues" may be used to set the number of Rx and Tx queues when "multi_mode" is set to 1 and interrupt mode is MSI-X. If interrupt mode is different than MSI-X (see "int_mode" parameter), the number of Rx and Tx queues will be set to 1 discarding the value of this parameter. For the bnx2x-1.52.12.v40.3 VMware ESX 4.0 Async Driver, only 1 queue is enabled. If users would like the use of more then 1 queue, users may force the number of NetQueues to use via the following command: esxcfg-module -s "multi_mode=1 num_queues=" bnx2x Otherwise users can allow the bnx2x driver to choose the number of NetQueues to use via the following command: esxcfg-module -s "multi_mode=1 num_queues=0" bnx2x The optimal number is to have the number of NetQueues match the number of CPU's on the machine. Once this has been set, the ESX host must be rebooted for the settings to take effect.