The Wayback Machine - https://web.archive.org/web/20080601182802/http://www.opensolaris.org:80/os/project/crossbow/faq/

OpenSolaris

You are not signed in. Sign in or register.

Network Virtualization and Resource Control (Crossbow) FAQ

Table of Contents

Overview

IP Instances

Virtual NICs (VNICs)

Flow Management

Miscellaneous

Overview

Crossbow is a set of technologies that provide network virtualization and greatly improves the resource control, performance, and network utilization needed to achieve true OS virtualization, utility computing, and server consolidation.

Crossbow consists of multiple components:

Crossbow is designed to add network virtualization to Solaris without introducing any performance penalty. Some of the underlying work is delivering better network performance. Receive rings and hardware classification, and multiple MAC address support contribute to better performance and enhance the virtualization provided.

Flow management could introduce some overhead. If the NIC or VNIC is not doing any fanout and B/W control, Crossbow can map a flow directly to a receive (RX) ring and use hardware to classify it. In that case, there is no performance impact. In cases where the NIC or VNIC is already doing bandwidth control or traffic fanout across multiple CPUs, any flow configured on top will have to go through an additional classification layer and there will be a small performance hit.

Crossbow is initially available as a BFU on top of OpenSolaris. The IP Instance portion of Crossbow has been integrated into Solaris Nevada build 57, and is in Solaris 10 8/07. VNICs and Flow Management are not yet integrated into Nevada, and may be delivered in a follow-on update. (This is as of March 2008)

Currently (March 2008), you can install Crossbow Snapshot onto Solaris Nevada build 81. Integrated ISOs with the Crossbow bits built-in are available, as are BFU bit to apply to an existing Nevada build 81 installation.

A beta is in progress at this time and will run through April 2008.

IP Instances

IP Instances are separate views of the IP stack, so that visibility and control is limited to the entity (zone) that the instance is assigned to. By default, all of Solaris has one view of IP, and therefor central visibility and control. With zones, the ability to view and control is limited by privileges, and all zones' network traffic decisions are made with a global view by the kernel. When IP instances are used, the view is limited to that information that applies to the instance, not the full kernel. So routing decisions, for example, are made based on the information only in this instance, and does not use any of the additional information that other instances on the same kernel may have. Similarly, control is delegated to this instance, so that a non-global zone can set network parameters such as routes, ndd(1m) values, IP address(es). Snooping of the interface(s) in the IP Instance is also possible. There is no visibility into any of the other IP Instances that may be sharing this Solaris instance and kernel.

Another feature with IP Instances is that traffic between zones must pass the whole path down the stack to the underlying NIC. This is the result of the zone's IP not knowing where the destination address is, and it must thus be put on the wire. If the zone is using a VNIC, whether the traffic stays within the system or exists on a physical netowrk interface depends on whether the destination also using a VNIC sharing the same physical NIC. If a NIC is shared for VNICs, traffic directly between the VNICs will be switched by the VNICs' virtual switch to the destination VNIC, and it will not leave the system.

IP Instances are in Solaris Nevada build 57 and later.

IP Instances are in Solaris 10 8/07 released on 4 September 2007.

Only NICs supported by the Generic LAN Driver version 3 (GLDv3) are supported with IP Instances. The way to determine if a NIC is GLDv3, run the dladm(1m) command with the 'show-link' subcommand and look for links that are not of type 'legacy'.

The is one exception. The ce interfaces can also be used now. See Which NICs are known to work with IP Instances? for details, such as Nevada build and Solaris 10 patches required.

This is how non-GLDv3 interfaces will look.

# dladm show-link
eri0            type: legacy    mtu: 1500       device: eri0
qfe0            type: legacy    mtu: 1500       device: qfe0
qfe1            type: legacy    mtu: 1500       device: qfe1
qfe2            type: legacy    mtu: 1500       device: qfe2
qfe3            type: legacy    mtu: 1500       device: qfe3

And how GLDv3 interfaces look.

# dladm show-link
bge0            type: non-vlan  mtu: 1500       device: bge0
bge1            type: non-vlan  mtu: 1500       device: bge1
bge1001         type: vlan 1    mtu: 1500       device: bge1
bge2001         type: vlan 2    mtu: 1500       device: bge1
bge2            type: non-vlan  mtu: 1500       device: bge2
bge3            type: non-vlan  mtu: 1500       device: bge3
aggr1           type: non-vlan  mtu: 1500       aggregation: key 1
  • Which NICs are known to work with IP Instances?
    • afe (Nevada build 73 and later)
    • bge
    • ce (Nevada build 80 and later, Solaris 10 with IP Instance patches*)
    • dfme (Nevada build 73 and later)
    • e1000g
    • eri (Nevada build 73 and later)
    • hme (Nevada build 73 and later)
    • iprb (Nevada build 73 and later)
    • ixgb
    • mxfe (Nevada build 73 and later)
    • nge
    • nxge
    • qfe (Nevada build 73 and later)
    • rge
    • rtls (Nevada build 73 and later)
    • xge
    • ath (Nevada only)
  • * NOTE: The ce NIC is not a GLDv3 device, but has been made to work with IP Instances. The Solaris 10 patches required are:

    • 118777-12 and 137042-01 (SPARC)
    • 118778-11 and 137043-01 (i386, x86, x64)
  • Which NICs don't work with IP Instances?
    • ce (Nevada build 79 and earlier, and Solaris 10)
    • dnet
    • elx
    • fjgi
    • ge
    • ipge
    • ixge
    • spwr
  • * NOTE: The e1000g driver replaces ipge in Solaris 10 11/06 and later for these NICS:

    • Sun PCI-Express Dual Gigabit Ethernet UTP X7280A-2
    • Sun PCI-Express Dual Gigabit Ethernet MMF X7281A-2

However, a shim is planned as part of Nemu Unification within Project Clearview that will allow those interfaces to be used together with IP Instances. (The list is based on most of the NICs for which drivers are included in Solaris.)

There are two Change Requests to enable IP Instances with the ce driver. See What's Up ce-Doc? for some details. These fixes have been put into OpenSolaris and are available in Nevada build 80 and later, and available for Solaris 10 with patches

Yes.

The maximum number of IP Instances is the same as the maximum number of non-global zones, which currently is 8191 (8K – 1).

A non-global zone can have only one IP Instance. By default, a zone is in the global instance sharing IP with the global zone and all other zones without an exclusive IP Instance. When a zone is configured to have an exclusive IP Instance, its view of IP is now isolated from the rest of the system.

No.

Commands at the IP level such as ifconfig(1m) will only work with the interfaces in the IP Instance from which the command is run. In the global zone, they will only be able to see those interfaces not set as exclusive to a non-global zone.

The snoop(1m) command is able to be used from the global zone even if that interface has been given to a non-global zone with an IP Instance configured. If snoop is run in the global zone and in the zone that has exclusive access to the interface, they will see the same data.

The dladm(1m) command is used from the global zone to manage all devices, links, aggregations, VLANs, and VNICs.

Using the dladm(1m) command, a privileged user in the global zone can see and control the physical interfaces (NICs, link aggregations (aggr), VLANs, and VNICs).

All interfaces assigned to a non-global zone can be identified by running 'ifconfig -a plumb', followed by 'ifconfig -a'.

non-global zone# ifconfig -a
lo0: flags=2001000849<UP,LOOPBACK,RUNNING,MULTICAST,IPv4,VIRTUAL> mtu 8232 index 1
    inet 127.0.0.1 netmask ff000000
non-global zone# ifconfig -a plumb
non-global zone# ifconfig -a
lo0: flags=2001000849<UP,LOOPBACK,RUNNING,MULTICAST,IPv4,VIRTUAL> mtu 8232 index 1
    inet 127.0.0.1 netmask ff000000
bge1: flags=201000842<BROADCAST,RUNNING,MULTICAST,IPv4,CoS> mtu 1500 index 2
    inet 0.0.0.0 netmask 0
    ether 0:3:ba:e3:42:8c
non-global zone#

If you have, for example, an nge interface, one method is to create the file /etc/hostname.nge0 in the non-global zone.

non-global zone# echo "192.168.1.11/24" > /etc/hostname.nge0
non-globalzone# init 6
...
non-global zone# ifconfig -a
lo0: flags=2001000849<UP,LOOPBACK,RUNNING,MULTICAST,IPv4,VIRTUAL> mtu 8232 index 1
nge0: flags=1000843"<"UP,BROADCAST,RUNNING,MULTICAST,IPv4">" mtu 1500 index 2
        inet 192.168.1.11 netmask ffffff00 broadcast 10.1.14.255
        ether 0:17:31:46:d8:eb
non-global zone#

Generally, you will set up the /etc/hosts file, /etc/defaultrouter if using static routes, /etc/netmasks, /etc/resolv.conf, and the like, as with any stand-alone system. With a shared IP Instance, much of this was managed by the adminstrator(s) in the global zone.

After configuring and installing the zone, copy or create an /etc/sysidcfg file. For example,

global-zone# cat /myzones/dhcpzone/root/etc/sysidcfg
system_locale=C
terminal=xterm
network_interface=primary {
        dhcp
        protocol_ipv6=no
}
security_policy=NONE
name_service=NONE
nfs4_domain=dynamic
timezone=US/Eastern
root_password=""
global-zone# zlogin dhcpzone ifconfig -a
lo0: flags=2001000849"<"UP,LOOPBACK,RUNNING,MULTICAST,IPv4,VIRTUAL">" mtu 8232 index 1
        inet 127.0.0.1 netmask ff000000
bge2: flags=201004843"<"UP,BROADCAST,RUNNING,MULTICAST,DHCP,IPv4,CoS">" mtu 1500 index 2
        inet 10.1.14.161 netmask ffffffc0 broadcast 10.1.14.191
        ether 0:3:ba:e3:42:8d
global-zone#

A non-global zone can still be an NFS client (not of the global zone on the same system), but can not be an NFS server. The in-ability of a non-global zone to be an NFS server is not related to networking, but rather to file system and virtual memory interaction.

You can not load private kernel modules in a non-global zone, even if you have your own instance. Also, IPfilter rulesets are controlled from the global zone at this time. A linux branded zone does not work with IP Instances at this time.

Virtual NICs (VNICs)

A VNIC is a virtualized network interface that presents the same media access control (MAC) interface that an actual interface would provide. Multiple VNICs can be configured on top of the same interface, allowing multiple consumers to share that interfaces. If the interface has hardware clasification capabilities, when data arrives on the NIC, the hardware can automatically direct the datagrams to receiver buffers (rings) associated with a specific VNIC. It may be possible to selectively turn interrupts on and off per ring, allowing the host to control the rate of arrival of packets into the system. For hardware that does not have these capabilities, these features are provided via software.

VNICs are support on Generic LAN Driver version 3 (GLDv3) supported interfaces. For a list, see Which NICs are known to work with IP Instances? You can also create a VNIC on top of an aggregation or VLAN that is built using GLDv3 NICs.

The maximum number of VNICs per NIC is limited by the total number of VNICs per system, which at this time is 899 (VNIC Ids 1-899). However, for NICs with hardware classification capabilities, maximum performance is achieved when the number of VNICs does not exceed the number of hardware classifiers on the NIC.

VNIC ids 900 to 999 are reserved for use by Xen.

Currently the maximum number of VNICs on a system is 899 user defined (1-899). As is typically the case, each VNIC will require additional system resources such as CPU. So there will be a practical maximum per system based on the type of system, they type of NICs, and traffic patterns.

This limit may be increased with the delivery of Clearview.

The dladm(1m) command is used to create, modify, and delete VNICs.

Use dladm show-dev -m device to show the MAC addresses assigned to this NIC. (This is work-in-progress)

To use a factory provided MAC address, run dladm create-vnic -m factory when creating a VNIC.

The MAC address for a VNIC can be set when the VNIC is created with the dladm create-vnic -m MAC-address command in the global zone.

Future???: The MAC address can be modified using the ifconfig(1m) command in a non-global zone. Either operation must be done by a privileged administrator.

Yes, the MAC address must be a valid MAC address as per IEEE. It can not be a multicast or broadcast address.

This is the case today, but in the future we will allow the MAC address to be chosen randomly, or from the hardware if the underlying NIC supports provides more than one factory MAC address.

Yes. You can do most of the things that you can do with a physical NIC. Things you can not do with a VNIC include: create a link aggregation, set a frame size larger than the underlink link.

TEST: create a VLAN.

Yes, but no larger than the MTU allowed by the underlying NIC.

Flow Management

Flow management is the ability to manage networking bandwidth resource for a transport, service, or a virtual machine. A service is specified as a combination of transport (e.g. TCP, UDP) and port, while a virtual machine is specified by its mac address or an IP address.

Flows are managed with the flowadm(1m) command.

Flows are defined as a set of attributes based on Layer 2, 3, and 4 headers which can be used to identify a protocol, service, or a [virtual machine] instance, such as a zone or Xen domain.

Flows support the following parameters:

  • Bandwidth--Sets the full duplex bandwidth specified in Kilo, Mega, or Gigabits, or as a percentage of link bandwidth.
  • Limit--Sets the maximum bandwidth usage that can not be exceeded.
  • Guarantee--Sets the entitled or minimum bandwidth. Bandwidth provided can exceed the guaranteed value.
  • Priority--Sets the relative priority of this flow compared to other flows, in a range from 0 to 100.
  • Hardware--Utilize NIC hardware classifier, if one is available.
  • CPU Binding--Assign one or more processors for the flow's kernel and interrupt processing. For dedicated use of a set of processors, use prset(1m) to define a processor set.

No, it is possible to create a flow without limits yet bind it to software or hardware resources.

Miscellaneous

It is difficult to determine a NIC's hardware capabilities. Please provide feedback on experiences with specific NICs and the information will be aggregated here. Thanks.

We are planning to provide an option of dladm which will display these hardware capabilities in a future version of Crossbow.

At this time you can not tell. There is work underway to add such a capability to the dladm(1m) command.

  • What do I do if I don't have enough NICs for all the zones?

    One option may be to use VLANs. See An example of using IP Instances with VLANs.


    Contributors: Crossbow engineering, Crossbow community at opensolaris.org
    Maintained by: Steffen Weiberle (steffen dot weiberle at sun dot com)