Upcoming Solaris Features: Crossbow - Part 1: Virtualisation

At the moment ZFS, DTrace or Zones are the well known features of Solaris. But in my opinion there will be a fourth important feature soon. Since Build 105 it´s integrated (many people will already know which feature i want to describe in this artcle) into Solaris. This feature has the project name Crossbow. It´s the new TCP/IP stack of Opensolaris and was developed with virtualisation in mind from ground up. “Virtualisation in mind” does not only lead to the concept of virtual network interface cards, you can configure virtual switches as well and even more important: You can control the resource usage of a virtual client or managing the load by distributing certain TCP/IP traffic to dedicated CPUs. I´ve already held some talks about Crossbow at different events, thus it´s time to write an article about this topic. I will start with the virtualisation part of Crossbow.

The Virtualisation part

This part is heavily inspired by this blog entry of Ben Rockwood, but he ommited some parts in the course of his article to make a full walk-through out of it, so i extended it a little bit. Normally a network consists out of switches and networking cards, server and router . It´s easy to replicate this in a single system. Networking cards can be simulated by VNICS, switches are called etherstubs in the namespace of Crossbow. Server can be simulated by zones of course, and as router are not much more than special-purpose servers, we could simulate them by a zone as well.

A simple network

Let us simulate a simple network at first. Just two servers and a router:

At first we create two virtual switches. They are called etherstub0 and etherstub1

# dladm create-etherstub etherstub0
# dladm create-etherstub etherstub1

Okay, now we create virtual nics that are bound to the virtual switch etherstub0. These virtual nics are called vnic1 and vnic0.

# dladm create-vnic -l etherstub0 vnic1
# dladm create-vnic -l etherstub0 vnic2

Now we do the same with our second virtual switch:

# dladm create-vnic -l etherstub1 vnic3
# dladm create-vnic -l etherstub1 vnic4

Okay, let´s look up the configuratio of our network.

# dladm showlink
LINK        CLASS    MTU    STATE    OVER
ni0         phys     1500   unknown  --
etherstub0  etherstub 9000  unknown  --
etherstub1  etherstub 9000  unknown  --
vnic1       vnic     9000   up       etherstub0
vnic2       vnic     9000   up       etherstub0
vnic3       vnic     9000   up       etherstub1
vnic4       vnic     9000   up       etherstub1
vnic5       vnic     1500   up       ni0

Yes, that´s all … but what can we do with it? For example simulating a complete network in your system. Let´s create a network with two networks, a router with a firewall and nat and a server in each of the network. Obviously we will use zones for this.

A template zone

At first we create a template zone. This zone is just used for speeding up the creation of other zones. To enable zone creation based on ZFS snapshots, we have to create a filesystem for our zones and mount it at a nice position in your filesystem:

# zfs create rpool/zones
# zfs set compression=on rpool/zones
# zfs set mountpoint=/zones rpool/zones

Now we prepare a command file for the zone creation. The pretty much the standard for a sparse root zone. We don´t configure any network interfaces, as we never boot or use this zone. It´s just a template as the name alreay states. So at first we create a file called template in a working directory. All the following steps assume that you are in this directory as i won´t use absolute paths.

create -b
set zonepath=/zones/template
set ip-type=exclusive
set autoboot=false
add inherit-pkg-dir
set dir=/lib
end
add inherit-pkg-dir
set dir=/platform
end
add inherit-pkg-dir
set dir=/sbin
end
add inherit-pkg-dir
set dir=/usr
end
add inherit-pkg-dir
set dir=/opt
end
commit</pre>
</code></blockquote>
Now we create the zone. Depending on your test equipment this will take some times.<br />

<!-- Migration Rule 4 --> 

<figure class="highlight"><pre><code class="language-plaintext" data-lang="plaintext"># zonecfg -z template -f template
# zoneadm -z template install
A ZFS file system has been created for this zone.
Preparing to install zone &lt;template&gt;.
Creating list of files to copy from the global zone.
Copying &lt;3488&gt; files to the zone.
Initializing zone product registry.
Determining zone package initialization order.
Preparing to initialize &lt;1507&gt; packages on the zone.
Initialized &lt;1507&gt; packages on zone.                                 
Zone &lt;template&gt; is initialized.
The file &lt;/zones/template/root/var/sadm/system/logs/install_log&gt; contains a log of the zone installation.
#</code></pre></figure>

Got a coffee? The next installations will be much faster.We will not boot it as we don´t need it for our testbed.
<br>
<h3>site.xml</h3>While waiting for the zone installation to end we can create a few other files. At first you should create a file called <code>site.xml</code>. This files controls which services are online after the first boot. You can think of it like an <code>sysidcfg</code> for the Service Management Framework. The file is rather long, so i won´t post it in the article directly. You can download my version of this file <a href="http://www.c0t0d0s0.org/pages/sitexml.html">here</a>.
<br>
<h3>Zone configurations for the testbed</h3>At first we have to create the zone configurations. The files are very similar. The differences are in the zonepath and in the network configuration. The zone <code>servera</code> is located in <code>/zones/serverA</code> and uses the network interface <code>vnic2</code>. This will is called <code>serverA</code>:

<!-- Migration Rule 6 --> 

<figure class="highlight"><pre><code class="language-plaintext" data-lang="plaintext">create -b
set zonepath=/zones/serverA
set ip-type=exclusive
set autoboot=false
add inherit-pkg-dir
set dir=/lib
end
add inherit-pkg-dir
set dir=/platform
end
add inherit-pkg-dir
set dir=/sbin
end
add inherit-pkg-dir
set dir=/usr
end
add inherit-pkg-dir
set dir=/opt
end
add net
set physical=vnic2
end
commit</code></pre></figure>


The zone <code>serverb</code> uses the directory <code>/zones/serverB</code> and is configured to bind to the interface <code>vnic4</code>. Obviously i´ve named the configuration file <code>serverB</code>

<!-- Migration Rule 6 --> 

<figure class="highlight"><pre><code class="language-plaintext" data-lang="plaintext">create -b
set zonepath=/zones/serverB
set ip-type=exclusive
set autoboot=false
add inherit-pkg-dir
set dir=/lib
end
add inherit-pkg-dir
set dir=/platform
end
add inherit-pkg-dir
set dir=/sbin
end
add inherit-pkg-dir
set dir=/usr
end
add inherit-pkg-dir
set dir=/opt
end
add net
set physical=vnic4
end
commit</code></pre></figure>


We have created both config files for the simulated servers. Now we do the same for our simulated router. The configuration of the <code>router</code> zone is a little bit longer as we need more network interfaces. I opened a file called <code>router</code> and filled it with the following content:

<!-- Migration Rule 4 --> 

<figure class="highlight"><pre><code class="language-plaintext" data-lang="plaintext">create -b
set zonepath=/zones/router
set ip-type=exclusive
set autoboot=false
add inherit-pkg-dir
set dir=/lib
end
add inherit-pkg-dir
set dir=/platform
end
add inherit-pkg-dir
set dir=/sbin
end
add inherit-pkg-dir
set dir=/usr
end
add inherit-pkg-dir
set dir=/opt
end
add net
set physical=vnic5
end
add net
set physical=vnic1
end
add net
set physical=vnic3
end
commit</code></pre></figure>



<h3>sysidcfg files</h3>
To speed up installation we create some sysidconfig files for our zones. Without this files, the installation would "go interactive" and you would have to use menus to provide the configuration informations. When you copy place such a file at <code>/etc/sysidcfg</code> the system will be initialized with the information provided in the file.

I will start with the sysidcfg file of <code>router</code> zone:

<!-- Migration Rule 6 --> 

<figure class="highlight"><pre><code class="language-plaintext" data-lang="plaintext">system_locale=C
terminal=vt100
name_service=none
network_interface=vnic5 {primary hostname=router1 ip_address=10.211.55.10 netmask=255.255.255.0 protocol_ipv6=no default_route=10.211.55.1}
network_interface=vnic1 {hostname=router1-a ip_address=10.211.100.10 netmask=255.255.255.0 protocol_ipv6=no default_route=NONE}
network_interface=vnic3 {hostname=router1-b ip_address=10.211.101.10 netmask=255.255.255.0 protocol_ipv6=no default_route=NONE}
nfs4_domain=dynamic
root_password=cmuL.HSJtwJ.I
security_policy=none
timeserver=localhost
timezone=US/Central</code></pre></figure>


After this, we create a second sysidconfig file for our first server zone. I store the following content into a file called <code>servera_sysidcfg</code>:<br />

<!-- Migration Rule 4 --> 

<figure class="highlight"><pre><code class="language-plaintext" data-lang="plaintext">system_locale=C
terminal=vt100
name_service=none
network_interface=vnic2 {primary hostname=server1 ip_address=10.211.100.11 netmask=255.255.255.0 protocol_ipv6=no default_route=NONE}
nfs4_domain=dynamic
root_password=cmuL.HSJtwJ.I
security_policy=none
timeserver=localhost
timezone=US/Central</code></pre></figure>


When you look closely at the network_interface line you will see, that i didn´t specified a default route. Please keep this in mind. In a last step i create <code>serverb_sysidcfg</code>. It´s the config file for our second server zone:

<!-- Migration Rule 6 --> 

<figure class="highlight"><pre><code class="language-plaintext" data-lang="plaintext">system_locale=C
terminal=vt100
name_service=none
network_interface=vnic4 {primary hostname=server2 ip_address=10.211.101.11 netmask=255.255.255.0 protocol_ipv6=no default_route=NONE}
nfs4_domain=dynamic
root_password=cmuL.HSJtwJ.I
security_policy=none
timeserver=localhost
timezone=US/Central</code></pre></figure>



<h3>Firing up the zones</h3>
After creating all this configuration files, we use them to create some zones. The procedure is similar for all zone. At first we do the configuration. After this we clone the <code>template</code> zone. As we located the <code>template</code> zone in a ZFS filesystem, the cloning takes just a second. Before we boot the zone, we place our configuration files we prepared while waiting for the installation of the <code>template</code> zone.

<!-- Migration Rule 6 --> 

<figure class="highlight"><pre><code class="language-plaintext" data-lang="plaintext"># zonecfg -z router -f router
# zoneadm -z router clone template
Cloning snapshot rpool/zones/template@SUNWzone3
Instead of copying, a ZFS clone has been created for this zone.
# cp router_sysidcfg /zones/router/root/etc/sysidcfg
# cp site.xml /zones/router/root/var/svc/profile
# zoneadm -z router boot&lt;pre&gt;&lt;/blockquote&gt;&lt;/code&gt;We repeat the steps for &lt;code&gt;servera&lt;/code&gt;.&lt;blockquote&gt;&lt;code&gt;&lt;pre&gt;# zonecfg -z servera -f serverA
# zoneadm -z servera clone template
Cloning snapshot rpool/zones/template@SUNWzone3
Instead of copying, a ZFS clone has been created for this zone.
# cp serverA_sysidcfg /zones/serverA/root/etc/sysidcfg
# cp site.xml /zones/serverA/root/var/svc/profile
# zoneadm -z servera boot</code></pre></figure>


At last we repeat it for our zone <code>serverb</code> again:<br />

<!-- Migration Rule 4 --> 

<figure class="highlight"><pre><code class="language-plaintext" data-lang="plaintext"># zonecfg -z serverb -f serverB
# zoneadm -z serverb clone template
Cloning snapshot rpool/zones/template@SUNWzone3
Instead of copying, a ZFS clone has been created for this zone.
# cp serverb_sysidcfg /zones/serverB/root/etc/sysidcfg
# cp site.xml /zones/serverB/root/var/svc/profile
# zoneadm -z serverb boot&lt;/pre&gt;
&lt;/blockquote&gt;
&lt;/code&gt;After completing the last step, we display the existing zones.&lt;br /&gt;
&lt;blockquote&gt;&lt;code&gt;
&lt;pre&gt;# zoneadm list -v
  ID NAME             STATUS     PATH                           BRAND    IP    
   0 global           running    /                              native   shared
  13 router           running    /zones/router                  native   excl  
  15 servera          running    /zones/serverA                 native   excl  
  19 serverb          running    /zones/serverB                 native   excl </code></pre></figure>


All zones are up and running.
<h3>Playing around with our simulated network</h3>
At first a basic check. Let´s try to plumb one of the VNICs already used in a zone.

<!-- Migration Rule 4 --> 

<figure class="highlight"><pre><code class="language-plaintext" data-lang="plaintext"># ifconfig vnic2 plumb
vnic2 is used by non-globalzone: servera</code></pre></figure>


Excellent. The system prohibits the plumbing. Before we can play with our mini network, we have to activate forwarding and routing on our new router. Since Solaris 10 this is really easy. There is a command for it: 

<!-- Migration Rule 4 --> 

<figure class="highlight"><pre><code class="language-plaintext" data-lang="plaintext"># routeadm -e ipv4-forwarding 
# routeadm -e ipv4-routing
# routeadm -u
# routeadm    
              Configuration   Current              Current
                     Option   Configuration        System State
---------------------------------------------------------------
               IPv4 routing   enabled              enabled
               IPv6 routing   disabled             disabled
            IPv4 forwarding   enabled              enabled
            IPv6 forwarding   disabled             disabled

           Routing services   "route:default ripng:default"

Routing daemons:

                      STATE   FMRI
                   disabled   svc:/network/routing/zebra:quagga
                   disabled   svc:/network/routing/rip:quagga
                   disabled   svc:/network/routing/ripng:default
                   disabled   svc:/network/routing/ripng:quagga
                   disabled   svc:/network/routing/ospf:quagga
                   disabled   svc:/network/routing/ospf6:quagga
                   disabled   svc:/network/routing/bgp:quagga
                   disabled   svc:/network/routing/isis:quagga
                   disabled   svc:/network/routing/rdisc:default
                     online   svc:/network/routing/route:default
                   disabled   svc:/network/routing/legacy-routing:ipv4
                   disabled   svc:/network/routing/legacy-routing:ipv6
                     online   svc:/network/routing/ndp:default</code></pre></figure>


This test goes only skin-deep into the capabilities of Solaris in regard of routing. But that is stuff for more than one LKSF tutorial. Now let´s look into the routing table of one of our server:<br />

<!-- Migration Rule 4 --> 

<figure class="highlight"><pre><code class="language-plaintext" data-lang="plaintext"># netstat -nr

Routing Table: IPv4
  Destination           Gateway           Flags  Ref     Use     Interface 
-------------------- -------------------- ----- ----- ---------- --------- 
default              10.211.100.10        UG        1          0 vnic2          
10.211.100.0         10.211.100.11        U         1          0 vnic2     
127.0.0.1            127.0.0.1            UH        1         49 lo0</code></pre></figure>


Do you remember, that i´ve asked you to keep in mind, that we didn´t specified a default route in the sysidcfg? But why have we such an defaultrouter now. There is some automagic in the boot. When a system with a single interfaces comes up without an default route specified in <code>/etc/defaultrouter</code> or without being a dhcp client it automatically starts up the router discovery protocol as specified by <a href="http://tools.ietf.org/html/rfc1256">RPC 1256</a>. By using this protocol the hosts adds all available routers in the subnet as a defaultrouter.
The rdisc protocol is implemented by the <code>in.routed</code> daemon. It implements two different protocols. The first one is the already mentioned rdisc protocol. But it implements the RIP protocol as well. The RIP protocol part is automagically activated when a system has more than one network interface.<br />

<!-- Migration Rule 4 --> 

<figure class="highlight"><pre><code class="language-plaintext" data-lang="plaintext"># ping 10.211.100.11
10.211.100.11 is alive
# traceroute 10.211.100.11
traceroute to 10.211.100.11 (10.211.100.11), 30 hops max, 40 byte packets
 1  10.211.101.10 (10.211.101.10)  0.285 ms  0.266 ms  0.204 ms
 2  10.211.100.11 (10.211.100.11)  0.307 ms  0.303 ms  0.294 ms
#</code></pre></figure>

As you can see ... we´ve builded a network in a box.

<h3>Building a more complex network</h3>
<center><a class='serendipity_image_link' href='/uploads/extendednetwork.png' target="_blank"><!-- s9ymdb:565 --><img class="serendipity_image_center" width="306" height="400" style="border: 0px; padding-left: 5px; padding-right: 5px;" src="/uploads/extendednetwork.serendipityThumb.png" alt="" /></a></center>
Let´s extend our example a little bit. At first we configure additional etherstubs and VNICs:

<!-- Migration Rule 6 --> 

<figure class="highlight"><pre><code class="language-plaintext" data-lang="plaintext"># dladm create-etherstub etherstub10
# dladm create-vnic -l etherstub1 routerb1
# dladm create-vnic -l etherstub10 routerb10
# dladm create-vnic -l etherstub10 serverc1
# dladm create-vnic -l etherstub1 routerc1
# dladm create-vnic -l etherstub10 routerc2</code></pre></figure>


As you see, you are not bound to a certain numbering scheme. You can call a vnic as you want, as long it´s beginning with letters and ending with numbers. Now we use an editor to create a configuration file for our <code>routerB</code>>: 
<!-- Migration Rule 6 --> 

<figure class="highlight"><pre><code class="language-plaintext" data-lang="plaintext">create -b
set zonepath=/zones/routerB
set ip-type=exclusive
set autoboot=false
add inherit-pkg-dir
set dir=/lib
end
add inherit-pkg-dir
set dir=/platform
end
add inherit-pkg-dir
set dir=/sbin
end
add inherit-pkg-dir
set dir=/usr
end
add inherit-pkg-dir
set dir=/opt
end
add net
set physical=routerb1
end
add net
set physical=routerb10
end
commit</code></pre></figure>

We don´t have to configure any default router in this <code>sysidcfg</code> even when the system is a router itself. The system boots up with a router and will get it´s routing tables from the RIP protocol.
<blockquote><code><pre>system_locale=C
terminal=vt100
name_service=none
network_interface=routerb1 {primary hostname=routerb ip_address=10.211.101.254 netmask=255.255.255.0 protocol_ipv6=no default_route=NONE}
network_interface=routerb10 {hostname=routerb-a ip_address=10.211.102.10 netmask=255.255.255.0 protocol_ipv6=no default_route=NONE}
nfs4_domain=dynamic
root_password=cmuL.HSJtwJ.I
security_policy=none
timeserver=localhost
timezone=US/Central

Okay, we can fire up the zone.

# zonecfg -z routerb -f routerb 
# zoneadm -z routerb clone template
Cloning snapshot rpool/zones/template@SUNWzone4
Instead of copying, a ZFS clone has been created for this zone.
# cp routerb_sysidcfg /zones/routerb/root/etc/sysidcfg
# cp site.xml /zones/routerB/root/var/svc/profile/
# zoneadm -z routerb boot

Okay, the next zone is the routerc zone. We bind it to the matching vnics in the zone configuration:

create -b
set zonepath=/zones/routerC
set ip-type=exclusive
set autoboot=false
add inherit-pkg-dir
set dir=/lib
end
add inherit-pkg-dir
set dir=/platform
end
add inherit-pkg-dir
set dir=/sbin
end
add inherit-pkg-dir
set dir=/usr
end
add inherit-pkg-dir
set dir=/opt
end
add net
set physical=routerc1
end
add net
set physical=routerc2
end
commit

The same rules as for the routerb apply to the routerc. We will rely on the routing protocols to provide a defaultroute, so we can just insert NONE into the sysidcfg for the default route.

# cat routerc_sysidcfg    
system_locale=C
terminal=vt100
name_service=none
network_interface=routerc1 {primary hostname=routerb ip_address=10.211.102.254 netmask=255.255.255.0 protocol_ipv6=no default_route=NONE}
network_interface=routerc2 {hostname=routerb-a ip_address=10.211.100.254 netmask=255.255.255.0 protocol_ipv6=no default_route=NONE}
nfs4_domain=dynamic
root_password=cmuL.HSJtwJ.I
security_policy=none
timeserver=localhost
timezone=US/Central

Okay, i assume you already know the following steps. It´s just the same just with other files.

# zonecfg -z routerc -f routerC
# zoneadm -z routerc clone template
Cloning snapshot rpool/zones/template@SUNWzone4
Instead of copying, a ZFS clone has been created for this zone.
# cp routerb_sysidcfg /zones/routerC/root/etc/sysidcfg
# cp site.xml /zones/routerC/root/var/svc/profile/
# zoneadm -z routerc boot

Okay, this is the last zone configuration in my tutorial. It´s the zone for serverc:

         
create -b
set zonepath=/zones/serverC
set ip-type=exclusive
set autoboot=false
add inherit-pkg-dir
set dir=/lib
end
add inherit-pkg-dir
set dir=/platform
end
add inherit-pkg-dir
set dir=/sbin
end
add inherit-pkg-dir
set dir=/usr
end
add inherit-pkg-dir
set dir=/opt
end
add net
set physical=serverc1
end
commit

Again … no defaultroute … as this is a single-interface system we leave it to the ICMP Router Discovery Protocol to find the routers. So create a file called serverC

system_locale=C
terminal=vt100
name_service=none
network_interface=serverc1 {primary hostname=server2 ip_address=10.211.102.11 netmask=255.255.255.0 protocol_ipv6=no default_route=NONE}
nfs4_domain=dynamic
root_password=cmuL.HSJtwJ.I
security_policy=none
timeserver=localhost
timezone=US/Central

Well … it´s zone startup time again …

# zonecfg -z serverc -f routerC 
# zoneadm -z serverc clone template
Cloning snapshot rpool/zones/template@SUNWzone4
Instead of copying, a ZFS clone has been created for this zone.
# cp serverc_sysidcfg /zones/serverC/root/etc/sysidcfg
# cp site.xml /zones/serverC/root/var/svc/profile/
# zoneadm -z serverC boot

So at first we have to make routers out of our routing zones. Obviously we have to login into the both routing zones and activating forwarding and routing. At first on routerb:

# routeadm -e ipv4-forwarding 
# routeadm -e ipv4-routing
# routeadm -u

Afterwards on routerc. The command sequence is identical.

# routeadm -e ipv4-forwarding 
# routeadm -e ipv4-routing
# routeadm -u</pre>
</blockquote>
</code>Now lets login into the console of our server:<br />
<blockquote><code>
<pre>servera# netstat -nr
Routing Table: IPv4
  Destination           Gateway           Flags  Ref     Use     Interface 
-------------------- -------------------- ----- ----- ---------- --------- 
default              10.211.100.10        UG        1          0 vnic2     
default              10.211.100.254       UG        1          0 vnic2     
10.211.100.0         10.211.100.11        U         1          0 vnic2     
127.0.0.1            127.0.0.1            UH        1         49 lo0 

As you see, there are two default routers in the routing table. The host receives router advertisments from two routers, thus it adds both into the routing table. Now let´s have a closer at the routing table of the routerb system.

routerb# netstat -nr 
Routing Table: IPv4
  Destination           Gateway           Flags  Ref     Use     Interface 
-------------------- -------------------- ----- ----- ---------- --------- 
default              10.211.101.10        UG        1          0 routerb1  
10.211.100.0         10.211.102.254       UG        1          0 routerb10 
10.211.101.0         10.211.101.254       U         1          0 routerb1  
10.211.102.0         10.211.102.10        U         1          0 routerb10 
127.0.0.1            127.0.0.1            UH        1         23 lo0

This system has more than one devices. Thus the in.routed starts up as a RIP capable routing daemon. After a short moment the in.routed has learned enough about the network and adds it´s routing table to the kernel. And after a short moment the routing tables of our router are filled with the routing informations provided by the routing protocols.

Conclusion

This part of the tutorial just covers a small part of Crossbow. In the next part i will talk about the capabilities in regard of managing flows of network traffic with crossbow. The scope of the virtualisation part is wider than just testing. Imagine the following situation: You want to consolidate several servers in a complex networks, but you want or you cant change a configuration file. In regard of the networking configuration you just could simulate it in one machine. And as it´s part of a single operating system kernel it is a very efficent way to do it. You don´t need virtual I/O servers or something like that. It´s the single underlying kernel of Solaris itself doing this job. Another interesting use case for Crossbow was introduced by Glenn Brunette in his concept for the immutable service containers.