New features of Solaris: Alternate boot environments based on snapshots
One of the limitations of Opensolaris 2008.05 will be the missing LiveUpgrade. But … well … you have something better. The whole concept of LifeUpgrade was transformed into the future by using the capabilities of ZFS.
Using snapshots for boot environments
One of the nice features of ZFS is the fact, that you get snapshots for free. The reason lies in the copy-on-write nature of ZFS. You can freeze the filesystem by not freeing the old blocks. as new data is written is written to new blocks, you don’t even have to copy the blocks (in this sense the COW of ZFS is more like a ROW … redirect on write). ZFS boot enables the system to work with such snapshots, as you can use one of these to boot from. You can establish multiple boot environments just by snapshoting the bootfilesystems, clonimg them and promoting them to real filesystems. This are features inherent to ZFS.
A practical example
A warning at first: Don´t try this example without a backup of your system. Or use a test system or test VM. We will fsck up the system during this example. Okay….
I’ve updated my system, so i have alread two boot environments on my system:
jmoekamp@glamdring:~# beadm list
BE Active Active on Mountpoint Space
Name reboot Used
---- ------ --------- ---------- -----
opensolaris-1 yes yes legacy 2.31G
opensolaris no no - 62.72M
This mirrors the actual state in your ZFS pools. You will find filesystems with accordings names.
NAME USED AVAIL REFER MOUNTPOINT
rpool 2.39G 142G 56.5K /rpool
rpool@install 18.5K - 55K -
rpool/ROOT 2.37G 142G 18K /rpool/ROOT
rpool/ROOT@install 0 - 18K -
rpool/ROOT/opensolaris 62.7M 142G 2.23G legacy
rpool/ROOT/opensolaris-1 2.31G 142G 2.24G legacy
rpool/ROOT/opensolaris-1@install 4.66M - 2.22G -
rpool/ROOT/opensolaris-1@static:-:2008-04-29-17:59:13 5.49M - 2.23G -
rpool/ROOT/opensolaris-1/opt 3.60M 142G 3.60M /opt
rpool/ROOT/opensolaris-1/opt@install 0 - 3.60M -
rpool/ROOT/opensolaris-1/opt@static:-:2008-04-29-17:59:13 0 - 3.60M -
rpool/ROOT/opensolaris/opt 0 142G 3.60M /opt
rpool/export 18.9M 142G 19K /export
rpool/export@install 15K - 19K -
rpool/export/home 18.9M 142G 18.9M /export/home
rpool/export/home@install 18K - 21K -
After doing some configuration, you can create an boot environment called opensolaris-baseline
:
It’s really easy. You just have to create a new boot environment:
# beadm create -e opensolaris-1 opensolaris-baseline<code></blockquote>
But we will not work with this environment. We use it as a baseline, as a last resort when we destroy our running environment. To run the system we will create another snapshot:<br />
<blockquote><code># beadm create -e opensolaris-1 opensolaris-work
Now let´s look into the list of our boot environments.
jmoekamp@glamdring:~# beadm list
BE Active Active on Mountpoint Space
Name reboot Used
---- ------ --------- ---------- -----
opensolaris-baseline no no - 53.5K
opensolaris-1 yes yes legacy 2.31G
opensolaris no no - 62.72M
opensolaris-work no no - 53.5K
Okay, now we activate the opensolaris-work
boot environment:
jmoekamp@glamdring:~# beadm activate opensolaris-work
Okay, let´s look at the list of boot environments again.
jmoekamp@glamdring:~# beadm list
BE Active Active on Mountpoint Space
Name reboot Used
---- ------ --------- ---------- -----
opensolaris-baseline no no - 53.5K
opensolaris-1 yes no legacy 24.5K
opensolaris no no - 62.72M
opensolaris-work no yes - 2.31G
jmoekamp@glamdring:~#
You will see that the opensolaris-1
snapshot is still active, but that the opensolaris-work
will be active at the next reboot. Okay, now reboot:
jmoekamp@glamdring:~# beadm list
BE Active Active on Mountpoint Space
Name reboot Used
---- ------ --------- ---------- -----
opensolaris-baseline no no - 53.5K
opensolaris-1 no no - 54.39M
opensolaris no no - 62.72M
opensolaris-work yes yes legacy 2.36G
Okay, you see … the boot environment opensolaris-work
is now active and it´s activated for the next reboot (until you activate another boot environment).
Now we can reboot the system. The GRUB comes up and it will default to the opensolaris-work
environment. Please remember on whicht position you find opensolaris-baseline
in the boot menu. You need this position in a few moments. After a few seconds, you can log into the system and work with it.
Okay … now let’s drop the atomic bomb of administrative mishaps to your system. Log into your system, assume the root role and do the following stuff:
# cd /<br />
# rm -rf *
You known what happens. Depending from how fast you are able to interrupt this run to get an slightly damaged system up to a system fscked up beyond any recognition. Normaly the system would send you to the tapes now. But remember. You have some alternate boot environments.
Reboot the system, wait for the grub. You may have an garbeled output, so it’s hard to read the output from the grub. Choose opensolaris-baseline
. The system will boot up quite normaly.
You need a terminal window now. How you get such a terminal window depends from incurred damage. The boot environment snapshots doesn’t cover the home directories. So you may have no home directory any longer. I will assume this for this example: You can get a terminal window by clicking on “Options”, then “Change Session” and choose “Failsafe Terminal” there.
Okay, login via the graphical login manager, a xterm will appear. At first we delete the defunct boot environment:
# beadm destroy opensolaris-work1<br />
Are you sure you want to destroy opensolaris-work1? This action cannot be undone (y/[n]):<br />
y
Okay, now we clone the opensolaris-baseline
environment to form a new opensolaris-work
environment.
# beadm create -e opensolaris-baseline opensolaris-work
We reactivate the opensolaris-work
boot environment:
# beadm activate opensolaris-work
Now check, if you still have a homedirectory for your user:
# ls -l /export/home/jmoekamp<br />
/export/home/jmoekamp: No such file or directory
If your home directory doesn’t exist any longer, create a new one:
# mkdir -p /export/home/jmoekamp<br />
# chown jmoekamp:staff /export/home/jmoekamp<(code></blockquote>
Now reboot the system:<br />
<blockquote><code># reboot
Wait a few moments. The system starts up. The GRUB defaults to opensolaris-work
and the system starts up normaly without any problem in that condition the system had, when you create the opensolaris-baseline
boot environment.
# beadm list
BE Active Active on Mountpoint Space
Name reboot Used
---- ------ --------- ---------- -----
opensolaris-baseline no no - 3.18M
opensolaris-1 no no - 54.42M
opensolaris no no - 62.72M
opensolaris-work yes yes legacy 2.36G
Obviously you may have to recover your directory with data. It’s a best practice to make snapshots of this directories on a regular schedule. So you can simply promote a snapshot to your actual version of the directory.
Conclusion
You see, this is a really neat feature. Recovering from a disaster in a minute or two. Snapshotting opens a completely new way to recover from errors. Unlike with Liveupgrade you don’t need extra disks or extra partitions, and as ZFS snapshots are really fast, creating alternate boot environments on zfs are extremly fast as well. At the moment this feature is available on Opensolaris 2008.05 only. But with future updates it will find it´s way into Solaris as well.