Adventures in dual-booting: Windows 8, EFI, CentOS6

September 2014

This adventure is probably summed-up by my favourite Mark Twain quote - "It ain't what you don't know that gets you into trouble. It's what you know for sure that just ain't so."

I've been running Linux for 20 years, and done a lot of dual-boots. I know that's old-school now, but I run Linux 95% of the time yet don't want to lose a Windows system I've paid for. I've never tried removing it from a system and reinstalling the same licenced copy inside a virtual machine.

I bought a new laptop back in April this year, after trying to check online for Linux certification to match what was in the local stores. There's so many models and variants that's almost impossible, but I found various "HP Pavilion 14" in ubuntu.com/certification and a couple of "HP EliteBook" in redhat.com/laptop.
So I bought an "HP Pavilion 14-n228ca TouchSmart Notebook" which came with Windows 8.1 installed.

So I start off doing what I've done on previous occasions - get into the BIOS*, change the boot order, boot a CentOS 6 installation CD as used on my desktop, go into rescue mode and look at the partitions. Normally I'd use fdisk, but that says it doesn't understand GPT and I should use parted. There's 5 partitions, so I use resizefs to shrink the main NTFS data partition, then delete the partition and recreate it shorter at the same start location. Then reboot the CD into install mode, create a Linux partition in the free space, and install CentOS, which adds a choice of "Other" in grub.conf to boot Windows.
To be more exact, I used a CentOS 6.2 installation CD (which I had probably burned in April 2012 ) to install packages from a CentOS mirror online, then did "yum update" to bring the install to the latest revision (6.5) - the same procedure I had used on my desktop system.

Then I boot CentOS and finish the install - a couple of glitches; it needs a kernel parameter "iommu=soft" to get the USB mouse to work ("nommu_map_single overflow" messages, per bug 532582), and it needs a firmware file rt3290.bin for the RT3290 WiFi chip to work (submitted bug 1133288).

The boot sequence is a bit weird compared to what I'm used to - this is my first machine with UEFI. The BIOS has a UEFI boot order and also a legacy boot order (see photo), which has to be enabled. UEFI takes precedence. At boot, ESC gives a choice photo)

 F1 System Information
 F2 System Diagnostics
 F9 Boot Device Options
 F10 BIOS Setup
 F11 System Recovery
With legacy boot enabled, F9 gives a boot menu (photo) with
  OS boot Manager
  Boot from EFI file
  Notebook hard drive
  Internal CD/DVD ROM Drive
"Notebook hard drive" now takes me to GRUB (the disk MBR).
"EFI file" takes me walkabout on a Windows file system with folders like "HP", "Boot", Windows" and what looks like hundreds of locale files - maybe I can boot in Turkish. (no; most of the .efi files boot Windows, except for HP/systemdiags/systemdiags.efi which boots "HP PC hardware diagnostics")
"OS boot Manager" takes me to an HP/Windows system recovery screen (photo) with various options - continue, troubleshoot, turn off.
"continue" goes to a splash screen like "attempting to repair" which fails (photo). "troubleshoot" has a command prompt option. That's running Windows cmd.exe in one of the other partitions, mounted as X:
In that, I find commands "chkdsk", "diskpart", "bootrec", "bcdedit" etc. To cut the even longer story short, I did something like:
X:\ diskpart
diskpart> select disk 0
diskpart> select partition 4 (the NTFS system one)
diskpart> set id=ebd0a0a2-b9e5-4433-87c0-68b6b72699c7
X:\ bcdedit /set {default} device partition=C:
X:\ bcdedit /set {default} osdevice partition=C:
X:\ bootrec /rebuildbcd
After doing that, the system partition appears as C:, passes chkdsk, and the system boots successfully into Windows.

Longer version:
The original Windows system+user partition was mountable under Linux, with all files apparently correct. In the Windows recovery system, the partition was not initially mounted (not assigned a drive letter), and so could not be examined. However, using the "diskpart" tool, it was possible to assign a drive letter to it (the next available, e.g. F:). Then, "chkdsk" could be run successfully, and the files listed normally. However, the drive letter assignment did not survive a reboot.
It seems that the newer Windows 8 (or GPT) disk configuration uses disk attributes not used - at least by default - by Linux. Per the Microsoft Technet article Set id, a special partition ID must be set to define the system partition as a "basic data partition".

Once that was done, it was necessary to assign the "device" and "osdevice" attributes in bcdedit. Then, the drive letter assignment (to C:) becomes permanent, and the system can be booted into Windows normally.
That's not very satisfactory, though, because the system now always boots into Windows unless you hit F9 while booting.

Getting a boot choice
I had read in some forums about EasyBCD. Initially I assumed it was included in Windows 8, but then realized it was actually produced by neosmart.net. So after successfully installing Linux and recovering Windows, I tried to improve the boot sequence. I installed EasyBCD and ran it. It's a typical Windows graphical program (which like most things more complex than Notepad, won't run from the command line in the recovery partition). See photo. The documentation includes an example of how to boot Fedora, assuming that you are doing a new Linux install, with GRUB2 installed into the first sector of the root partition. I don't want to do that - I already have CentOS running, with GRUB (0.97) in the MBR. I try "grub-install /dev/sda7" to try and install a copy. I'm not sure if that did what I wanted, but regardless, the EasyBCD install does not work. I get a boot menu with a Windows entry and a RedHat entry, but the RedHat one fails to boot. Possibly this photo. So I removed the RedHat entry and got a direct boot to Windows again.

While looking through the CentOS 6 installation files, I found mirror.centos.org/centos/6/os/x86_64/EFI/BOOT/BOOTX64.efi and BOOTX64.conf. I tried installing them in the EFI partition directory tree, and booting the EFI file from the boot manager "Boot from EFI file" selection, with a suitably modified BOOTX64.conf, but although I got a GRUB prompt, I could not boot a system.
I then found a page of the RHEL 6 installation guide which explains the EFI booting process. I set up /etc/fstab to mount the VFAT EFI partition /dev/sda2 on /boot/efi as described, but it was not obvious how to proceed.

Later, after more reading, I found mention of rEFInd, which incidentally has an excellent guide to EFI on Linux in general.
I downloaded rEFInd from sourceforge with git, and tried building it. rEFInd 0.8.3 requires gnu-efi-3.0u or later, while the CentOS6 version is gnu-efi-3.0g. So I downloaded gnu-efi from sourceforge and built an RPM version, using the 3.0g specfile as a template. Then I installed the resulting RPMs for gnu-efi and rEFInd.
Surprisingly, the on-install script for rEFInd actually runs the program, which writes to the EFI partition. This was a bit disconcerting; I would normally read the instructions and do a dry run first. Particularly as the program continues to run after a "fatal" error:

  Installing : refind-0.8.3-1.el6.x86_64                                    1/1 
Fatal: Couldn't open either sysfs or procfs directories for accessing EFI variables.
Try 'modprobe efivars' as root.
Installing rEFInd on Linux....
ESP was found at /boot/efi using vfat
Running in BIOS mode with a suspected Windows installation; moving boot loader
files so as to install to /boot/efi/EFI/Microsoft/Boot.
Installing driver for ext4 (ext4_x64.efi)
Copied rEFInd binary files

Copying sample configuration file as refind.conf; edit this file to configure
rEFInd.
After reading some more, I found that the EFI variables at /sys/firmware/efi/vars/ are only available if the Linux kernel was actually booted from EFI. The fatal error comes from "efibootmgr", which is included in CentOS 6, and is used in the rEFInd install script.
I think at that point I used the "Boot from EFI file" option from the BIOS boot menu to boot /boot/efi/EFI/redhat/grub.efi, and then re-ran /usr/share/refind-0.8.3/install.sh, which ran with no error. But I'm not quite sure.

Following this, I tried to boot "OS boot Manager" (the default) from the boot menu, but got a blank screen. However, when I disabled legacy boot in the BIOS and tried again, I got the rEFInd menu. I was then able to edit /boot/efi/EFI/Microsoft/Boot/refind.conf and add a line "dont_scan_files /EFI/Boot/bootx64.efi" to suppress the not-very-useful bootx64 option.
This, finally, gave me a reasonable dual-boot system - one that presents a clear choice on the screen at boot time, then defaults to Linux with a timeout. See photo. Success!

More tests
Rod Smith advised me to do "mvrefind.sh /boot/efi/EFI/Microsoft/Boot /boot/efi/EFI/refind" to move refind to its normal location, in case it was later overwritten by Windows. After doing that, at F9 I get a menu listing rEFInd (see photo), and by default the system boots to rEFInd, which is what I want.
However, it seems that on this laptop, if I boot Windows with that configuration, it re-writes the EFI boot order (seen in "efibootmgr") to have Windows first. So, after some confusion which involved a second copy of rEFInd with missing logos, I reinstalled rEFInd under Linux booted in legacy mode, so that it replaces /boot/efi/EFI/Microsoft/bootmgfw.efi and Windows seems unable to change it.

Secure Boot
There is information on secure boot at rodsbooks, but for now I am going to leave it off. The rEFInd and CentOS entries I now have will not boot in secure mode since I have no keys set up.

Unanswered Questions
I still have some unanswered questions, such as:
- what should I have done to create a Linux partition ?
- how should I have installed Linux to get a UEFI boot by default ?

Historically, as I recall, it was impossible to resize an NTFS partition from inside a running Windows partition. Or even using Windows at all, so that many third-party partition software disks were in fact built using Linux resizentfs. But I think I read somewhere that since Windows 8 (or maybe 7) that diskpart will now work on a running system, and is the method of choice.
Rod Smith suggests using Gparted instead to resize the partition, rather than re-creating it.

Regarding UEFI, had I burned a new CentOS 6 install CD (i.e. 6.5), would this have natively understood EFI and given me a choice to create a UEFI boot system instead of the traditional MBR and first-disk-sector ? Or would I have had to switch to the newer CentOS 7, or Fedora ?
I wasn't getting very good information from Google - the RedHat GRUB EFI documentation was only one page, for instance. As the old chestnut has it, the best documentation was in the last place I looked, rodsbooks.com, which I found not by a search for "EFI Linux" but via sourceforge from a mention of rEFInd in a mailing list post. uefi.org, which I have not explored in depth but seems to have definitive specifications and white papers, came from the RPM info block for efibootmgr.

However, I have no particular desire to repeat everything from scratch just to answer these questions. I prefer to keep the working system.

Later A correspondent tells me about the SUSE article (end of the links). I have not checked everything, but it seems reasonable.

Links:
These are some pages I referred to on the way:

* BIOS - I refer to the firmware that executes at boot time as "the BIOS". UEFI.org etc. sometimes call that just "the firmware", and refer to BIOS legacy mode in the firmware as a method of booting older devices/bootloaders. Other sources refer to a CSM (compatability support module).


RSS
Andrew Daviel, September 2014