Marc's Public Blog - Linux

2010/10/24 Ubuntu Maverick: Plymouth Is the Worst Thing That Happened To Linux

π 2010-10-24 01:01 in Linux, Public

So, linux upgrades can always be a bit painful: software gets upgraded, things change (not always for the better), and there is not always a lot of (or any) QA on upgrade paths, so things do break.
There is nothing new there, I've been doing this for 15 years, so I'm used to it.

Just to say that I'm not picking on plymouth, other random things broke during my recent upgrades, and I fixed it (including Xorg switching to kernel modesetting, and requiring i915.modeset=1). Usually it's just a matter of googling for error messages and applying the answers.
(plymouth is a new 'feature' that hides all the boot messages by default, replaces them with a splash screen and tries to capture text output from the boot and log it, which it still does improperly as of today, including by dropping them if there are too many).

A few release backs, the switch to upstart was a bit painful. I can't say I was super thrilled with it, especially when at the time documentation and debugging info was sparse. This has however been fixed (the documentation that is), and while there are still bugglets here and there (statd won't start properly on my laptop), at least my boot doesn't randomly hang on networking dependencies anymore ( https://bugs.launchpad.net/ubuntu/+source/network-manager/+bug/499361 ). But the main part is that I'm willing to be more patient and understanding with upstart because it is a clear win for linux: it is good progress on functionality.

And this brings up to the trainwreck otherwise known as plymouth.

So, you might ask, what is so wrong with plymouth? Well, how about this:

Plymouth changes some very core parts of the system, and was rolled out by over eager people way before it was finished. If Red Hat does that on Fedora which is used for experimentation on mostly willing users, great. That canonical puts this in Long Term Support Lucid in the half baked state that it was: very lame (6 months ago, I upgraded my mythtv to lucid to pick up new mythtv packages and their dependencies. Needless to say that it didn't go so well, and that plymouth really sucked and made my life miserable then: http://marc.merlins.org/perso/linux/post_2010-04-25_Ubuntu-Lucid-and-Mythtv-0_23-Upgrade_-Please-Don_t-Become-Red-Hat.html ).

As a result, I entirely skipped lucid on my main laptop and waited for maverick with some dim hopes that it would get a bit better. Long story short: not significantly so.

Plymouth is still mostly undocumented as of today. man plymouth and man plymouthd still return nothing. Searching for 'plymouth ubuntu' or 'plymouth' on ubuntu.com does not return anything useful. What do you think I'd be wanting?

If you are going to be significantly changing all our systems, you'd better post a good rationale as to why the change is good and desirable. This was done for upstart and even mostly for network-manager (nevermind the many problems that network-manager had for 18 months after being pushed to all before it was mostly useable and stable). The cynic will point out that lack of rationale for the switch to plymouth is that said rationale is very dubious.
If you are going to be pushing a big change to all our system (hell, it's only how they boot, how they fsck drives, and how you can get to single user mode, or not), you'd better bad a page that explains how it works, how you debug it, and how you work around it if you have to. Why is it that even getting 'noplymouth' as a boot option is such a highly guarded secret apparently?
If you really need to put a multiplexer in the boot system, which is a big change, make baby steps: add it while keeping text mode booting by default for at least existing installs. Debug the hell out of it (it looks to me that as text mode booting as you can get with plymouth is a very little tested code path, unless the entire code is as buggy still as what I experienced in text mode).
Who thought that they were going to make friends with us by forcing that utterly useless "you don't need to see your boot, it might have useful debug info that might be useful to you, we can't have that". Why is it that is is so fscking hard to turn off plymouth boot message stealing? Even after turning off the graphical crap, you still get a useless one in text mode spinning around colored text dots. Really ?!?!
noplymouth INIT_VERBOSE=yes at the lilo prompt seems to do it for me right now, but where on earth is that documented?

So, that's really my beef with canonical on this one: I care much much more about having a system I can upgrade mostly safely like I can in debian, than this graphical crap that is downward hurtful to my system. I really don't care about how windows-like you can make linux look like (up to an un-debuggable and opaque boot), I'm really not interested.
Ubuntu/Canonical, stop the madness, please. Do impose some standards on your eager developers who think they came up with the last 'this is so cool' thing to add to linux, especially when it affects essential parts of the system.
I think I'm also specifically bitter about plymouth in ubuntu because its presence could have been made optional in init scripts (Red Hat even had such support in their init scripts), but in ubuntu "it's obviously good enough for everybody, so eat it and shut up".

For more details on what went wrong this time with plymouth, if you are curious:

on top of being hard to turn off, plymouth still broke my custom 'ask password from the command line' code in my initrd for cryptpart (by disconnecting stdin from my shell script it seems, thank you very much for that). I just re-opened my filed bug: https://bugs.launchpad.net/ubuntu/+source/plymouth/+bug/665789 which was closed as invalid (sure, it's ok to break cryptsetup and it's invalid to complain about it since plymouth is obviously necessary (it's not, I'm booting without it right now); perfect, so the problem must be with me, not plymouth; and properly documented (all those people all over google searches asking how to turn it off or generally kill plymouth are probably just a statistical error).

I had a problem with /usr, apparently because the system has been rebooting without unmounting partitions cleanly (some other bug probabably). When I asked to be dropped to a prompt to debug, I got a shell where /usr was mounted and where I could't umount it (fuser -kvm did not help at that point).

some other plymouth bug stopped me from getting to real single user mode: I got both a single user console and a second process still asking me for my root password, both stealing keystrokes at the same time. End result was that I couldn't type anything.

at the next boot, while I was still trying to get a single user prompt, plymouth just answered fsck for me that it was ok to fsck -f -y /usr and scroll many pages of vital errors that I could not capture and that were lost (a few lines ended up in /var/log/boot.log, but not enough to be useful). In my book, plymouth actually caused data loss here, thank you (thankfully I have backups).

I could go on (it does go on), but that's long enough, you get the idea....

As sad as it is, and with people like Steve Langasek ( https://bugs.launchpad.net/ubuntu/+source/plymouth/+bug/665789 ) who are now probably understandably defensive about plymouth, likely due to many complaints like mine, instead of doing the right thing and making it really optional, writing a rationale on ubuntu.com as to why it's there and why we should love it, and especially documenting the crap out of it, I'm now dubious as to whether ubuntu/canonical will get a clue about this and whether I should just switch back to Debian where I haven't seen much insanity like this...

Vmware linux kernel support downfall

I've had vmware workstation since version 2.0, and paid for each version since then: 2, 3, 4, 5, and 6. My latest version is 6.5.4 (under the 6.0 license).

I need to start by saying that I've always considered vmware an impressive product. Some of what they're doing is incredibly hard (fast VT without CPU support) and they're doing it very well.

Yet, I've always had a gripe with Vmware which finally got me to switch to Virtualbox just now: their total lack of support for new kernels as they come out.
While it would be nice, I'm not asking for their website to have up to date patches for the latest rc kernel patch that just came out 2 hours ago, but they have always lacked support for new stable kernels, even months after they came out (sometimes up to six months almost!).

It's not like they don't have kernel engineers there, why don't they write timely patches to vmmon, vmnet and friends a few days after each new stable kernel comes out? Ok, there is the support and testing issue, but then just post them unofficially on the web site as 'unsupported', which is better than my googling for the same thing off some random guy's web page.
Now you could say "just upgrade to workstation 7.x", but shouldn't I get kernel support for the existing workstation license I already paid for, for a reasonable number of years, like 3 to 5? I'm not asking to have vmware 2 working anymore... Forcing me to buy the next vmware every time just to get kernel support is wrong. The worst part is that even workstation 7 still does not support kernel 2.6.35 or 2.6.36, so even by paying for the next version, I'm still unsupported.

For "fun", I tried to get the lastest vmware workstation from their web site, take its modules, patch them for 2.6.36 since they do not work as is, and then try to use them with vmware 6.5. Unfortunately, they're not backward compatible :(

So, I'm left with using the free limited vmware player when I had 5 paid for licenses for vmware 2, 3, 4, 5, and 6 (!). That's just wrong: I shouldn't have to use the free limited product when I have paid licenses for multiple older vmware versions with no kernel support...
And to add insult to injury when I installed the latest vmware player, it removed vmware workstation from my system without asking me, never mind if I wanted to reboot with an older kernel and use my older vmware workstation. Grr, that's not cool guys!

So, while I really liked vmware, the company and the product, this just drove me away and enticed me to try VirtualBox.
Sorry to tell you vmware: you lost a customer who could otherwise recommend your products for server room use because you couldn't be bothered to support new linux kernels timely as they came out in the 10 years or so you've been in business and I've used your products, and you clearly dropped support for your paid customers as soon as a new vmware came out, forcing people to upgrade and pay for a new version.

VirtualBox

VirtualBox comes in two flavours: the open source edition (OSE) and the closed source but free as in beer version. They came from Sun who needs to be commended for the contributions they have made to open source. Unfortunately, this is now owned by Oracle which I respect and trust a whole lot less (especially since the Java lawsuit).

Because USB support is a requirement for me (I mostly use windows to sync to hardware that only has windows support, like garmin GPSes), I had to use the closed source edition from Oracle.

First impressions were good: Oracle provides multiple native packages for all popular distributions (which is better than the vmware bundle that decides to uninstall old versions of vmware products without asking me), and one apt-get command later, I had virtualbox running on my laptop.

The part that really impressed me was that VirtualBox was able to re-use my vmware disk image cut in 2GB chucks as is. I just pointed it to my vmware image, and it just booted windows. Now, things were a bit ugly until I got its tools installed (a bit like vmware tools), once that was done, things worked pretty well. I read some notes about the fact that a WinXP image might not work as is in virtualbox due to ACPI and IOAPIC, but my W2K image from 10 years ago just worked as is.
A small problem was that it was incompatible with the AMD PCnet emulation than vmware used, so networking didn't quite work until I switched virtualbox to emulate an e1000 card and installed the e1000 driver in my vmware image. After that, things were good.

I was then even able to shut down the image, and re-open it in vmware player 7 (the only thing to be very careful about was that vmware player remembered that the image was a snapshot and tried to restore windows to its running state in vmware with disk blocks that has since changed due to Virtualbox having run in the meantime. Worryingly enough I did not find a way to tell vmware player to start the image from scratch as opposed to trying to restore the frozen snapshot. I had to restore my full vmware disk blocks from a backup and then delete the pointers vmware had so that it would not try to restore the image anymore and just boot it from scratch.

Anyway, I can now run my vmware image with either VirtualBox or Vmware Player, although I'm going to stick to Virtualbox from now on: I tested USB on it and it worked fine to talk to my garmin GPS, so that's all I need.

Little things I noticed:

virtualbox does not allow me to map a serial port on the fly (i.e. reserve a serial port and map it to /dev/ttyUSB0 after I plug in the serial converter in linux). I can do that with vmware.

Annoyingly vmware and virtualbox use a different hostname and path to share host virtual systems under. As a result, I have drive letter mappings that are different for vmware and virtualbox and one half fails each time I boot windows. I'm not sure if I can hack virtualbox to use the same names than vmware.

Virtualbox otherwise looks as good, if not better as vmware player. Its snapshot/restore looks a bit faster on my system and more importantly their kernel modules are actually supported and working on recent kernels.

One needs to be *very* careful not to try and restore a snapshoted system back to life if the disk image was used to boot the system from scratch by the competing product. Severe disk corruption could otherwise occur. Virtualbox should maybe detect that and optionally invalidate the vmware freeze/snapshot after it mounts and modifies its disk image.

I am wondering if virtualbox is a bit slower on a vmware disk image instead of its native format. If someone knows, tell me.

But all in all, while I'm sad to have had to ditch Vmware after about 10 years of being a user, I'm a happy Virtualbox user now. I just hope Oracle isn't going to screw it up.

1996/11/18-21:	Linux Pavillion Comdex Fall 1996 (photos only). I've been going since then to help at the linux pavillion.
1997/11/18-21:	Linux Pavillion Comdex Fall 1997 (photos only)
1998/05/28-30:	Linuxexpo 1998 (photos only)
1998/11/16-20:	Linux Pavillion Comdex Fall 1998 (full report)
1998/11/11:	Silicon Valley Tea Party (report with pictures)
1999/02/15:	Windows Refund Day (report with pictures)
1999/03/20:	SVLUG KTEH night (photos only)
1999/03/01-04:	LinuxWorld Expo Winter 99 (complete report with many pictures)
1999/03/31:	Mozilla Party one year anniversary (photos only)
1999/05/18-22:	Linuxexpo 1999 (complete report with many pictures)
1999/06/07:	June 99 Balug meeting with Linus
1999/08/09-12:	LinuxWorld Expo Summer 99 (complete report with many pictures)
1999/11/15-19:	Linux Business Show at Comdex Fall 1999 (full report with pictures)
2000/08/14-17:	LinuxWorld Expo Summer 2000 (complete report with many pictures)
2001/01/17-20:	Linux.conf.au/LCA 2001 (complete report with pictures)
2001/07/25-28:	OLS 2001 (photos only)
2001/08/25:	Linux 10th Anniversary (report with pictures)
2001/09/27-30:	LinuxWorld Expo Summer 2001 report with pictures)
2001/11/05-10:	ALS 2001 (photos only)
2002/06/26-29:	OLS 2002 (photos only)
2003/01/20-25:	LCA 2003 (photos only)
2003/07/23-26:	OLS 2003 (photos only)
2004/01/12-17:	LCA 2004 (photos only)
2004/07/21-24:	OLS 2004 (photos only)
2005/04/18-23:	LCA 2005 (photos only)
2006/01/24-28:	LCA 2006 (photos only)
2007/01/17-21:	LCA 2007 (photos only)

Marc's Public Blog - Linux Hacking

Vmware linux kernel support downfall

VirtualBox