A very belated Fedora 29 DNF upgrade initally proceeded without too many issues. But I did run into a couple of interesting problems, one of them seemingly some 15 years in the making.

So begins the typical process:

dnf upgrade --refresh # nothing as of at least 2019-10-25
dnf install dnf-plugin-system-upgrade
dnf system-upgrade download --refresh --releasever=29

After removing a couple old packages that caused problems with the transaction (jre-1.6.0_07 and libsocialweb-0.25.21), and getting a clean system-upgrade download:

dnf system-upgrade reboot

Unfortunately the system bombs to the dracut shell with:

Warning: /dev/Volume00/root does not exist

The logs revealed something interesting:

dracut:/# less /run/initramfs/rdsosreport.txt
* command -v lvm
* lvm pvdisplay
  Cannot access VG Volume00 with system ID localhost.localdomain1083061244 with unknown local system ID.
  Read-only locking type set.  Write locks are prohibited.
  Recovery of standalone physical volumes failed.
  Cannot process standalone physical volumes
* lvm vgdisplay
  Cannot access VG Volume00 with system ID localhost.localdomain1083061244 with unknown local system ID.
* lvm lvdisplay
Cannot access VG Volume00 with system ID localhost.localdomain1083061244 with unknown local system ID.
* command -v dmsetup
* dmsetup ls --tree
No devices found

Booting the fedora 28 kernel, everything works just fine, even on the last released kernel-5.0.16-100.fc28.i686. But on the new fedora 29 kernel, all lvm operations are met with:

  WARNING: Found LVs active in VG Volume00 with foreign system ID localhost.localdomain1083061244.
Possible data corruption.

LVM apparently now cares about matching system IDs. Reading that suffix as a likely epoch dates that volume group to 2004-04-27T06:20:44, which is quite entertaining. This was almost certainly set by my Fedora Core 1 installation some 15-plus years ago.

After reading up on LVM system ID, it is apparent that LVM needs to be told to trust this system ID. Booting with with a fedora 29 rescue disk:

sh-4.4# lvs
   Cannot access VG Volume00 with system ID localhost.localdomain1083061244 with unknown local system ID.

sh-4.4# lvm systemid
  system ID:

So the rescue boot has no system ID at least, which is “normal.” First we try setting the system_id to match:

sh-4.4# vi /etc/lvm/lvmlocal.conf
global {
    system_id_source = "lvmlocal"
}
local {
    system_id = "localhost.localdomain1083061244"
}

sh-4.4# lvs
  WARNING: system ID may not begin with the string "localhost"
  WARNING: No system ID found from system_id_source lvmlocal.
  Cannot access VG Volume00 with system ID localhost.localdomain1083061244 with unknown local system ID.

Trying it as an extra system ID:

sh-4.4# vi /etc/lvm/lvmlocal.conf
global {
    system_id_source = "lvmlocal"
}
local {
    system_id = "asdf"
    extra_system_ids = "localhost.localdomain1083061244"
}

sh-4.4# lvs
 LV              VG       Attr       LSize  Pool Origin Data%  Meta%  Move Log Cpy%Sync Convert
   docker-poolmeta Volume00 -wi-a----- 52.00m                                                    
   home            Volume00 -wn-ao---- 16.00g                                                    
   root            Volume00 -wi-ao---- 35.48g                                                    
   swap            Volume00 -wi-ao----  2.00g

Success! Since I never want to deal with this again, I blank out the system ID on the volume group itself. This allows it to be read by any connected LVM, which will “never” be an issue for this install:

sh-4.4# vgchange --systemid "" Volume00
  WARNING: Empty system ID supplied.
  WARNING: Removing the system ID allows unsafe access from other hosts.
Remove system ID localhost.localdomain1083061244 from volume group Volume00? [y/n]: y
  Volume group "Volume00" successfully changed

The system now boots normally.

Now for an X problem. I switched to XFCE years ago when Gnome 3 debuted in Fedora in a less than usable state. And for about as long, I have had my system boot to runlevel 3 (or in systemd parlance, multi-user.target), giving me a non-graphical environment as default in case I need to work on something X-related. This requires me to bring up the desktop environment manually, either by switching to runlevel 5 (graphical.target) or launching XFCE directly, the latter being my typical practice. This now fails:

~]: startxfce4
/usr/bin/startxfce4: Starting X server

Unrecognized option: --
use: X [:<display>] [option]
...
(EE) 
 Fatal server error:
 (EE) Unrecognized option: --
 (EE) 
 (EE) 
 Please consult the Fedora Project support 
          at http://wiki.x.org
  for help. 
 (EE) 
 xinit: giving up
 xinit: unable to connect to X server: Connection refused
 xinit: unexpected signal 2

This appears to be a Fedora-specific XFCE bug: https://bugzilla.redhat.com/show_bug.cgi?id=1718968

We can work around this by switching to runlevel 5. This will have us launch the desktop environment “properly” via GDM, rather than the buggy xfce script.

sudo systemctl isolate graphical.target

But from GDM, logging in to any desktop environment hangs: gnome, gnome-classic, lxde, plasma, xfce. At this point I spent a fairly ridiculous amount of time troubleshooting a problem that is not much of a problem.

A test user can login just fine, suggesting a user config issue. I began playing whack-a-mole with dot-files and dot-directories. None of the usual suspects like .cache, .config, .local, helped. In fact nothing helped, until I eliminated the .ssh directory, which I left for near last because really?

Keychain.

I have been using the keychain ssh agent manager for years, initializing out of .bashrc. Keychain prompts for the entry of passphrases for encrypted ssh keys, and it would appear that when loading a desktop environment from GDM, these prompts cause the hang. Not necessarily surprising. If those passphrases are entered prior, such as in runlevel 3 before loading GDM, all is fine (though if you wanted to load directly into runlevel 5 at boot, you are stuck). This is an issue in Fedora 28 as well (and probably earlier), though I never ran into it since loading directly into XFCE with startxfce4 works just fine regardless of prompts.

Ultimately I experienced the problem because in a rush to troubleshoot the problem, I skipped passphrase entry, which caused the problem I was rushing… to… troubleshoot…

On a final note, this appears to be the end of working 32-bit eclipse from RPM. Fedora 29 ships with mostly-broken eclipse-4.10.0 “2018-12.” The UI throws up “eclipse java.lang.NoSuchMethodError: gObjectClass_finalize” with almost every click, and it seems as though there is no longer any planned support for 32-bit from the project: https://bugs.eclipse.org/bugs/show_bug.cgi?id=536766.

Trying tarball packages, I had to back all the way down to eclipse-4.7.3a “oxygen” (the same that ships with Fedora 28), as both eclipse-4.9 “2018-09” and eclipse-4.8 “photon” throw that same error. Another nail in the coffin for 32-bit.