Restoring my Muse

Wednesday, March 1, 2017

[Lifted from old blog]

Attempting to restore Muse after salvaging some data from it.

How to repair a root BTRFS filesystem?

Attempting a btrfs rescue chunk-recover /dev/sdb3 tells me that the device is busy. Duh, it's my root disk. Can't umount /, it does nothing. So I have to reboot in Ubuntu and hope that an outdated btrfs-tools won't screw up the disk more than currently.

It ran for several hours, then completed apparently successfully, after repairing a few things (was too lazy to copy it here, maybe I should have).

When I rebooted, however, I was back to the emergency shell stage. This time, trying to btrfs rescue from there. The version of btrfs rescue is clearly not the same. This one shows the number of blocks scanned. But not the total number of blocks or a percentage, which makes the information somewhat useless as far as monitoring progress goes.

Well, repaired, found 105 additional nodes. Reboot. Still stuck at the emergency prompt. I guess it's about time to give up on that system and reinstall. I think that was the machine I had installed with Fedora 25 Server.

How to schedule a GPU across VMs?

Started a discussion regarding this topic following a report on some progress with iGVT. There are a number of issues to consider:

  • What performance and scheduling information can the GPUs report? This is not necessarily the same thing for all vendors.
  • How do they report that information first to the kernel, then possibly to user space via sysfs or something like that? Ideally, we'd like some metrics about resource usage, as well as knobs controlling resource allocation in near real-time.
  • Is there any benefit in having some user-space scheduling agent that tweaks said knobs? Or is it good enough to leave it to the user?

There are more questions, of course. See also heterogeneous scheduling, mostly focusing on little-big architectures like some ARM chips. But some interest in GPUs, notably for compute workloads.

Build stuff

Learned a few tricks about how to build Fedora packages remotely. Created an account on fedoraproject.org.

Commands saved for later reference:

fedpkg clone mesa

fedpkg --dist f25 build --scratch --srpm

But until my account on fedoraproject.org is fully activated, I have to build with mock.

Mesa investigation

Restarted the investigation in a different VM on Big, that reproduces the same problem. I began building directly from the NFS server, but it's really too slow. Copying data locally.

Taking options straight from Fedora

Looking into the Mesa configuration options from Fedora. There are a few I did not investigate.

  • Added the --enable-selinux, assuming it might be a permission issue. Still falling back to llvmpipe.
  • Added all the options from Fedora. Now this connects correctly, and exhibits the same issues as the installed package. Good, I have a reproducer.

The configuration that worked is:

./configure --prefix=/usr --enable-libglvnd --enable-selinux --enable-gallium-osmesa --with-dri-driverdir=/usr/lib64/dri --enable-gl --disable-gles1 --enable-gles2 --disable-xvmc --with-egl-platforms=drm,x11,surfaceless,wayland --enable-shared-glapi --enable-gbm --enable-glx-tls --enable-texture-float=yes --enable-gallium-llvm --enable-llvm-shared-libs --enable-dri --with-gallium-drivers=i915,nouveau,r300,svga,swrast,virgl --with-dri-drivers=swrast,nouveau

Not sure which option exactly "makes it work".

Mesa 17.0.1-devel from master is fixed

Building again from master, c0e9e61c9a1. Now I have virgl activated, and it works in the guest too. So the problem has already been fixed.

Tao3D at full acceleration

Tao3D works fine in this context, and my little test runs at 175FPS. For some reason, the vsync option does not work (I would have expected it to stop at 60FPS with that option). It's visually very smooth. And this time, it also works full screen!