Help - i have an emergency

Hi all, if anyone has some knowledge on zfs, please help. Following attached a post, I did in proxmox forum, as I have a proxmox on zfs that has a problem. It boots but then becomes unavailable and kernel panics. I desperately need help as in 2.5 hours my coworkers apear and want to start working on the new systems I setup. And now my proxmox vms arent available.

If anyone can assist me to prevent further damage, it would be great. I need to know, what my next steps should be to ideally be able to boot proxmox and vms, or at least to save all my work from last months by being able to preserve / copy thes vm disks and / or vm proxmox backups of the vms. It would not be the end of the world to reinstall proxmox if I at least can have my vms. Loosing them, would destroy a hell of a lot of work, that I have done the last months.

So thanks if anyone can guide my next steps:

@Elleni

Hi

Do you have a backup of these VM machines?
Not on Proxmox…

Andy

No, the whole weekend we migrated everything and even if I had one one it would not contain Prod data, thus I need the content oft this pool. I was indeed pluging in a disk for backing everything up.

As the pool showing online and being a mirror I really hope, I can recover at least the vm disks but I dont know how…

If you installed proxmox on seperate disks as the data disks, you can just reinstall proxxmox and import the ZFS pool with “zfs import”. If you have data on the promox disks, than you should save the disk an add another one. Then install proxmox on this separate disks and import the zfs pool from the old disk under a new name “zfs import rpool rpool-old” and move the data to the new location.

@carsten

Hi Carsten

I’m helping Elleni here with this (on phone - speaking swiss german is easier for him in this stress moment…). Spending the weekend with two stints at 16 hours to prepare for going live, and getting this at the end - all with new hardware! Very frustating!

Besides which, I helped him three months ago, setting up his external “hosted” Proxmox, so we know each other also on phone… :slight_smile:

Unfortunately, this RAID1 mirror is the boot device AND VM Storage. And there are NO backups, as everything was migrated this weekend, including live data from the old system.

It seems that the used USB Disk - intended for a one off Backup - had fried a new laptops mainboard earlier, and maybe the Proxmox (Also new hardware) could also have mainboard issues (now).

Elleni was able to organise a (smaller) SATA Disk, and is now installing a seperate Proxmox.
We intended to try what you suggested (Had the same idea earlier…), import it in as a non-boot pool.

Thanks for any suggestions - ZFS is one of the things I don’t consider myself an expert on (yet)!
Still earning my spores… :slight_smile:

Andy

1 Like

Ok. Anyway. ZFS is almost undestroyable, so the data not gone (whatever happened to the proxmox) . He should get a separate disk or an USB stick to install a new proxmox installation on. Then import the old rpool with “zfs import rpool rpool-old”. You can transfer the VMs / LXCs with zfs send/zfs receive quite easily.

He should NOT break the ZFS mirror in case both hard disks have some problems, as ZFS would try to retrieve a good copy of every block from each of the disks. As ZFS is fully checksumed, ZFS always knows which block is ok and which is corrupted.

BTW: Is Elleni from Germany (where I am from) or from Switzerland?

Regards
Carsten

@carsten

OK, thanks for the tip.

I know that ZFS is practically undestructable and almost never loses data, unless really faulty hardware and no redundancy setup.
That’s also what I told him at 06:00 this morning! Not to worry about losing data. And both disks are mirrored, so they should be almost identical!

But boot AND data on the same disks… And system not really booting up - and after mounting the pool, a kernel dump, and possibly now a faulty mainboard… Three Months work fine tuning all VMs possibly going lost for ever…

I think Elleni made a slight boo boo, he imported the pool into the same name, making things worse. And no available disk in house (At his company, where he’s employed…).

Luckily for him, there’s an IT company in house, and he knows the people there, so they helped him with a “loaned” disk. He’s right now installing a new Proxmox on that disk, on an almost identical hardware (Slightly less powerful CPU, the rest identical). Without ZFS as boot device. Our intention is to import the pool as soon as Proxmos is up and running.

Samll issues - most due to overwork, not enough sleep - are also cropping up, like putting on the wrong IP on the replacement box, no network. But nothing insurmountable!

Elleni works in Zurich, and lives about a 30 km drive away… (Switzerland!)

My 2 cents
Andy

I would also suggest using zfs as the boot device, because this guiards the system against some faulty hardware and it is the recommended mostly used standard setup. The boot problems have nothing todo with ZFS.

Importing the pool with the same name is not possible, because of the name conflict and therefore cannot do any harm (it simply fails).

That’s what I would have thought, but after the import he had a kernel dump, not before that…

@carsten

Hi Carsten

Seems to be a real “Murphy” case (NICs, Disks, uvm.) but the replacement system is running and zpool import went without errors…

But how to now make the disks (VM-Data) available for actual usage or backup to another system?

Can Elleni contact you directly?

Andy

Thanks all. Status update:

The under datacenter the pool was imported by zpool import -f with the name rpool-old. Then under storage I now have: rpool-old/data, local-zfs, pve-1 and root zfs. But for all these there are only two content types available: Disc image and container. What I am still missing are the other content types to be able to restore my vm configs, backup vzdump files and everything. Thanks all for helping me safe my last 6 months work. GREATLY apreciated :blush:

Can anyone tell me where in my zfs pool the location is and how I can mount it. My goal is to create a vzdump backup of all these vms so I have a single file per vm that I an copy shomewhere else to restore it on a newly to setup proxmox server.

creating an image on my newly installed prox and backuping I realize, that I still miss one storage thingie named local. The one, that normaly is mounted to /var/lib/vz, where the subfolder dump contains the dump files.

My goal would be to create a vzdump file of the vm in the pool to just have a simple file that I can transer somewhere to backup, or is that not possible? What are my next steps to get the vms including their configs out of my zfs to an external storage? And btw. I have two 500 gb usb sticks. Would love to hear how to mount them correctly so I can put the backup of rpool-old to them.

Also… Before start to edit something else… backup, dude. It’s a friendly advice. You never know… :wink:

1 Like

If you want to keep the old disks as data disk. Reimport the pool for example as vm-data. Then make it available in proxmox as a zfs storage. With zfs list there should be zfs datasets which represent the vm data. The restoration of the vm configs is a little more complicated so I sugggest just hust recreate them manuelly. The you can move the datastets into the correct zfs hierarchy with “zfs rename”, so that you should be able to use them in the proxmox configuration.

Hava a look at the proxmox configuration files in /etc/pve/nodes. You can edit them by hand and the UI will catch up.

1 Like

@pike

Hi Michael

That’s what I asked first!

The Problem started in the night, after migrating all the data from the old Server to the VMs and Elleni plugged in a (new) USB Disks to take Backups…

Apparently that USB Disk fired a new Notebooks mainboard too… :frowning:

Sh*t happens!

Andy

Fired or burned? I don’t know if the USB drive is the company’s boss… :wink:

fried / burned

In the newly installed ext4 temp proxmox I have added one of the two mirror disks and with Andy we have imported the rpool as rpool-old. So I can see the vm-disk images in data. Under datacenter/storage I added every thing I could find from rpool-old, meaning: data, local-zfs, pve1 and root-zfs. The missing bit seems to be the local. This are all the ones being listed by zpool list.

The space on the ssd which contains the new install is limited to 250 GB so I will need to see how to recreate them on the temporary prox installation. I have two 500 USB Sticks though, that I can mount in order to have more space.

I need to know step by step how to do this vm by vm to not do something wrongly, but thanks for your post that helps me getting the idea :+1:

  1. recreate a vm im new nethserver ok, but then I still try to understand how the the move of the datasets would work with zfs rename. From my unerstanding I create a new vm not in the zfs pool but on the local ext4 pool right, and then ? The idea is that in the end is to be able to have the vm on my temp install in order to create vzdump files that will be saved to a stick. That way in the end I once backuped all vms, I reinstall proxmox with mirrored raid, copy the vzdump files to it, and restore them and I am done. I hope, my understanding is correct. As those data are crucial I want to avoid try and error, so I am very carefull before actually doing it.

Still trying to understand where in the pool all /var/lib/vz and all other pve configurations are and how to mount them so they are available on my temp install.

The filesystem root (/) is in the rpool/ROOT/pve-1 dataset on my (ZFS) Proxmox servers. From wherever you have rpool-old mounted, you should be able to change to ROOT/pve-1/var/lib/vz/. The contents of that directory should be whatever is on the local storage.

I see, so instead of adding root-zfs to Datacenter/storage I need to mount it somewhere on my local filesystem? What is the correct mount command? mount <?> /mnt/mountpoint?

zfs list -r rpool-old shows that the moint point is /rpool-old and rpool-old/ROOT/pve-1 -> /

I dont understand how to mount a pool like rpool-old to a local folder, to access its contents, I am still confused :slight_smile:

After having a lengthy remote session I just wanted to ask if everything is repaired and in service again?

1 Like