Seite 4 von 4

Re: Scheduler Wait (VM job unmanageable, restarting later).

Verfasst: 16.04.2016 16:50
von MLx
Still no news on this? I suspect this is a VirtualBox issue...

Interestingly, the tasks continue computing when the BOINC client is restarted, so slowly but surely, I'm chipping away at the WU. However, it seems that because there are no snapshots, there are no trickles, and thus the deadline does not get extended as it should.

Should I let these tasks finish, or do I just scrap them?

http://www.rnaworld.de/rnaworld/result. ... d=14951245
http://www.rnaworld.de/rnaworld/result. ... d=14951335

Re: Scheduler Wait (VM job unmanageable, restarting later).

Verfasst: 16.04.2016 17:39
von ChristianB
If vboxwrapper is not running we will not get the results. You can try to shut down BOINC and then manually "pause" the VM using the VirtualBox Manager. Make sure you really pause them. Then reboot and start BOINC again. This could make the VBox manageable by vboxwrapper again.

Re: Scheduler Wait (VM job unmanageable, restarting later).

Verfasst: 16.04.2016 19:06
von MLx
I'm not sure how I can to that - when the error (posted here) happens, both VBoxSvc and vboxwrapper quit.
After restarting BOINC, the task starts computing again, after about 5 seconds the progress bar continues where it left off. After about 30 minutes, the error happens, and I suppose VBoxSvc suspends the machine and quits.

Should I pause it while vboxwrapper is running?

Re: Scheduler Wait (VM job unmanageable, restarting later).

Verfasst: 17.04.2016 08:36
von ChristianB
Than it seems that there is a problem with VirtualBox that doesn't clear itself with a restart of the VM. I was under the impression that the VM is running without vboxwrapper. If the VM fails repeatedly you should abort the Task.

Trickles are only send after 4 hours of computing. I don't know if this counter is reset with every error you see.

VirtualBox itself is not as stable as we like but overall it's far better than the previous solution.

Re: Scheduler Wait (VM job unmanageable, restarting later).

Verfasst: 24.04.2016 16:26
von MLx
Upgraded VBox to 5.0.18, the issue seems resolved - the WUs have ran for ~80minutes now, without stopping.

Re: Scheduler Wait (VM job unmanageable, restarting later).

Verfasst: 24.04.2016 23:25
von Jacob Klein
:) I am directly responsible for getting them to fix a 5.x snapshot-creation bug, in 5.0.18.
Perhaps it fixed your issue! :) I'll surely take credit.
https://www.virtualbox.org/ticket/15206

Good luck!
- Jacob

Re: Scheduler Wait (VM job unmanageable, restarting later).

Verfasst: 25.04.2016 09:33
von Michael H.W. Weber
Nice. :D

Michael.

Re: Scheduler Wait (VM job unmanageable, restarting later).

Verfasst: 26.04.2016 15:49
von MLx
Hm, got the same state in BOINC manager again, but this time it seems caused by something else. These are the last lines in stderr.txt:

Code: Alles auswählen

2016-04-26 15:58:14 (90834): Creating new snapshot for VM.
2016-04-26 15:58:18 (90834): Deleting stale snapshot.
2016-04-26 15:58:19 (90834): Checkpoint completed.
sh: line 1: 63854 Segmentation fault: 11  VBoxManage -q showvminfo "boinc_a1bd11c312417537" --machinereadable 2>&1
2016-04-26 16:26:33 (90834): ERROR: Vboxwrapper lost communication with VirtualBox, rescheduling task for a later time.
2016-04-26 16:26:33 (90834): Powering off VM.
2016-04-26 16:26:34 (90834): Successfully stopped VM.
Only 1 of 2 tasks failed this way so far (after >30 hours of cumputation). After BOINC service restart, the failed task seems to continue where it left off (or at last successful checkpoint).

Re: Scheduler Wait (VM job unmanageable, restarting later).

Verfasst: 26.04.2016 18:38
von ChristianB
I've seen some segfaults in the logs in the past. Strange things but outside of our reach and hard to reproduce.

Re: Scheduler Wait (VM job unmanageable, restarting later).

Verfasst: 29.07.2020 21:59
von joeybuddy96
Six years after the status "Postponed: VM job unmanageable, restarting later" was first posted in this forum, it's still around. I've seen in with QuChem, TACC, and if online search results are any indication, it exists for Cosmology, LHC, and NanoHub. I think the way I was able to get the tasks to run in the past was to completely finish all non-Vbox CPU WUs and set those projects to not accept new tasks. In my case, that means finishing about five dozen Rosetta WUs and then sitting around running at less than 15% CPU use (since I've got one RNA task) for however many real world days it takes to finish the WU. I'm all for the goals of projects that use VBox, but I'd like to run the tasks concurrently with non-Vbox ones and also use my PC's full resources. Any better ideas on how to force VBox tasks out of postponement?

Re: Scheduler Wait (VM job unmanageable, restarting later).

Verfasst: 30.07.2020 10:43
von Michael H.W. Weber
joeybuddy96 hat geschrieben:
29.07.2020 21:59
Any better ideas on how to force VBox tasks out of postponement?
I have seen this issue happening very rarely and unfortunately I do not know how to resolve it.
I would just opt out of the VM tasks alltogether if these permanently result in problems on your machines.

Michael.