Seite 4 von 4

Re: Scheduler Wait (VM job unmanageable, restarting later).

Verfasst: 16.04.2016 16:50
von MLx
Still no news on this? I suspect this is a VirtualBox issue...

Interestingly, the tasks continue computing when the BOINC client is restarted, so slowly but surely, I'm chipping away at the WU. However, it seems that because there are no snapshots, there are no trickles, and thus the deadline does not get extended as it should.

Should I let these tasks finish, or do I just scrap them?

http://www.rnaworld.de/rnaworld/result. ... d=14951245
http://www.rnaworld.de/rnaworld/result. ... d=14951335

Re: Scheduler Wait (VM job unmanageable, restarting later).

Verfasst: 16.04.2016 17:39
von ChristianB
If vboxwrapper is not running we will not get the results. You can try to shut down BOINC and then manually "pause" the VM using the VirtualBox Manager. Make sure you really pause them. Then reboot and start BOINC again. This could make the VBox manageable by vboxwrapper again.

Re: Scheduler Wait (VM job unmanageable, restarting later).

Verfasst: 16.04.2016 19:06
von MLx
I'm not sure how I can to that - when the error (posted here) happens, both VBoxSvc and vboxwrapper quit.
After restarting BOINC, the task starts computing again, after about 5 seconds the progress bar continues where it left off. After about 30 minutes, the error happens, and I suppose VBoxSvc suspends the machine and quits.

Should I pause it while vboxwrapper is running?

Re: Scheduler Wait (VM job unmanageable, restarting later).

Verfasst: 17.04.2016 08:36
von ChristianB
Than it seems that there is a problem with VirtualBox that doesn't clear itself with a restart of the VM. I was under the impression that the VM is running without vboxwrapper. If the VM fails repeatedly you should abort the Task.

Trickles are only send after 4 hours of computing. I don't know if this counter is reset with every error you see.

VirtualBox itself is not as stable as we like but overall it's far better than the previous solution.

Re: Scheduler Wait (VM job unmanageable, restarting later).

Verfasst: 24.04.2016 16:26
von MLx
Upgraded VBox to 5.0.18, the issue seems resolved - the WUs have ran for ~80minutes now, without stopping.

Re: Scheduler Wait (VM job unmanageable, restarting later).

Verfasst: 24.04.2016 23:25
von Jacob Klein
:) I am directly responsible for getting them to fix a 5.x snapshot-creation bug, in 5.0.18.
Perhaps it fixed your issue! :) I'll surely take credit.
https://www.virtualbox.org/ticket/15206

Good luck!
- Jacob

Re: Scheduler Wait (VM job unmanageable, restarting later).

Verfasst: 25.04.2016 09:33
von Michael H.W. Weber
Nice. :D

Michael.

Re: Scheduler Wait (VM job unmanageable, restarting later).

Verfasst: 26.04.2016 15:49
von MLx
Hm, got the same state in BOINC manager again, but this time it seems caused by something else. These are the last lines in stderr.txt:

Code: Alles auswählen

2016-04-26 15:58:14 (90834): Creating new snapshot for VM.
2016-04-26 15:58:18 (90834): Deleting stale snapshot.
2016-04-26 15:58:19 (90834): Checkpoint completed.
sh: line 1: 63854 Segmentation fault: 11  VBoxManage -q showvminfo "boinc_a1bd11c312417537" --machinereadable 2>&1
2016-04-26 16:26:33 (90834): ERROR: Vboxwrapper lost communication with VirtualBox, rescheduling task for a later time.
2016-04-26 16:26:33 (90834): Powering off VM.
2016-04-26 16:26:34 (90834): Successfully stopped VM.
Only 1 of 2 tasks failed this way so far (after >30 hours of cumputation). After BOINC service restart, the failed task seems to continue where it left off (or at last successful checkpoint).

Re: Scheduler Wait (VM job unmanageable, restarting later).

Verfasst: 26.04.2016 18:38
von ChristianB
I've seen some segfaults in the logs in the past. Strange things but outside of our reach and hard to reproduce.