Scheduler Wait (VM job unmanageable, restarting later).

Everything about the project RNA World
Nachricht
Autor
MLx
Mikrocruncher
Mikrocruncher
Beiträge: 23
Registriert: 21.05.2011 17:49

Re: Scheduler Wait (VM job unmanageable, restarting later).

#37 Ungelesener Beitrag von MLx » 16.04.2016 16:50

Still no news on this? I suspect this is a VirtualBox issue...

Interestingly, the tasks continue computing when the BOINC client is restarted, so slowly but surely, I'm chipping away at the WU. However, it seems that because there are no snapshots, there are no trickles, and thus the deadline does not get extended as it should.

Should I let these tasks finish, or do I just scrap them?

http://www.rnaworld.de/rnaworld/result. ... d=14951245
http://www.rnaworld.de/rnaworld/result. ... d=14951335

ChristianB
Admin
Admin
Beiträge: 1920
Registriert: 23.02.2010 22:12

Re: Scheduler Wait (VM job unmanageable, restarting later).

#38 Ungelesener Beitrag von ChristianB » 16.04.2016 17:39

If vboxwrapper is not running we will not get the results. You can try to shut down BOINC and then manually "pause" the VM using the VirtualBox Manager. Make sure you really pause them. Then reboot and start BOINC again. This could make the VBox manageable by vboxwrapper again.

MLx
Mikrocruncher
Mikrocruncher
Beiträge: 23
Registriert: 21.05.2011 17:49

Re: Scheduler Wait (VM job unmanageable, restarting later).

#39 Ungelesener Beitrag von MLx » 16.04.2016 19:06

I'm not sure how I can to that - when the error (posted here) happens, both VBoxSvc and vboxwrapper quit.
After restarting BOINC, the task starts computing again, after about 5 seconds the progress bar continues where it left off. After about 30 minutes, the error happens, and I suppose VBoxSvc suspends the machine and quits.

Should I pause it while vboxwrapper is running?

ChristianB
Admin
Admin
Beiträge: 1920
Registriert: 23.02.2010 22:12

Re: Scheduler Wait (VM job unmanageable, restarting later).

#40 Ungelesener Beitrag von ChristianB » 17.04.2016 08:36

Than it seems that there is a problem with VirtualBox that doesn't clear itself with a restart of the VM. I was under the impression that the VM is running without vboxwrapper. If the VM fails repeatedly you should abort the Task.

Trickles are only send after 4 hours of computing. I don't know if this counter is reset with every error you see.

VirtualBox itself is not as stable as we like but overall it's far better than the previous solution.

MLx
Mikrocruncher
Mikrocruncher
Beiträge: 23
Registriert: 21.05.2011 17:49

Re: Scheduler Wait (VM job unmanageable, restarting later).

#41 Ungelesener Beitrag von MLx » 24.04.2016 16:26

Upgraded VBox to 5.0.18, the issue seems resolved - the WUs have ran for ~80minutes now, without stopping.

Jacob Klein
Brain-Bug
Brain-Bug
Beiträge: 564
Registriert: 26.07.2013 15:41

Re: Scheduler Wait (VM job unmanageable, restarting later).

#42 Ungelesener Beitrag von Jacob Klein » 24.04.2016 23:25

:) I am directly responsible for getting them to fix a 5.x snapshot-creation bug, in 5.0.18.
Perhaps it fixed your issue! :) I'll surely take credit.
https://www.virtualbox.org/ticket/15206

Good luck!
- Jacob

Benutzeravatar
Michael H.W. Weber
Vereinsvorstand
Vereinsvorstand
Beiträge: 22414
Registriert: 07.01.2002 01:00
Wohnort: Marpurk
Kontaktdaten:

Re: Scheduler Wait (VM job unmanageable, restarting later).

#43 Ungelesener Beitrag von Michael H.W. Weber » 25.04.2016 09:33

Nice. :D

Michael.
Fördern, kooperieren und konstruieren statt fordern, konkurrieren und konsumieren.

http://signature.statseb.fr I: Kaputte Seite A
http://signature.statseb.fr II: Kaputte Seite B

Bild Bild Bild

MLx
Mikrocruncher
Mikrocruncher
Beiträge: 23
Registriert: 21.05.2011 17:49

Re: Scheduler Wait (VM job unmanageable, restarting later).

#44 Ungelesener Beitrag von MLx » 26.04.2016 15:49

Hm, got the same state in BOINC manager again, but this time it seems caused by something else. These are the last lines in stderr.txt:

Code: Alles auswählen

2016-04-26 15:58:14 (90834): Creating new snapshot for VM.
2016-04-26 15:58:18 (90834): Deleting stale snapshot.
2016-04-26 15:58:19 (90834): Checkpoint completed.
sh: line 1: 63854 Segmentation fault: 11  VBoxManage -q showvminfo "boinc_a1bd11c312417537" --machinereadable 2>&1
2016-04-26 16:26:33 (90834): ERROR: Vboxwrapper lost communication with VirtualBox, rescheduling task for a later time.
2016-04-26 16:26:33 (90834): Powering off VM.
2016-04-26 16:26:34 (90834): Successfully stopped VM.
Only 1 of 2 tasks failed this way so far (after >30 hours of cumputation). After BOINC service restart, the failed task seems to continue where it left off (or at last successful checkpoint).

ChristianB
Admin
Admin
Beiträge: 1920
Registriert: 23.02.2010 22:12

Re: Scheduler Wait (VM job unmanageable, restarting later).

#45 Ungelesener Beitrag von ChristianB » 26.04.2016 18:38

I've seen some segfaults in the logs in the past. Strange things but outside of our reach and hard to reproduce.

joeybuddy96
Taschenrechner
Taschenrechner
Beiträge: 12
Registriert: 29.07.2020 21:33

Re: Scheduler Wait (VM job unmanageable, restarting later).

#46 Ungelesener Beitrag von joeybuddy96 » 29.07.2020 21:59

Six years after the status "Postponed: VM job unmanageable, restarting later" was first posted in this forum, it's still around. I've seen in with QuChem, TACC, and if online search results are any indication, it exists for Cosmology, LHC, and NanoHub. I think the way I was able to get the tasks to run in the past was to completely finish all non-Vbox CPU WUs and set those projects to not accept new tasks. In my case, that means finishing about five dozen Rosetta WUs and then sitting around running at less than 15% CPU use (since I've got one RNA task) for however many real world days it takes to finish the WU. I'm all for the goals of projects that use VBox, but I'd like to run the tasks concurrently with non-Vbox ones and also use my PC's full resources. Any better ideas on how to force VBox tasks out of postponement?

Benutzeravatar
Michael H.W. Weber
Vereinsvorstand
Vereinsvorstand
Beiträge: 22414
Registriert: 07.01.2002 01:00
Wohnort: Marpurk
Kontaktdaten:

Re: Scheduler Wait (VM job unmanageable, restarting later).

#47 Ungelesener Beitrag von Michael H.W. Weber » 30.07.2020 10:43

joeybuddy96 hat geschrieben:
29.07.2020 21:59
Any better ideas on how to force VBox tasks out of postponement?
I have seen this issue happening very rarely and unfortunately I do not know how to resolve it.
I would just opt out of the VM tasks alltogether if these permanently result in problems on your machines.

Michael.
Fördern, kooperieren und konstruieren statt fordern, konkurrieren und konsumieren.

http://signature.statseb.fr I: Kaputte Seite A
http://signature.statseb.fr II: Kaputte Seite B

Bild Bild Bild

Antworten

Zurück zu „RNA World Discussions (english)“