Scheduler Wait (VM job unmanageable, restarting later).

Everything about the project RNA World
Nachricht
Autor
MLx
Mikrocruncher
Mikrocruncher
Beiträge: 23
Registriert: 21.05.2011 17:49

Re: Scheduler Wait (VM job unmanageable, restarting later).

#37 Ungelesener Beitrag von MLx » 16.04.2016 16:50

Still no news on this? I suspect this is a VirtualBox issue...

Interestingly, the tasks continue computing when the BOINC client is restarted, so slowly but surely, I'm chipping away at the WU. However, it seems that because there are no snapshots, there are no trickles, and thus the deadline does not get extended as it should.

Should I let these tasks finish, or do I just scrap them?

http://www.rnaworld.de/rnaworld/result. ... d=14951245
http://www.rnaworld.de/rnaworld/result. ... d=14951335

ChristianB
Vereinsvorstand
Vereinsvorstand
Beiträge: 1915
Registriert: 23.02.2010 22:12

Re: Scheduler Wait (VM job unmanageable, restarting later).

#38 Ungelesener Beitrag von ChristianB » 16.04.2016 17:39

If vboxwrapper is not running we will not get the results. You can try to shut down BOINC and then manually "pause" the VM using the VirtualBox Manager. Make sure you really pause them. Then reboot and start BOINC again. This could make the VBox manageable by vboxwrapper again.

MLx
Mikrocruncher
Mikrocruncher
Beiträge: 23
Registriert: 21.05.2011 17:49

Re: Scheduler Wait (VM job unmanageable, restarting later).

#39 Ungelesener Beitrag von MLx » 16.04.2016 19:06

I'm not sure how I can to that - when the error (posted here) happens, both VBoxSvc and vboxwrapper quit.
After restarting BOINC, the task starts computing again, after about 5 seconds the progress bar continues where it left off. After about 30 minutes, the error happens, and I suppose VBoxSvc suspends the machine and quits.

Should I pause it while vboxwrapper is running?

ChristianB
Vereinsvorstand
Vereinsvorstand
Beiträge: 1915
Registriert: 23.02.2010 22:12

Re: Scheduler Wait (VM job unmanageable, restarting later).

#40 Ungelesener Beitrag von ChristianB » 17.04.2016 08:36

Than it seems that there is a problem with VirtualBox that doesn't clear itself with a restart of the VM. I was under the impression that the VM is running without vboxwrapper. If the VM fails repeatedly you should abort the Task.

Trickles are only send after 4 hours of computing. I don't know if this counter is reset with every error you see.

VirtualBox itself is not as stable as we like but overall it's far better than the previous solution.

MLx
Mikrocruncher
Mikrocruncher
Beiträge: 23
Registriert: 21.05.2011 17:49

Re: Scheduler Wait (VM job unmanageable, restarting later).

#41 Ungelesener Beitrag von MLx » 24.04.2016 16:26

Upgraded VBox to 5.0.18, the issue seems resolved - the WUs have ran for ~80minutes now, without stopping.

Jacob Klein
Oberfalter
Oberfalter
Beiträge: 481
Registriert: 26.07.2013 15:41

Re: Scheduler Wait (VM job unmanageable, restarting later).

#42 Ungelesener Beitrag von Jacob Klein » 24.04.2016 23:25

:) I am directly responsible for getting them to fix a 5.x snapshot-creation bug, in 5.0.18.
Perhaps it fixed your issue! :) I'll surely take credit.
https://www.virtualbox.org/ticket/15206

Good luck!
- Jacob

Benutzeravatar
Michael H.W. Weber
Vereinsvorstand
Vereinsvorstand
Beiträge: 20372
Registriert: 07.01.2002 01:00
Wohnort: Marpurk
Kontaktdaten:

Re: Scheduler Wait (VM job unmanageable, restarting later).

#43 Ungelesener Beitrag von Michael H.W. Weber » 25.04.2016 09:33

Nice. :D

Michael.
Fördern, kooperieren und konstruieren statt fordern, konkurrieren und konsumieren.

http://signature.statseb.fr I: Kaputte Seite A
http://signature.statseb.fr II: Kaputte Seite B

Bild Bild Bild

MLx
Mikrocruncher
Mikrocruncher
Beiträge: 23
Registriert: 21.05.2011 17:49

Re: Scheduler Wait (VM job unmanageable, restarting later).

#44 Ungelesener Beitrag von MLx » 26.04.2016 15:49

Hm, got the same state in BOINC manager again, but this time it seems caused by something else. These are the last lines in stderr.txt:

Code: Alles auswählen

2016-04-26 15:58:14 (90834): Creating new snapshot for VM.
2016-04-26 15:58:18 (90834): Deleting stale snapshot.
2016-04-26 15:58:19 (90834): Checkpoint completed.
sh: line 1: 63854 Segmentation fault: 11  VBoxManage -q showvminfo "boinc_a1bd11c312417537" --machinereadable 2>&1
2016-04-26 16:26:33 (90834): ERROR: Vboxwrapper lost communication with VirtualBox, rescheduling task for a later time.
2016-04-26 16:26:33 (90834): Powering off VM.
2016-04-26 16:26:34 (90834): Successfully stopped VM.
Only 1 of 2 tasks failed this way so far (after >30 hours of cumputation). After BOINC service restart, the failed task seems to continue where it left off (or at last successful checkpoint).

ChristianB
Vereinsvorstand
Vereinsvorstand
Beiträge: 1915
Registriert: 23.02.2010 22:12

Re: Scheduler Wait (VM job unmanageable, restarting later).

#45 Ungelesener Beitrag von ChristianB » 26.04.2016 18:38

I've seen some segfaults in the logs in the past. Strange things but outside of our reach and hard to reproduce.

Antworten

Zurück zu „RNA World Discussions (english)“