VBox failed; task invalid

Everything about the project RNA World
Nachricht
Autor
joeybuddy96
Fingerzähler
Fingerzähler
Beiträge: 2
Registriert: 29.07.2020 21:33

VBox failed; task invalid

#1 Ungelesener Beitrag von joeybuddy96 » 08.08.2020 22:56

After one week of crunching a task that was estimated to take 1027 days to complete, VBox encountered an error and now the task is no longer in the BOINC Manager or VirtualBox.
http://www.rnaworld.de/rnaworld/result. ... d=14954760

Code: Alles auswählen

VBoxManage.exe: error: Details: code E_FAIL (0x80004005), component SessionMachine, interface ISession
I'm not sure what caused the error. It's possible that I didn't shut down BOINC prior to restarting my PC; maybe an app installation or starting another app while it was running caused it.

I don't understand what the point of having snapshots is if it's not possible to go back to an earlier state and resume work from there. Given the three year run time of this task, it seems like it's inviting trouble to not split a task up into smaller tasks to be distributed to multiple users to minimize the risk of losing all the work.

Benutzeravatar
Michael H.W. Weber
Vereinsvorstand
Vereinsvorstand
Beiträge: 20768
Registriert: 07.01.2002 01:00
Wohnort: Marpurk
Kontaktdaten:

Re: VBox failed; task invalid

#2 Ungelesener Beitrag von Michael H.W. Weber » 09.08.2020 11:06

Hm, so far I have actually never heard of tasks just vanishing from the BOINC manager - that's really strange. :roll:

Anyhow, we use Virtualbox to allow for the writing of checkpoints - which, unfortunately, the original science app (developed by mathematicians for HPCs only) does not support. We were actually the first DC team (which operates multiple own DC projects) suggesting a VM approach for DC apps which can't natively checkpoint at the 2009 Barcelona BOINC Workshop. The LHC@home team at CERN presented their first implementation of this at that same workshop (after us presenting the proposal, however). After years of practice and optimization based on user reports, to me it appears that at present the only error remaining to occasionally occurr in RNA World with these VERY DEMANDING tasks (which you can disable in your settings in case computing a year for a single WU is too ambitious) is that Virtualbox does not keep up with the snapshot writing to disk before a system shuts down. For this reason, we recommend to shut down BOINC, wait until your HD does not appear to write anything relevant anymore and only then shut down your machine.
In any case, after failure, the task should not vanish from your system but "just" restart from scratch (which is anoying enough given the exorbitant runtimes).

A solution to the "premature shutdown issue" could be the keeping of at least TWO snapshots such that in case of the bug described above, the older snapshot could be used to resume work. But this is something that is not in our but in Oracle's hand, I fear.
Unfortunately, the breaking up of these monster tasks into smaller parts is not possible with our current approach. So, I would recommend to just disable the VM tasks in your RNA World settings for now.

Michael.

P.S.: Thanks for reporting the issue to us. :D
Fördern, kooperieren und konstruieren statt fordern, konkurrieren und konsumieren.

http://signature.statseb.fr I: Kaputte Seite A
http://signature.statseb.fr II: Kaputte Seite B

Bild Bild Bild

Antworten

Zurück zu „RNA World Discussions (english)“