VboxHeadless running without vboxwrapper on Mac OS

Everything about the project RNA World
Nachricht
Autor
ChristianB
Admin
Admin
Beiträge: 1920
Registriert: 23.02.2010 22:12

Re: Long running work unit

#13 Ungelesener Beitrag von ChristianB » 28.06.2014 20:49

I don't really know. Can you try to disable T4T and let RNA run on it's own? Do you have other VMs running at the same time. Is the hard drive under permanent use because of something else? I'm running out of ideas to try.

Benutzeravatar
JeromeC
XBOX360-Installer
XBOX360-Installer
Beiträge: 76
Registriert: 23.10.2010 19:38
Wohnort: Poissy/France

Re: Long running work unit

#14 Ungelesener Beitrag von JeromeC » 29.06.2014 11:21

Ok, I have suspended T4T then to see how it goes, but it says remaining estimate is 73 days of calculation !!! Why is it supposed to be soooo long ? Even CPDN look like baby WUs compared to this, is it really normal ? what's the benefit of such a long WU for computation / science interest ? how many credits is it supposed to give at the end ?

I really don't want to stop T4T for such a long time, I like my boinc to be "multi-projects", not to be monopolized by one project (even if "only" one WU out of 8), so if it cannot share the space with T4T (and/or other VM projects) there is an issue and I don't know what to do on the long term with this...


edit : I forgot to mention that I currently have a windows VM running with VirtualBox (since Friday evening), but it's not running all the time.

Benutzeravatar
JeromeC
XBOX360-Installer
XBOX360-Installer
Beiträge: 76
Registriert: 23.10.2010 19:38
Wohnort: Poissy/France

Re: Long running work unit

#15 Ungelesener Beitrag von JeromeC » 30.06.2014 22:28

For the moment the VM has been running continuously but still I'd like to have some answers to my questions above.

T4T would run together with RNA with no problem "in the past" (ie before I started testing with Atlas) but now Atlas is stopped and I can't run T4T with RNA or RNA would become "unmanageable", and I don't feel like letting such an enormous WU run without knowing if it's "worth it"...

I'm currently doing another test : I stopped the windows VM I had running in VB, and started again the T4T WU : so far both are running ok. I"ll see how it goes tomorrow morning. But maybe also there some kind of limit with "2 VB VM max" and somehow, if the third is Atlas or any manual VM that I may run under VB, that "limit" is reached and problems start...

ChristianB
Admin
Admin
Beiträge: 1920
Registriert: 23.02.2010 22:12

Re: Long running work unit

#16 Ungelesener Beitrag von ChristianB » 30.06.2014 22:51

Yes, our long running tasks are quiet challenging. You don't have to finish the task just run several combinations for 1 or 2 days and see when the "unamangeable" state is happening. The problem here seems to be with VB itself. I know of a Windows host that has 8 VMs running without problems. Maybe it's a Mac VB issue.

Benutzeravatar
JeromeC
XBOX360-Installer
XBOX360-Installer
Beiträge: 76
Registriert: 23.10.2010 19:38
Wohnort: Poissy/France

Re: Long running work unit

#17 Ungelesener Beitrag von JeromeC » 01.07.2014 22:44

Well, RNA did not get stuck again, the stderr is clean, it's now the T4T VM that got "stuck" in the night :
2014-07-01 02:20:23 (45192): Deleting stale snapshot.
2014-07-01 02:20:23 (45192): Error in delete stale snapshot for VM: -2147024809
Command:
VBoxManage -q snapshot "boinc_347e67cabc4bc4c6" delete "69980446-4870-487a-b308-c7aa245f3a6c"
Output:
VBoxManage: error: Code NS_ERROR_INVALID_ARG (0x80070057) - Invalid argument value (extended info not available)
VBoxManage: error: Context: "DeleteSnapshot(bstrSnapGuid.raw(), pProgress.asOutParam())" at line 421 of file VBoxManageSnapshot.cpp

2014-07-01 02:20:23 (45192): ERROR: Checkpoint maintenance failed, rescheduling task for a later time. (-2147024809)
2014-07-01 02:20:23 (45192): Powering off VM.
2014-07-01 02:20:24 (45192): Error in poweroff VM for VM: -2135228414
Command:
VBoxManage -q controlvm "boinc_347e67cabc4bc4c6" poweroff
Output:
VBoxManage: error: Invalid machine state: DeletingSnapshotOnline (must be Running, Paused or Stuck)
VBoxManage: error: Details: code VBOX_E_INVALID_VM_STATE (0x80bb0002), component Console, interface IConsole, callee nsISupports
VBoxManage: error: Context: "PowerDown(progress.asOutParam())" at line 222 of file VBoxManageControlVM.cpp

2014-07-01 02:20:24 (45192): VM did not power off when requested.
The VM is running in memory, boinc says "unmanageable" and however I can see the console with recent activity inside of it !

RNA console is also ok, work is being processed.

But I have 9 WUs busy in my i7, each one having obviously less than one core to compute...

Regarding your point with a machine with 8 VM, is it with different VM/VB project, like me ? or all the same ?

Benutzeravatar
JeromeC
XBOX360-Installer
XBOX360-Installer
Beiträge: 76
Registriert: 23.10.2010 19:38
Wohnort: Poissy/France

Re: Long running work unit

#18 Ungelesener Beitrag von JeromeC » 02.07.2014 07:23

This is actually driving me crazy, this morning "all looks fine again" with my 2 VM projects, running together each one with his VB wrapper, happily, however I have 9 apps running in memory, the guilty 9th is not CPDN (no VM !!) which is declared "pending" for boinc and actually running... !!!

Bild

Bild

ChristianB
Admin
Admin
Beiträge: 1920
Registriert: 23.02.2010 22:12

Re: Long running work unit

#19 Ungelesener Beitrag von ChristianB » 02.07.2014 11:58

My example was with RNA only. In Theory it shouldn't mater what projects are running. Each vboxwrapper is monitoring a different VM and VB should handle each VM the same. I haven't heard of Rom lately but I will ping him and the others with this thread.

Benutzeravatar
JeromeC
XBOX360-Installer
XBOX360-Installer
Beiträge: 76
Registriert: 23.10.2010 19:38
Wohnort: Poissy/France

Re: VboxHeadless running without vboxwrapper on Mac OS

#20 Ungelesener Beitrag von JeromeC » 02.07.2014 13:37

Thanks a lot for your help and sorry I did actually contaminate the other thread, it was a good idea to split !

But now it seem to happen now with a "non VM project" (CPDN) and this morning the two VM projects were both ok... I'm starting to wonder if maybe it has nothing to do with the wrapper version, or any of the VB/VM project... but some issue on my Mac... I'll try to reboot tonight and see what happens !

ChristianB
Admin
Admin
Beiträge: 1920
Registriert: 23.02.2010 22:12

Re: VboxHeadless running without vboxwrapper on Mac OS

#21 Ungelesener Beitrag von ChristianB » 02.07.2014 13:39

What boinc version are you currently using?

Benutzeravatar
JeromeC
XBOX360-Installer
XBOX360-Installer
Beiträge: 76
Registriert: 23.10.2010 19:38
Wohnort: Poissy/France

Re: VboxHeadless running without vboxwrapper on Mac OS

#22 Ungelesener Beitrag von JeromeC » 02.07.2014 14:02

I'm using 7.3.19 because of this big issue with T4T, it solved the problem.

Actually on that occasion boinc devs did help quite a lot (Rom, Charlie...)

Benutzeravatar
JeromeC
XBOX360-Installer
XBOX360-Installer
Beiträge: 76
Registriert: 23.10.2010 19:38
Wohnort: Poissy/France

Re: VboxHeadless running without vboxwrapper on Mac OS

#23 Ungelesener Beitrag von JeromeC » 03.07.2014 22:36

I only rebooted tonight, after restarting all looked fine, T4T and RNA were running together with their wrapper, and after 2 hours I find out that T4T is again unmanageable and its wrapper is lost, but the VM is running and I can even access it via the console, RNA seems to be running fine, plus 7 others tasks running = 9 running tasks on the i7...

I stop / restart boinc... until the next time...

Benutzeravatar
JeromeC
XBOX360-Installer
XBOX360-Installer
Beiträge: 76
Registriert: 23.10.2010 19:38
Wohnort: Poissy/France

Re: VboxHeadless running without vboxwrapper on Mac OS

#24 Ungelesener Beitrag von JeromeC » 05.07.2014 12:06

I had problems with the Mac yesterday (not starting anymore) so I had to stop it all day and had to restore system disk from a recent clone (not the boinc data, not located on my system disk), this morning I restarted it : RNA WU would continue (no more T4T WU in the pipe), and the end of the morning it did download a T4T WU, at the beginning all went well, then
2014-07-05 12:27:26 (3226): Error in delete stale snapshot for VM: -2147024809
Command:
VBoxManage -q snapshot "boinc_ab0e47072b0b9d80" delete "0bc8bc9e-5a25-4c06-8138-159a26a2898b"
Output:
VBoxManage: error: Code NS_ERROR_INVALID_ARG (0x80070057) - Invalid argument value (extended info not available)
VBoxManage: error: Context: "DeleteSnapshot(bstrSnapGuid.raw(), pProgress.asOutParam())" at line 421 of file VBoxManageSnapshot.cpp

2014-07-05 12:27:26 (3226): ERROR: Checkpoint maintenance failed, rescheduling task for a later time. (-2147024809)
2014-07-05 12:27:26 (3226): Powering off VM.
2014-07-05 12:27:26 (3226): Error in poweroff VM for VM: -2135228414
Command:
VBoxManage -q controlvm "boinc_ab0e47072b0b9d80" poweroff
Output:
VBoxManage: error: Invalid machine state: DeletingSnapshotOnline (must be Running, Paused or Stuck)
VBoxManage: error: Details: code VBOX_E_INVALID_VM_STATE (0x80bb0002), component Console, interface IConsole, callee nsISupports
VBoxManage: error: Context: "PowerDown(progress.asOutParam())" at line 222 of file VBoxManageControlVM.cpp

2014-07-05 12:27:26 (3226): VM did not power off when requested.
Now the RNA is unmanageable. Both WU can be accessed with the console, they seem to be running, only the RNA had lost its wrapper.

What I have now done is reduce boinc to 7 cores out of 8, so for the moment I have actually 8 WU running with one core each :)

Wait & see

Antworten

Zurück zu „RNA World Discussions (english)“