Assistance needed - 2 Long-running VMs just failed - Clones
-
Jacob Klein
- Brain-Bug

- Beiträge: 564
- Registriert: 26.07.2013 15:41
Re: Assistance needed - 2 Long-running VMs just failed - Clo
Christian,
I'm going to be moving my main rig to permanently use Windows 10 Technical Preview I think, but unfortunately, VirtualBox's hardened security currently does not work in Windows 10. I've reported it in their security-issues thread in their forums.
However... For the 2 tasks that I've been crunching for months, I still intend to complete them. I still very much intend on getting these 2 tasks completed, somehow!! I'd prefer my main rig, but I have to wait for a VirtualBox fix, but I want to keep progressing forward with the tasks.
So, while I am waiting on Oracle for a fix for Windows 10, I think I'm going to see if I can resume processing them on a different computer in my household, a laptop. It has plenty of RAM, and a decent quad-core processor. I'll of course retain backups of everything on my main rig, but ... just wanted to make sure of some things.
a) Do you think this could work successfully to complete the tasks?
b) Is it acceptable to have multiple PCs do work, serially, to complete a single RNA world VM task?
c) Do you have any comments/advice?
Thanks,
Jacob
I'm going to be moving my main rig to permanently use Windows 10 Technical Preview I think, but unfortunately, VirtualBox's hardened security currently does not work in Windows 10. I've reported it in their security-issues thread in their forums.
However... For the 2 tasks that I've been crunching for months, I still intend to complete them. I still very much intend on getting these 2 tasks completed, somehow!! I'd prefer my main rig, but I have to wait for a VirtualBox fix, but I want to keep progressing forward with the tasks.
So, while I am waiting on Oracle for a fix for Windows 10, I think I'm going to see if I can resume processing them on a different computer in my household, a laptop. It has plenty of RAM, and a decent quad-core processor. I'll of course retain backups of everything on my main rig, but ... just wanted to make sure of some things.
a) Do you think this could work successfully to complete the tasks?
b) Is it acceptable to have multiple PCs do work, serially, to complete a single RNA world VM task?
c) Do you have any comments/advice?
Thanks,
Jacob
-
ChristianB
- Admin

- Beiträge: 1920
- Registriert: 23.02.2010 22:12
Re: Assistance needed - 2 Long-running VMs just failed - Clo
a) usually BOINC doesn't allow this but I think it is possible to do if you grab the VM and snapshot from the slot directory. You can try to run this outside of BOINC on the second rig and than later reinsert the files into the same slot directory and let the vboxwrapper do its job. In Theory this sounds good but in practice it's difficult because I don't know how the vboxwrapper restarts the VM using the correct snapshot.Jacob Klein hat geschrieben:a) Do you think this could work successfully to complete the tasks?
b) Is it acceptable to have multiple PCs do work, serially, to complete a single RNA world VM task?
c) Do you have any comments/advice?
b) For us it is acceptable because we have this long runtimes but I think this will only be an experiment as it involves a lot of manual work
c) make sure you snapshot regularly on the second rig, note the filenames and content of the slots dir (especially the vbox files). When moving the files back from the second to the first rig make sure you rename the latest snapshot so that one the first rig "knows" about.
-
Jacob Klein
- Brain-Bug

- Beiträge: 564
- Registriert: 26.07.2013 15:41
Re: Assistance needed - 2 Long-running VMs just failed - Clo
a) If you recall, these 2 tasks I'm talking about, are a part of the "group of 4 VMs" that crashed several months ago, in the first post of the thread.... so, I've already been running them outside-of-BOINC, and it's not an issue. Also, I was able to transplant them to the other laptop PC just fine.
b) Yes. As I said, it was already running out-side-of-BOINC, but I just wanted to make sure that you'd still accept my result even when I put it on a second PC to do some of the processing.
c) I've been doing snapshots regularly (every 24 hours or so) on my main rig, over the last several months, and plan to do the same on the laptop. What I usually do is: Create a new snapshot (by default it increments the number in the name, so "Snapshot 27" for instance), then delete the old one ("Snapshot 26" for instance). Then, every couple of weeks, I make a clone of the [VM +1 snapshot] to save as a backup ("RNA World Task 1 Clone 27" for instance), so I additionally have rolling backups.
So, I guess the short story is: Will you still accept my results as useful work, if I move them to another PC to do some of the work. And it sounds like the answer is yes, right?
Thanks,
Jacob
b) Yes. As I said, it was already running out-side-of-BOINC, but I just wanted to make sure that you'd still accept my result even when I put it on a second PC to do some of the processing.
c) I've been doing snapshots regularly (every 24 hours or so) on my main rig, over the last several months, and plan to do the same on the laptop. What I usually do is: Create a new snapshot (by default it increments the number in the name, so "Snapshot 27" for instance), then delete the old one ("Snapshot 26" for instance). Then, every couple of weeks, I make a clone of the [VM +1 snapshot] to save as a backup ("RNA World Task 1 Clone 27" for instance), so I additionally have rolling backups.
So, I guess the short story is: Will you still accept my results as useful work, if I move them to another PC to do some of the work. And it sounds like the answer is yes, right?
Thanks,
Jacob
-
ChristianB
- Admin

- Beiträge: 1920
- Registriert: 23.02.2010 22:12
Re: Assistance needed - 2 Long-running VMs just failed - Clo
Yes, I'll accept this. I forgot about those two and it makes things easier.
-
Jacob Klein
- Brain-Bug

- Beiträge: 564
- Registriert: 26.07.2013 15:41
Re: Assistance needed - 2 Long-running VMs just failed - Clo
Excellent, thank you. The community is already putting pressure on Oracle to get VirtualBox working on Windows 10 Technical Preview, so.. I may be able to transplant them back to the main rig soon 
-
Jacob Klein
- Brain-Bug

- Beiträge: 564
- Registriert: 26.07.2013 15:41
Re: Assistance needed - 2 Long-running VMs just failed - Clo
VirtualBox 4.3.17 r96426 (Test build #5)... is working correctly on Windows 10 Technical Preview hosts. So, I have transplanted the 2 big multi-month units back over to my main rig which is running Windows 10 as primary OS. But, as I said, for a couple days, those 2 units were running on an a laptop, and I'm under the impression that it shouldn't affect the final outcome of the work units.
Thanks,
Jacob
Thanks,
Jacob
-
Jacob Klein
- Brain-Bug

- Beiträge: 564
- Registriert: 26.07.2013 15:41
Re: Assistance needed - 2 Long-running VMs just failed - Clo
VirtualBox 4.3.18 r96516, will not launch properly on Windows 10 Build 9860. I will hold on to the units, and crunch them when I'm able to again. Basically, I still plan to finish these 2 units, despite the current stall/setback.
Thanks,
Jacob
Thanks,
Jacob
-
Jacob Klein
- Brain-Bug

- Beiträge: 564
- Registriert: 26.07.2013 15:41
Re: Assistance needed - 2 Long-running VMs just failed - Clo
Here is a status update of the 4 VMs that I am still running on Nitro (my beefy Win 8.1 laptop), outside of BOINC. The first 2 are the 2 RNA World tasks that had failed in the original post. The bottom 2 are from an RNA World task that failed after that, and I'm actually crunching 2 instances of it, because the version at the time of the crash had hard drive write errors within the VM, and I additionally had a backup from before those errors had happened.
So, I still intend to complete all 4 of these... slowly but surely
It's neat to see times listed in weeks, these are some huge times!
Thanks,
Jacob
1)
Name: cmsvm2_GA-p[e20-30MB_Lin64f]_1_Drosophila-melanogaster-(fruit-fly)_AE014297.lin.EMBL_RF00028_Intron_gpI_1349111823_13748_9
http://www.rnaworld.de/rnaworld/workuni ... id=6330939
estimated runtime on reference system: 10w 5d 3h 19m 24s (6491964.4413781 s)
forecast 9655215 sec (~16 weeks)
Failed on: 2 Jul 2014, 11:09:24 UTC
Failed at: 8,345,739 sec (~13.8 weeks)
Current Progress.txt: 98.765%
Current runtime: 278232 mins (~27.6 weeks)
Note: Has a wingman with a completion time of: 18,268,430 sec (~30.2 weeks; probably valid)
2)
Name: cmsvm2_GA-p[e20-30MB_Lin64f]_1_Oryza-sativa-Japonica-Group_CM000142.lin.EMBL_RF00028_Intron_gpI_1349111823_57652_12
http://www.rnaworld.de/rnaworld/workuni ... id=6330945
estimated runtime on reference system: 8w 5d 20h 41m 32s (5344892.6310472 s)
forecast 8627070 sec (~14.25 weeks)
Failed on: 2 Jul 2014, 11:09:24 UTC
Failed at: 10,810,220 sec (~17.9 weeks)
Current Progress.txt: 98.765%
Current runtime: 311347 mins (~30.9 weeks)
Note: Has a wingman with a completion time of: 463,101.50 sec (~0.75 weeks; invalid)
3)
Name: cmsvm2_GA-p[e20-30MB_Lin64f]_1_Oryza-sativa-Japonica-Group_CM000147.lin.EMBL_RF00028_Intron_gpI_1349111823_64512_30
http://www.rnaworld.de/rnaworld/workuni ... id=6330855
estimated runtime on reference system: 8w 0d 21h 7m 47s (4914467.536505 s)
forecast 17647160 sec (~29.2 weeks)
Failed on: 20 Nov 2014, 20:11:45 UTC
Failed at: 13,672,580 sec (~22.6 weeks)
Current Progress.txt: 26.1737%
Current runtime: 76981 mins (~7.6 weeks) * Started running outside of BOINC using a pre-crash snapshot that was saved before "write errors" in VM
4)
Name: cmsvm2_GA-p[e20-30MB_Lin64f]_1_Oryza-sativa-Japonica-Group_CM000147.lin.EMBL_RF00028_Intron_gpI_1349111823_64512_30
http://www.rnaworld.de/rnaworld/workuni ... id=6330855
estimated runtime on reference system: 8w 0d 21h 7m 47s (4914467.536505 s)
forecast 17647160 sec (~29.2 weeks)
Failed on: 20 Nov 2014, 20:11:45 UTC
Failed at: 13,672,580 sec (~22.6 weeks)
Current Progress.txt: 82.5124%
Current runtime: 242683 mins (~24.1 weeks)
So, I still intend to complete all 4 of these... slowly but surely
Thanks,
Jacob
1)
Name: cmsvm2_GA-p[e20-30MB_Lin64f]_1_Drosophila-melanogaster-(fruit-fly)_AE014297.lin.EMBL_RF00028_Intron_gpI_1349111823_13748_9
http://www.rnaworld.de/rnaworld/workuni ... id=6330939
estimated runtime on reference system: 10w 5d 3h 19m 24s (6491964.4413781 s)
forecast 9655215 sec (~16 weeks)
Failed on: 2 Jul 2014, 11:09:24 UTC
Failed at: 8,345,739 sec (~13.8 weeks)
Current Progress.txt: 98.765%
Current runtime: 278232 mins (~27.6 weeks)
Note: Has a wingman with a completion time of: 18,268,430 sec (~30.2 weeks; probably valid)
2)
Name: cmsvm2_GA-p[e20-30MB_Lin64f]_1_Oryza-sativa-Japonica-Group_CM000142.lin.EMBL_RF00028_Intron_gpI_1349111823_57652_12
http://www.rnaworld.de/rnaworld/workuni ... id=6330945
estimated runtime on reference system: 8w 5d 20h 41m 32s (5344892.6310472 s)
forecast 8627070 sec (~14.25 weeks)
Failed on: 2 Jul 2014, 11:09:24 UTC
Failed at: 10,810,220 sec (~17.9 weeks)
Current Progress.txt: 98.765%
Current runtime: 311347 mins (~30.9 weeks)
Note: Has a wingman with a completion time of: 463,101.50 sec (~0.75 weeks; invalid)
3)
Name: cmsvm2_GA-p[e20-30MB_Lin64f]_1_Oryza-sativa-Japonica-Group_CM000147.lin.EMBL_RF00028_Intron_gpI_1349111823_64512_30
http://www.rnaworld.de/rnaworld/workuni ... id=6330855
estimated runtime on reference system: 8w 0d 21h 7m 47s (4914467.536505 s)
forecast 17647160 sec (~29.2 weeks)
Failed on: 20 Nov 2014, 20:11:45 UTC
Failed at: 13,672,580 sec (~22.6 weeks)
Current Progress.txt: 26.1737%
Current runtime: 76981 mins (~7.6 weeks) * Started running outside of BOINC using a pre-crash snapshot that was saved before "write errors" in VM
4)
Name: cmsvm2_GA-p[e20-30MB_Lin64f]_1_Oryza-sativa-Japonica-Group_CM000147.lin.EMBL_RF00028_Intron_gpI_1349111823_64512_30
http://www.rnaworld.de/rnaworld/workuni ... id=6330855
estimated runtime on reference system: 8w 0d 21h 7m 47s (4914467.536505 s)
forecast 17647160 sec (~29.2 weeks)
Failed on: 20 Nov 2014, 20:11:45 UTC
Failed at: 13,672,580 sec (~22.6 weeks)
Current Progress.txt: 82.5124%
Current runtime: 242683 mins (~24.1 weeks)
-
Jacob Klein
- Brain-Bug

- Beiträge: 564
- Registriert: 26.07.2013 15:41
Re: Assistance needed - 2 Long-running VMs just failed - Clo
I wanted to chime in with another status update.
Oracle has just this week finally fixed the VirtualBox startup bug that prevented 4.3.18+ from properly starting up on the latest versions of Windows 10, which I first reported to them 3 months ago. I have confirmed that Oracle VirtualBox 4.3.21 r97845 (link below), does correctly function on Windows 10 Technical Preview Build 9926 (release 2 days ago).
See here for the test build, if interested.
https://www.virtualbox.org/ticket/13665#comment:12
This means I can now move my 4 VMs from Nitro (my 1.73 GHz laptop, Windows 8.1), back to RacerX (my 3.74 GHz main rig, Windows 10)... so I can hopefully complete them faster. So, I have done that. Below is the current status of all 4 VMs. Still crunching away!
Thanks,
Jacob Klein
1)
Name: cmsvm2_GA-p[e20-30MB_Lin64f]_1_Drosophila-melanogaster-(fruit-fly)_AE014297.lin.EMBL_RF00028_Intron_gpI_1349111823_13748_9
http://www.rnaworld.de/rnaworld/workuni ... id=6330939
estimated runtime on reference system: 10w 5d 3h 19m 24s (6491964.4413781 s)
forecast 9655215 sec (~16 weeks)
Failed on: 2 Jul 2014, 11:09:24 UTC
Failed at: 8,345,739 sec (~13.8 weeks)
Current Progress.txt: 98.765%
Current runtime: 335882 mins (~33.3 weeks)
Note: Has a wingman with a completion time of: 18,268,430 sec (~30.2 weeks; probably valid)
2)
Name: cmsvm2_GA-p[e20-30MB_Lin64f]_1_Oryza-sativa-Japonica-Group_CM000142.lin.EMBL_RF00028_Intron_gpI_1349111823_57652_12
http://www.rnaworld.de/rnaworld/workuni ... id=6330945
estimated runtime on reference system: 8w 5d 20h 41m 32s (5344892.6310472 s)
forecast 8627070 sec (~14.25 weeks)
Failed on: 2 Jul 2014, 11:09:24 UTC
Failed at: 10,810,220 sec (~17.9 weeks)
Current Progress.txt: 98.765%
Current runtime: 368205 mins (~36.5 weeks)
Note: Has a wingman with a completion time of: 463,101.50 sec (~0.75 weeks; invalid)
3)
Name: cmsvm2_GA-p[e20-30MB_Lin64f]_1_Oryza-sativa-Japonica-Group_CM000147.lin.EMBL_RF00028_Intron_gpI_1349111823_64512_30
http://www.rnaworld.de/rnaworld/workuni ... id=6330855
estimated runtime on reference system: 8w 0d 21h 7m 47s (4914467.536505 s)
forecast 17647160 sec (~29.2 weeks)
Failed on: 20 Nov 2014, 20:11:45 UTC
Failed at: 13,672,580 sec (~22.6 weeks)
Current Progress.txt: 46.1178%
Current runtime: 135642 mins (~13.5 weeks) * Started running outside of BOINC using a pre-crash snapshot that was saved before "write errors" in VM
4)
Name: cmsvm2_GA-p[e20-30MB_Lin64f]_1_Oryza-sativa-Japonica-Group_CM000147.lin.EMBL_RF00028_Intron_gpI_1349111823_64512_30
http://www.rnaworld.de/rnaworld/workuni ... id=6330855
estimated runtime on reference system: 8w 0d 21h 7m 47s (4914467.536505 s)
forecast 17647160 sec (~29.2 weeks)
Failed on: 20 Nov 2014, 20:11:45 UTC
Failed at: 13,672,580 sec (~22.6 weeks)
Current Progress.txt: 98.765%
Current runtime: 300353 mins (~29.8 weeks)
Oracle has just this week finally fixed the VirtualBox startup bug that prevented 4.3.18+ from properly starting up on the latest versions of Windows 10, which I first reported to them 3 months ago. I have confirmed that Oracle VirtualBox 4.3.21 r97845 (link below), does correctly function on Windows 10 Technical Preview Build 9926 (release 2 days ago).
See here for the test build, if interested.
https://www.virtualbox.org/ticket/13665#comment:12
This means I can now move my 4 VMs from Nitro (my 1.73 GHz laptop, Windows 8.1), back to RacerX (my 3.74 GHz main rig, Windows 10)... so I can hopefully complete them faster. So, I have done that. Below is the current status of all 4 VMs. Still crunching away!
Thanks,
Jacob Klein
1)
Name: cmsvm2_GA-p[e20-30MB_Lin64f]_1_Drosophila-melanogaster-(fruit-fly)_AE014297.lin.EMBL_RF00028_Intron_gpI_1349111823_13748_9
http://www.rnaworld.de/rnaworld/workuni ... id=6330939
estimated runtime on reference system: 10w 5d 3h 19m 24s (6491964.4413781 s)
forecast 9655215 sec (~16 weeks)
Failed on: 2 Jul 2014, 11:09:24 UTC
Failed at: 8,345,739 sec (~13.8 weeks)
Current Progress.txt: 98.765%
Current runtime: 335882 mins (~33.3 weeks)
Note: Has a wingman with a completion time of: 18,268,430 sec (~30.2 weeks; probably valid)
2)
Name: cmsvm2_GA-p[e20-30MB_Lin64f]_1_Oryza-sativa-Japonica-Group_CM000142.lin.EMBL_RF00028_Intron_gpI_1349111823_57652_12
http://www.rnaworld.de/rnaworld/workuni ... id=6330945
estimated runtime on reference system: 8w 5d 20h 41m 32s (5344892.6310472 s)
forecast 8627070 sec (~14.25 weeks)
Failed on: 2 Jul 2014, 11:09:24 UTC
Failed at: 10,810,220 sec (~17.9 weeks)
Current Progress.txt: 98.765%
Current runtime: 368205 mins (~36.5 weeks)
Note: Has a wingman with a completion time of: 463,101.50 sec (~0.75 weeks; invalid)
3)
Name: cmsvm2_GA-p[e20-30MB_Lin64f]_1_Oryza-sativa-Japonica-Group_CM000147.lin.EMBL_RF00028_Intron_gpI_1349111823_64512_30
http://www.rnaworld.de/rnaworld/workuni ... id=6330855
estimated runtime on reference system: 8w 0d 21h 7m 47s (4914467.536505 s)
forecast 17647160 sec (~29.2 weeks)
Failed on: 20 Nov 2014, 20:11:45 UTC
Failed at: 13,672,580 sec (~22.6 weeks)
Current Progress.txt: 46.1178%
Current runtime: 135642 mins (~13.5 weeks) * Started running outside of BOINC using a pre-crash snapshot that was saved before "write errors" in VM
4)
Name: cmsvm2_GA-p[e20-30MB_Lin64f]_1_Oryza-sativa-Japonica-Group_CM000147.lin.EMBL_RF00028_Intron_gpI_1349111823_64512_30
http://www.rnaworld.de/rnaworld/workuni ... id=6330855
estimated runtime on reference system: 8w 0d 21h 7m 47s (4914467.536505 s)
forecast 17647160 sec (~29.2 weeks)
Failed on: 20 Nov 2014, 20:11:45 UTC
Failed at: 13,672,580 sec (~22.6 weeks)
Current Progress.txt: 98.765%
Current runtime: 300353 mins (~29.8 weeks)
-
Michael H.W. Weber
- Vereinsvorstand

- Beiträge: 23026
- Registriert: 07.01.2002 01:00
- Wohnort: Marpurk
Re: Assistance needed - 2 Long-running VMs just failed - Clo
Thank you for contacting Oracle on this!
Michael.
Michael.
Fördern, kooperieren und konstruieren statt fordern, konkurrieren und konsumieren.


-
Jacob Klein
- Brain-Bug

- Beiträge: 564
- Registriert: 26.07.2013 15:41
Re: Assistance needed - 2 Long-running VMs just failed - Clo
You're welcome. A few of us early adopters were voicing our concerns, but I think it took so long to fix because they are still dealing with hardened security issues, and Windows 10 wasn't on their radar. But, now that the big reveal event happened, and lots of people are starting to use Windows 10, I guess they finally thought it was time to fix it.
By the way, I don't recommend running Windows 10 right now, unless you like living on the bleeding edge or want to help give Microsoft valuable feedback. Parts of it are a bit buggy
By the way, I don't recommend running Windows 10 right now, unless you like living on the bleeding edge or want to help give Microsoft valuable feedback. Parts of it are a bit buggy
-
Michael H.W. Weber
- Vereinsvorstand

- Beiträge: 23026
- Registriert: 07.01.2002 01:00
- Wohnort: Marpurk
Re: Assistance needed - 2 Long-running VMs just failed - Clo
If I remember correctly, Oracle has also changed something else which is making it difficult to use the latest VirtualBox versions with BOINC.
Christian knows all the details...
Michael.
Christian knows all the details...
Michael.
Fördern, kooperieren und konstruieren statt fordern, konkurrieren und konsumieren.

