XXL Work Unit checkpoint

Everything about the project RNA World
Nachricht
Autor
Jon Heels
Idle-Sammler
Idle-Sammler
Beiträge: 6
Registriert: 24.04.2013 17:54

XXL Work Unit checkpoint

#1 Ungelesener Beitrag von Jon Heels » 28.04.2013 19:28

:cry2:
I am reluctantly having to stop running XXL work units as it is impossible to finish them due to the long run times and more importantly the lack of checkpoints.

I had reached about 40hrs runtime on an XXl WU when an automatic Windows download caused the computer to shut down and restart overnight. This is a usual situation and on acessing the computer the following morning everything was fine except that the particular XXL WU had was now showing just 2 Hours and the percentage figure had decreased from 12% to 2%.

As an experiment I paused the same WU for a few minutes and exactly the same thing happened, the runtime decreased from 25 to 2 hours with a similar decrease in the percentage completed figure.

The absence of checkpoints mean that the XXL WU's need to be run on a very fast machine in order to complete the WU before something interupts the work and the WU starts at the beginning again.

IPS am rather annoyed that I have wasted about 60 hours CPU run time for nothing.

PS Perhaps you will actually publish this post unlike my previous 2

Benutzeravatar
Dunuin
Vereinsmitglied
Vereinsmitglied
Beiträge: 1743
Registriert: 23.03.2011 12:59
Wohnort: Hamburg

Re: XXL Work Unit checkpoint

#2 Ungelesener Beitrag von Dunuin » 28.04.2013 20:23

I'm using a VM. Install Virtualbox and setup a virtual Linux computer running BOINC and RNAWorld. Make a Batchscript like this

Code: Alles auswählen

C:
cd "\Program Files\Oracle\VirtualBox\"
VBoxManage controlvm RNA-Cruncher-2 pause
VBoxManage snapshot RNA-Cruncher-2 edit previoushour --name deletemehour
VBoxManage snapshot RNA-Cruncher-2 edit currenthour --name previoushour
VBoxManage snapshot RNA-Cruncher-2 take currenthour
VboxManage controlvm RNA-Cruncher-2 resume
vboxManage snapshot RNA-Cruncher-2 delete deletemehour
and run it every hour. So you can make your own Checkpointing and you won't loose more than one hour of work.
Bild

Benutzeravatar
Michael H.W. Weber
Vereinsvorstand
Vereinsvorstand
Beiträge: 22419
Registriert: 07.01.2002 01:00
Wohnort: Marpurk
Kontaktdaten:

Re: XXL Work Unit checkpoint

#3 Ungelesener Beitrag von Michael H.W. Weber » 29.04.2013 11:33

Jon Heels hat geschrieben::cry2:
I am reluctantly having to stop running XXL work units as it is impossible to finish them due to the long run times and more importantly the lack of checkpoints.
For checkpointing, please use Linux or use Virtual Box VM under Windows.
Jon Heels hat geschrieben:I had reached about 40hrs runtime on an XXl WU when an automatic Windows download caused the computer to shut down and restart overnight.
Yes, I am sorry for this - it is really a pitty but you should NOT allow Windows to restart your machine automatically in general.
Jon Heels hat geschrieben:IPS am rather annoyed that I have wasted about 60 hours CPU run time for nothing.
Sure, but please realize that it is your Windows setting which caused this and not the RNA World project as such.
Jon Heels hat geschrieben:PS Perhaps you will actually publish this post unlike my previous 2
Hmmm - what exactly do you mean by this? There is no single post which we have not published. To the best of my knowledge. Promised. :D
In fact, except for newcomers who publish a post for the first time, there is no review of the messages prior to publication in our forums.

Michael.
Fördern, kooperieren und konstruieren statt fordern, konkurrieren und konsumieren.

http://signature.statseb.fr I: Kaputte Seite A
http://signature.statseb.fr II: Kaputte Seite B

Bild Bild Bild

Jon Heels
Idle-Sammler
Idle-Sammler
Beiträge: 6
Registriert: 24.04.2013 17:54

Re: XXL Work Unit checkpoint

#4 Ungelesener Beitrag von Jon Heels » 29.04.2013 15:35

I am fully aware that my Windows settings caused the computer restart. These restarts do not seem to cause much problem with my other Boinc projects.

It is just a shame that the XXL WU's have no checkpoints and take so long to run on my machine!

I run Boinc projects just as a sort of hobby but I have neither the inclination or the computer expertise to start running virtual Linux machines for the sake of a few extra credits, but it was a nice thought for someone to suggest it.

Hopefully I will carry on with RNA World and start running the XXL WU's when I have a much more powerful computer to run them on.

Benutzeravatar
X1900AIW
TuX-omane
TuX-omane
Beiträge: 2868
Registriert: 05.01.2008 16:34

Re: XXL Work Unit checkpoint

#5 Ungelesener Beitrag von X1900AIW » 29.04.2013 19:33

Jon Heels hat geschrieben:It is just a shame that the XXL WU's have no checkpoints and take so long to run on my machine!

I run Boinc projects just as a sort of hobby but I have neither the inclination or the computer expertise to start running virtual Linux machines for the sake of a few extra credits, but it was a nice thought for someone to suggest it.

Hopefully I will carry on with RNA World and start running the XXL WU's when I have a much more powerful computer to run them on.
Why is it a shame ? See the FAQ. The conditions to run RNA workunits are clear, it is risky.

To have not a faster machine is not a shame I think, we are mostly hobby crunchers and not professional in any way except perhaps the project operators. RNA World is a "small" project with limited human and financial ressources, that´s my perception. Crunchers are welcome, but they normally know the problems running specific workunits.

Checkpointing should be standard, but it isn´t here. This fact can complained as often as we may want to hear it, but complaining does not change the code. I lost 1000 hours with one single "daily" BOINC crash, others lost much more crunching time even with finished workunits ... that´s part of the game. :wave:
Zusammenkommen ist ein Beginn, Zusammenbleiben ist ein Fortschritt, Zusammenarbeiten ist ein Erfolg.
Henry Ford

Jon Heels
Idle-Sammler
Idle-Sammler
Beiträge: 6
Registriert: 24.04.2013 17:54

Re: XXL Work Unit checkpoint

#6 Ungelesener Beitrag von Jon Heels » 29.04.2013 23:59

X1900AIW
Thank you for your post, when I said it was a shame that the XXL WU's take so long to run on my machine, this was my opinion and NOT a criticism of the RNA World project.

However I did get the impression from your post that I was being criticised for not realising the potential problems of running the XXL WU's.

It will be a fair while before I can get a new computer, so my new computer if and when it arrives will be a lot more powerful than my present one as a consequence of Moore's law.

I am also a "hobby cruncher" and the loss of the computer time due to system shutdowns is not really so bad, I just wish I had realised what would happen with these particlar work units.

Benutzeravatar
X1900AIW
TuX-omane
TuX-omane
Beiträge: 2868
Registriert: 05.01.2008 16:34

Re: XXL Work Unit checkpoint

#7 Ungelesener Beitrag von X1900AIW » 30.04.2013 07:11

Updating the hardware is an option, but it mainly shifts the border of the own expectations to new ones. The last years I reduced the number of (powerful) computers more and more and concentrate now on just one single system with an efficient Ivy Bridge which can be customized from low end to high end crunching, same time eliminating power hungry just like noisy GPUs, I am quite happy with this solution.

Transfered to the approach of the folding@home project, RNA World plays at the "cutting edge of technology" regarding the non existence of checkponting and very long run times, cutting edge means it makes you bleeding if you don´t care. :wink: That´s an elementary experience which can tell us something about seriousness of life. Other project do not donate us this learning effect. :lol:

The difference to folding@home can be found in pro forma dead lines, which can be extended a unlimited number a times, manually with honorable engagement by Yoyo. :good:

As a consequence, second: RNA can be run on low end hardware, I don´t know whether these systems are common use, but I appreciate this approach more than the kind of "high end restricted access" at folding@home or GPUgrid, where bonus credits are the preferred part of the game and can be reached only with a huge invest in new resp. modern hardware same as electricity costs. With some exceptions.
Zusammenkommen ist ein Beginn, Zusammenbleiben ist ein Fortschritt, Zusammenarbeiten ist ein Erfolg.
Henry Ford

Benutzeravatar
Michael H.W. Weber
Vereinsvorstand
Vereinsvorstand
Beiträge: 22419
Registriert: 07.01.2002 01:00
Wohnort: Marpurk
Kontaktdaten:

Re: XXL Work Unit checkpoint

#8 Ungelesener Beitrag von Michael H.W. Weber » 30.04.2013 14:42

RNA World indeed is VERY demanding - absolutely agreed. :D
Thank you for giving it a try. We hope to come up with an improved version, this year.

Michael.
Fördern, kooperieren und konstruieren statt fordern, konkurrieren und konsumieren.

http://signature.statseb.fr I: Kaputte Seite A
http://signature.statseb.fr II: Kaputte Seite B

Bild Bild Bild

robertmiles
XBOX360-Installer
XBOX360-Installer
Beiträge: 86
Registriert: 23.02.2010 18:43
Wohnort: northern Alabama, US

Re: XXL Work Unit checkpoint

#9 Ungelesener Beitrag von robertmiles » 01.05.2013 00:53

Jon Heels,

I've read elsewhere that CPUs are now close to the limit of the region where Moore's law applies - the CPU and memory voltages is now hard to reduce any more and still get the transistors to work, so about the only way left to get a speedup to to make them do more things AT THE SAME TIME because none of a group of items to be done depend on the results of any other items in the group. GPUs are very good at doing many things at the same time, but this project doesn't seem suitable to convert to running on GPUs.

Antworten

Zurück zu „RNA World Discussions (english)“