Rechenkraft.net e.V.

die erste Adresse für Distributed Computing
Unterstütze Rechenkraft.net e.V.
Mit Suchkraft hier suchen:
 
It is currently 29.11.14 10:38

All times are UTC + 1 hour [ DST ]




Post new topic Reply to topic  [ 19 posts ]  Go to page 1, 2  Next
Author Message
Unread postPosted: 31.12.11 20:56 
Offline
Idle-Sammler
Idle-Sammler

Joined: 31.12.11 20:51
Posts: 3
I have had to abort all the CMSearcg (large) tasks because their state is lost on any kind of suspension (for any reason) and they appear to require anywhere from 2 days to to more than 5 days of uninterrupted processing on an i7 core Intel machine. This is is a waste of CPU time. Either the work units need to be made smaller so that they stand a reasonable chance of completing in under 1 day, or they need to be able to save their state so that they can resume where they left off without discarding possible days of CPU time for some suspension.


Top
 Profile  
 
Unread postPosted: 31.12.11 21:02 
Offline
Vereinsvorstand
Vereinsvorstand
User avatar

Joined: 17.12.02 14:09
Posts: 6800
Location: Berlin
Both is not possible. Therefore we run them as XXL App which you can deselect in your preferences. Anyway, there is a small group of XXL fans.
yoyo

_________________
HILF mit im Rechenkraft-WiKi, dies gibts zu tun.
Wiki - FAQ - Verein - Chat

Image Image


Top
 Profile  
 
Unread postPosted: 01.01.12 10:18 
Offline
Idle-Sammler
Idle-Sammler

Joined: 31.12.11 20:51
Posts: 3
That's a pity.

Where do I find the preference? I don't see it in either the BOIC manager or the BAM! web pages. I'd like to disable any XXL WUs as they are unlikely to ever finish and the CPU time would be better spent on other tasks or proijects.

Thanks,
David.


Top
 Profile  
 
Unread postPosted: 01.01.12 10:31 
Offline
Vereinsvorstand
Vereinsvorstand
User avatar

Joined: 07.01.02 01:00
Posts: 16514
Location: anne Lahn
DavidHoney wrote:
...I'd like to disable any XXL WUs as they are unlikely to ever finish and the CPU time would be better spent on other tasks or proijects.

These tasks definetely finish, give exorbitant credits as a compensation for the (unavoidably) long runtimes and are worth of processing, as they correspond to interesting but complex RNA families. :D

Michael.

P.S.: And of course can the CMSEARCH XXL applications be deselcted in the RNA World project settings under the Run only the selected applications option.

_________________
Cooperation rather than competition is nature's most powerful principle. Or why do you think are eukaryotes the crown of evolution on earth?
Image


Top
 Profile  
 
Unread postPosted: 01.01.12 11:41 
Offline
Idle-Sammler
Idle-Sammler

Joined: 31.12.11 20:51
Posts: 3
Quote:
These tasks definetely finish, give exorbitant credits as a compensation for the (unavoidably) long runtimes and are worth of processing, as they correspond to interesting but complex RNA families


It seems they will only complete if you never need to suspend BOINC or the project for at least some days or possibly more than a week. The moment you have to suspend the project, or suspend use of processor/GPU for running BOINC project tasks, or reboot the machine (such as installing a Windows update or installing/uninstalling an application, all the work for that RNA task is discarded. So on my machine where it is likely that such a suspension is required at least twice a week, the result is that none of those cmsearch XXL tasks will ever complete. Thanks to the pointer on where I can disable cmsearch XXL WUs from running - I've now done that.

This is a pity. The applications from other BOINC projects save their state on suspension so that they can resume without discarding all the work achieved on them to-date. This is the most user friendly design to which BOINC apps should aspire. After all, it's my machine and when I need to run my own resource intensive apps, I should be able to suspend BOINC so that resources are relinquished so that I can the applications.

David.


Top
 Profile  
 
Unread postPosted: 01.01.12 12:03 
Offline
Task-Killer
Task-Killer
User avatar

Joined: 10.02.10 22:26
Posts: 717
Location: Berlin
DavidHoney wrote:
The applications from other BOINC projects save their state on suspension so that they can resume without discarding all the work achieved on them to-date. This is the most user friendly design to which BOINC apps should aspire.


Well... it's not as if we (the as association Rechenkraft.net) don't try to make this possible BUT the application wasn't originally developed for BOINC - and the coders didn't think it necessary to implement checkpoints. We'll have to do with what we've got and unless someone of us can find a way to make checkpoints possible in our WU's, we're unfortunately stuck with not having them, sorry.

_________________
MfG
MReed

ImageImage


Top
 Profile  
 
Unread postPosted: 01.01.12 12:44 
Offline
Vereinsvorstand
Vereinsvorstand
User avatar

Joined: 07.01.02 01:00
Posts: 16514
Location: anne Lahn
DavidHoney wrote:
It seems they will only complete if you never need to suspend BOINC or the project for at least some days or possibly more than a week. The moment you have to suspend the project, or suspend use of processor/GPU for running BOINC project tasks, or reboot the machine (such as installing a Windows update or installing/uninstalling an application, all the work for that RNA task is discarded. So on my machine where it is likely that such a suspension is required at least twice a week, the result is that none of those cmsearch XXL tasks will ever complete. Thanks to the pointer on where I can disable cmsearch XXL WUs from running - I've now done that.

This is a pity. The applications from other BOINC projects save their state on suspension so that they can resume without discarding all the work achieved on them to-date. This is the most user friendly design to which BOINC apps should aspire. After all, it's my machine and when I need to run my own resource intensive apps, I should be able to suspend BOINC so that resources are relinquished so that I can the applications.

David.

Yes, it is correct and known (FAQ) that RNA World supports checkpointing only for 32-bit Linux machines which have memory randomization disabled in the kernel. However, you should activate the switch "keep application in memory" in the general BOINC settings - and this not only for the RNA World project - to avoid loss of data when pausing the task. Sending Windows to hibernation mode should then NOT result in loss of the computational results and should consequently allow you to turn your machine off for a while (in hibernation mode, only!).

As described earlier on many occasions in diverse discussion fora, we are not keen on implementing checkpointing at the science application level for multiple reasons. Among these is the fact that the current application would have to be re-written. And this application is not developed by our team. The second most important argument is that unlike other distributed computing projects, our project already consists of multiple applications and will be massively extended in the future. We therefore require a BOINC-integrated, universal checkpointing mechanism as we cannot re-write all the applications each time a new version is released. Such a universal approach would also be beneficial to all the other projects and, to my point of view, is therefore of utmost universal importance. Unfortunately, there seems to be zero efforts from the BOINC developers to take this request serious. As a consequence, I cannot exclude that RNA World might one day migrate to a different, more advanced DC infrastructure that satisfies our needs in a more timely fashion.

Michael.

_________________
Cooperation rather than competition is nature's most powerful principle. Or why do you think are eukaryotes the crown of evolution on earth?
Image


Top
 Profile  
 
Unread postPosted: 08.01.12 17:47 
Offline
Idle-Sammler
Idle-Sammler

Joined: 26.12.11 18:06
Posts: 6
I could live with the lack of suspend if the run time could be counted on to be close to what was predicted.

I started the job I'm currently running, with run time that was supposed to be around 140 hours. It hit 100% after 333.5 high priority hours, but after that has run another 381 hours at high priority, and still hasn't finished. It would have been nice to know what I was getting into.


Top
 Profile  
 
Unread postPosted: 08.01.12 23:19 
Offline
WU-Schieber
WU-Schieber

Joined: 27.04.08 18:37
Posts: 1184
Location: Nordlichter Köln
@Xenu : disabling XXL work if you prefer shorter ones (described above : "Run only the selected applications"), that should relax the situation a bit

edit : The first call to the cmsearch program estimates the runtime to the 100% mark ... and you're right, it sometimes isn't too close to reality :

wrapper: no checkpoint file found
wrapper: running cmsearch (--forecast 1 -T 0.0 --fil-T-hmm 0.0 --fil-T-qdb 0.0
RF00976_mir-583.cm Ornithorhynchus-anatinus-(platypus)_CM000409.lin.EMBL.fasta)
forecast.txt found.


This might be caused by the CPU specific optimizations, that probably work better for the forecast than for the real calculation.

_________________
vi BOINC/checkin_notes
:1,$s/bug/feature/g
:wq!

Erzaehlen sich Biologen eigentlich Klein-RNA-Witze?


Top
 Profile  
 
Unread postPosted: 10.01.12 17:34 
Offline
Idle-Sammler
Idle-Sammler

Joined: 26.12.11 18:06
Posts: 6
Thanks, I disabled XXL a little over a month ago.

I don't mind jobs that take a really long time, but by now it's been at 100% for close to 3 weeks. Will it complete today? Next week? The week after? Is it caught in a loop and will never complete? I've really given up guessing, but would hope that the coders could figure out some way of preventing this sort of thing.


Top
 Profile  
 
Unread postPosted: 10.01.12 17:41 
Offline
Vereinsvorstand
Vereinsvorstand
User avatar

Joined: 17.12.02 14:09
Posts: 6800
Location: Berlin
About which workunitID you are talking?
yoyo

_________________
HILF mit im Rechenkraft-WiKi, dies gibts zu tun.
Wiki - FAQ - Verein - Chat

Image Image


Top
 Profile  
 
Unread postPosted: 10.01.12 19:18 
Offline
Idle-Sammler
Idle-Sammler

Joined: 26.12.11 18:06
Posts: 6
That would be WU 5759022, now at 764 hours on a 2.6 GHz core.


Top
 Profile  
 
Display posts from previous:  Sort by  
Post new topic Reply to topic  [ 19 posts ]  Go to page 1, 2  Next

All times are UTC + 1 hour [ DST ]


Who is online

Users browsing this forum: Yahoo [Bot] and 0 guests


You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot post attachments in this forum

Search for:
Jump to:  
Powered by phpBB® Forum Software © phpBB Group