Long running work unit

Everything about the project RNA World
Nachricht
Autor
Ananas
WU-Schieber
WU-Schieber
Beiträge: 1184
Registriert: 27.04.2008 18:37
Wohnort: Nordlichter Köln

Re: Long running work unit

#325 Ungelesener Beitrag von Ananas » 03.12.2011 12:53

The runtime estimate is very rough and afaik. the 100% point is set to that estimate.

If the result has been underestimated, it hangs at 100% for quite a while (this while can even be weeks), if it has been overestimated, it is done before 100% are reached.
vi BOINC/checkin_notes
:1,$s/bug/feature/g
:wq!

Erzaehlen sich Biologen eigentlich Klein-RNA-Witze?

Zachariassen

Re: Long running work unit

#326 Ungelesener Beitrag von Zachariassen » 06.12.2011 18:57

Hello yoyo
Could you please extend the following task?

13612793 - 5732991

Almost 400h. and the progress has been 100% for a week or so....
I hope I can deliver this WU in 2011 (well - if I can avoid a power outage - again... :wink: )

raddoc
Idle-Sammler
Idle-Sammler
Beiträge: 4
Registriert: 09.12.2011 18:01

Re: Long running work unit

#327 Ungelesener Beitrag von raddoc » 09.12.2011 18:11

this is at 100% for some time and I have over 166 hours of work - should I abort?


Name cms_GA[e20-30MB_Lin64f]_1_Oryza-sativa-Japonica-Group_CM000145.lin.EMBL_RF00028_Intron_gpI_1307967123_61768_23
Workunit 5288673
Created 16 Nov 2011 18:10:44 UTC
Sent 17 Nov 2011 0:18:42 UTC
Received ---
Server state Over
Outcome No reply
Client state New
Exit status 0 (0x0)
Computer ID 10516
Report deadline 7 Dec 2011 0:18:42 UTC
Run time 0
CPU time 0
stderr out

Validate state Initial
Claimed credit 0
Granted credit 0
application version cmsearch XXL (large) 1.0.2 v0.31

Benutzeravatar
mxplm
Partikel-Strecker
Partikel-Strecker
Beiträge: 966
Registriert: 14.09.2009 13:56
Wohnort: Bielefeld

Re: Long running work unit

#328 Ungelesener Beitrag von mxplm » 13.12.2011 08:42

raddoc hat geschrieben:this is at 100% for some time and I have over 166 hours of work - should I abort?
Someone already finished it, but now the WU status is "too many error results". Perhaps yoyo can rescue this one.

@yoyo: Could you please extend Result 13429171 for me? Either you changed the SSH credentials or I forgot them.
:Wiki-Benutzerseite: (Über mich)
:fold.it: (Helfen durch Zocken)

Benutzeravatar
yoyo
Vereinsvorstand
Vereinsvorstand
Beiträge: 7855
Registriert: 17.12.2002 14:09
Wohnort: Berlin
Kontaktdaten:

Re: Long running work unit

#329 Ungelesener Beitrag von yoyo » 13.12.2011 09:47

raddoc hat geschrieben:this is at 100% for some time and I have over 166 hours of work - should I abort?


Name cms_GA[e20-30MB_Lin64f]_1_Oryza-sativa-Japonica-Group_CM000145.lin.EMBL_RF00028_Intron_gpI_1307967123_61768_23
Workunit 5288673
Created 16 Nov 2011 18:10:44 UTC
Sent 17 Nov 2011 0:18:42 UTC
Received ---
Server state Over
Outcome No reply
Client state New
Exit status 0 (0x0)
Computer ID 10516
Report deadline 7 Dec 2011 0:18:42 UTC
Run time 0
CPU time 0
stderr out

Validate state Initial
Claimed credit 0
Granted credit 0
application version cmsearch XXL (large) 1.0.2 v0.31
Boinc canceled and deleted this workunit already because it got to many errors. So you should cancel your result also.
yoyo
HILF mit im Rechenkraft-WiKi, dies gibts zu tun.
Wiki - FAQ - Verein - Chat

Bild Bild

Benutzeravatar
yoyo
Vereinsvorstand
Vereinsvorstand
Beiträge: 7855
Registriert: 17.12.2002 14:09
Wohnort: Berlin
Kontaktdaten:

Re: Long running work unit

#330 Ungelesener Beitrag von yoyo » 13.12.2011 09:48

mxplm hat geschrieben:@yoyo: Could you please extend Result 13429171 for me?
I extended it.
yoyo
HILF mit im Rechenkraft-WiKi, dies gibts zu tun.
Wiki - FAQ - Verein - Chat

Bild Bild

ftpd

Re: Long running work unit

#331 Ungelesener Beitrag von ftpd » 13.12.2011 16:36

Hi YoYo,

Can you please extend two wu's?
Hostid = 4208

5731106 - 13428348
5714106 - 13394064

Thx,

Ton

Benutzeravatar
yoyo
Vereinsvorstand
Vereinsvorstand
Beiträge: 7855
Registriert: 17.12.2002 14:09
Wohnort: Berlin
Kontaktdaten:

Re: Long running work unit

#332 Ungelesener Beitrag von yoyo » 13.12.2011 17:33

Extended.
yoyo
HILF mit im Rechenkraft-WiKi, dies gibts zu tun.
Wiki - FAQ - Verein - Chat

Bild Bild

robertmiles
XBOX360-Installer
XBOX360-Installer
Beiträge: 75
Registriert: 23.02.2010 18:43
Wohnort: northern Alabama, US

Re: Long running work unit

#333 Ungelesener Beitrag von robertmiles » 14.12.2011 01:38

Michael H.W. Weber hat geschrieben:Indeed, these WUs were not defective but just not complete. :roll: The problem is that the progress bar does not really work well as described many times before. Again, RNA World has stochastic elements which underly the calculations and that prevents an accurate runtime prediction. In most cases, the progress bar is fairly OK but in some exceptional cases, as in those described above, it fails completely and may indicate runtimes of only 10% of the real runtime. We are of course sad about that because there is no way to solve this problem except for having checkpointing.

Michael.
I've seen a suitable way in a previous application over at Rosetta@Home. As the reported progress approaches 100%, just decrease the size of the steps of reported progress so that the progress keeps increasing, but never reaches 100% until the workunit is actually finished.

Benutzeravatar
Michael H.W. Weber
Vereinsvorstand
Vereinsvorstand
Beiträge: 20768
Registriert: 07.01.2002 01:00
Wohnort: Marpurk
Kontaktdaten:

Re: Long running work unit

#334 Ungelesener Beitrag von Michael H.W. Weber » 14.12.2011 01:40

Too much efforts for very little outcome.

Michael.
Fördern, kooperieren und konstruieren statt fordern, konkurrieren und konsumieren.

http://signature.statseb.fr I: Kaputte Seite A
http://signature.statseb.fr II: Kaputte Seite B

Bild Bild Bild

robertmiles
XBOX360-Installer
XBOX360-Installer
Beiträge: 75
Registriert: 23.02.2010 18:43
Wohnort: northern Alabama, US

Re: Long running work unit

#335 Ungelesener Beitrag von robertmiles » 14.12.2011 03:13

Michael H.W. Weber hat geschrieben:
ConflictingEmotions hat geschrieben:Really you need to figure out why this intron WUs are so badly underestimated.
Well, we know that but, unfortunately, it cannot be fixed due to stochastic elements in the code.

Michael.
Do you at least get a chance to modify the estimated runtimes after the initial calculation? For example, multiply it by the typical ratio for which intron runtimes are incorrect?

Or, if it's easier, divide the estimated speed of the CPU by that ratio?

Ananas
WU-Schieber
WU-Schieber
Beiträge: 1184
Registriert: 27.04.2008 18:37
Wohnort: Nordlichter Köln

Re: Long running work unit

#336 Ungelesener Beitrag von Ananas » 14.12.2011 06:44

That's what the first cmsearch call is supposed to do, it is extremely unreliable though.

Code: Alles auswählen

wrapper: running unzip_cpufeat (cmsearch.zip)
wrapper: no checkpoint file found
wrapper: running cmsearch (--forecast 1 -T 0.0 --fil-T-hmm 0.0 --fil-T-qdb 0.0 RF00894_mir-790.cm Equus-caballus-(horse)_CM000405.lin.EMBL.fasta)
forecast.txt found.
This "--forecast" is a runtime forecast. Check forecast.txt in your slot directories and you will see that it has an estimated runtime in it. The file is human-readable.

p.s.: this applies only to cmsearch, cmcalibrate uses a loop count for the progress, the last loop needs somewhat more time so it isn't an exact measure - but better than nothing
vi BOINC/checkin_notes
:1,$s/bug/feature/g
:wq!

Erzaehlen sich Biologen eigentlich Klein-RNA-Witze?

Antworten

Zurück zu „RNA World Discussions (english)“