31.8 hour WU for 509.72 credits unacceptable
- Michael H.W. Weber
- Vereinsvorstand
- Beiträge: 22435
- Registriert: 07.01.2002 01:00
- Wohnort: Marpurk
- Kontaktdaten:
Re: 31.8 hour WU for 509.72 credits unacceptable
He exactly got the credits for the actual work that was done which in RNA World is similar if not more than what other projects would have granted. 32 hrs were not put into computation, also.
Michael.
Michael.
Fördern, kooperieren und konstruieren statt fordern, konkurrieren und konsumieren.
http://signature.statseb.fr I: Kaputte Seite A
http://signature.statseb.fr II: Kaputte Seite B
http://signature.statseb.fr I: Kaputte Seite A
http://signature.statseb.fr II: Kaputte Seite B
Re: 31.8 hour WU for 509.72 credits unacceptable
oh shi
http://www.rnaworld.de/rnaworld/result. ... id=1291163
seems this task was restarted million times (and wasted 6 days)
could you do something with it in the next app version, please
http://www.rnaworld.de/rnaworld/result. ... id=1291163
seems this task was restarted million times (and wasted 6 days)
could you do something with it in the next app version, please
I crunch for Ukraine
Re: 31.8 hour WU for 509.72 credits unacceptable
I'm not 100% sure - but I think you're wrong.rilian hat geschrieben:oh shi
http://www.rnaworld.de/rnaworld/result. ... id=1291163
seems this task was restarted million times (and wasted 6 days)
could you do something with it in the next app version, please
I might be wrong - but this "wrapper: windows. no checkpoint image" is not a restart.
A restart always is a new cmsearch call, like : "wrapper: running cmsearch (-o out cmfile in)"
You have that line only once, so it actually did run to the bitter end where the RAM was exhausted.
p.s.: Compare yours to this one : http://www.rnaworld.de/rnaworld/result. ... id=1307857 which (imo.) has a restart in it.
Zuletzt geändert von Ananas am 15.03.2010 00:20, insgesamt 1-mal geändert.
vi BOINC/checkin_notes
:1,$s/bug/feature/g
:wq!
Erzaehlen sich Biologen eigentlich Klein-RNA-Witze?
:1,$s/bug/feature/g
:wq!
Erzaehlen sich Biologen eigentlich Klein-RNA-Witze?
Re: 31.8 hour WU for 509.72 credits unacceptable
ok, since that machine is 24/7 crunching (no machine restarts etc), and CPU time is 4000sec while run time 500000 sec, this looked much like symptoms of topic WU, that's why i wrote "restarted"
i dont know what "wrapper: windows. no checkpoint image" is as well
is there anything in project folder that may help investigate this? machine is quite remote but i can try to get logs
i dont know what "wrapper: windows. no checkpoint image" is as well
is there anything in project folder that may help investigate this? machine is quite remote but i can try to get logs
I crunch for Ukraine
Re: 31.8 hour WU for 509.72 credits unacceptable
It uploads what it has (at least most of it), some stderr output currently seems to get lost though.
We have a thread here with a report that cmsearch sometimes sits there doing nothing :
http://www.rechenkraft.net/phpBB/viewto ... 30#p116830
It's in german language but the screenshot with BOINC and taskmanager should show what it is about. This might explain the big difference between CPU time and wallclock time. We do not know yet why this happens.
We have a thread here with a report that cmsearch sometimes sits there doing nothing :
http://www.rechenkraft.net/phpBB/viewto ... 30#p116830
It's in german language but the screenshot with BOINC and taskmanager should show what it is about. This might explain the big difference between CPU time and wallclock time. We do not know yet why this happens.
vi BOINC/checkin_notes
:1,$s/bug/feature/g
:wq!
Erzaehlen sich Biologen eigentlich Klein-RNA-Witze?
:1,$s/bug/feature/g
:wq!
Erzaehlen sich Biologen eigentlich Klein-RNA-Witze?
Re: 31.8 hour WU for 509.72 credits unacceptable
Thank you Michael for explaining the situation.
I will be re-attaching to the project shortly.
Rilian, unfortunately that is not my WU. Mine was completed successfully and claimed credits was 458.15.
And my WU was re-started only a few times as the Boinc manager controls the time slots.
Thank you.
I will be re-attaching to the project shortly.
Rilian, unfortunately that is not my WU. Mine was completed successfully and claimed credits was 458.15.
And my WU was re-started only a few times as the Boinc manager controls the time slots.
Thank you.
Re: 31.8 hour WU for 509.72 credits unacceptable
There are two types of restart.
The "soft / non-destructive" restart happens, when the WU is just suspended (without beeing thrown out of memory) and BOINC reactivates it, I guess that's where the wrapper looks for this "checkpoint image" (which can only exist in Lin32) - Yoyo knows the wrapper very well, maybe he can confirm this?
The other one seems to happen after a crash sometimes, which means that it was actually a restart from scratch - a new cmsearch task with a fresh start from 0% - Gentilli, that's what your WU did (most likely). So the first run gets totally lost (including its CPU time) and only the second (or final) run's CPU time is reported to BOINC.
The "soft / non-destructive" restart happens, when the WU is just suspended (without beeing thrown out of memory) and BOINC reactivates it, I guess that's where the wrapper looks for this "checkpoint image" (which can only exist in Lin32) - Yoyo knows the wrapper very well, maybe he can confirm this?
The other one seems to happen after a crash sometimes, which means that it was actually a restart from scratch - a new cmsearch task with a fresh start from 0% - Gentilli, that's what your WU did (most likely). So the first run gets totally lost (including its CPU time) and only the second (or final) run's CPU time is reported to BOINC.
vi BOINC/checkin_notes
:1,$s/bug/feature/g
:wq!
Erzaehlen sich Biologen eigentlich Klein-RNA-Witze?
:1,$s/bug/feature/g
:wq!
Erzaehlen sich Biologen eigentlich Klein-RNA-Witze?
Re: 31.8 hour WU for 509.72 credits unacceptable
You are not authorised to read this forum.Ananas hat geschrieben: http://www.rechenkraft.net/phpBB/viewto ... 30#p116830
if wu is "left in memory while suspended", why should it search for checkpoint anyway?Ananas hat geschrieben:There are two types of restart.
The "soft / non-destructive" restart happens, when the WU is just suspended (without beeing thrown out of memory) and BOINC reactivates it, I guess that's where the wrapper looks for this "checkpoint image" (which can only exist in Lin32) - Yoyo knows the wrapper very well, maybe he can confirm this?
I crunch for Ukraine
Re: 31.8 hour WU for 509.72 credits unacceptable
oops, sorry ... in short : it says that BOINC shows the WU as "running" but the windows task manager shows that it does not consume CPU time.rilian hat geschrieben:You are not authorised to read this forum.Ananas hat geschrieben: http://www.rechenkraft.net/phpBB/viewto ... 30#p116830
vi BOINC/checkin_notes
:1,$s/bug/feature/g
:wq!
Erzaehlen sich Biologen eigentlich Klein-RNA-Witze?
:1,$s/bug/feature/g
:wq!
Erzaehlen sich Biologen eigentlich Klein-RNA-Witze?
Re: 31.8 hour WU for 509.72 credits unacceptable
wu is just waiting for christmasAnanas hat geschrieben:oops, sorry ... in short : it says that BOINC shows the WU as "running" but the windows task manager shows that it does not consume CPU time.rilian hat geschrieben:You are not authorised to read this forum.Ananas hat geschrieben: http://www.rechenkraft.net/phpBB/viewto ... 30#p116830
(dunno if this joke is popular in english IT community, but it is quite well known in russian)
I crunch for Ukraine
Re: 31.8 hour WU for 509.72 credits unacceptable
I can only guess : When the application says that it knows about checkpoints (i.e. it has a checkpoint filename in the WU), the wrapper tries to evaluate the timestamp of that checkpoint file in order to see wether the WU has checkpointed or not (Information taken from the Berkeley sample code)rilian hat geschrieben:... if wu is "left in memory while suspended", why should it search for checkpoint anyway?
The reason why there is a checkpoint filename is, that the WU generator creates the same WU XML files independant from the operating system (even though only Linux32 can handle it).
___________
We know this Christmas joke in german too - and if we want to express an even longer delay (actually infinite), we say "It will happen when Christmas and Easter will be on the same date"
vi BOINC/checkin_notes
:1,$s/bug/feature/g
:wq!
Erzaehlen sich Biologen eigentlich Klein-RNA-Witze?
:1,$s/bug/feature/g
:wq!
Erzaehlen sich Biologen eigentlich Klein-RNA-Witze?