What happens with some ECM_UC ? Very long running time

Alles zum Projekt yoyo@home
Everything about the project yoyo@home

What happens with some ECM_UC ? Very long running time

Unread postby marsinph » 02.12.2018 19:02

Hello,
On three of my hosts, I have two very special WU !
Normaly, I return my WU after a little more than one hour.
But Those six WU, already between 17 and 22 hours running and stay at 40%
The common of those ECM 705.02 wu is :
ecm_uc_1543642398_np_195_850e6_6_"xxx" (not P1, not P2 !!!
ecm_uc_1543642398_np_195_850e6_6_7300_0 (who use 1.1Giga RAM without changes).
ecm_uc_1543642398_np_195_850e6_6_6535_1
ecm_uc_1543642398_np_195_850e6_6_7735_0
ecm_uc_1543642398_np_195_850e6_6_6660_0
ecm_uc_1543642398_np_195_850e6_6_725_1 (one Gb RAM, but it vary)
and
ecm_uc_1543642398_np_195_850e6_6_365_0 (one Gb ram but vary

Then the others WU starts with
ecm_uc_1543642398_np_195_2900e6_01_...... who take about 8-9 hours to run !!!
But they are "P1" task. So it seems more or less normal running time
I repeat, all my WU since months were finished after 60-90 minuts !
All tasks use about 2Mb RAM, ecmwrapper about 1Mb.


What we need to do with those very special long WU ?
Restarting BAM is not the solution because no checkpoint !!!
marsinph
Taschenrechner
Taschenrechner
 
Posts: 10
Joined: 05.04.2018 09:08

Re: What happens with some ECM_UC ? Very long running time

Unread postby yoyo » 02.12.2018 19:41

Yes those ecm_uc_1543642398_np_195_850e6* workunits runs long. I see runtimes from 15 to 30 hours. RAM consumption is high, but will not exceed 1,8 GB RAM.

As for every ecm workunit (not P1 or P2) every 20% a checkpoint is done.
Each ecm runs 5 curves and each curve has a 2 stages where stage 1 runs long and needs less ram and stage 2 runs shorter but needs much RAM. Ration between stage 1 and 2 is roughly 4:1.
HILF mit im Rechenkraft-WiKi, dies gibts zu tun.
Wiki - FAQ - Verein - Chat

Image Image
User avatar
yoyo
Vereinsvorstand
Vereinsvorstand
 
Posts: 7653
Joined: 17.12.2002 14:09
Location: Berlin

Re: What happens with some ECM_UC ? Very long running time

Unread postby marsinph » 02.12.2018 21:26

yoyo wrote:Yes those ecm_uc_1543642398_np_195_850e6* workunits runs long. I see runtimes from 15 to 30 hours. RAM consumption is high, but will not exceed 1,8 GB RAM.

As for every ecm workunit (not P1 or P2) every 20% a checkpoint is done.
Each ecm runs 5 curves and each curve has a 2 stages where stage 1 runs long and needs less ram and stage 2 runs shorter but needs much RAM. Ration between stage 1 and 2 is roughly 4:1.




Hello, thank you for explanation.
For short WU, there is checkpoint each 1200 seconds !!! Why not for the huge WU who take more than one thousand time to run ???
Then about RAM use of the huge WU, on all my three host (all the same CPU/RAM) ECM, take only a few megabyte !!!
Sometimes about one Giga, but not long (about 10 minuts) !

About your ratio stage 1 and 2. (I not consider RAM)
All very "rough estimation"
If stage 1 take 25% running, It would say, if the first stage take 12 hours to run, so the stage two would need 3 hours to finish !
It is not !
Those monster WU were at 20% after about 3 hours.
20 hours later, only 40% ! So I think the ratio is not 4:1 but 1:4.
It will says, those WU will be finished AFTER deadline (only four days !!!)
The small WU have a deadline of 10 days. The monster only four days (running 24/24, without restart, logoff,......)!!!


I hope the given credits will also be consequent !? On the same host (to be able to compare)
P1 WU runs on my host 29,300sec for a credit of 322.97 (ratio : 39.6 credit / hour)
And ECM_xy_.or hc, runs about 1.2 hour (3900sec for 76.98 credits) (ratio : 71 credit / hour)
Twice more for small WU !
It already shows how longer WU, how less credits !!!
And not considering no checkpoint , errors, host restart with lost of crunched part
I can not do any update on my host who require a restart because the very long WU block all !

My conclusion, the next time, I receive such "monster", I cancel it.
Sorry.It is not the goal of the research, but I will not block some hostys for nothing.
Look my signature, you will see I crunch on several project, also my team.

I will let finish the monster on one host, with hope on credits according running time.
The other still running, I will abort if they take more than 8 hours to completion.
Once again, sorry, but I need to do maintenance on my hosts.

Suggestion : reduce the size of the WU ! The monster requires 266TFLOPs, yes two hundred sixty-six TERRA FLOPs !!!
Only CPU !!!
To compare a i7-2600k OC to 4.0Ghz have a power of 4.22 GFLOPS (and 13.5 GINOPS (integer).


Best regards
marsinph
Taschenrechner
Taschenrechner
 
Posts: 10
Joined: 05.04.2018 09:08

Re: What happens with some ECM_UC ? Very long running time

Unread postby marsinph » 03.12.2018 15:07

Hello,
Like expected, the credit for the "monster" WU is much lower as for small WU !!! 1713CR / 124,348 sec : so about 49 credit/hour

How to do to not receive those WU with expected running time of 2 days ?
The here above was predicted running 9 hours, it took 34 hours !
I changed the BAM local prefenrences to accept only work for 0.5 days. But still receiving "monster" with a deadline of 5 days !!!
Considering here above and a ration of 1 / 3 (expected / running). the received WU will never finish before deadline !
So I need to abort. It is not the goal of the project. But the only solution, unless someone has an idea to not get the ECM_UC_...._NP_195..... ?
Best regards
marsinph
Taschenrechner
Taschenrechner
 
Posts: 10
Joined: 05.04.2018 09:08

Re: What happens with some ECM_UC ? Very long running time

Unread postby Michael H.W. Weber » 04.12.2018 12:34

Excellent. I need to re-connect my machines to this project to get such nice long tasks.

Michael.
Fördern, kooperieren und konstruieren statt fordern, konkurrieren und konsumieren.

Image

Image Image Image
User avatar
Michael H.W. Weber
Vereinsvorstand
Vereinsvorstand
 
Posts: 19839
Joined: 07.01.2002 01:00
Location: Marpurk


Return to Number crunching

Who is online

Users browsing this forum: No registered users and 8 guests