Anyone having issues with ecm 700.02 WUs?
-
- Idle-Sammler
- Beiträge: 6
- Registriert: 18.12.2016 11:58
Anyone having issues with ecm 700.02 WUs?
Hi all,
I've been crunching some yoyo@home tasks and I have some ecm 700.02 tasks that start off needing 6 hours to crunch.
I now have at least 6 of these tasks which are now only at 20% completion after nearly 12 hours...so, do I leave them to finish or do I abort them and start some of the other WUs in the cache?
regards
Tim
PS The deadline for these is fast approaching and if the tasks are all going to take this long, then some will not be finished in time...
CPU is a 3.6GHz Intel i7, with 6 cores running BOINC...so, I cannot see the CPU is an issue.
I've been crunching some yoyo@home tasks and I have some ecm 700.02 tasks that start off needing 6 hours to crunch.
I now have at least 6 of these tasks which are now only at 20% completion after nearly 12 hours...so, do I leave them to finish or do I abort them and start some of the other WUs in the cache?
regards
Tim
PS The deadline for these is fast approaching and if the tasks are all going to take this long, then some will not be finished in time...
CPU is a 3.6GHz Intel i7, with 6 cores running BOINC...so, I cannot see the CPU is an issue.
Re: Anyone having issues with ecm 700.02 WUs?
Can you give me the resultID's of those tasks?
Re: Anyone having issues with ecm 700.02 WUs?
Yes same problem, mine have been running 36 hrs, some at 80% complete, some as low as 20%. 1 WU ended, only got 250 credits, if another ends with the same amt of credit, I'm cancelling the rest and opting out of ECM.
-
- Idle-Sammler
- Beiträge: 6
- Registriert: 18.12.2016 11:58
Re: Anyone having issues with ecm 700.02 WUs?
Hiyoyo hat geschrieben:Can you give me the resultID's of those tasks?
Thanks for the msg.
It's a bit difficult to give these as the "Results for computer" page, doesn't list the WU names - so I have to check each one on BOINC Manager and hopefully I get the right file names...
So, the issue is with ecm_op_1481818542_444741478546331_17M.C235 files:
_2515_0 - this one is currently at 80% after 23 hours, and the remaining time is just increasing...
_2965_0 - now on 20% after 6+ hrs - remaining 1d 5hr
_2920_0 - now on 20% after 6+ hrs - remaining 1d 3hr
_2960_0 - now on 20% after 6+ hrs - remaining 1d 2hr
_2415_0 - now on 20% after 6+ hrs - remaining 1d 0hr
_2305_0 - now on 20% after 6+ hrs - remaining 1d 0hr
_2440_0 - now on 20% after 6+ hrs - remaining 23hr
I have had no issue with the ecm_xy files...these seem to be OK...but NOT the ecm_op files.
I have already aborted a number of ecm_op files as they were not going to be completed by the deadline...but I kept the above, as with a 7 core i7, I thought I would give these a chance to finish...but it looks unlikely....
Thanks in advance for your help
Tim
Zuletzt geändert von UBT - Timbo am 19.12.2016 22:43, insgesamt 1-mal geändert.
Re: Anyone having issues with ecm 700.02 WUs?
This is realy strange.
All of those WUs were finished with an runtime of 1 - 5 hours. Also a 32 bit Odroid (arm 32 bit) finished this WU after 11 hours.
There must something wrong with your boinc.
Have you throttled cpu usage or is your cpu clock reduced?
Would be interesting to see the stderr of such workunits.
- check in boinc in which slot the wu is running
- shutdown boinc
- send me the stderr
I only saw some crediting issues with ecm_ru_ workunits. There the Windows app needs 4 times longer than the Linux app while on other workunits it is vice versa. But anyway for such workunits it I give now 3 times more credits. I think this was the issue from @xyzzy.
yoyo
All of those WUs were finished with an runtime of 1 - 5 hours. Also a 32 bit Odroid (arm 32 bit) finished this WU after 11 hours.
There must something wrong with your boinc.
Have you throttled cpu usage or is your cpu clock reduced?
Would be interesting to see the stderr of such workunits.
- check in boinc in which slot the wu is running
- shutdown boinc
- send me the stderr
I only saw some crediting issues with ecm_ru_ workunits. There the Windows app needs 4 times longer than the Linux app while on other workunits it is vice versa. But anyway for such workunits it I give now 3 times more credits. I think this was the issue from @xyzzy.
yoyo
-
- Idle-Sammler
- Beiträge: 6
- Registriert: 18.12.2016 11:58
Re: Anyone having issues with ecm 700.02 WUs?
Hi
All of those WUs were finished with an runtime of 1 - 5 hours.
- Maybe, but not on my PC !!
There must something wrong with your boinc.
- Maybe, but ALL other projects crunch tasks perfectly well...and even the yoyo muon tasks and ecm_xy tasks are all OK
Have you throttled cpu usage or is your cpu clock reduced?
- Nope - running at standard CPU speed
Would be interesting to see the stderr of such workunits.
- check in boinc in which slot the wu is running - I have 13 slots with yoyo files - slots 1 thru 5 and 11 thru 18 - this is because I suspended 6 ecm_op tasks and I'm running 6 ecm_xy and 1 ecm_op tasks.
- shutdown boinc - if I do that, I am not sure that the crunching time for the 7 "running" ecm tasks will be checkpointed...as I noticed before, on some earlier ecm_op tasks, that I lost many hours, (with elapsed time going from 13+ hrs down to 6+ hours ) when I shutdown BOINC, thinking it might be the issue...
- send me the stderr - that will have to wait until one ecm_oy file that is current, has finished.
regards and thanks
Tim
All of those WUs were finished with an runtime of 1 - 5 hours.
- Maybe, but not on my PC !!
There must something wrong with your boinc.
- Maybe, but ALL other projects crunch tasks perfectly well...and even the yoyo muon tasks and ecm_xy tasks are all OK
Have you throttled cpu usage or is your cpu clock reduced?
- Nope - running at standard CPU speed
Would be interesting to see the stderr of such workunits.
- check in boinc in which slot the wu is running - I have 13 slots with yoyo files - slots 1 thru 5 and 11 thru 18 - this is because I suspended 6 ecm_op tasks and I'm running 6 ecm_xy and 1 ecm_op tasks.
- shutdown boinc - if I do that, I am not sure that the crunching time for the 7 "running" ecm tasks will be checkpointed...as I noticed before, on some earlier ecm_op tasks, that I lost many hours, (with elapsed time going from 13+ hrs down to 6+ hours ) when I shutdown BOINC, thinking it might be the issue...
- send me the stderr - that will have to wait until one ecm_oy file that is current, has finished.
regards and thanks
Tim
-
- Idle-Sammler
- Beiträge: 6
- Registriert: 18.12.2016 11:58
Re: Anyone having issues with ecm 700.02 WUs?
UPDATE:
Ok, so I was WRONG about the ecm_xy tasks.
The 6 I had running were doing fine and got to about 97% or so and then, as one, they all restarted from zero again
So, I shut down BOINC Manager and restarted it....
Looked in the slots folders and I found the folders for each of the yoyo tasks.
for the ecm_op 2515_0 task, here's the content of the stderr file:
So, if any of the files that have been created in various slots folders are of any use to you, please let me know and I'll maybe "zip" them up and you can look at them in detail.
regards
Tim
Ok, so I was WRONG about the ecm_xy tasks.
The 6 I had running were doing fine and got to about 97% or so and then, as one, they all restarted from zero again
So, I shut down BOINC Manager and restarted it....
Looked in the slots folders and I found the folders for each of the yoyo tasks.
for the ecm_op 2515_0 task, here's the content of the stderr file:
The checkpoint.txt file says this:wrapper: starting
wrapper: running ecm (-v -nn -timestamp -chkpnt checkpnt -inp in -maxmem 1800 110e6)
No heartbeat from core client for 30 sec - exiting
wrapper: starting
wrapper: running ecm (-v -nn -timestamp -chkpnt checkpnt -inp in -maxmem 1800 110e6)
No heartbeat from core client for 30 sec - exiting
wrapper: starting
wrapper: running ecm (-v -nn -timestamp -chkpnt checkpnt -inp in -maxmem 1800 110e6)
No heartbeat from core client for 30 sec - exiting
No heartbeat from core client for 30 sec - exiting
wrapper: starting
wrapper: running ecm (-v -nn -timestamp -chkpnt checkpnt -inp in -maxmem 1800 110e6)
No heartbeat from core client for 30 sec - exiting
wrapper: starting
wrapper: running ecm (-v -nn -timestamp -chkpnt checkpnt -inp in -maxmem 1800 110e6)
No heartbeat from core client for 30 sec - exiting
wrapper: starting
wrapper: running ecm (-v -nn -timestamp -chkpnt checkpnt -inp in -maxmem 1800 110e6)
No heartbeat from core client for 30 sec - exiting
wrapper: starting
wrapper: running ecm (-v -nn -timestamp -chkpnt checkpnt -inp in -maxmem 1800 110e6)
wrapper: running ecm (-v -nn -timestamp -chkpnt checkpnt -inp in -maxmem 1800 110e6)
wrapper: starting
wrapper: running ecm (-v -nn -timestamp -chkpnt checkpnt -inp in -maxmem 1800 110e6)
wrapper: running ecm (-v -nn -timestamp -chkpnt checkpnt -inp in -maxmem 1800 110e6)
wrapper: running ecm (-v -nn -timestamp -chkpnt checkpnt -inp in -maxmem 1800 110e6)
wrapper: running ecm (-v -nn -timestamp -chkpnt checkpnt -inp in -maxmem 1800 110e6)
wrapper: starting
wrapper: running ecm (-v -nn -timestamp -chkpnt checkpnt -inp in -maxmem 1800 110e6)
No heartbeat from core client for 30 sec - exiting
No heartbeat from core client for 30 sec - exiting
No heartbeat from core client for 30 sec - exiting
wrapper: starting
wrapper: running ecm (-v -nn -timestamp -chkpnt checkpnt -inp in -maxmem 1800 110e6)
No heartbeat from core client for 30 sec - exiting
wrapper: starting
wrapper: running ecm (-v -nn -timestamp -chkpnt checkpnt -inp in -maxmem 1800 110e6)
wrapper: starting
wrapper: running ecm (-v -nn -timestamp -chkpnt checkpnt -inp in -maxmem 1800 110e6)
and there's an "out" file full of numbers...not sure if this is relevant as it is very long.4 73902.187500
So, if any of the files that have been created in various slots folders are of any use to you, please let me know and I'll maybe "zip" them up and you can look at them in detail.
regards
Tim
Re: Anyone having issues with ecm 700.02 WUs?
This "no heartbeat" looks bad and explains a bit the long runtime. It means that boinc restarts the workunit.
Do you run the latest boinc version?
Do you run the latest boinc version?
- Beyond
- Prozessor-Polier
- Beiträge: 111
- Registriert: 02.02.2008 01:48
- Wohnort: Rum River watershed, MN, USA
Re: Anyone having issues with ecm 700.02 WUs?
Code: Alles auswählen
ecm_ru_1481927124_10_1155.c367_1670_0 44:26:31 (42:45:18) 20.000 177:46:02 55:32:25 23.9 °C 0 700.01 ecm Running High P. [1] 20:53:19 74.04 1140.00 MB
ecm_ru_1481927124_10_1155.c367_1770_0 44:26:31 (43:04:17) 20.000 177:46:07 55:31:23 23.9 °C 0 700.01 ecm Running High P. [1] 21:13:13 83.03 964.92 MB
ecm_ru_1481927124_10_1155.c367_2450_0 44:26:31 (42:48:33) 20.000 177:46:07 55:31:52 23.9 °C 0 700.01 ecm Running High P. [1] 20:57:55 75.37 1031.77 MB
ecm_ru_1481927124_10_1155.c367_2120_0 44:26:31 (43:02:36) 20.000 177:46:07 55:32:08 23.9 °C 0 700.01 ecm Running High P. [1] 21:11:53 88.70 995.49 MB
ecm_ru_1481927124_10_1155.c367_3020_0 44:26:31 (42:54:26) 20.000 177:46:07 55:37:18 23.9 °C 0 700.01 ecm Running High P. [1] 21:03:11 80.83 1051.81 MB
Edit: All 5 jumped to 40% done at about 45.5 hours. New figures:
Code: Alles auswählen
ecm_ru_1481927124_10_1155.c367_1770_0 46:24:33 (44:47:33) 40.000 69:36:50 53:33:21 23.9 °C 0 700.01 ecm Running High P. [2] 01:03:47 97.92 151.10 MB
ecm_ru_1481927124_10_1155.c367_2450_0 46:24:33 (44:31:09) 40.000 69:36:50 53:33:50 23.9 °C 0 700.01 ecm Running High P. [2] 00:49:15 98.31 151.09 MB
ecm_ru_1481927124_10_1155.c367_2120_0 46:24:33 (44:50:07) 40.000 69:36:50 53:34:06 23.9 °C 0 700.01 ecm Running High P. [2] 01:06:24 100.00 151.10 MB
ecm_ru_1481927124_10_1155.c367_1670_0 46:24:33 (44:26:46) 40.000 69:36:50 53:34:23 23.9 °C 0 700.01 ecm Running High P. [2] 00:42:35 97.45 151.09 MB
ecm_ru_1481927124_10_1155.c367_3020_0 46:24:33 (44:35:40) 40.000 69:36:50 53:39:16 23.9 °C 0 700.01 ecm Running High P. [2] 00:51:39 98.94 151.10 MB
Re: Anyone having issues with ecm 700.02 WUs?
I deployed a version 700.02 for Win 64 which should be much faster.
I would abort those workunits. They might run too long and error out with time limit exceeded.
Usualy they should finish in 50 hours.
I would abort those workunits. They might run too long and error out with time limit exceeded.
Usualy they should finish in 50 hours.
-
- Idle-Sammler
- Beiträge: 6
- Registriert: 18.12.2016 11:58
Re: Anyone having issues with ecm 700.02 WUs?
Hi OK - that's interesting.yoyo hat geschrieben:This "no heartbeat" looks bad and explains a bit the long runtime. It means that boinc restarts the workunit.
Do you run the latest boinc version?
The BOINC version is 7.6.22, so quite recent.
The PC spec is Win XP Pro, Intel i7 3820 @ 3.6GHz with 3Gb ram. So not a slow coach (usually).
I plan to abort these - I left 1x op and 1x xy running overnight and the elapsed time is higher but the %age to completion hasn't changed much.
Once aborted, I guess I will lose all the info in the folders, so if you need any info from the folders please advise. I will wait until I hear from you.
regards
Tim
- Beyond
- Prozessor-Polier
- Beiträge: 111
- Registriert: 02.02.2008 01:48
- Wohnort: Rum River watershed, MN, USA
Re: Anyone having issues with ecm 700.02 WUs?
They all aborted themselves at exactly 58:10:45 hours. Strange:yoyo hat geschrieben:I deployed a version 700.02 for Win 64 which should be much faster.
I would abort those workunits. They might run too long and error out with time limit exceeded.
Usualy they should finish in 50 hours.
Code: Alles auswählen
ecm_ru_1481927124_10_1155.c367_1770_0 58:10:45 (56:16:03) 12/20/2016 11:12:38 AM 12/20/2016 11:27:37 AM 96.71 Reported: Computation error (197,) 1317.20 MB 1274.58 MB
ecm_ru_1481927124_10_1155.c367_2450_0 58:10:45 (55:58:16) 12/20/2016 11:12:38 AM 12/20/2016 11:27:37 AM 96.20 Reported: Computation error (197,) 1317.20 MB 1255.02 MB
ecm_ru_1481927124_10_1155.c367_2120_0 58:10:45 (56:20:23) 12/20/2016 11:12:38 AM 12/20/2016 11:27:37 AM 96.84 Reported: Computation error (197,) 1317.19 MB 1219.29 MB
ecm_ru_1481927124_10_1155.c367_1670_0 58:10:45 (55:53:38) 12/20/2016 11:12:38 AM 12/20/2016 11:27:37 AM 96.07 Reported: Computation error (197,) 1317.20 MB 1212.13 MB
ecm_ru_1481927124_10_1155.c367_3020_0 58:10:45 (56:04:59) 12/20/2016 11:12:38 AM 12/20/2016 11:27:37 AM 96.40 Reported: Computation error (197,) 1317.20 MB 1179.36 MB
<message>
exceeded elapsed time limit 209444.61 (1000000.00G/4.77G)
</message>
Edit: these were sent out again to some poor soul. Shouldn't they be cancelled? They've already caused almost 300 hours of wasted CPU time.