# of errors
-
- Fingerzähler
- Beiträge: 2
- Registriert: 04.10.2008 04:00
# of errors
It looks like something happened lately that is making the server cancel results if another computer errors out. Did the # of max errors get lowered since the end is near for OGR?
What older units showed for:
max # of error/total/success results 20, 30, 10
http://www.rechenkraft.net/yoyo/workuni ... id=1287328
Newer units show:
max # of error/total/success results 1, 1, 10
and frequently this after they are sent to me and are later canceled by the server as a redundant result.
errors Too many error results Too many total results
http://www.rechenkraft.net/yoyo/workuni ... id=1313132
What older units showed for:
max # of error/total/success results 20, 30, 10
http://www.rechenkraft.net/yoyo/workuni ... id=1287328
Newer units show:
max # of error/total/success results 1, 1, 10
and frequently this after they are sent to me and are later canceled by the server as a redundant result.
errors Too many error results Too many total results
http://www.rechenkraft.net/yoyo/workuni ... id=1313132
Re: # of errors
Hello,
yes I set max # of error/total/success results 1, 1, 10 because of late phase of the project. I need results now fast and not month later. So if wus are resent multiple times they return to late for the project.
But I wonder why the Boinc server sends the wu a second time and sometimes later cancels this second copy. I made now some changes in the config and the second wu shhould not be sent out.
yoyo
yes I set max # of error/total/success results 1, 1, 10 because of late phase of the project. I need results now fast and not month later. So if wus are resent multiple times they return to late for the project.
But I wonder why the Boinc server sends the wu a second time and sometimes later cancels this second copy. I made now some changes in the config and the second wu shhould not be sent out.
yoyo
Re: # of errors
Hello,
I reverted back from these changes. The server had problems with it.
But there should be no problem on client side. If a wu is canceled by server means only that the server does not need anymore the result and it makes no sense to crunch it.
There seems to be a bug in the server version which I'm using, that a second result is send to clients. I will think about an update of the server version in future. But not before ogr finishes.
yoyo
I reverted back from these changes. The server had problems with it.
But there should be no problem on client side. If a wu is canceled by server means only that the server does not need anymore the result and it makes no sense to crunch it.
There seems to be a bug in the server version which I'm using, that a second result is send to clients. I will think about an update of the server version in future. But not before ogr finishes.
yoyo
-
- Fingerzähler
- Beiträge: 2
- Registriert: 04.10.2008 04:00
Re: # of errors
Thanks for the quick reply. That's what I thought was going on but I just wanted to make sure everything was going like it should. I have noticed on those units that they are no longer being canceled by the server but being finished, granted credit but they now show:errors Too many total results. As long as they are being sent in to dnet as valid I guess it doesn't matter.
http://www.rechenkraft.net/yoyo/workuni ... id=1356088
I hope all goes well when you go to upgrade the server.
http://www.rechenkraft.net/yoyo/workuni ... id=1356088
I hope all goes well when you go to upgrade the server.
Re: # of errors
Yes, I saw it.
These wus are the first ones with big numbers, more than 50000 digits. I have the impression that ecm may be not stable with such big numbers.
Currently I do not know what
means.
I try to find the root cause for it.
yoyo
These wus are the first ones with big numbers, more than 50000 digits. I have the impression that ecm may be not stable with such big numbers.
Currently I do not know what
Code: Alles auswählen
app exit status: 0xc00000fd
I try to find the root cause for it.
yoyo
Re: # of errors
Hello,
this crash seems to be an stack overflow which appears mostly on Windows systems. If a Linux box gets this work unit it runs mostly fine.
So again your help is needed.
If you had ecm workunits which crashed with exit code 195 (0xc3) and in stderr out you have
Than please run the following tests:
- create a new directory
- copy this file into it
- copy the ecm exe from the boinc project folder also into this directory
- run it on command line with -v -nn -timestamp -chkpnt checkpnt -inp inp 50000 > out1
this should confirm, that it crashes
- now run it with -k 2 -v -nn -timestamp -chkpnt checkpnt -inp inp 50000 > out2
if this also failes
- run it with -k 8 -v -nn -timestamp -chkpnt checkpnt -inp inp 50000 > out8
Afterwards please post the out files here.
What will happen:
- Each run will take as long as the wu did run until the crash, 1h and more. But we will have the chance to find a solution for this crash.
- The used -k option performs k blocks in step 2. Increasing k decreases the memory usage of step 2, at the expense of more cpu time. We will see this in the out files.
yoyo
this crash seems to be an stack overflow which appears mostly on Windows systems. If a Linux box gets this work unit it runs mostly fine.
So again your help is needed.
If you had ecm workunits which crashed with exit code 195 (0xc3) and in stderr out you have
Code: Alles auswählen
app exit status: 0xc00000fd
- create a new directory
- copy this file into it
- copy the ecm exe from the boinc project folder also into this directory
- run it on command line with -v -nn -timestamp -chkpnt checkpnt -inp inp 50000 > out1
this should confirm, that it crashes
- now run it with -k 2 -v -nn -timestamp -chkpnt checkpnt -inp inp 50000 > out2
if this also failes
- run it with -k 8 -v -nn -timestamp -chkpnt checkpnt -inp inp 50000 > out8
Afterwards please post the out files here.
What will happen:
- Each run will take as long as the wu did run until the crash, 1h and more. But we will have the chance to find a solution for this crash.
- The used -k option performs k blocks in step 2. Increasing k decreases the memory usage of step 2, at the expense of more cpu time. We will see this in the out files.
yoyo
Re: # of errors
i can't find emc.exe
NO emc.exe in G:\PRO\BOINC2D\projects\www.rechenkraft.net_yoyo\
only ecm621_win64_core2.exe and ecmwrapper_0.02_windows_x86_64.exe
not in all G: (it's my system disk)
and if download it http://www.rechenkraft.net/yoyo/downloa ... C58862/inp
Download Master -- it crashes ((((
if only firefox - OK
NO emc.exe in G:\PRO\BOINC2D\projects\www.rechenkraft.net_yoyo\
only ecm621_win64_core2.exe and ecmwrapper_0.02_windows_x86_64.exe
not in all G: (it's my system disk)
and if download it http://www.rechenkraft.net/yoyo/downloa ... C58862/inp
Download Master -- it crashes ((((
if only firefox - OK
Zuletzt geändert von (_KoDAk_) am 27.03.2009 22:23, insgesamt 1-mal geändert.
Re: # of errors
Name of the ecm exe depends on win32 or win64 and should be
ecm621_win32_c2d.exe
or
ecm621_win64_core2.exe
yoyo
ecm621_win32_c2d.exe
or
ecm621_win64_core2.exe
yoyo
Re: # of errors
ecm621_win64_core2.exe -k 8 -v -nn -timestamp -chkpnt checkpnt -inp in 50000 > out8
ecm621_win64_core2.exe -k 2 -v -nn -timestamp -chkpnt checkpnt -inp in 50000 > out2
ecm621_win64_core2.exe -v -nn -timestamp -chkpnt checkpnt -inp in 50000 > out1
out8
out2
out1
empty (
they CON'T find inp file
sorry fire fox name it inp.txt ((((
but
Can't find input file in
ecm621_win64_core2.exe -k 2 -v -nn -timestamp -chkpnt checkpnt -inp in 50000 > out2
ecm621_win64_core2.exe -v -nn -timestamp -chkpnt checkpnt -inp in 50000 > out1
out8
out2
out1
empty (
they CON'T find inp file
sorry fire fox name it inp.txt ((((
but
Can't find input file in
Re: # of errors
there were some small typing faults in my description above, I changed it.
yoyo
yoyo