Harmonious Trees 0.03

Alles zum Projekt yoyo@home
Everything about the project yoyo@home
Nachricht
Autor
Benutzeravatar
yoyo
Vereinsvorstand
Vereinsvorstand
Beiträge: 8045
Registriert: 17.12.2002 14:09
Wohnort: Berlin
Kontaktdaten:

Harmonious Trees 0.03

#1 Ungelesener Beitrag von yoyo » 15.08.2011 15:30

This version contains a fix to abort long running workunits after ~ 2 days of runtime and send the intermediate result back to the server. The server reissues this workunit to continue from the point where it was aborted.

yoyo
HILF mit im Rechenkraft-WiKi, dies gibts zu tun.
Wiki - FAQ - Verein - Chat

Bild Bild

x3mEn
Prozessor-Polier
Prozessor-Polier
Beiträge: 102
Registriert: 20.03.2011 22:23

Re: Harmonious Trees 0.03

#2 Ungelesener Beitrag von x3mEn » 16.08.2011 20:41

yoyo, what to do with WU's running @ ht 0.02?

Benutzeravatar
yoyo
Vereinsvorstand
Vereinsvorstand
Beiträge: 8045
Registriert: 17.12.2002 14:09
Wohnort: Berlin
Kontaktdaten:

Re: Harmonious Trees 0.03

#3 Ungelesener Beitrag von yoyo » 16.08.2011 21:31

If it is a ong running wu you should consider to abort it. If this result will meet a version 0.03 wu, which was aborted it will most probably be invalid.
yoyo
HILF mit im Rechenkraft-WiKi, dies gibts zu tun.
Wiki - FAQ - Verein - Chat

Bild Bild

Ananas
WU-Schieber
WU-Schieber
Beiträge: 1184
Registriert: 27.04.2008 18:37
Wohnort: Nordlichter Köln

Re: Harmonious Trees 0.03

#4 Ungelesener Beitrag von Ananas » 17.08.2011 09:20

What if a slow host meets a fast one? Will they still validate? 2 days on the slow one will sure result in a way different checkpoint file than the one of the fast host.
vi BOINC/checkin_notes
:1,$s/bug/feature/g
:wq!

Erzaehlen sich Biologen eigentlich Klein-RNA-Witze?

fwjmath
XBOX360-Installer
XBOX360-Installer
Beiträge: 83
Registriert: 19.10.2010 15:26

Re: Harmonious Trees 0.03

#5 Ungelesener Beitrag von fwjmath » 17.08.2011 11:22

Hi Ananas,

The timing of abortion, or more precisely "premature completion when taking too much time", is controlled by an internal counter. If the counter is not messed up, things will work out. On the other hand, the counter is an estimation, therefore the timing may not be exactly 2 days, and it depends on machine.

fwjmath.

Mayoran
XBOX360-Installer
XBOX360-Installer
Beiträge: 63
Registriert: 02.08.2011 06:25

Re: Harmonious Trees 0.03

#6 Ungelesener Beitrag von Mayoran » 06.09.2011 10:49

Hi! :wave:

I also have a task (Result ID 12294926), that takes way too long (now a net 24h, with just over 5% done) for my personal taste. If it continues like this, it will never finish before the deadline. It might not even finish before i am finished :wink: . I am very reluctant to lending my computer time to a task with that unspecific and long enduring characteristics and I am seriously considering canceling/aborting the task. You are writing tasks will intentionally self-abort after about 2 days? Are there any credits given then? And can you give more details to this "counter" mentioned? Does the current counter show up anywhere, for example in the ckpt.txt file?

For info: my ckpt.txt file currently looks like this:

34
189721
68890281
590248
68295211
0
2176794844
0123433123333323332333233221233123
0123433123333323332333233221233123
01GHPL82?7:<>NAEDOCFI;49MBJ@K5=36D
406110275451

Can you read out anything from this, especially when the task might come to a regular end or at least to an intentionally prematurally self-aborted end?

fwjmath
XBOX360-Installer
XBOX360-Installer
Beiträge: 83
Registriert: 19.10.2010 15:26

Re: Harmonious Trees 0.03

#7 Ungelesener Beitrag von fwjmath » 06.09.2011 15:00

Hello Mayoran,

I would like to clear up several things first.

For app verion 0.03, the application will "prematurely self-abort", which means that it would quit running when a certain amount of computation is reached, and will return as a normal result, and the deserved credits will be granted. In this case, the result is nothing different than a normally finished result in the aspect of volunteers and credits. We call it "premature" only because the actual segment in the workunit is only partially finished in this case and the remaining part needs to be redistributed once again.

I don't know if you are familiar with a finished project called Rectilinear Crossing Number (RCN). In fact this kind of mechanism to avoid extremely long workunits had been used in RCN.

For the progress bar, I have to admit that it does not really reflect progress for very long workunits. However, there is another way to look for progress. Look at the last line of ckpt.txt. The app will "prematurely self-abort" when it reaches 2e12 and some more subtle kind of criteria. But roughly, you can see the last line of ckpt.txt as a not-so-just indicator, and 2e12 as a very soft threshold.

For your workunit, it is a typically long running one, just give it its time.

fwjmath.

Mayoran
XBOX360-Installer
XBOX360-Installer
Beiträge: 63
Registriert: 02.08.2011 06:25

Re: Harmonious Trees 0.03

#8 Ungelesener Beitrag von Mayoran » 06.09.2011 19:13

fwjmath hat geschrieben:Hello Mayoran,
Hello fwjmath, thank you for your fast reply. :good:
fwjmath hat geschrieben:I would like to clear up several things first.

For app verion 0.03, the application will "prematurely self-abort", which means that it would quit running when a certain amount of computation is reached, and will return as a normal result, and the deserved credits will be granted. In this case, the result is nothing different than a normally finished result in the aspect of volunteers and credits. We call it "premature" only because the actual segment in the workunit is only partially finished in this case and the remaining part needs to be redistributed once again.
Ok, I did it understand that way.
fwjmath hat geschrieben: I don't know if you are familiar with a finished project called Rectilinear Crossing Number (RCN). In fact this kind of mechanism to avoid extremely long workunits had been used in RCN.
Nope, not yet done that (sub)project. But I understand what you mean.
fwjmath hat geschrieben:For the progress bar, I have to admit that it does not really reflect progress for very long workunits. However, there is another way to look for progress. Look at the last line of ckpt.txt. The app will "prematurely self-abort" when it reaches 2e12 and some more subtle kind of criteria. But roughly, you can see the last line of ckpt.txt as a not-so-just indicator, and 2e12 as a very soft threshold.
That's why I posted my file; thought anyway, that last line would be important. :wink:
However, since I am currently at roughly 4e11, with about 24h CPU time, in linear terms therefore about 20% done, that means I have roughly another 80% to go, which means about four full days of CPU-time? Hard to get!!! :o :suspect: Might come close to deadline in practical terms! :o

And what other more 'subtle' criteria do you talk about? :suspect: Is it possible, that my task will not end after these four long days? :unsure:

Mayoran
XBOX360-Installer
XBOX360-Installer
Beiträge: 63
Registriert: 02.08.2011 06:25

Re: Harmonious Trees 0.03

#9 Ungelesener Beitrag von Mayoran » 08.09.2011 16:13

Ok, finally, Task has ended on my slow Laptop, successfully. So far so good.

But task ended with counter at just above 1e12 and not at 2e12.

Hmmm...
Bild

Benutzeravatar
Beyond
Prozessor-Polier
Prozessor-Polier
Beiträge: 111
Registriert: 02.02.2008 01:48
Wohnort: Rum River watershed, MN, USA

Re: Harmonious Trees 0.03

#10 Ungelesener Beitrag von Beyond » 14.09.2011 15:36

yoyo hat geschrieben:This version contains a fix to abort long running workunits after ~ 2 days of runtime and send the intermediate result back to the server. The server reissues this workunit to continue from the point where it was aborted.
yoyo
Don't think this fix is working. I have a .03 WU that's been running for 59 hours and is still at 25% completion:

0.03 harmtrees hat_737_34-104574-1727118432_R_1315461737_2
59:03:07 elapsed - 25% - 177:09:10 time left - 225:25:32 deadline - 54.6 °C - Running High P.

What should I do with it?
Regards/Beyond

fwjmath
XBOX360-Installer
XBOX360-Installer
Beiträge: 83
Registriert: 19.10.2010 15:26

Re: Harmonious Trees 0.03

#11 Ungelesener Beitrag von fwjmath » 14.09.2011 18:55

Beyond hat geschrieben:
yoyo hat geschrieben:This version contains a fix to abort long running workunits after ~ 2 days of runtime and send the intermediate result back to the server. The server reissues this workunit to continue from the point where it was aborted.
yoyo
Don't think this fix is working. I have a .03 WU that's been running for 59 hours and is still at 25% completion:

0.03 harmtrees hat_737_34-104574-1727118432_R_1315461737_2
59:03:07 elapsed - 25% - 177:09:10 time left - 225:25:32 deadline - 54.6 °C - Running High P.

What should I do with it?
Regards/Beyond
Hi Beyond,

Could you please post the content of ckpt.txt here, so that we can do further investigation? We should note that the ~2days runtime limit is a soft one, which means it depends both on machine and on workunit. Moreover, the progress bar is only an estimation, since the actual progress is very hard to measure.

However, from ckpt.txt, in a case-by-case basis, we can have a better estimation about running time. As I said in previous posts, workunit should be finished once the last line of ckpt.txt exceeds 2e12 and some more subtle condition satisfied. Let T be the first tree making the last line of ckpt.txt exceeding 2e12. Workunit will stop once all trees with the same first subtree are processed. However, it is not easy to estimate this number automatically. Therefore, if you would like to pose your ckpt.txt here, I could give you an estimation of remaining running time.

But rest assure in all cases. We can prove mathematically that this application, when fed with correct input, will terminate.

fwjmath.

Benutzeravatar
Beyond
Prozessor-Polier
Prozessor-Polier
Beiträge: 111
Registriert: 02.02.2008 01:48
Wohnort: Rum River watershed, MN, USA

Re: Harmonious Trees 0.03

#12 Ungelesener Beitrag von Beyond » 14.09.2011 19:12

fwjmath hat geschrieben:
Beyond hat geschrieben:
yoyo hat geschrieben:This version contains a fix to abort long running workunits after ~ 2 days of runtime and send the intermediate result back to the server. The server reissues this workunit to continue from the point where it was aborted.
yoyo
Don't think this fix is working. I have a .03 WU that's been running for 59 hours and is still at 25% completion:

0.03 harmtrees hat_737_34-104574-1727118432_R_1315461737_2
59:03:07 elapsed - 25% - 177:09:10 time left - 225:25:32 deadline - 54.6 °C - Running High P.

What should I do with it?
Regards/Beyond
Hi Beyond,

Could you please post the content of ckpt.txt here, so that we can do further investigation? We should note that the ~2days runtime limit is a soft one, which means it depends both on machine and on workunit. Moreover, the progress bar is only an estimation, since the actual progress is very hard to measure.

However, from ckpt.txt, in a case-by-case basis, we can have a better estimation about running time. As I said in previous posts, workunit should be finished once the last line of ckpt.txt exceeds 2e12 and some more subtle condition satisfied. Let T be the first tree making the last line of ckpt.txt exceeding 2e12. Workunit will stop once all trees with the same first subtree are processed. However, it is not easy to estimate this number automatically. Therefore, if you would like to pose your ckpt.txt here, I could give you an estimation of remaining running time.

But rest assure in all cases. We can prove mathematically that this application, when fed with correct input, will terminate.

fwjmath.
Here's the contents of ckpt.txt:

34
104573
256502765
22094756
233931644
0
2551069546
0123456745123456634123442123322222
0123456745123456634123442123322222
05E=FI;@HK9PM1B3L<D?7J>O26GN84C:A6
2470400999780

I restarted BOINC earlier to see if it would start progressing again. Hope that didn't mess it up. Looks like it's past 2e12 though.


Edit: the ckpt.txt now:

34
104573
278022765
23803576
253701846
0
2742674845
0123456745123456564543312345345111
0123456745123456564543312345345111
04KBJ@;3G25A<LMD6FO?>1CP789H=:IENC
2672443196488


Edit 2: the ckpt.txt this morning:

34
104573
339262765
28569057
310073384
0
3900813131
0123456745123456455523455542123453
0123456745123456455523455542123456
01FC5I4>GE@9:KPB3<7A;L86NHJMO2D?=K
3229669446809
Zuletzt geändert von Beyond am 15.09.2011 14:30, insgesamt 1-mal geändert.

Antworten

Zurück zu „Number crunching“