Another ECM project backoff and no new WU

Fehler und Wünsche zum Projekt yoyo@home
Bugs and wishes for the project yoyo@home

Another ECM project backoff and no new WU

Unread postby Dunckx » 07.05.2018 17:51

I had forgotten the signs of this which I should have remembered. It looks like this is another fail like the one in my post dated 22-01-18 viewtopic.php?f=56&t=16833#p173094 . Anyhow, the log file contains:

07/05/2018 17:16:27 | yoyo@home | [sched_op] Starting scheduler request
07/05/2018 17:16:27 | yoyo@home | [work_fetch] request: CPU (1355855.83 sec, 6.00 inst) NVIDIA GPU (172800.00 sec, 1.00 inst)
07/05/2018 17:16:27 | yoyo@home | Sending scheduler request: To report completed tasks.
07/05/2018 17:16:27 | yoyo@home | Reporting 200 completed tasks
07/05/2018 17:16:27 | yoyo@home | Requesting new tasks for CPU and NVIDIA GPU
07/05/2018 17:16:27 | yoyo@home | [sched_op] CPU work request: 1355855.83 seconds; 6.00 devices
07/05/2018 17:16:27 | yoyo@home | [sched_op] NVIDIA GPU work request: 172800.00 seconds; 1.00 devices
07/05/2018 17:16:34 | yoyo@home | [error] No close tag in scheduler reply
07/05/2018 17:16:34 | yoyo@home | [sched_op] Deferring communication for 03:47:49
07/05/2018 17:16:34 | yoyo@home | [sched_op] Reason: can't parse scheduler reply
07/05/2018 17:16:34 | | [work_fetch] Request work fetch: RPC complete
07/05/2018 17:16:39 | | [work_fetch] ------- start work fetch state -------
07/05/2018 17:16:39 | | [work_fetch] target work buffer: 86400.00 + 86400.00 sec
07/05/2018 17:16:39 | | [work_fetch] --- project states ---
07/05/2018 17:16:39 | yoyo@home | [work_fetch] REC 6963.600 prio -0.002 can't request work: scheduler RPC backoff (13663.74 sec)

Can you just confirm this is what I think it is? The only way I was able to resolve it last time was by removing and reattaching the project, and I hate to lose a day's results, but I think that there's no option.

Do you want the sched_reply.xml file?
Dunckx
PDA-Benutzer
PDA-Benutzer
 
Posts: 44
Joined: 12.11.2014 09:26

Re: Another ECM project backoff and no new WU

Unread postby yoyo » 07.05.2018 18:24

The reply message of the server contains the close tag. But the message is longer than the buffer in the boinc client. So the close tag didn't get it into the buffer. I think the only way is to reattach the project.
HILF mit im Rechenkraft-WiKi, dies gibts zu tun.
Wiki - FAQ - Verein - Chat

Image Image
User avatar
yoyo
Vereinsvorstand
Vereinsvorstand
 
Posts: 7631
Joined: 17.12.2002 14:09
Location: Berlin

Re: Another ECM project backoff and no new WU

Unread postby Dunckx » 07.05.2018 18:54

OK, I thought it was so. Pity. Ah well, there goes a day's crunching!

Thanks for letting me know.
Dunckx
PDA-Benutzer
PDA-Benutzer
 
Posts: 44
Joined: 12.11.2014 09:26

Re: Another ECM project backoff and no new WU

Unread postby Dunckx » 15.07.2018 20:03

This has now happened to me twice in this month. On 2nd July I got the same out-of-work message and project backoff and had to re-attach the project. Today it happened again, this time it was sufficient to reset the project in order to fix the fault.

My computer has now lost two days' output in two weeks, 14% wasted effort.

Please can the BOINC programming team do something about the client buffer size problem?! This is getting to be a real drag, as they say.

Thanks.
Dunckx
Dunckx
PDA-Benutzer
PDA-Benutzer
 
Posts: 44
Joined: 12.11.2014 09:26

Re: Another ECM project backoff and no new WU

Unread postby Dunckx » 22.07.2018 11:07

Today it has happened yet again, three times in one month! This time, resetting the project was not enough, I had to delete it and re-install it. The new BOINC version 7.12.1 hasn't fixed it.

At least three days of crunching lost in 22!

Dunckx
Dunckx
PDA-Benutzer
PDA-Benutzer
 
Posts: 44
Joined: 12.11.2014 09:26

Re: Another ECM project backoff and no new WU

Unread postby yoyo » 22.07.2018 14:24

Maybe you should set your bufers smaller.
In your first posting the message logs says to report 200 results. This is a lot and might be the reason for the long message reply which doesn't fit into the boinc client buffer.
HILF mit im Rechenkraft-WiKi, dies gibts zu tun.
Wiki - FAQ - Verein - Chat

Image Image
User avatar
yoyo
Vereinsvorstand
Vereinsvorstand
 
Posts: 7631
Joined: 17.12.2002 14:09
Location: Berlin

Re: Another ECM project backoff and no new WU

Unread postby ChristianB » 22.07.2018 14:44

What version of BOINC is that? I thought we increased the buffer size of the sched_reply message for the 7.10.x release.
ChristianB
Vereinsvorstand
Vereinsvorstand
 
Posts: 1890
Joined: 23.02.2010 22:12

Re: Another ECM project backoff and no new WU

Unread postby gemini8 » 27.07.2018 09:41

The Rakesearch project suggests the following:
Dear crunchers!
If your powerful machines store a large number of tasks and try to report about its completion to project server at once, they may faced timeouts or errors. Please use a max_tasks_reported option in cc_config.xml file for reducing of size of requests to project server and fast tasks reporting:
Code: Select all
<cc_config>
    ...
    <options>
        ...
        <max_tasks_reported>N</max_tasks_reported>
        ...
    </options>
    ...
</cc_config>

For example, with value N = 32 maximum size of xml-requests to projects server significantly decrease and them process faster.

Might this help with the afore-mentioned problem as well?
Gruß, Jens
- - - - - -
Lowend-User und Teilzeitcruncher

Image Image
Image
User avatar
gemini8
Vereinsmitglied
Vereinsmitglied
 
Posts: 2250
Joined: 31.05.2011 10:30
Location: Hannover


Return to Fehler, Wünsche / Bugs, Wishes

Who is online

Users browsing this forum: Exabot [Bot] and 9 guests