Possible stuck task

Everything about the project RNA World
Nachricht
Autor
BobCat13
PDA-Benutzer
PDA-Benutzer
Beiträge: 40
Registriert: 17.02.2010 19:33

Possible stuck task

#1 Ungelesener Beitrag von BobCat13 » 18.12.2015 17:09

Running task 14950361 under Linux. At approximately 720 hours CPU time, the progress % reached 44.688. The CPU time is now almost 800 hours and the progress % is still 44.688.

I have done the stop/start client, as well as rebooting the machine, but no change in progress %. Checking the two files written in the Snapshot directory, the file sizes do not change, but their crc32 values do change. Are the crc32 values changing an indication that the task is still running and just hit a difficult part? Or is this task stuck and should be aborted? One user did finish their copy, but it took 2.73 times the estimate from the reference machine. I don't mind letting my copy run to see if it will advance eventually.

Thanks,
BobCat13

ChristianB
Vereinsvorstand
Vereinsvorstand
Beiträge: 1915
Registriert: 23.02.2010 22:12

Re: Possible stuck task

#2 Ungelesener Beitrag von ChristianB » 19.12.2015 14:27

You should see some progress in the stderr.txt file. If not the task is really stuck.

BobCat13
PDA-Benutzer
PDA-Benutzer
Beiträge: 40
Registriert: 17.02.2010 19:33

Re: Possible stuck task

#3 Ungelesener Beitrag von BobCat13 » 21.12.2015 16:48

ChristianB hat geschrieben:You should see some progress in the stderr.txt file. If not the task is really stuck.
Not sure what I should be seeing in that file. Here is a snippet of some entries this morning (there was a power outage from 08:30 until 10:10)

2015-12-21 07:21:28 (2867): Creating new snapshot for VM.
2015-12-21 07:21:28 (2867): Restoring VM Process priority.
2015-12-21 07:21:45 (2867): Lowering VM Process priority.
2015-12-21 07:21:48 (2867): Deleting stale snapshot.
2015-12-21 07:22:03 (2867): Checkpoint completed.
2015-12-21 07:51:27 (2867): Creating new snapshot for VM.
2015-12-21 07:51:27 (2867): Restoring VM Process priority.
2015-12-21 07:51:43 (2867): Lowering VM Process priority.
2015-12-21 07:51:44 (2867): Deleting stale snapshot.
2015-12-21 07:51:53 (2867): Checkpoint completed.
2015-12-21 08:21:27 (2867): Creating new snapshot for VM.
2015-12-21 08:21:27 (2867): Restoring VM Process priority.
2015-12-21 08:21:44 (2867): Lowering VM Process priority.
2015-12-21 08:21:45 (2867): Deleting stale snapshot.
2015-12-21 08:21:47 (2867): Checkpoint completed.
2015-12-21 10:12:49 (2891): vboxwrapper (7.3.26085): starting
2015-12-21 10:12:49 (2891): Feature: Checkpoint interval offset (222 seconds)
2015-12-21 10:12:49 (2891): Feature: Enabling trickle-ups (Interval: 14400.000000)
2015-12-21 10:12:49 (2891): Detected: VirtualBox 4.3.30r101610
2015-12-21 10:12:53 (2891): Detected: Minimum checkpoint interval (1800.000000 seconds)
2015-12-21 10:12:54 (2891): Restore from previously saved snapshot.
2015-12-21 10:12:55 (2891): Restore completed.
2015-12-21 10:12:55 (2891): Starting VM.
2015-12-21 10:13:33 (2891): Successfully started VM. (PID = '3011')
2015-12-21 10:13:33 (2891): Reporting VM Process ID to BOINC.
2015-12-21 10:13:34 (2891): Lowering VM Process priority.
2015-12-21 10:13:35 (2891): VM state change detected. (old = 'poweroff', new = 'running')
2015-12-21 10:13:37 (2891): Status Report: Elapsed Time: '3125762.380967'
2015-12-21 10:13:37 (2891): Status Report: CPU Time: '3114374.600000'
2015-12-21 10:13:37 (2891): Status Report: Network Bytes Sent (Total): '0.000000'
2015-12-21 10:13:37 (2891): Status Report: Network Bytes Received (Total): '0.000000'
2015-12-21 10:13:37 (2891): Status Report: Trickle-Up Event.
2015-12-21 10:13:37 (2891): Preference change detected
2015-12-21 10:13:37 (2891): Setting CPU throttle for VM. (100%)
2015-12-21 10:13:37 (2891): Checkpoint Interval is now 600 seconds.
2015-12-21 10:23:11 (2891): Creating new snapshot for VM.
2015-12-21 10:23:11 (2891): Restoring VM Process priority.
2015-12-21 10:23:25 (2891): Lowering VM Process priority.
2015-12-21 10:23:28 (2891): Deleting stale snapshot.
2015-12-21 10:23:31 (2891): Checkpoint completed.

ChristianB
Vereinsvorstand
Vereinsvorstand
Beiträge: 1915
Registriert: 23.02.2010 22:12

Re: Possible stuck task

#4 Ungelesener Beitrag von ChristianB » 23.01.2016 18:30

If you look at the content of the progress file it should change maybe it does not get shown by the Client? If the content does not change after an hour the task is stuck.

Sorry for the late response. I was on vacation and just saw your message now.

MLx
Mikrocruncher
Mikrocruncher
Beiträge: 23
Registriert: 21.05.2011 17:49

Re: Possible stuck task

#5 Ungelesener Beitrag von MLx » 23.01.2016 20:57

Hi Christian, I may be seeing the same problem. Task 6330703 (OS X) has reached 98.765% Very quickly (about 30 minutes after starting, I think), and has been stuck on that percentage for over 2 CPU days now.

Which is the progress file?
I see shared/progress.txt in the slot directory, but that's just text with the same percentage.

ChristianB
Vereinsvorstand
Vereinsvorstand
Beiträge: 1915
Registriert: 23.02.2010 22:12

Re: Possible stuck task

#6 Ungelesener Beitrag von ChristianB » 23.01.2016 22:40

MLx hat geschrieben:Hi Christian, I may be seeing the same problem. Task 6330703 (OS X) has reached 98.765% Very quickly (about 30 minutes after starting, I think), and has been stuck on that percentage for over 2 CPU days now.

Which is the progress file?
I see shared/progress.txt in the slot directory, but that's just text with the same percentage.
No that is normal. The percentage 98.765 was chosen deliberately for the case where we underestimated the time need to finish the task. You just need to let it run from now on. You can only look at the modification timestamp of the shared/progress.txt file.

Antworten

Zurück zu „RNA World Discussions (english)“