Message boards :
Number crunching :
Task running for over 26 hours after clock change
Message board moderation
Author | Message |
---|---|
Send message Joined: 11 Apr 10 Posts: 54 Credit: 382,341 RAC: 0 |
I've just spotted that one of my computers has been running a WUProp task for over 26 hours, stuck at 91.135%. Checking back through the BOINC event log and the NTP log file it looks like I had a rogue NTP clock update yesterday which caused the date to jump forward 3 months for less than 30 seconds. The WUProp task checkpointed once during that period but hasn't made another one since the time was corrected. Restarting BOINC caused the elapsed time to wind back to its previous checkpoint value of 5:27:00 but the task still made no further checkpoints. I dug a bit deeper and the problem appears to be the contents of the checkpoint file in the slot directory: 329 The second line converts to UTC time 14:45:31 on 3rd March 2016. The last line converts to UTC time 09:14:36 on 3rd December 2015. At the same time the checkpoint file on my other computer contained: 11 The second line converts to UTC time 14:01:50 on 4th December 2015. The last line converts to UTC time 23:51:50 on 4th December 2015. To test if the checkpoint file was the problem I stopped BOINC, changed the second line to the current time_t value and restarted BOINC. Checkpoints started being made again and the task was reported 33 minutes later, albeit with 24 hours of missing data (task 53970364). "The ultimate test of a moral society is the kind of world that it leaves to its children." - Dietrich Bonhoeffer |
Send message Joined: 20 Jun 12 Posts: 63 Credit: 94,685 RAC: 0 |
Great analysis! Jumping forward/back in time can cause problems in many programs E.g. what happened with the (Windows XP) "Scheduled Tasks"? I noticed Process Lasso reports that some process "Ran for 0 ms" if I change the clock back and then exit this program/process. This can also make BOINC itself "crazy" (e.g. may make the statistics_*.xml files 'bad' - they will need manual edit) I don't know if the following was fixed in some BOINC version but it exists/existed for many years: - Set computer clock back a few (5) minutes - Watch the processes (tasks) started by BOINC (in Windows Task Manager, Process Explorer) - they will exit after 30 seconds but BOINC Manager will continue to show them "Running" - After 5 minutes they will be restarted I reported this in 2008 ("Computer Clock back stops BOINC"): http://setiathome.berkeley.edu/forum_thread.php?id=45717 I use the very small (10 KB) program Neutron http://keir.net/neutron.html With the following contents in Neutron.ini [Options] AutoSync=0 AutoExit=0 Server=25 Retry=1 [Servers] 0="cuckoo.nevada.edu" 1="ntp.nblug.org" 2="ntp0.cornell.edu" 3="timekeeper.isi.edu" 4="nist1.symmetricom.com" 5="clock.via.net" 6="nist1.aol-ca.truetime.com" 7="nist.expertsmi.com" 8="nist1-dc.WiTime.net" 9="nist1-sj.WiTime.net" 10="time-a.nist.gov" 11="time-a.timefreq.bldrdoc.gov" 12="time-b.nist.gov" 13="time-b.timefreq.bldrdoc.gov" 14="time-c.timefreq.bldrdoc.gov" 15="utcnist.colorado.edu" 16="time.ien.it" 17="time.nrc.ca" 18="time.chu.nrc.ca" 19="clock.psu.edu" 20="tick.greyware.com" 21="ntp1.as34288.net" 22="ntp0.as34288.net" 23="time.ufe.cz" 24="ntp0.fau.de" 25="time.fu-berlin.de" 26="time.windows.com" 27="time.nist.gov" I mostly use time.fu-berlin.de which is also set in Windows XP (instead of time.windows.com) Never had problem with wrong time. (I use that server since 2-3 years) The last line converts to UTC time 23:51:50 on 4th December 2015 UNIX TimeStamp: 1449237110 = 04 Dec 2015 13:51:50 GMT http://www.onlineconversion.com/unix_time.htm  - ALF - "Find out what you don't do well ..... then don't do it!" :) |
©2024 Sébastien