Message boards :
Number crunching :
PrimeGrid PSP LLR not counting correctly
Message board moderation
Author | Message |
---|---|
Send message Joined: 7 Apr 10 Posts: 224 Credit: 461,423 RAC: 0 |
Hi! As far as I can see, there seems to be no thread regarding such an issue, so I opened this one. I think there's definitely a problem with how WUProp is counting the times from PrimeGrids subproject Prime Sierpinski Problem LLR. Recently I started several tasks on two comps, the last two days I returned these three ones: 530224709 383917896 246708 4 Mar 2014 | 2:59:18 UTC 24 Mar 2014 | 8:41:28 UTC Fertig und Bestätigt 485,223.39 (134.78 hours) 417,523.13 10,597.77 Prime Sierpinski Problem (LLR) v6.15 530216327 383919258 246708 4 Mar 2014 | 2:59:18 UTC 24 Mar 2014 | 8:43:26 UTC Fertig und Bestätigt 508,745.45 (141.32 hours) 422,600.52 10,668.74 Prime Sierpinski Problem (LLR) v6.15 530208321 383921185 246708 4 Mar 2014 | 2:59:18 UTC 25 Mar 2014 | 0:36:31 UTC Fertig und Bestätigt 624,106.74 (173.36 hours) 531,300.39 11,764.39 Prime Sierpinski Problem (LLR) v6.15 All in all, the finished time sums up to 449.46 hours. But in my account are only 142.95 hours listed currently, with 281.48 pending from another 4 tasks I have remaining on my second computer, but this pending value isn't correctly either, the WUs have already way more hours crunched (and are not finished yet). I gave it some other WU-updates to see if the value changes, but it looks like it doesn't, since it's already over a day ago. Judging from the numbers it almost appears that only one task of the three was counted from WUProp, the second one. Is this possible? Life is Science, and Science rules. To the universe and beyond Member of BOINC@Heidelberg My BOINC-Stats |
Send message Joined: 22 Jan 13 Posts: 107 Credit: 805,487 RAC: 48 |
Multicore machines sometimes do not show all pending hours. When they finish it does add up pretty well. I am running PSP and I have counted the hours and do not seem to miss any. Give it a chance. My movie https://vimeo.com/manage/videos/502242 |
Send message Joined: 7 Apr 10 Posts: 224 Credit: 461,423 RAC: 0 |
Multicore machines sometimes do not show all pending hours. Well, that is often the case, I noticed that before also on other projects, in most of the cases it really works out itself. But the pendings are not what matter here in the first place. When they finish it does add up pretty well. Well, they finished now over two days ago and my WUProp account is still showing with completed 142.95 hours. How long do you think should it take to show up a near value to 449.46 hours which I had calculated in the first post? It normally should have matched after the first WUProp wu-upload after the PG tasks did finish, but it didn't. In the first post I marked the real run time of the finished results. I'm not sure what values WUProp really counts (run time or cpu time), but even when it picks up only cpu time (the bold ones below) these are still way more (380.95 hours) than what my account currently shows (142.95 hours to remember). 530224709 383917896 246708 4 Mar 2014 | 2:59:18 UTC 24 Mar 2014 | 8:41:28 UTC Fertig und Bestätigt 485,223.39 417,523.13 (115.98 hours) 10,597.77 Prime Sierpinski Problem (LLR) v6.15 530216327 383919258 246708 4 Mar 2014 | 2:59:18 UTC 24 Mar 2014 | 8:43:26 UTC Fertig und Bestätigt 508,745.45 422,600.52 (117.39 hours) 10,668.74 Prime Sierpinski Problem (LLR) v6.15 530208321 383921185 246708 4 Mar 2014 | 2:59:18 UTC 25 Mar 2014 | 0:36:31 UTC Fertig und Bestätigt 624,106.74 531,300.39 (147.58 hours) 11,764.39 Prime Sierpinski Problem (LLR) v6.15 It doesn't match at all, no matter how you look at it... I don't know how the WUProp mechanism works but as far as I can judge from that something doesn't seem very well working here... It's as if only one WU was recorded as I first suspected. Or how do you explain a missing gap of 238 hours (cpu-time) respective 306.51 (run-time)? Life is Science, and Science rules. To the universe and beyond Member of BOINC@Heidelberg My BOINC-Stats |
Send message Joined: 20 Jun 12 Posts: 63 Credit: 94,685 RAC: 0 |
I think this is the computer: http://www.primegrid.com/show_host_detail.php?hostid=246708 http://wuprop.boinc-af.org/show_host_detail.php?hostid=37777 (you are lucky to get such hostid with 4 '7' in a row '7777' ;) ) The last WUProp@Home tasks on this computer show some problems (which may be the reason for missing hours): http://wuprop.boinc-af.org/results.php?hostid=37777 One is 'Error while computing' (after the full 'Run time' of 6 h (nothing that happened in these 6 h will be recorded to your sums)) Stderr don't show anything: http://wuprop.boinc-af.org/result.php?resultid=39667670 For 'Exit status (0xffffffff80000003)' the Google search give some results which you can check for some hint of what happened: https://www.google.bg/#q=Exit+status+(0xffffffff80000003) https://www.google.bg/search?q=Exit+status+(0xffffffff80000003)+site:wuprop.boinc-af.org E.g. this one: http://wuprop.boinc-af.org/forum_thread.php?id=208 To see what happened also search in your stdoutdae.txt and stdoutdae.old for: wu_v4_1394897173_223642 Other task was accepted (Valid): http://wuprop.boinc-af.org/result.php?resultid=39662155 ... but it seems during the run the WUProp@Home app had several problems to get info: Stderr output <core_client_version>7.0.8</core_client_version> <![CDATA[ <stderr_txt> No ATI library found. Device GeForce GTX 660 already detected as NVIDIA GeForce GTX 660 [03/26/14 02:21:16] TRACE [14804]: RPC_CLIENT::init connect 2: Winsock error '10061' [03/26/14 02:21:16] TRACE [14804]: RPC_CLIENT::init connect on 576 returned -1 02:21:16 (14236): can't connect to localhost02:21:16 (14236): Erreur reception active_result 02:21:17 (14236): No heartbeat from client for 30 sec - exiting No ATI library found. Device GeForce GTX 660 already detected as NVIDIA GeForce GTX 660 03:51:03 (16976): Can't acquire lockfile (32) - waiting 35s 03:51:07 (8092): No heartbeat from client for 30 sec - exiting No ATI library found. Device GeForce GTX 660 already detected as NVIDIA GeForce GTX 660 05:46:19 (16976): called boinc_finish </stderr_txt> - ALF - "Find out what you don't do well ..... then don't do it!" :) |
Send message Joined: 7 Apr 10 Posts: 224 Credit: 461,423 RAC: 0 |
The last WUProp@Home tasks on this computer show some problems (which may be the reason for missing hours): Well, if you look closely to the date and time you will notice that the last wu has nothing to do anymore with the PG tasks, because they were already finished over a day ago... ;-) This WU is an exception and was lost because I updated my video driver. I waited too long with the reboot and suddenly got a bluescreen unfortunately. As the machine restarted the wu had a failure for whatever reason. Other projects do miss some hours now, but this is negligible. ;-) If you think the lost time of the PG tasks could be because of lost WUProp results I don't think it's possible. To come to the hours I described above it would take at least 13 lost WUProp tasks, and this definitely did not happen during the crunching, not even one. Other task was accepted (Valid): Well, since they are valid I guess there couldn't be much data loss. At least I can't see much of a loss on other projects except for the failure above, and I check my WUProp list regularly. If I didn't I guess I wouldn't have even noticed the problem with the PG tasks. ;-) Well, it doesn't matter much anyway to me, I only thought I bring it up so that the project admins can check and correct it probably. When I have 500 hours from PG PSP LLR I move on to the next subprojects, and with the pendings from the second computer I have already reached it. ;-) Btw: nice avatar and sig, live long and prosper and don't eat cats. ;-D Edit 2: On a side note, I notice now that the pendings on the second computer are quite correct, since they weren't on the first one. It is a quite weird thesis, but maybe an OS issue could cause the missing hours also. Maybe WUProp does count multiple PSP LLRs correctly on Windows Vista, but not on Win 7... Life is Science, and Science rules. To the universe and beyond Member of BOINC@Heidelberg My BOINC-Stats |
Send message Joined: 28 Mar 10 Posts: 588 Credit: 1,220,149 RAC: 237 |
G'Day DoctorNow, Not sure if it is any help but that computer with Win 7 on it after the WU error a new WU was issued on the 26th but still has not been returned, so is there a problem with that computer? Counting the error WU there has been no output from that computer since 4.47 UTC on the 26th. Just another thing for you to scatch your head over. Conan |
Send message Joined: 7 Apr 10 Posts: 224 Credit: 461,423 RAC: 0 |
Not sure if it is any help but that computer with Win 7 on it after the WU error a new WU was issued on the 26th but still has not been returned, so is there a problem with that computer? It's just turned off atm. ;-) I'm switching between my two comps regularly from time to time. And before you ask: no, I never went over the deadline of the WUs (while crunching the PG tasks). It has nothing to do with a probable data loss. Life is Science, and Science rules. To the universe and beyond Member of BOINC@Heidelberg My BOINC-Stats |
Send message Joined: 7 Sep 10 Posts: 453 Credit: 945,109 RAC: 0 |
Today I had 3 system restarts and 3 WUProp WU's failed with the same error Exit status -2147483645 (0xffffffff80000003) W7x64 system, 8thead i7, 8GB DDR3, 1TB drive At the time I was 'trying' to run climateprediction.net work. In the recent past I sometimes got WUProp failures when running MW, so I have stopped that. It appears that when I do not run WUProp, I don't have any crashes... The stderr says, time ::: (number) : Can't open init data file - running in standalone mode debug A, in_s: in , and similar, ~30 times Below are links to today's failures. http://wuprop.boinc-af.org/result.php?resultid=40719381 http://wuprop.boinc-af.org/result.php?resultid=40717505 http://wuprop.boinc-af.org/result.php?resultid=40711887 Error logs are all similar. Task 40711887 Name wu_v4_1398018388_424326_0 Workunit 40109856 Created 10 May 2014 1:45:16 UTC Sent 10 May 2014 13:02:29 UTC Received 10 May 2014 18:59:30 UTC Server state Over Outcome Computation error Client state Compute error Exit status -2147483645 (0xffffffff80000003) Computer ID 66846 Report deadline 20 May 2014 13:02:29 UTC Run time 18,052.59 CPU time 3.34 Validate state Invalid Credit 0.00 Application version Data collect version 4 v4.14 (nci) Stderr output <core_client_version>7.2.42</core_client_version> <![CDATA[ <message> One or more arguments are invalid (0x80000003) - exit code -2147483645 (0x80000003) </message> <stderr_txt> No NVIDIA library found calInit() returned 4 ERROR: Invalid parameter detected in function (null). File: (null) Line: 0 ERROR: Expression: (null) Unhandled Exception Detected... - Unhandled Exception Record - Reason: Breakpoint Encountered (0x80000003) at address 0x000007FEFCD53CA2 Engaging BOINC Windows Runtime Debugger... ******************** BOINC Windows Runtime Debugger Version 7.0.64 Dump Timestamp : 05/10/14 19:58:20 Install Directory : C:\Program Files\BOINC\ Data Directory : C:\ProgramData\BOINC Project Symstore : Loaded Library : C:\Program Files\BOINC\\dbghelp.dll Loaded Library : C:\Program Files\BOINC\\symsrv.dll Loaded Library : C:\Program Files\BOINC\\srcsrv.dll LoadLibraryA( C:\Program Files\BOINC\\version.dll ): GetLastError = 126 Loaded Library : version.dll Debugger Engine : 4.0.5.0 Symbol Search Path: C:\ProgramData\BOINC\slots\5;C:\ProgramData\BOINC\projects\wuprop.boinc-af.org ModLoad: 0000000040000000 00000000000f1000 C:\ProgramData\BOINC\projects\wuprop.boinc-af.org\data_collect_v4_4.14_windows_x86_64__nci.exe (-nosymbols- Symbols Loaded) Linked PDB Filename : c:\Documents and Settings\Seb\Mes documents\Visual Studio 2005\Projects\data_collect\x64\release\data_collect.pdb ModLoad: 0000000076e60000 00000000001a9000 C:\Windows\SYSTEM32\ntdll.dll (6.1.7601.18247) (-exported- Symbols Loaded) Linked PDB Filename : ntdll.pdb File Version : 6.1.7600.16385 (win7_rtm.090713-1255) Company Name : Microsoft Corporation Product Name : Microsoft® Windows® Operating System Product Version : 6.1.7600.16385 ModLoad: 0000000076c40000 000000000011f000 C:\Windows\system32\kernel32.dll (6.1.7601.18409) (-exported- Symbols Loaded) Linked PDB Filename : kernel32.pdb File Version : 6.1.7601.18015 (win7sp1_gdr.121129-1432) Company Name : Microsoft Corporation Product Name : Microsoft® Windows® Operating System Product Version : 6.1.7601.18015 ModLoad: 00000000fcd20000 000000000006b000 C:\Windows\system32\KERNELBASE.dll (6.1.7601.18229) (-exported- Symbols Loaded) Linked PDB Filename : kernelbase.pdb File Version : 6.1.7601.18015 (win7sp1_gdr.121129-1432) Company Name : Microsoft Corporation Product Name : Microsoft® Windows® Operating System Product Version : 6.1.7601.18015 ModLoad: 00000000fdf20000 000000000004d000 C:\Windows\system32\WS2_32.dll (6.1.7601.17514) (-exported- Symbols Loaded) Linked PDB Filename : ws2_32.pdb File Version : 6.1.7600.16385 (win7_rtm.090713-1255) Company Name : Microsoft Corporation Product Name : Microsoft® Windows® Operating System Product Version : 6.1.7600.16385 ModLoad: 00000000fd360000 000000000009f000 C:\Windows\system32\msvcrt.dll (7.0.7601.17744) (-exported- Symbols Loaded) Linked PDB Filename : msvcrt.pdb File Version : 7.0.7601.17744 (win7sp1_gdr.111215-1535) Company Name : Microsoft Corporation Product Name : Microsoft® Windows® Operating System Product Version : 7.0.7601.17744 ModLoad: 00000000fd620000 000000000012d000 C:\Windows\system32\RPCRT4.dll (6.1.7601.18205) (-exported- Symbols Loaded) Linked PDB Filename : rpcrt4.pdb File Version : 6.1.7600.16385 (win7_rtm.090713-1255) Company Name : Microsoft Corporation Product Name : Microsoft® Windows® Operating System Product Version : 6.1.7600.16385 ModLoad: 00000000fdf10000 0000000000008000 C:\Windows\system32\NSI.dll (6.1.7600.16385) (-exported- Symbols Loaded) Linked PDB Filename : nsi.pdb File Version : 6.1.7600.16385 (win7_rtm.090713-1255) Company Name : Microsoft Corporation Product Name : Microsoft® Windows® Operating System Product Version : 6.1.7600.16385 ModLoad: 0000000076d60000 00000000000fa000 C:\Windows\system32\USER32.dll (6.1.7601.17514) (-exported- Symbols Loaded) Linked PDB Filename : user32.pdb File Version : 6.1.7601.17514 (win7sp1_rtm.101119-1850) Company Name : Microsoft Corporation Product Name : Microsoft® Windows® Operating System Product Version : 6.1.7601.17514 ModLoad: 00000000fd750000 0000000000067000 C:\Windows\system32\GDI32.dll (6.1.7601.18275) (-exported- Symbols Loaded) Linked PDB Filename : gdi32.pdb File Version : 6.1.7601.18275 (win7sp1_gdr.131002-1533) Company Name : Microsoft Corporation Product Name : Microsoft® Windows® Operating System Product Version : 6.1.7601.18275 ModLoad: 00000000fd400000 000000000000e000 C:\Windows\system32\LPK.dll (6.1.7601.18177) (-exported- Symbols Loaded) Linked PDB Filename : lpk.pdb File Version : 6.1.7601.18177 (win7sp1_gdr.130605-1534) Company Name : Microsoft Corporation Product Name : Microsoft® Windows® Operating System Product Version : 6.1.7601.18177 ModLoad: 00000000fe290000 00000000000c9000 C:\Windows\system32\USP10.dll (1.626.7601.18009) (-exported- Symbols Loaded) Linked PDB Filename : usp10.pdb File Version : 1.0626.7601.18009 (win7sp1_gdr.121121-1431) Company Name : Microsoft Corporation Product Name : Microsoft(R) Uniscribe Unicode script processor Product Version : 1.0626.7601.18009 ModLoad: 0000000077030000 0000000000007000 C:\Windows\system32\PSAPI.DLL (6.1.7600.16385) (-exported- Symbols Loaded) Linked PDB Filename : psapi.pdb File Version : 6.1.7600.16385 (win7_rtm.090713-1255) Company Name : Microsoft Corporation Product Name : Microsoft® Windows® Operating System Product Version : 6.1.7600.16385 ModLoad: 00000000fd050000 00000000000db000 C:\Windows\system32\ADVAPI32.dll (6.1.7601.18247) (-exported- Symbols Loaded) Linked PDB Filename : advapi32.pdb File Version : 6.1.7600.16385 (win7_rtm.090713-1255) Company Name : Microsoft Corporation Product Name : Microsoft® Windows® Operating System Product Version : 6.1.7600.16385 ModLoad: 00000000ff150000 000000000001f000 C:\Windows\SYSTEM32\sechost.dll (6.1.7600.16385) (-exported- Symbols Loaded) Linked PDB Filename : sechost.pdb File Version : 6.1.7600.16385 (win7_rtm.090713-1255) Company Name : Microsoft Corporation Product Name : Microsoft® Windows® Operating System Product Version : 6.1.7600.16385 ModLoad: 00000000fdee0000 000000000002e000 C:\Windows\system32\IMM32.DLL (6.1.7600.16385) (-exported- Symbols Loaded) Linked PDB Filename : imm32.pdb File Version : 6.1.7600.16385 (win7_rtm.090713-1255) Company Name : Microsoft Corporation Product Name : Microsoft® Windows® Operating System Product Version : 6.1.7600.16385 ModLoad: 00000000fdb20000 0000000000109000 C:\Windows\system32\MSCTF.dll (6.1.7600.16385) (-exported- Symbols Loaded) Linked PDB Filename : msctf.pdb File Version : 6.1.7600.16385 (win7_rtm.090713-1255) Company Name : Microsoft Corporation Product Name : Microsoft® Windows® Operating System Product Version : 6.1.7600.16385 ModLoad: 0000000073da0000 000000000015e000 C:\Program Files\BOINC\dbghelp.dll (6.8.4.0) (-exported- Symbols Loaded) Linked PDB Filename : dbghelp.pdb File Version : 6.8.0004.0 (debuggers(dbg).070519-0745) Company Name : Microsoft Corporation Product Name : Debugging Tools for Windows(R) Product Version : 6.8.0004.0 ModLoad: 0000000073d50000 000000000004e000 C:\Program Files\BOINC\symsrv.dll (6.8.4.0) (-exported- Symbols Loaded) Linked PDB Filename : symsrv.pdb File Version : 6.8.0004.0 (debuggers(dbg).070519-0745) Company Name : Microsoft Corporation Product Name : Debugging Tools for Windows(R) Product Version : 6.8.0004.0 ModLoad: 0000000073d10000 000000000003e000 C:\Program Files\BOINC\srcsrv.dll (6.8.4.0) (-exported- Symbols Loaded) Linked PDB Filename : srcsrv.pdb File Version : 6.8.0004.0 (debuggers(dbg).070519-0745) Company Name : Microsoft Corporation Product Name : Debugging Tools for Windows(R) Product Version : 6.8.0004.0 ModLoad: 00000000fbd60000 000000000000c000 C:\Windows\system32\version.dll (6.1.7600.16385) (-exported- Symbols Loaded) Linked PDB Filename : version.pdb File Version : 6.1.7600.16385 (win7_rtm.090713-1255) Company Name : Microsoft Corporation Product Name : Microsoft® Windows® Operating System Product Version : 6.1.7600.16385 *** Dump of the Process Statistics: *** - I/O Operations Counters - Read: 0, Write: 0, Other 0 - I/O Transfers Counters - Read: 0, Write: 0, Other 0 - Paged Pool Usage - QuotaPagedPoolUsage: 0, QuotaPeakPagedPoolUsage: 0 QuotaNonPagedPoolUsage: 0, QuotaPeakNonPagedPoolUsage: 0 - Virtual Memory Usage - VirtualSize: 0, PeakVirtualSize: 0 - Pagefile Usage - PagefileUsage: 0, PeakPagefileUsage: 0 - Working Set Size - WorkingSetSize: 0, PeakWorkingSetSize: 0, PageFaultCount: 0 *** Dump of thread ID 3104 (state: Initialized): *** - Information - Status: Base Priority: Normal, Priority: Normal, , Kernel Time: 0.000000, User Time: 0.000000, Wait Time: 0.000000 - Unhandled Exception Record - Reason: Breakpoint Encountered (0x80000003) at address 0x000007FEFCD53CA2 - Registers - rax=000000000000001a rbx=0000000000000000 rcx=00000000400a06e8 rdx=0000000000000002 rsi=0000000000000000 rdi=0000000000000000 r8=0000000000000000 r9=0000000000000000 r10=0000000040000000 r11=0000000000000200 r12=000000000000000a r13=0000000000000000 r14=0000000000000000 r15=0000000000000000 rip=00000000fcd53ca2 rsp=000000000012c5c8 rbp=0000000000000000 cs=0033 ss=002b ds=002b es=002b fs=0053 gs=002b efl=00000206 - Callstack - ChildEBP RetAddr Args to Child 0012c5c0 4005b84f 00000000 00000000 00000000 01d812f0 KERNELBASE!DebugBreak+0x0 0012cb80 4005d4b1 00000000 0000003c 00000000 400684b9 data_collect_v4_4.14_windows_x8!+0x0 0012cbf0 4005d76a 00000201 00000000 00000000 00000000 data_collect_v4_4.14_windows_x8!+0x0 0012cc30 4000db11 ffffffff 000003ca 00000000 4009d3a0 data_collect_v4_4.14_windows_x8!+0x0 0012ff00 4005ce05 00000000 00000000 00000000 00000006 data_collect_v4_4.14_windows_x8!+0x0 0012ff50 76c559ed 00000000 00000000 00000000 00000000 data_collect_v4_4.14_windows_x8!+0x0 0012ff80 76e8c541 00000000 00000000 00000000 00000000 kernel32!BaseThreadInitThunk+0x0 0012ffd0 00000000 00000000 00000000 00000000 00000000 ntdll!RtlUserThreadStart+0x0 *** Dump of thread ID 3220 (state: Initialized): *** - Information - Status: Base Priority: Normal, Priority: Normal, , Kernel Time: 0.000000, User Time: 0.000000, Wait Time: 0.000000 - Registers - rax=00000000bac14649 rbx=0000000000000000 rcx=0000000076eb15fa rdx=0000000000000000 rsi=0000000000000064 rdi=0000000000000000 r8=000000000264fe88 r9=0000000000000000 r10=0000000000000000 r11=0000000000000246 r12=000000000264feb0 r13=0000000000000000 r14=0000000000000000 r15=0000000000000000 rip=0000000076eb15fa rsp=000000000264fe88 rbp=0000000000000000 cs=0033 ss=002b ds=0000 es=0000 fs=0000 gs=0000 efl=00000246 - Callstack - ChildEBP RetAddr Args to Child 0264fe80 fcd21203 0264ff48 00000000 00000000 00000000 ntdll!ZwDelayExecution+0x0 0264ff20 4002dc9f 00000000 00000000 00000000 00000000 KERNELBASE!SleepEx+0x0 0264ff50 76c559ed 00000000 00000000 00000000 00000000 data_collect_v4_4.14_windows_x8!+0x0 0264ff80 76e8c541 00000000 00000000 00000000 00000000 kernel32!BaseThreadInitThunk+0x0 0264ffd0 00000000 00000000 00000000 00000000 00000000 ntdll!RtlUserThreadStart+0x0 *** Debug Message Dump **** *** Foreground Window Data *** Window Name : Window Class : Window Process ID: 0 Window Thread ID : 0 Exiting... </stderr_txt> ]]> |
Send message Joined: 7 Apr 10 Posts: 224 Credit: 461,423 RAC: 0 |
You should have opened a new thread, this has nothing to do with the problem I reported here... Life is Science, and Science rules. To the universe and beyond Member of BOINC@Heidelberg My BOINC-Stats |
Send message Joined: 7 Sep 10 Posts: 453 Credit: 945,109 RAC: 0 |
In my case the issue was related to system memory. Swapped it and no further problems, so far... =sticky tape= |
©2024 Sébastien