WUProps WU will not complete

Message boards : Number crunching : WUProps WU will not complete
Message board moderation

To post messages, you must log in.

AuthorMessage
Profile marmot
     
Avatar

Send message
Joined: 13 Dec 15
Posts: 174
Credit: 2,268,792
RAC: 248
Message 8177 - Posted: 30 May 2021, 1:30:30 UTC

My 1090t, that's been responsible for many hours and hosts my best GPU's, hasn't completed a WUProp WU since May 13 (weather is much hotter and it was off for 10+ days).

Reset the project several times, pulled and reinstalled after making sure the slots and data folder were empty of WUProps remains.

The WU from today is now going on 36,000 seconds and shouldn't go past 23,000.

Process Hacker shows the WU is using 0.06% CPU every approx 30 seconds and saves about 24k bytes to the drive about as often. Not seeing any network activity.

This machine is Windows X 2017 ed. The only things that have changed since early May are reinstalled the NVidia and AMD GPU drivers, turned off a few services but turned them back on and still having the issue. (This should not be relevant).

The only thing that I was doing about May 13th differently was manually setting all 8 minecraft@home WU's affinity to core 5 to see if nVidia OpenCL locking of cores could be alleviated (it helped greatly; but I need an affinity shift automation script/app).

Can playing with affinities earn a WUProps ban?

Is just changing the BOINC host CPID enough to fix this?
I haven't done that in while... just changing the computer name (equiv to domain name) in Windows is enough to force a new host CPID but then prolly have to go and merge the machine with it's old stats on every project it's attached to... which is distressingly very many.
Maybe detaching WUProps first then changing the domain name then reattaching will just force WUProps to generate a new host CPID and the other projects will not generate new or even if they do keep the stats properly merged with the old ID?

Will a new WUProps CrossID even fix my issue?

Is there another fix?
Like figuring out why this is happening to my machine?
ID: 8177 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
mikey
     
Avatar

Send message
Joined: 20 May 10
Posts: 552
Credit: 1,900,065
RAC: 788
Message 8178 - Posted: 30 May 2021, 2:54:34 UTC - in response to Message 8177.  

My 1090t, that's been responsible for many hours and hosts my best GPU's, hasn't completed a WUProp WU since May 13 (weather is much hotter and it was off for 10+ days).

Reset the project several times, pulled and reinstalled after making sure the slots and data folder were empty of WUProps remains.

The WU from today is now going on 36,000 seconds and shouldn't go past 23,000.

Process Hacker shows the WU is using 0.06% CPU every approx 30 seconds and saves about 24k bytes to the drive about as often. Not seeing any network activity.

This machine is Windows X 2017 ed. The only things that have changed since early May are reinstalled the NVidia and AMD GPU drivers, turned off a few services but turned them back on and still having the issue. (This should not be relevant).

The only thing that I was doing about May 13th differently was manually setting all 8 minecraft@home WU's affinity to core 5 to see if nVidia OpenCL locking of cores could be alleviated (it helped greatly; but I need an affinity shift automation script/app).

Can playing with affinities earn a WUProps ban?

Is just changing the BOINC host CPID enough to fix this?
I haven't done that in while... just changing the computer name (equiv to domain name) in Windows is enough to force a new host CPID but then prolly have to go and merge the machine with it's old stats on every project it's attached to... which is distressingly very many.
Maybe detaching WUProps first then changing the domain name then reattaching will just force WUProps to generate a new host CPID and the other projects will not generate new or even if they do keep the stats properly merged with the old ID?

Will a new WUProps CrossID even fix my issue?

Is there another fix?
Like figuring out why this is happening to my machine?


Mine are running for 6 hours, 36k seconds, of clock time but only 21k of cpu time when I get credit for them. I would let it run for a few days and see what happens.

And setting cpu affinity didn't have any effect of wuprop when I did it awhile back.

I have no idea about new host cpid's and wuprop but do know they can take a few days to migrate across the stats sites.
ID: 8178 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile [AF>WildWildWest] Sébastie...
     
Project administrator
Avatar

Send message
Joined: 28 Mar 10
Posts: 2869
Credit: 538,357
RAC: 139
Message 8179 - Posted: 30 May 2021, 7:03:25 UTC - in response to Message 8177.  
Last modified: 30 May 2021, 7:04:15 UTC

@Marmot :
Workunit can't connect to BOINC (Winsock error '10061').
Could you check if the antivirus is blocking the connection?
ID: 8179 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile marmot
     
Avatar

Send message
Joined: 13 Dec 15
Posts: 174
Credit: 2,268,792
RAC: 248
Message 8189 - Posted: 3 Jun 2021, 7:13:43 UTC - in response to Message 8179.  
Last modified: 3 Jun 2021, 7:19:21 UTC

@Marmot :
Workunit can't connect to BOINC (Winsock error '10061').
Could you check if the antivirus is blocking the connection?



I thought that might be the issue so completely shut down Windows Firewall and it didn't seem to help.
That would be odd since the machine was running WU's fine with the Firewall on and the filter list didn't contain the WUProps app.
The machine has been off since after this post; I'll fire it back up and try another WU tomorrow.

Sorry, I didn't look into the WU logs. Now I see that error in the aborted WU.

Thanks for the response; this should be enough info for me to fix the issue.
ID: 8189 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile marmot
     
Avatar

Send message
Joined: 13 Dec 15
Posts: 174
Credit: 2,268,792
RAC: 248
Message 8249 - Posted: 15 Jul 2021, 17:18:12 UTC
Last modified: 15 Jul 2021, 17:20:20 UTC

So we had a heat wave and I just shut the machines off for weeks.

Noticed Gravitational Wave search O3 All-Sky #1 was available and had to fix this issue cause it's on the GPU host.


The error is "RPC_CLIENT::init connect 2: Winsock error '10061'" and nothing to do with the firewall.

(Just checking the Firewall GUI turned Windows Defender back on and now all the tricks I used that successfully disable it are not working. I've forgotten a trick, *sigh*).

This error could have been the 2 RPC Windows services are not enabled (they are enabled).

I read several odd posts around the web; decided to upgrade BOINC and BOINC spent a very long time checking folder privileges but did not fix the issue.
I always install BOINC for all users to use and not the current user.

Finally, my brain said "ah ha!" and I rt-clicked the BOINC/Data folder and gave write/modify privileges to all users on the machine and now WUProps can work.

This machine was running WUProps fine in early May and I did not upgrade BOINC, modify users or the OS back in May when this stopped working. No idea how this happened!

Maybe if I installed BOINC for the current admin user running BOINC, I could change the folder privileges back BUT since it's working; I do not want to waste any more time on ths.

Thanks for the response mikey and [AF>WildWildWest]Sebastien.

Be back more regularly in September.

TLDR: Gave write/modify permissions to all users for the BOINC Data folder to fix the RPC call error.
ID: 8249 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
mikey
     
Avatar

Send message
Joined: 20 May 10
Posts: 552
Credit: 1,900,065
RAC: 788
Message 8253 - Posted: 15 Jul 2021, 17:45:39 UTC - in response to Message 8249.  

So we had a heat wave and I just shut the machines off for weeks.

Noticed Gravitational Wave search O3 All-Sky #1 was available and had to fix this issue cause it's on the GPU host.


The error is "RPC_CLIENT::init connect 2: Winsock error '10061'" and nothing to do with the firewall.

(Just checking the Firewall GUI turned Windows Defender back on and now all the tricks I used that successfully disable it are not working. I've forgotten a trick, *sigh*).

This error could have been the 2 RPC Windows services are not enabled (they are enabled).

I read several odd posts around the web; decided to upgrade BOINC and BOINC spent a very long time checking folder privileges but did not fix the issue.
I always install BOINC for all users to use and not the current user.

Finally, my brain said "ah ha!" and I rt-clicked the BOINC/Data folder and gave write/modify privileges to all users on the machine and now WUProps can work.

This machine was running WUProps fine in early May and I did not upgrade BOINC, modify users or the OS back in May when this stopped working. No idea how this happened!

Maybe if I installed BOINC for the current admin user running BOINC, I could change the folder privileges back BUT since it's working; I do not want to waste any more time on ths.

Thanks for the response mikey and [AF>WildWildWest]Sebastien.

Be back more regularly in September.

TLDR: Gave write/modify permissions to all users for the BOINC Data folder to fix the RPC call error.


If Boinc would give clearer error messages it sure would help
ID: 8253 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote

Message boards : Number crunching : WUProps WU will not complete

©2024 Sébastien