Message boards :
Number crunching :
Invalid WU's
Message board moderation
Author | Message |
---|---|
Send message Joined: 11 Jun 10 Posts: 12 Credit: 996,343 RAC: 48 |
I started three computers on this project and two are working just fine. The third, with computer ID: 5522, keeps getting marked "Completed, marked as invalid". This is a partial list of the stderr out: <core_client_version>6.10.17</core_client_version> <![CDATA[ <stderr_txt> GPIPE: write on a pipe with no reader 19:30:42 (2573): Interrogation impossible Broken pipe. SIGPIPE: write on a pipe with no reader 19:30:43 (2573): Interrogation impossible Broken pipe. SIGPIPE: write on a pipe with no reader 19:30:44 (2573): Interrogation impossible Broken pipe. ... SIGPIPE: write on a pipe with no reader 07:33:03 (14831): Interrogation impossible Broken pipe. SIGPIPE: write on a pipe with no reader 07:33:05 (14831): Interrogation impossible Broken pipe. 07:33:06 (14831): Deconnection impossible Transport endpoint is not connected. 07:33:06 (14831): Connection impossible Connection refused. SIGPIPE: write on a pipe with no reader 07:33:06 (14831): Interrogation impossible Broken pipe. SIGPIPE: write on a pipe with no reader ... SIGPIPE: write on a pipe with no reader 07:43:03 (14831): Interrogation impossible Broken pipe. SIGPIPE: write on a pipe with no reader 07:43:04 (14831): Interrogation impossible Broken pipe. SIGPIPE: write on a pipe with no reader 07:43:05 (14831): Interrogation impossible Broken pipe. 07:43:06 (14831): Deconnection impossible Transport endpoint is not connected. 07:43:06 (14831): called boinc_finish </stderr_txt> ]]> Can someone explain what is going on here and if there is anything I should be doing my end to fix the problem. This computer is running several projects and only CPU WUs. |
Send message Joined: 11 Jun 10 Posts: 12 Credit: 996,343 RAC: 48 |
I tried a few of the newest WU's and got the same messages again. Can anyone help me figure this out? Thanks John |
Send message Joined: 28 Mar 10 Posts: 588 Credit: 1,220,597 RAC: 237 |
I had something like this on a remote computer and my two home Windows computers, I found that the Anti Virus programme I am running had blocked the WU when it had been updated and the file version name changed from 1.37 to 1.38. This happens each time the version number changes and the WU sits there spinning its wheels increasing run time but not running as the Anti Virus programme is waiting for me to say the programme is safe to run. It buggered up on three computers this time as the version number changed twice very quickly from 1.36 to 1.37 then to 1.38. The Linux computers have been unaffected by this issue. So I will have to check out my Anti Virus programme (Zone Alarm). Conan. |
Send message Joined: 28 Mar 10 Posts: 2871 Credit: 538,595 RAC: 132 |
I tried a few of the newest WU's and got the same messages again. Can anyone help me figure this out? Which distribution did you use? Did you use SELinux? I will try to reproduce the problem. |
Send message Joined: 11 Jun 10 Posts: 12 Credit: 996,343 RAC: 48 |
This machine is running Debian linux I will get the exact details when I get home tonight and let you know what they are. I do know it is a very recent version of Debian. Thanks. |
Send message Joined: 11 Jun 10 Posts: 12 Credit: 996,343 RAC: 48 |
OK, A few more details, The OS and kernel that are failing is Debian Sid and either kernel 2.6.32-trunk-686 (Debian 2.6.32-5) or 2.6.32-5-686 (Debian 2.6.32-20), both were used at various time and I'm not sure which was used when. Now this machine is running the 2.6.32-5-686-bigmem kernel so I will enable this project again and see what it does. I will keep you informed of the progress. jna |
Send message Joined: 28 Mar 10 Posts: 2871 Credit: 538,595 RAC: 132 |
I installed Debian Sid 32bits in Virtual machine. The data collecting run well. |
Send message Joined: 11 Jun 10 Posts: 12 Credit: 996,343 RAC: 48 |
Started running WUs again and got the same error messages. This machine is running Debian sid and Linux kernel 2.6.32-5-686-bigmem. So there is something different about this machine that causes problems. Are there any ports that this application uses that I could be blocking that could cause this problem? Thanks jna |
Send message Joined: 28 Mar 10 Posts: 2871 Credit: 538,595 RAC: 132 |
The application uses the same port as the BOINC Manager (31416) In a shell could you type the below commands? telnet 127.0.0.1 31416 <get_cc_status> You should obtain
|
Send message Joined: 11 Jun 10 Posts: 12 Credit: 996,343 RAC: 48 |
Here is the result of the Telnet command: telnet 127.0.0.1 31416 (UNKNOWN) [127.0.0.1] 31416 (?) : Connection refused telnet 127.0.0.1 31415 <get_cc_status> <boinc_gui_rpc_reply> <cc_status> <network_status>2</network_status> <ams_password_error>0</ams_password_error> <task_suspend_reason>0</task_suspend_reason> <network_suspend_reason>0</network_suspend_reason> <task_mode>2</task_mode> <task_mode_perm>2</task_mode_perm> <task_mode_delay>0.000000</task_mode_delay> <gpu_mode>2</gpu_mode> <gpu_mode_perm>2</gpu_mode_perm> <gpu_mode_delay>0.000000</gpu_mode_delay> <network_mode>2</network_mode> <network_mode_perm>2</network_mode_perm> <network_mode_delay>0.000000</network_mode_delay> <disallow_attach>0</disallow_attach> <simple_gui_only>0s</simple_gui_only> </cc_status> </boinc_gui_rpc_reply> It just occured to me that I am using a non-standard port for Boinc. This is probably the cause of all the problems. I'm not sure if you want to fix your app to handle non-standard ports or not but I will wait for your reply before trying the standard port. jna |
Send message Joined: 28 Mar 10 Posts: 2871 Credit: 538,595 RAC: 132 |
I modify the application To use a non-standard port for Boinc, you should download the version 1.39 by aborting the current WU. You should also add a file named app_info.xml in the directory wuprop.boinc-af.org. The content of app_info.xml should be similar to this:
You should restart the BOINC client. |
Send message Joined: 11 Jun 10 Posts: 12 Credit: 996,343 RAC: 48 |
Ca marche! The file stderr.txt contains: Utilisation du port 31415 after 15 minutes of runtime! Yay! Merci beaucoup! (la blonde de jna) |
Send message Joined: 14 May 10 Posts: 5 Credit: 96,576 RAC: 0 |
All I can say is that it is a friggen waste of computer time to have a work unit trudge along just to find out that after 12 hours of run time to have it come up as invalid. Again a friggen waste of my computer time. Can I get a rebate on my electric bill, Please! |
Send message Joined: 14 May 10 Posts: 5 Credit: 96,576 RAC: 0 |
Yes |
Send message Joined: 14 May 10 Posts: 5 Credit: 96,576 RAC: 0 |
YES, a complete waste of my time. |
Send message Joined: 4 May 10 Posts: 8 Credit: 238,189 RAC: 0 |
All I can say is that it is a friggen waste of computer time to have a work unit trudge along just to find out that after 12 hours of run time to have it come up as invalid. Again a friggen waste of my computer time. Can I get a rebate on my electric bill, Please! Well, considering that these work units are non-cpu intensive their impact on your electric bill is quite minimal, to say the least. What other CPU projects were you running at that time on this computer? Consider yourself lucky you that you never ran an Orbit work unit for 12 days and then have it come up invalid. |
Send message Joined: 25 Jul 10 Posts: 13 Credit: 33,946 RAC: 0 |
I ran 3 Orbit Wu and all three crashed. Must be Orbit's std MOD. They're off my radar for that. lol. |
Send message Joined: 14 May 10 Posts: 5 Credit: 96,576 RAC: 0 |
The work unit doesn't get done when the computer is off and dumbfoundly when you do turn off your computer off in the middle of a work unit and you pass that 12 hour mark the result is sent in no matter hou far along you got the previous day. So -- power on uses electricity and the work unit trudges along to completion after 12 hours work. Hopefully it passes muster and you get your 25 credits. The same power/electricity is being used for 12 hours only to get an "INVALID" upon completion. Anyway you read it, the computer was powered up for those 12 hours of time that went down the drain. A total waste of the computers processing time and electricity. |
Send message Joined: 4 May 10 Posts: 8 Credit: 238,189 RAC: 0 |
The work unit doesn't get done when the computer is off and dumbfoundly when you do turn off your computer off in the middle of a work unit and you pass that 12 hour mark the result is sent in no matter hou far along you got the previous day. So -- power on uses electricity and the work unit trudges along to completion after 12 hours work. Hopefully it passes muster and you get your 25 credits. The same power/electricity is being used for 12 hours only to get an "INVALID" upon completion. Anyway you read it, the computer was powered up for those 12 hours of time that went down the drain. A total waste of the computers processing time and electricity. Please don't tell me this computer was turned on for the sole purpose of crunching work units here. |
Send message Joined: 11 Jun 10 Posts: 12 Credit: 996,343 RAC: 48 |
I'm not sure what mscharmack is talking about but the machine I was having problems with was running 6 other projects. So the fact that this projects WUs were coming back invalid did not bother me in the least, and I did not consider it a waste of electricity in the least. (OK it bothered me enough to try to find out why they were coming back invalid and fix the problem.) If mscharmack is running just this project on his computers that IS a waste of time and electricity as it will not give credit if no other projects are running. Besides this project does not award large amounts of credit per day anyway. You are much better off with just about any other project if all you want is credit. I try to find projects that are doing something interesting that I like first and then I try to maximize credit after. All I need to really start bringing on the credit is a GPU but that will have to wait until I finish renovating my house. jna |
©2024 Sébastien