Errors on ARM

Message boards : Number crunching : Errors on ARM
Message board moderation

To post messages, you must log in.

AuthorMessage
ebahapo
   
Avatar

Send message
Joined: 6 Apr 10
Posts: 41
Credit: 471,539
RAC: 0
Message 841 - Posted: 7 Feb 2013, 15:50:39 UTC

My ARM box has been failing the WUs in the last week. Some go through fine, some fail after a while and some fail immediately.

I have the impression that this has started about the time when badges were added.

Please, advise.
ID: 841 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile Pooh Bear 27
 
Avatar

Send message
Joined: 22 Jan 13
Posts: 107
Credit: 805,609
RAC: 48
Message 844 - Posted: 7 Feb 2013, 17:56:01 UTC

Have you updated NativeBOINC? There was an update to handle the new WUPorp code for Badges.
ID: 844 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
ebahapo
   
Avatar

Send message
Joined: 6 Apr 10
Posts: 41
Credit: 471,539
RAC: 0
Message 847 - Posted: 7 Feb 2013, 21:20:13 UTC - in response to Message 844.  

It doesn't run Android, but regular Linux on ARM. It continues to run Oproject just fine; there's this problem only with WUprop.
ID: 847 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile BilBg
Avatar

Send message
Joined: 20 Jun 12
Posts: 63
Credit: 94,685
RAC: 0
Message 850 - Posted: 8 Feb 2013, 2:57:24 UTC - in response to Message 847.  


Even your 'Valid' tasks on this machine report some errors:
http://wuprop.boinc-af.org/result.php?resultid=23689917

Stderr output
<core_client_version>7.1.0</core_client_version>
<![CDATA[
<stderr_txt>
17:21:58 (9384): Nombre de GPU non conforme: 0
17:21:58 (9384): Erreur wu_terminee (wu deja reportee)
18:21:59 (9384): Erreur reception active_result
18:30:12 (9384): Erreur reception state
18:30:12 (9384): Erreur wu_terminee (chargement state)
20:21:58 (9384): called boinc_finish

</stderr_txt>
]]>


An 'Invalid' task (Credit 0.00 as obviously the result was 'empty' (no info returned about your other projects/apps)):
http://wuprop.boinc-af.org/result.php?resultid=23621481

Stderr output
<core_client_version>7.1.0</core_client_version>
<![CDATA[
<stderr_txt>
00:08:52 (14492): Nombre de GPU non conforme: 0
00:08:52 (14492): Erreur wu_terminee (wu deja reportee)
01:08:52 (14492): Erreur reception active_result
03:01:52 (14492): Erreur reception active_result
03:08:52 (14492): called boinc_finish

</stderr_txt>
]]>


'Error while computing'
http://wuprop.boinc-af.org/result.php?resultid=23637143

Stderr output
<core_client_version>7.1.0</core_client_version>
<![CDATA[
<message>
process got signal 11
</message>
<stderr_txt>
09:10:36 (26026): Nombre de GPU non conforme: 0
09:10:36 (26026): Erreur wu_terminee (wu deja reportee)
09:45:36 (26026): Erreur reception active_result
10:10:36 (26026): Erreur reception active_result
11:12:37 (26026): Erreur reception active_result

</stderr_txt>
]]>


(I think 'process got signal 11' means something killed the process)





- ALF - "Find out what you don't do well ..... then don't do it!" :)
ID: 850 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
ebahapo
   
Avatar

Send message
Joined: 6 Apr 10
Posts: 41
Credit: 471,539
RAC: 0
Message 854 - Posted: 8 Feb 2013, 17:08:11 UTC - in response to Message 850.  

Signal #11 is SIGSEGV, or segment violation, AKA "oh, $#!%".

I don't understand why some WUs work and others don't. In the past, most worked, now, apparently after the addition of badges, it seems that only a few work. Should the ARM application be updated like those of mainstream hosts?

TIA
ID: 854 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
ebahapo
   
Avatar

Send message
Joined: 6 Apr 10
Posts: 41
Credit: 471,539
RAC: 0
Message 1012 - Posted: 6 Mar 2013, 16:15:30 UTC
Last modified: 6 Mar 2013, 16:18:39 UTC

The new version of the app has been failing consistently (e.g. http://bit.ly/XSXNG4).

Please, advise.
ID: 1012 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
ebahapo
   
Avatar

Send message
Joined: 6 Apr 10
Posts: 41
Credit: 471,539
RAC: 0
Message 1020 - Posted: 7 Mar 2013, 21:07:04 UTC - in response to Message 1012.  

The new version of the app has been failing consistently (e.g. http://bit.ly/XSXNG4)..


ID: 1020 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile Coleslaw
         
Avatar

Send message
Joined: 11 Apr 10
Posts: 182
Credit: 8,446,943
RAC: 962
Message 1021 - Posted: 9 Mar 2013, 5:00:12 UTC

My cell phones running nativeBOINC on Android are running fine. It must be the Linux app.
ID: 1021 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile skgiven
       
Avatar

Send message
Joined: 7 Sep 10
Posts: 453
Credit: 945,109
RAC: 0
Message 1023 - Posted: 9 Mar 2013, 11:45:37 UTC - in response to Message 1020.  

Augustine, have you checked the Boinc folder security?
ID: 1023 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
ebahapo
   
Avatar

Send message
Joined: 6 Apr 10
Posts: 41
Credit: 471,539
RAC: 0
Message 1024 - Posted: 9 Mar 2013, 14:23:27 UTC

This is a Linux ARM system, a NAS actually. WUProp used to run OK on it; some WUs would fail, but since the new version came out, almost all of them have failed.

Please, advise.
ID: 1024 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile Jaska*
   
Avatar

Send message
Joined: 29 Dec 10
Posts: 4
Credit: 714,739
RAC: 174
Message 1044 - Posted: 12 Mar 2013, 18:20:25 UTC

Here, check out my phone's vast amount of failures.

That IS an Android system on ARM. It's basically running Linux, only not really, because it's Android. I tried using the official Berkeley installer but all the projects said "nope no android-linux-gnu please okay bye"

I'm no expert on the ARM architecture, but has your system got ANY swap space at all? What kind of runtime settings are being used?

Beyond that, I think I'll just queue up behind you for answers...


ID: 1044 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
ebahapo
   
Avatar

Send message
Joined: 6 Apr 10
Posts: 41
Credit: 471,539
RAC: 0
Message 1072 - Posted: 17 Mar 2013, 16:41:01 UTC

Are these issues ever going to be given any attention???
ID: 1072 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile [AF>WildWildWest] Sébastie...
     
Project administrator
Avatar

Send message
Joined: 28 Mar 10
Posts: 2871
Credit: 538,697
RAC: 133
Message 1073 - Posted: 17 Mar 2013, 19:43:13 UTC - in response to Message 1072.  

I updated the application.
This should correct the problem.
ID: 1073 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
ebahapo
   
Avatar

Send message
Joined: 6 Apr 10
Posts: 41
Credit: 471,539
RAC: 0
Message 1074 - Posted: 17 Mar 2013, 20:13:33 UTC - in response to Message 1073.  

Yes, it seems to be going on well. I'll let you know how it ends.

Thank you.
ID: 1074 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
ebahapo
   
Avatar

Send message
Joined: 6 Apr 10
Posts: 41
Credit: 471,539
RAC: 0
Message 1075 - Posted: 18 Mar 2013, 3:18:19 UTC - in response to Message 1074.  

The WUs are now being completed and validated by my ARM Linux host too.

Thanks.
ID: 1075 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
ebahapo
   
Avatar

Send message
Joined: 6 Apr 10
Posts: 41
Credit: 471,539
RAC: 0
Message 1077 - Posted: 19 Mar 2013, 20:36:27 UTC

It seems that some WUs (~10%) fail without an obvious reason, like this one and this one.

Please, advise.
ID: 1077 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
ebahapo
   
Avatar

Send message
Joined: 6 Apr 10
Posts: 41
Credit: 471,539
RAC: 0
Message 1215 - Posted: 10 May 2013, 19:50:39 UTC - in response to Message 1077.  
Last modified: 10 May 2013, 20:01:16 UTC

Bump!

I've been seeing about 30% of the WUs for this host failing with a segmentation fault (signal #11) and it quickly reaches its quota for the day.

Please, advise.
ID: 1215 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
ebahapo
   
Avatar

Send message
Joined: 6 Apr 10
Posts: 41
Credit: 471,539
RAC: 0
Message 1235 - Posted: 17 May 2013, 20:52:29 UTC
Last modified: 17 May 2013, 20:53:09 UTC

With so many errors, the daily quota for the host was decreased to just 3 and now almost 100% of the WUs fail and the host may go a whole day without getting any WU.

Please, advise.

PS: Lorem ipsum dolor sit amet, consectetur adipisicing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat. Duis aute irure dolor in reprehenderit in voluptate velit esse cillum dolore eu fugiat nulla pariatur. Excepteur sint occaecat cupidatat non proident, sunt in culpa qui officia deserunt mollit anim id est laborum.
ID: 1235 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile Ray_GTI-R

Send message
Joined: 25 Oct 12
Posts: 6
Credit: 347,215
RAC: 0
Message 1377 - Posted: 19 Jul 2013, 21:39:46 UTC
Last modified: 19 Jul 2013, 21:40:21 UTC

For the device:- http://wuprop.boinc-af.org/show_host_detail.php?hostid=54606

Some WU's fail (*) and some show Completed but have also Validated errors (**) although they do have credit given.

*
Results for Task http://wuprop.boinc-af.org/result.php?resultid=30480624

<core_client_version>6.12.38</core_client_version>
<stderr_txt>
&#133;0W&#128; Erreur reception host_info
&#128; Nombre de GPU non conforme: 0
&#128; Nombre de GPU non conforme: 0
&#128; Nombre de GPU non conforme: 0
&#128; Nombre de GPU non conforme: 0
&#128; Erreur reception host_info
&#128; Nombre de GPU non conforme: 0
... etc, etc

</stderr_txt>
<message>
upload failure: <file_xfer_error>
<file_name>wu_v3_1373905034_156447_0_0</file_name>
<error_code>-161</error_code>
</file_xfer_error>

---------------------------------------

**
Results for Task http://wuprop.boinc-af.org/result.php?resultid=30430344

<core_client_version>6.12.38</core_client_version>
<![CDATA[
<stderr_txt>
10:00:27 (505): Nombre de GPU non conforme: 0
10:00:27 (505): Erreur wu_terminee (wu deja reportee)
11:00:27 (505): Erreur reception active_result
13:00:27 (505): called boinc_finish

</stderr_txt>

Note this device has no GPU/coprocessor (!!!) as confirmed by ...
http://wuprop.boinc-af.org/show_host_detail.php?hostid=54606
Has anyone figured out what's going on?

Cheers, Ray
ID: 1377 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote

Message boards : Number crunching : Errors on ARM

©2024 Sébastien