Errors on ARM

log in

Advanced search

Message boards : Number crunching : Errors on ARM

Author Message
ebahapo
   
Avatar
Send message
Joined: 6 Apr 10
Posts: 41
Credit: 471,109
RAC: 0
Total hours: 351,238
Message 841 - Posted: 7 Feb 2013, 15:50:39 UTC

My ARM box has been failing the WUs in the last week. Some go through fine, some fail after a while and some fail immediately.

I have the impression that this has started about the time when badges were added.

Please, advise.
____________

Profile Pooh Bear 27
 
Avatar
Send message
Joined: 22 Jan 13
Posts: 106
Credit: 795,993
RAC: 44
Total hours: 1,919,554
Message 844 - Posted: 7 Feb 2013, 17:56:01 UTC

Have you updated NativeBOINC? There was an update to handle the new WUPorp code for Badges.

ebahapo
   
Avatar
Send message
Joined: 6 Apr 10
Posts: 41
Credit: 471,109
RAC: 0
Total hours: 351,238
Message 847 - Posted: 7 Feb 2013, 21:20:13 UTC - in response to Message 844.

It doesn't run Android, but regular Linux on ARM. It continues to run Oproject just fine; there's this problem only with WUprop.
____________

Profile BilBg
Avatar
Send message
Joined: 20 Jun 12
Posts: 63
Credit: 94,685
RAC: 0
Total hours: 108,788
Message 850 - Posted: 8 Feb 2013, 2:57:24 UTC - in response to Message 847.


Even your 'Valid' tasks on this machine report some errors:
http://wuprop.boinc-af.org/result.php?resultid=23689917

Stderr output
<core_client_version>7.1.0</core_client_version>
<![CDATA[
<stderr_txt>
17:21:58 (9384): Nombre de GPU non conforme: 0
17:21:58 (9384): Erreur wu_terminee (wu deja reportee)
18:21:59 (9384): Erreur reception active_result
18:30:12 (9384): Erreur reception state
18:30:12 (9384): Erreur wu_terminee (chargement state)
20:21:58 (9384): called boinc_finish

</stderr_txt>
]]>


An 'Invalid' task (Credit 0.00 as obviously the result was 'empty' (no info returned about your other projects/apps)):
http://wuprop.boinc-af.org/result.php?resultid=23621481

Stderr output
<core_client_version>7.1.0</core_client_version>
<![CDATA[
<stderr_txt>
00:08:52 (14492): Nombre de GPU non conforme: 0
00:08:52 (14492): Erreur wu_terminee (wu deja reportee)
01:08:52 (14492): Erreur reception active_result
03:01:52 (14492): Erreur reception active_result
03:08:52 (14492): called boinc_finish

</stderr_txt>
]]>


'Error while computing'
http://wuprop.boinc-af.org/result.php?resultid=23637143

Stderr output
<core_client_version>7.1.0</core_client_version>
<![CDATA[
<message>
process got signal 11
</message>
<stderr_txt>
09:10:36 (26026): Nombre de GPU non conforme: 0
09:10:36 (26026): Erreur wu_terminee (wu deja reportee)
09:45:36 (26026): Erreur reception active_result
10:10:36 (26026): Erreur reception active_result
11:12:37 (26026): Erreur reception active_result

</stderr_txt>
]]>


(I think 'process got signal 11' means something killed the process)


____________



- ALF - "Find out what you don't do well ..... then don't do it!" :)

ebahapo
   
Avatar
Send message
Joined: 6 Apr 10
Posts: 41
Credit: 471,109
RAC: 0
Total hours: 351,238
Message 854 - Posted: 8 Feb 2013, 17:08:11 UTC - in response to Message 850.

Signal #11 is SIGSEGV, or segment violation, AKA "oh, $#!%".

I don't understand why some WUs work and others don't. In the past, most worked, now, apparently after the addition of badges, it seems that only a few work. Should the ARM application be updated like those of mainstream hosts?

TIA
____________

ebahapo
   
Avatar
Send message
Joined: 6 Apr 10
Posts: 41
Credit: 471,109
RAC: 0
Total hours: 351,238
Message 1012 - Posted: 6 Mar 2013, 16:15:30 UTC
Last modified: 6 Mar 2013, 16:18:39 UTC

The new version of the app has been failing consistently (e.g. http://bit.ly/XSXNG4).

Please, advise.
____________

ebahapo
   
Avatar
Send message
Joined: 6 Apr 10
Posts: 41
Credit: 471,109
RAC: 0
Total hours: 351,238
Message 1020 - Posted: 7 Mar 2013, 21:07:04 UTC - in response to Message 1012.

The new version of the app has been failing consistently (e.g. http://bit.ly/XSXNG4)..


____________

Profile Coleslaw
         
Avatar
Send message
Joined: 11 Apr 10
Posts: 182
Credit: 8,212,959
RAC: 1,586
Total hours: 19,524,262
Message 1021 - Posted: 9 Mar 2013, 5:00:12 UTC

My cell phones running nativeBOINC on Android are running fine. It must be the Linux app.
____________

Profile skgiven
       
Avatar
Send message
Joined: 7 Sep 10
Posts: 453
Credit: 945,109
RAC: 0
Total hours: 2,101,570
Message 1023 - Posted: 9 Mar 2013, 11:45:37 UTC - in response to Message 1020.

Augustine, have you checked the Boinc folder security?

ebahapo
   
Avatar
Send message
Joined: 6 Apr 10
Posts: 41
Credit: 471,109
RAC: 0
Total hours: 351,238
Message 1024 - Posted: 9 Mar 2013, 14:23:27 UTC

This is a Linux ARM system, a NAS actually. WUProp used to run OK on it; some WUs would fail, but since the new version came out, almost all of them have failed.

Please, advise.
____________

Profile Jaska*
   
Avatar
Send message
Joined: 29 Dec 10
Posts: 4
Credit: 670,985
RAC: 231
Total hours: 1,319,313
Message 1044 - Posted: 12 Mar 2013, 18:20:25 UTC

Here, check out my phone's vast amount of failures.

That IS an Android system on ARM. It's basically running Linux, only not really, because it's Android. I tried using the official Berkeley installer but all the projects said "nope no android-linux-gnu please okay bye"

I'm no expert on the ARM architecture, but has your system got ANY swap space at all? What kind of runtime settings are being used?

Beyond that, I think I'll just queue up behind you for answers...
____________


ebahapo
   
Avatar
Send message
Joined: 6 Apr 10
Posts: 41
Credit: 471,109
RAC: 0
Total hours: 351,238
Message 1072 - Posted: 17 Mar 2013, 16:41:01 UTC

Are these issues ever going to be given any attention???
____________

Profile [AF>WildWildWest] Sebastien
     
Dictator
Avatar
Send message
Joined: 28 Mar 10
Posts: 2691
Credit: 516,263
RAC: 94
Total hours: 1,442,894
Message 1073 - Posted: 17 Mar 2013, 19:43:13 UTC - in response to Message 1072.

I updated the application.
This should correct the problem.
____________

ebahapo
   
Avatar
Send message
Joined: 6 Apr 10
Posts: 41
Credit: 471,109
RAC: 0
Total hours: 351,238
Message 1074 - Posted: 17 Mar 2013, 20:13:33 UTC - in response to Message 1073.

Yes, it seems to be going on well. I'll let you know how it ends.

Thank you.
____________

ebahapo
   
Avatar
Send message
Joined: 6 Apr 10
Posts: 41
Credit: 471,109
RAC: 0
Total hours: 351,238
Message 1075 - Posted: 18 Mar 2013, 3:18:19 UTC - in response to Message 1074.

The WUs are now being completed and validated by my ARM Linux host too.

Thanks.
____________

ebahapo
   
Avatar
Send message
Joined: 6 Apr 10
Posts: 41
Credit: 471,109
RAC: 0
Total hours: 351,238
Message 1077 - Posted: 19 Mar 2013, 20:36:27 UTC

It seems that some WUs (~10%) fail without an obvious reason, like this one and this one.

Please, advise.
____________

ebahapo
   
Avatar
Send message
Joined: 6 Apr 10
Posts: 41
Credit: 471,109
RAC: 0
Total hours: 351,238
Message 1215 - Posted: 10 May 2013, 19:50:39 UTC - in response to Message 1077.
Last modified: 10 May 2013, 20:01:16 UTC

Bump!

I've been seeing about 30% of the WUs for this host failing with a segmentation fault (signal #11) and it quickly reaches its quota for the day.

Please, advise.
____________

ebahapo
   
Avatar
Send message
Joined: 6 Apr 10
Posts: 41
Credit: 471,109
RAC: 0
Total hours: 351,238
Message 1235 - Posted: 17 May 2013, 20:52:29 UTC
Last modified: 17 May 2013, 20:53:09 UTC

With so many errors, the daily quota for the host was decreased to just 3 and now almost 100% of the WUs fail and the host may go a whole day without getting any WU.

Please, advise.

PS: Lorem ipsum dolor sit amet, consectetur adipisicing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat. Duis aute irure dolor in reprehenderit in voluptate velit esse cillum dolore eu fugiat nulla pariatur. Excepteur sint occaecat cupidatat non proident, sunt in culpa qui officia deserunt mollit anim id est laborum.
____________

Profile Ray_GTI-R
Send message
Joined: 25 Oct 12
Posts: 6
Credit: 347,215
RAC: 0
Total hours: 408,411
Message 1377 - Posted: 19 Jul 2013, 21:39:46 UTC
Last modified: 19 Jul 2013, 21:40:21 UTC

For the device:- http://wuprop.boinc-af.org/show_host_detail.php?hostid=54606

Some WU's fail (*) and some show Completed but have also Validated errors (**) although they do have credit given.

*
Results for Task http://wuprop.boinc-af.org/result.php?resultid=30480624

<core_client_version>6.12.38</core_client_version>
<stderr_txt>
&#133;0W&#128; Erreur reception host_info
&#128; Nombre de GPU non conforme: 0
&#128; Nombre de GPU non conforme: 0
&#128; Nombre de GPU non conforme: 0
&#128; Nombre de GPU non conforme: 0
&#128; Erreur reception host_info
&#128; Nombre de GPU non conforme: 0
... etc, etc

</stderr_txt>
<message>
upload failure: <file_xfer_error>
<file_name>wu_v3_1373905034_156447_0_0</file_name>
<error_code>-161</error_code>
</file_xfer_error>

---------------------------------------

**
Results for Task http://wuprop.boinc-af.org/result.php?resultid=30430344

<core_client_version>6.12.38</core_client_version>
<![CDATA[
<stderr_txt>
10:00:27 (505): Nombre de GPU non conforme: 0
10:00:27 (505): Erreur wu_terminee (wu deja reportee)
11:00:27 (505): Erreur reception active_result
13:00:27 (505): called boinc_finish

</stderr_txt>

Note this device has no GPU/coprocessor (!!!) as confirmed by ...
http://wuprop.boinc-af.org/show_host_detail.php?hostid=54606
Has anyone figured out what's going on?

Cheers, Ray


Post to thread

Message boards : Number crunching : Errors on ARM


Home | My Account | Message Boards | Results


Copyright © 2024 Sebastien