counting GPU time?

log in

Advanced search

Message boards : Number crunching : counting GPU time?

Author Message
zombie67 [MM]
             
Avatar
Send message
Joined: 30 Mar 10
Posts: 219
Credit: 8,803,727
RAC: 1,234
Total hours: 30,507,241
Message 1002 - Posted: 4 Mar 2013, 2:50:33 UTC
Last modified: 4 Mar 2013, 2:52:10 UTC

I am crunching primegrid PPS Sieve exclusively on only one machine, on 3x GPUs and no CPUs. I've been tracking my hours here on my account page, and I am adding about 24 hours/day. So why are my stats not accumulating 72 hours/day?

I thought maybe the wuprop app is tracking CPU time instead of wall time. So I looked at the tasks, and the wall time is ~1600 seconds per task, and the CPU time is ~400 seconds per task. To make the math work, the ratio needs to be 1/3, not the 1/4 it really is.

So, I am left with guessing that the wuprop is counting only one of my GPU task hours, not all three. Can anyone else confirm this behavior?
____________
Reno, NV
Team: SETI.USA

Profile skgiven
       
Avatar
Send message
Joined: 7 Sep 10
Posts: 453
Credit: 945,109
RAC: 0
Total hours: 2,101,570
Message 1004 - Posted: 4 Mar 2013, 22:47:11 UTC - in response to Message 1002.
Last modified: 4 Mar 2013, 22:54:23 UTC

I think you are right; my take is that the WUProp app tallies up CPU runtimes, but is only seeing the one app rather than 3apps on 3GPU's. So it's a bit like the MT issue.
Conversely when running an app_info WUProps app will count 4 CPU runtimes if you run 4 GPU tasks per GPU. For the likes of POEM, which is really a CPU+CPU project, this is fine; you do more work and actually use 4 CPU's (or most of 4).

Profile [AF>WildWildWest] Sebastien
     
Dictator
Avatar
Send message
Joined: 28 Mar 10
Posts: 2677
Credit: 513,703
RAC: 95
Total hours: 1,427,238
Message 1005 - Posted: 5 Mar 2013, 6:38:20 UTC

What is the ID of the host?
____________

Profile skgiven
       
Avatar
Send message
Joined: 7 Sep 10
Posts: 453
Credit: 945,109
RAC: 0
Total hours: 2,101,570
Message 1006 - Posted: 6 Mar 2013, 11:22:36 UTC - in response to Message 1005.
Last modified: 6 Mar 2013, 11:49:19 UTC

[edited] because it's been Fixed, Thanks

Last Night,
Poem@Home POEM++ OpenCL version 2,695.08 1.28

Earlier Today,
Poem@Home POEM++ OpenCL version 1,402.53 0.88

Now,
Poem@Home POEM++ OpenCL version 2,719.90 0.70
____________
=sticky tape=

Profile Pooh Bear 27
 
Avatar
Send message
Joined: 22 Jan 13
Posts: 106
Credit: 794,799
RAC: 46
Total hours: 1,908,756
Message 1007 - Posted: 6 Mar 2013, 11:44:50 UTC

I have 1 machine where the GPU seems to be ignored. I believe this because I am currently running Collatz and the one GPU only runs Mini, but the other GPU runs both. So Mini should be out performing the regular but it is not. Only the GPU from the one box is being counted.

I know I had a notification early on that WUProp didn't understand my video card but I filled out the information and have not had that notification since.

ID: 47412 has an ATI HD5670 video card. This were not very popular cards and I understand why, but it is what I have and I wish the work on it to be counted.

zombie67 [MM]
             
Avatar
Send message
Joined: 30 Mar 10
Posts: 219
Credit: 8,803,727
RAC: 1,234
Total hours: 30,507,241
Message 1009 - Posted: 6 Mar 2013, 14:19:10 UTC - in response to Message 1005.
Last modified: 6 Mar 2013, 14:23:35 UTC

What is the ID of the host?


If that was directed to me, my host is:

http://wuprop.boinc-af.org/show_host_detail.php?hostid=34980

However, it is running multiple GPU projects now. I can set it back to running PPS Sieve exclusively, if that helps. Let me know.

Edit: Here is the same machine at PG, if that helps:

http://www.primegrid.com/show_host_detail.php?hostid=362114
____________
Reno, NV
Team: SETI.USA

Profile Pooh Bear 27
 
Avatar
Send message
Joined: 22 Jan 13
Posts: 106
Credit: 794,799
RAC: 46
Total hours: 1,908,756
Message 1019 - Posted: 7 Mar 2013, 16:57:31 UTC

Experiment to see if ID: 47412 is really reporting GPU work or not.
It's the only one running mini-collatz at the moment, no other boxes are. If after a WUProp is returned and no advancement on mini-collatz then I know that GPU is not being counted. Since they only take about 600 seconds to run each there should be plenty of data in a WUProp unit to give results. I will report after the next unit is reported and processed.

Current:
mini_collatz 509.52

Profile Pooh Bear 27
 
Avatar
Send message
Joined: 22 Jan 13
Posts: 106
Credit: 794,799
RAC: 46
Total hours: 1,908,756
Message 1033 - Posted: 10 Mar 2013, 15:30:51 UTC

Experiment finished, but I am still not understanding GPU counting.

I have 2 GPUs running Collatz 24x7
One runs both Mini and Regular
One runs Mini only

Since they both run Mini, I figured Mini would be out producing Regular, but that is not the case.

Yes one card is nearly double the speed of the other, but I thought it was counting clock cycles? What are we really counting?

Profile Van Fanel
   
Send message
Joined: 28 May 12
Posts: 18
Credit: 110,476
RAC: 0
Total hours: 232,752
Message 1210 - Posted: 10 May 2013, 8:52:14 UTC

Hi all!

I would like to add my two cents on this issue. I have an ATI HD 5450 dedicated to Collatz. It has been running both mini and regular WUs. However, in recent days, all work done by this GPU has not been accounted for in my stats on WUProp. Apparently, all work done in this GPU is completely ignored.

I would like to know if this is an application problem or an WUProp problem, and what can be done to mitigate this issue.

Thank you!


Curly
   
Send message
Joined: 6 Apr 10
Posts: 20
Credit: 1,031,633
RAC: 32
Total hours: 1,349,391
Message 1211 - Posted: 10 May 2013, 9:11:56 UTC

Host 21159 returned 6 Seti Beta tasks so far. Application is "SETI@home v7 v7.00" (type cuda32, means the GPU app).

Together they have runtime of ~9.2 hrs and a cpu time of ~0.47 hrs.
Yet the application is shown with 0.10 hrs under Reported data. No pending time.

How is the Running time calculated for this application?

Curly
   
Send message
Joined: 6 Apr 10
Posts: 20
Credit: 1,031,633
RAC: 32
Total hours: 1,349,391
Message 1212 - Posted: 10 May 2013, 9:33:20 UTC

Another host (host id 1103) just returned a wuprop task.
The pendings for Primegrid PPS Sieve changed from 0.03 to 1.00 hrs. although it should have added 3 hrs. No change in Running time.

The host is a dual core with a single GPU (in sum 3 cores). This makes me wondering since the 1 hour added is 1/3 of the expected 3 hours.

Curly
   
Send message
Joined: 6 Apr 10
Posts: 20
Credit: 1,031,633
RAC: 32
Total hours: 1,349,391
Message 1219 - Posted: 12 May 2013, 21:09:20 UTC

The GPU wus still count for only a fraction of Running time that the wus actually run.

Profile x3mEn
 
Send message
Joined: 24 May 11
Posts: 15
Credit: 323,643
RAC: 0
Total hours: 354,633
Message 1220 - Posted: 13 May 2013, 11:16:52 UTC
Last modified: 13 May 2013, 11:17:36 UTC

http://wuprop.boinc-af.org/results.php?hostid=51423

I can confirm, WUProp app is tracking CPU time instead of wall time.
Collatz Conjecture (collatz) is using 0.02 CPU + 1.00 NVIDIA and the progress of Running time (hours) growth is horribly slow.
FYI, I'm using app_config.xml for CPU usage limitation for collatz app.

<app_config> <app> <name>collatz</ name> <gpu_versions> <gpu_usage>0.02</gpu_usage> <cpu_usage>1.00</ cpu_usage> </gpu_versions> </app> </app_config>

Profile Van Fanel
   
Send message
Joined: 28 May 12
Posts: 18
Credit: 110,476
RAC: 0
Total hours: 232,752
Message 1223 - Posted: 13 May 2013, 18:20:28 UTC - in response to Message 1220.

http://wuprop.boinc-af.org/results.php?hostid=51423

I can confirm, WUProp app is tracking CPU time instead of wall time.
Collatz Conjecture (collatz) is using 0.02 CPU + 1.00 NVIDIA and the progress of Running time (hours) growth is horribly slow.


I confirm that the problem of counting CPU time instead of wall time also occurs for GPUGrid application in both Linux and Windows clients.

Bummer... :'(

Profile [AF>WildWildWest] Sebastien
     
Dictator
Avatar
Send message
Joined: 28 Mar 10
Posts: 2677
Credit: 513,703
RAC: 95
Total hours: 1,427,238
Message 1224 - Posted: 13 May 2013, 18:54:02 UTC
Last modified: 13 May 2013, 18:56:41 UTC

I fixed the problem
____________

Profile BilBg
Avatar
Send message
Joined: 20 Jun 12
Posts: 63
Credit: 94,685
RAC: 0
Total hours: 108,788
Message 1228 - Posted: 14 May 2013, 2:28:52 UTC - in response to Message 1220.
Last modified: 14 May 2013, 2:35:14 UTC

FYI, I'm using app_config.xml for CPU usage limitation for collatz app.

<app_config> <app> <name>collatz</ name> <gpu_versions> <gpu_usage>0.02</gpu_usage> <cpu_usage>1.00</ cpu_usage> </gpu_versions> </app> </app_config>

What do you think you accomplish by this?
There are several (severe) errors in the posted file ...

And you can't do "CPU usage limitation" using app_config.xml
http://boinc.berkeley.edu/wiki/Client_configuration#Application_configuration

If the file you posted was working (it is not) YOU instruct BOINC to run 50 GPU apps simultaneously! per GPU and to reserve one free CPU 'core' for each GPU app instance.


____________



- ALF - "Find out what you don't do well ..... then don't do it!" :)

Profile x3mEn
 
Send message
Joined: 24 May 11
Posts: 15
Credit: 323,643
RAC: 0
Total hours: 354,633
Message 1229 - Posted: 14 May 2013, 16:14:13 UTC
Last modified: 14 May 2013, 16:14:52 UTC

BilBg, you are wrong and right simultaneously.
Wrong because I have no 50 free CPU cores, and right because I confused gpu and cpu usages.

<app_config> <app> <name>collatz</ name> <gpu_versions> <cpu_usage>0.02</cpu_usage> <gpu_usage>1.00</gpu_usage> </gpu_versions> </app> </app_config>


In any case the issue has already fixed, so doesn't matter.

Profile BilBg
Avatar
Send message
Joined: 20 Jun 12
Posts: 63
Credit: 94,685
RAC: 0
Total hours: 108,788
Message 1231 - Posted: 17 May 2013, 4:30:41 UTC - in response to Message 1229.
Last modified: 17 May 2013, 4:50:51 UTC

BilBg, you are wrong and right simultaneously.
Wrong because I have no 50 free CPU cores, and right because I confused gpu and cpu usages.
<app_config> <app> <name>collatz</ name> <gpu_versions> <cpu_usage>0.02</cpu_usage> <gpu_usage>1.00</gpu_usage> </gpu_versions> </app> </app_config>


In any case the issue has already fixed, so doesn't matter.

You didn't fix this:
<name>collatz</ name>

Have to be:
<name>collatz</name>


And:
<cpu_usage>0.02</cpu_usage>

... will have no any impact on the cpu usage of the app

This is only to inform BOINC, app will use what it likes/needs (e.g. 90% CPU), you can't limit real cpu usage of the app this way.

BOINC uses this value to see if it needs to free a core (from CPU task)
E.g. if <cpu_usage>0.4</cpu_usage> and ...
1 GPU apps (tasks) run BOINC will free 0 cores
2 GPU apps (tasks) run BOINC will free 0 cores (2 * 0.4 = 0.8 cores)
3 GPU apps (tasks) run BOINC will free 1 core (3 * 0.4 = 1.2 cores)
4 GPU apps (tasks) run BOINC will free 1 core (4 * 0.4 = 1.6 cores)
5 GPU apps (tasks) run BOINC will free 2 cores (5 * 0.4 = 2.0 cores)
...

In fact people use mainly this:
<gpu_usage>0.5</gpu_usage>

... to run 2 GPU tasks/apps on every GPU

For fast GPUs 2-3 tasks at a time usually is more effective but it depends on the project/app. I'm familiar only with SETI apps.


____________



- ALF - "Find out what you don't do well ..... then don't do it!" :)


Post to thread

Message boards : Number crunching : counting GPU time?


Home | My Account | Message Boards | Results


Copyright © 2024 Sebastien