Message boards :
Number crunching :
Hours severely underestimated for short GPU tasks
Message board moderation
Author | Message |
---|---|
Send message Joined: 21 Jul 13 Posts: 69 Credit: 691,597 RAC: 0 |
I'm running the project "Private GFN Server" (which, despite its name, is not private), and this project has two extremely short applications, GFN-13 Prime Search, and GFN-14 Prime Search. In order to ensure that my GPU is fully utilised, I'm running 4 of these at a time, GFN-13s taking about 1 minute each (so 15 seconds per task), GFN-14s taking about 2 minutes each (30 seconds per task). Looking at my daily hours reported, I'm seeing drastic underestimation of these - particularly GFN-13. If I look at my last day of work on that computer (image available here), you'll see that there's ~16.6 hours of data there based on the NCI tasks. Collats Sieve is running on the integrated GPU, so you can ignore that. The only tasks there running on the dedicated GPU are Asteroids@Home, and GFN-13. Asteroids seems to be tracking correctly @ 5.25 hours, so there's around 11.35 hours left over, all of which were presumably occupied by GFN-13. Now, theoretically, GFN-13 should be counting 4 hours for every actual hour it runs, since I'm running 4 tasks at a time. There's likely a fair amount of overhead involved in starting and running them considering how short the task is, but this is definitely not enough of a difference to turn ~45 hours into 0.38. I haven't paid as much attention to GFN-14, but it seems like it may have had similar issues, although certainly far less drastic ones. I don't really know what's happening here. Maybe WUProp is erroneously tracking CPU time, maybe it just can't cope with such short work units and misses most of them (this would explain GFN-14 tracking better), or maybe it's something else entirely, but the GFN-13 data is definitely massively under-reporting. These tasks do run at 0.01 CPUs + 0.25 GPUs, so maybe it's recording the CPU time at a 0.01 "multithreaded" multiplier? 0.38 hours * 100 would be almost exactly what I'd expect, with the remaining handful of hours being easily explained by the start/stop overhead. Looking at the global stats it certainly doesn't seem like the top users there are having that problem, though. I'm just confused at this point. |
Send message Joined: 22 Aug 16 Posts: 448 Credit: 2,093,099 RAC: 693 |
I think I recall WUProp captures the running tasks every 1 minute so short tasks can end up short of hours. The same thing happens at other projects with short tasks. You can load up more than needed as far as GPU util goes to extend the task length. |
Send message Joined: 21 Jul 13 Posts: 69 Credit: 691,597 RAC: 0 |
I doubled the number of running tasks from 4 to 8 (doubling their runtime to about 2 minutes), and it now seems to be tracking accurately. Thanks for the suggestion. |
©2024 Sébastien