Message boards :
Number crunching :
Why has the project been down for multiple consecutive hours in last number of days?
Message board moderation
Author | Message |
---|---|
Send message Joined: 29 Jul 11 Posts: 334 Credit: 1,240,251 RAC: 321 |
This was the 4th time the project was down in less than a week for multiple consecutive hours without any explanation. And now it's back online with all the projects I have worked on in last day reporting negative hours. Did the project have another database problem or something else break? |
Send message Joined: 9 Feb 14 Posts: 26 Credit: 1,154,782 RAC: 96 |
My hours are not increasing but no negatives this time. Edit: Now incrementing. Better to return to 6 hour work units to reduce server load. Paul. |
Send message Joined: 20 Jun 12 Posts: 141 Credit: 342,004 RAC: 64 |
Seem to have been down for about 7 hours and this time I have negative hours. I agree, better return to the 6 hours WUs (or even better: let us choose like Rosetta), that will not only reduce the server load, we will also get better chance to make it thru the outage without loosing any hours. |
Send message Joined: 20 May 10 Posts: 552 Credit: 1,901,961 RAC: 792 |
Seem to have been down for about 7 hours and this time I have negative hours. I agree, better return to the 6 hours WUs (or even better: let us choose like Rosetta), that will not only reduce the server load, we will also get better chance to make it thru the outage without loosing any hours. I like this idea as it might make it alot better for Android devices with the smaller tasks while desktops, Servers and laptop can easily handle the 6 hour tasks with a lot less bandwidth and wear and tear on the wuprop hardware. BUT what I'd like to really know is if there is anything we users can do to help alleviate any future unplanned outages ie help with newer and bigger hard drives, more memory for the Server or pc(s) it runs on, more bandwidth etc etc. |
Send message Joined: 29 Jul 11 Posts: 334 Credit: 1,240,251 RAC: 321 |
Another 6 to 8 hour outage today. The multi-hour outages appear to come in waves aprox Two days apart in a week. Loosing many hours on apps due to the extended outages. Agree Time to go back to a 6 hour task across the board. *** EDIT TO ADD >>> AND THE NEGATIVE HOURS ARE BACK ALSO <<< *** |
Send message Joined: 20 May 10 Posts: 552 Credit: 1,901,961 RAC: 792 |
Another 6 to 8 hour outage today. The multi-hour outages appear to come in waves aprox Two days apart in a week. Another thought I had was to treat the tasks like other Boinc Projects and just send out several at a time, that way if the Server is down we just move to the next task, and then the task after that etc. Boinc uses a first in first out basis for it's tasks, which can then be affected by return times, but instead of banging the Server we would just hold the completed tasks and not lose any time until the Server is ready to take them all back again. |
Send message Joined: 20 Jun 12 Posts: 141 Credit: 342,004 RAC: 64 |
Boinc uses a first in first out basis for it's tasks, which can then be affected by return times, but instead of banging the Server we would just hold the completed tasks and not lose any time until the Server is ready to take them all back again. Won't it run them all at once, since they are NCI? Goofyxgrid has also send just one task per app and they were all running concurrently. |
Send message Joined: 29 Jul 11 Posts: 334 Credit: 1,240,251 RAC: 321 |
Boinc uses a first in first out basis for it's tasks, which can then be affected by return times, but instead of banging the Server we would just hold the completed tasks and not lose any time until the Server is ready to take them all back again. More than likely. I don't think BOINC NCI works that way - all or nothing run. Every NCI project I have attached & completed work on has only sent one task per computer except the old GoofyGrid would send multiples at times and occasionally I get multiple tasks running at once on iThena's main NCI project. |
Send message Joined: 29 Jul 11 Posts: 334 Credit: 1,240,251 RAC: 321 |
And we're back from another multi hour outage along with negative hours for the third day now..... |
Send message Joined: 9 May 13 Posts: 98 Credit: 762,755 RAC: 280 |
I am happy that we have this project even if it is occasionally intermittent. Complaining each and every time that the server hiccups doesn't do anyone any good. Sebastien needs sleep just like the rest of us, cut him some slack. |
Send message Joined: 20 May 10 Posts: 552 Credit: 1,901,961 RAC: 792 |
I am happy that we have this project even if it is occasionally intermittent. Complaining each and every time that the server hiccups doesn't do anyone any good. Sebastien needs sleep just like the rest of us, cut him some slack. I don't think it's the complaining so much as that there's been no explanation of why and if anyone of us can help solve the problem. I know Sebastien got some helpers awhile back when the Project was on the verge of collapsing, does he need more? Does he need some hardware that keeps failing? Is it a software problem? Is it an ISP problem? In short people are trying to figure out if they can help but with no word coming from 'the Team, it's kinda hard. |
Send message Joined: 20 Jun 12 Posts: 141 Credit: 342,004 RAC: 64 |
Complaining each and every time that the server hiccups doesn't do anyone any good. We are not complaining, at least I'm not, just reporting an issue, which he might even not notice otherwise if it's always fixing itself after couple of hours. I don't see anything wrong with reporting bugs or other issues to the admin/developer, that's how admins/devs get to know there's something wrong at all with their servers/software/whatever in most cases, they might not notice the issue from their end without the reports. |
Send message Joined: 28 Jan 13 Posts: 40 Credit: 1,408,392 RAC: 551 |
Getting a couple of projects reporting negative hours myself. |
Send message Joined: 20 May 10 Posts: 552 Credit: 1,901,961 RAC: 792 |
Getting a couple of projects reporting negative hours myself. I just got a huge update on some of my projects!! |
Send message Joined: 28 Jan 13 Posts: 40 Credit: 1,408,392 RAC: 551 |
Same, Mikey |
Send message Joined: 29 Jul 11 Posts: 334 Credit: 1,240,251 RAC: 321 |
Same here. It appears the "extra" hours are counting about 2.5 to 3 calendar days of back reporting of work based on my last 24 hours per device page. |
Send message Joined: 8 Oct 12 Posts: 33 Credit: 1,900,432 RAC: 830 |
Now it is "Server error: feeder not running" EDIT: and back online again |
Send message Joined: 20 Jun 12 Posts: 141 Credit: 342,004 RAC: 64 |
Agree Time to go back to a 6 hour task across the board. If server load because of too many clients connecting the scheduler is the issue, than perhaps increasing <next_rpc_delay> might help a bit at least, currently I see this stupid behavior for each WU: 29/04/2023 12:38:19 | WUProp@Home | Sending scheduler request: Requested by project. The first request is completely unnecessary and without <report_results_immediately/> in app_config.xml it slows down getting a new WU by few seconds. To avoid this while still keeping the function of forced scheduler requests <next_rpc_delay> should be increased from the current 3600 to 3700 seconds. |
Send message Joined: 29 Jul 11 Posts: 334 Credit: 1,240,251 RAC: 321 |
I see we had Two "short" outages in about 48 hours.. WuProp database is back up along with the usual NEGATIVE HOURS |
Send message Joined: 13 Dec 15 Posts: 174 Credit: 2,269,994 RAC: 304 |
I'm just glad the project is still with us. Is there a Patreon donation link? |
©2024 Sébastien