RE: ticket #2425
Can be reproduced with very busy server --> load average: 5.42, 7.35, 12.84.
I don't know what is the server exactly but your load average seem explain your I/O problems. If several process wait for I/O, that can explain the problem when apache2 is restarted. The diagnostic is quick simple here, you should improve the server performances. IspCP must not take care about busy servers. it's the administrator job. IspCP should be used in normal execution context.
Imagine the following:
I've a server with the following material:
CPU intel celeron 1.2 Ghz
1 Go ram
On this server, I uses ispCP. After some days, I've 200 domains names hosted on it. Sure, I'll encountered a problem like you. It's the ispCP fault ? No, it's simply because the server material is too small to support the works. So, what my solution ? Ask the ispCP team to add a sleep statement of one or more second before restart Apache to be sure that the work is finish on my poor server ?
Sure, if someone pose this question, the team will surely answering the following: Your server is too small to ensure the works for 200 domains.
I've not thinking about this before reading your load average. Now, it's more clear for me.
I repeat, it's the admin job to take care about server performances. In you case, you have two solutions:
1. Getting a new servers and move some domains from the current one to the new
or
2. Hack your panel by yourself by adding a sleep statement (That will have as result a slow works).
if you want, maybe we can schedule an IRC session to talk about this.
@Kilburn
fsync() call will have the same effect than the sleep statement. Depending of the load average it's better than the sleep statement because no return will be done before the data are written for each files. In normal execution context, the data will be written faster. On a busy server, the work will be more slowest. Other taks will not be processed before that the fsync() returns 0.
The race condition is not due to ispCP.
FS like ext3 can wait 1 to 5 second before really write the data on the file system. With ext4, that can be more (1 minutes). Imagine that the data are scheduled to be written, so, the data in cache should be written on the FS. But, imagine that during the write attemps, due to the load average, the flush process wait for I/O (here, it's the race condition), and then another process like ispcp-serv-mngr restart apache... Hard to explain to you with my poor English but I'm pretty sure that the problem.
Now, imagine that with multiples process that wait for I/O on very busy server...
Note: Answer made before that Marc deletes her answer
|