Threaded Mode | Linear Mode

sci2tech · 08-19-2010 03:29 PM

For some reason I can`t use trac so I continue here:

from sync manual

Quote:Notes
On Linux, sync is only guaranteed to schedule the dirty blocks for writing; it can actually take a short time before all the blocks are finally written. The reboot(8) and halt(8) commands take this into account by sleeping for a few seconds after calling sync(2).

So sync will not solve the problem still need to sleep Smile

sci2tech · 08-23-2010 06:02 AM

Since my trac account does not work, and since kilburn does not seems to belive my report I manage to replicate:

Quote:[Sun Aug 22 22:21:31 2010] [notice] mod_fcgid: process /var/www/ispcp/gui/index.php(31670) exit(shutting down), terminated by calling exit(), return code: 0
[Sun Aug 22 22:21:34 2010] [emerg] mod_fcgid: server is restarted, 1308 must exit
[Sun Aug 22 22:21:34 2010] [emerg] (22)Invalid argument: mod_fcgid: can't get lock, pid: 1308
apache2: Syntax error on line 281 of /etc/apache2/apache2.conf: Syntax error on line 25980 of /etc/apache2/sites-enabled/ispcp.conf: /etc/apache2/sites-enabled/ispcp.conf:25980: <VirtualHost> was not closed.

Quote:22.08.2010 22:19 ...: add domain: ...
22.08.2010 22:19 ...: add user: ... (for domain ...)
22.08.2010 22:19 ...: Auto Add User To: ..., From: ..., Status: |OK|!
22.08.2010 22:19 ...: deletes customer order.
22.08.2010 22:19 ...: deletes customer order.
22.08.2010 22:17 ...: add user: ... (for domain ...)
22.08.2010 22:17 ...: Auto Add User To: ..., From: ..., Status: |OK|!
22.08.2010 22:17 ...: add domain: ...
22.08.2010 22:17 ...: add user: ... (for domain ...)
22.08.2010 22:17 ...: add domain: ...

And

Quote:/etc/init.d/apache2 restart
httpd (pid 23071?) not running
.

Quote:uptime:
22:40:31 up 10 days, 4:16, 1 user, load average: 5.42, 7.35, 12.84

Quote:Account name ...
Admin users 2
Reseller users 3
Normal users 478
Domains 479
Subdomains 93
Domain aliases 0
Mail accounts 206/267
FTP accounts 379
SQL databases 353
SQL users 319

Maybe this will help you.

08-23-2010 06:31 AM

Read the related ticket ! You server seem very busy. It's the problem.

Quote:Account name ...
Admin users 2
Reseller users 3
Normal users 478
Domains 479
Subdomains 93
Domain aliases 0
Mail accounts 206/267
FTP accounts 379
SQL databases 353
SQL users 319

Hum what is your server exactly ? 479 domains and 93 subdomains Sad

sci2tech · 08-23-2010 06:37 AM

Server is not mine, and I did read the ticket. Wanted just show that it can be reproduced. Good luck.

08-23-2010 07:35 AM

Can be reproduced with very busy server --> load average: 5.42, 7.35, 12.84.

I don't know what is the server exactly but your load average seem explain your I/O problems. If several process wait for I/O, that can explain the problem when apache2 is restarted. The diagnostic is quick simple here, you should improve the server performances. IspCP must not take care about busy servers. it's the administrator job. IspCP should be used in normal execution context.

Imagine the following:

I've a server with the following material:

CPU intel celeron 1.2 Ghz
1 Go ram

On this server, I uses ispCP. After some days, I've 200 domains names hosted on it. Sure, I'll encountered a problem like you. It's the ispCP fault ? No, it's simply because the server material is too small to support the works. So, what my solution ? Ask the ispCP team to add a sleep statement of one or more second before restart Apache to be sure that the work is finish on my poor server ?

Sure, if someone pose this question, the team will surely answering the following: Your server is too small to ensure the works for 200 domains.

I've not thinking about this before reading your load average. Now, it's more clear for me.

I repeat, it's the admin job to take care about server performances. In you case, you have two solutions:

1. Getting a new servers and move some domains from the current one to the new
or
2. Hack your panel by yourself by adding a sleep statement (That will have as result a slow works).

if you want, maybe we can schedule an IRC session to talk about this.

@Kilburn

fsync() call will have the same effect than the sleep statement. Depending of the load average it's better than the sleep statement because no return will be done before the data are written for each files. In normal execution context, the data will be written faster. On a busy server, the work will be more slowest. Other taks will not be processed before that the fsync() returns 0.

The race condition is not due to ispCP.

FS like ext3 can wait 1 to 5 second before really write the data on the file system. With ext4, that can be more (1 minutes). Imagine that the data are scheduled to be written, so, the data in cache should be written on the FS. But, imagine that during the write attemps, due to the load average, the flush process wait for I/O (here, it's the race condition), and then another process like ispcp-serv-mngr restart apache... Hard to explain to you with my poor English but I'm pretty sure that the problem.

Now, imagine that with multiples process that wait for I/O on very busy server...

Note: Answer made before that Marc deletes her answer

**kilburn** · 08-23-2010 08:43 AM

What I'm saying is that this should not be a cache coherency problem (as you describe by the wait until data is written to disk), but a race condition when reading/writting to the files.

Anyway, this is easy to see. If it gets solved by an "fsync()" call, then you were probably right and I was wrong. Otherwise, try to introduce a lock to protect reading/editing /etc/apache2/sites-available/ispcp.conf and if that fixes the issue, I was right Tongue

08-23-2010 08:48 AM

@Marc

Normally, with have already a lock file on the master.

@Sci2tech;

When you run ispcp-update, the problem occurs too ?

(08-23-2010 08:43 AM)kilburn Wrote: What I'm saying is that this should not be a cache coherency problem (as you describe by the wait until data is written to disk), but a race condition when reading/writting to the files.

Hehe, Imagine a delayed I/O operations by the kernel due to a busy server that is done at the same moment that Apache read the file. I don't know how the synchronization is made exactly (Atomic or not)...

***RatS*** · 08-23-2010 04:19 PM

The ultimate solution will be one config file per domain. In this case it won't (or will harldy) happen that the same file is read twice before modification is written. But this is a heavy task.
As I see it correctly, we cannot use flock() for file logging, because the issue is file-system based. If the cached data hasn't been written but the ispcp-script already ended and the file is released from lock.

08-23-2010 04:27 PM

Yeah Benedikt, I think that you right. Another solution is fsync() to force the data to be written. Now, Marc says that normaly, same if the data are not written, the data are provided from the cache... It's not my opinion but.

Marc and me, we have planned to do some stress tests this night on a busy server.

sci2tech · 08-24-2010 04:48 AM

That server is a quad core Xeon CPU at 2.4 khz with 2560MiB RAM, not a VPS, nor a lousy Celeron! Same server managed successfully 1400 domains and some (more than 100) subdomains with ISPCP v 1.0.0. Same problem happened with a

Quote:load average: 0.90, 0.80, 0.72

.
Update does not cause same problem.
So I start to suspect a race condition, as Kilburn said.
Rats, apache (at list in debian) has a hard coded limit of 1024 simultaneous open file, so using 1 file per domain will limit maximum domains that can be added. Will end with same problem that ispcp did have with logs (btw ispcp-apache-logger is coded by me so no need to add moleSoftware as author and since code is based on vlogger witch is GPL it can not be relesed as MPL).

Threaded Mode \| Linear Mode ticket #2425
Author	Message