php.net |  support |  documentation |  report a bug |  advanced search |  search howto |  statistics |  random bug |  login
Bug #58404 apc.write_lock not working?
Submitted: 2008-11-03 12:08 UTC Modified: 2008-11-08 21:31 UTC
From: oliver at realtsp dot com Assigned:
Status: Closed Package: APC (PECL)
PHP Version: 5.2.5 OS: FreeBSd 7.0
Private report: No CVE-ID: None
 [2008-11-03 12:08 UTC] oliver at realtsp dot com
Description:
------------
We have a situation where apc is getting "slammed" on
startup. using fastcgi with latest stable apc.

# php-cgi -v
PHP 5.2.6 (cgi-fcgi) (built: Sep 11 2008 14:35:00)

We are loading about 350 class files per request. We can
reliably reproduce the "slam" if we empty APC with zero load
and then send 10 concurrent users. 

We get lots of "semaphore waits" and not much action
otherwise.

56074 www         1  -4    0  1112M 45164K semwai 0   0:08
39.36% php-cgi
56071 www         1  -4    0  1111M 44344K semwai 1   0:07
39.45% php-cgi
56066 www         1  -4    0  1111M 44796K semwai 1   0:08
40.19% php-cgi
56069 www         1  -4    0  1111M 44464K semwai 2   0:08
39.79% php-cgi
56070 www         1  -4    0  1111M 44496K semwai 4   0:07
38.96% php-cgi
56073 www         1  -4    0  1110M 44020K semwai 5   0:08
39.70% php-cgi
56067 www         1  -4    0  1111M 44752K semwai 6   0:08
38.87% php-cgi
56072 www         1  -4    0  1114M 47284K semwai 7   0:07
39.16% php-cgi
56075 www         1  -4    0  1111M 44652K semwai 7   0:07
40.19% php-cgi

With 10 concurrent users APC eventually recovers with
terribly fragmented memory. With 40 concurrent users it
never recovers, becomes completely locked out even when the
load is removed.

As you can see from the STATE column above we have compiled
APC with semaphores using the standard FreeBSD Ports
makefile config option, which adds --enable-apc-sem. We did
this because the default fcntl locks became a bottleneck,
lots of "flock" shown on "top" display. Semaphores re much
faster and solved this problem. However we are reasonably
sure this behaviour will also occur on fcntl locks.

It seems like a classic "startup cache slam", that
apc.write_lock is designed to prevent. However write_lock is
already disables as is the default.

From studying the source code it appears that respecting
write_lock=1 is dependent upon 

#if NONBLOCKING_LOCK_AVAILABLE

and NONBLOCKING_LOCK_AVAILABLE is only =1 for

APC_PTHREADMUTEX_LOCKS and
APC_FUTEX_LOCKS

ie not for 
APC_SEM_LOCKS

Does this mean that apc.write_lock cannot work while using
semaphores? That would certainly explain why the cache is
betting slammed. 

To test this theory we tried using the deprecated apc.slam-
from caching any files at all. Can't understand why it won't
work since the code for it still seems to in there...unless
rand() is not working??



Reproduce code:
---------------
hit cache with 10 concurrent users each requiring 400files.


Expected result:
----------------
cache does not get slammed

Actual result:
--------------
cache gets slammed

Patches

Add a Patch

Pull Requests

Add a Pull Request

History

AllCommentsChangesGit/SVN commitsRelated reports
 [2008-11-03 12:11 UTC] oliver at realtsp dot com
sorry typo..

"However write_lock is
already disables as is the default."

should read

"However write_lock is
enabled as per default."
 [2008-11-03 12:13 UTC] oliver at realtsp dot com
Another typo:

this:

To test this theory we tried using the deprecated apc.slam-
from caching any files at all. Can't understand why it won't
work since the code for it still seems to in there...unless
rand() is not working??

should read

To test this theory we tried using the deprecated
apc.slam_defense. However when setting this to any value
greater than 1 prevents apc from caching any files at all.
Can't understand why it won't work since the code for it
still seems to in there...unless rand() is not working??
 [2008-11-03 12:14 UTC] oliver at realtsp dot com
changing summary
 [2008-11-03 12:15 UTC] rasmus@php.net
Did I read that right?  "We are loading about 350 class files per request."

Of course, APC should work in this case, but that just seems nuts to me.  Around here if I see more than 10 includes on a request I take a serious look at refactoring and simplifying things.
 [2008-11-03 13:19 UTC] oliver at realtsp dot com
Yes you did. 

This is a 200 table e-commerce platform with a lot of dynamic personalisatin features. We use propel for orm which admittedly increases the class file count a lot: 4-5 generated classes per table. (not all are used on every request of course). 

In general we have a policy of "make developer time productive by focusing them  on the commercial issues, not on creating hard to maintain, super fast code. Significant time is spent on performance as well but that is mostly db optimisation and app level caching. Our average request execution times are ~200ms and only 20-30ms of that is apc class load time. We recently switched to dual-quad-core amd opteron machines for the fastcgi layer and with 8cores the apc cocurrency became the bottleneck for us. Switching to semaphores from fcntl locks solved the problem in steady state, but at startup we have the above problem. 

I guess there are more and more projects like ours which are becoming "enterprise scale" and stretching php. However this is good news no? Is that not what you want to happen to your baby Rasmus? ;-)
 [2008-11-03 15:47 UTC] shire@php.net
Is there a reason you haven't tried using the Pthread Mutex locking?  (Spin locks are also available but it's possible you could run into some signal handling problems with Zend timeouts, still considered experimental).
 [2008-11-03 17:13 UTC] oliver at realtsp dot com
when compiling with: --enable-apc-pthreadmutex

i get

checking Checking whether we should use pthread mutex
locking... yes
Unable to set PTHREAD_PROCESS_SHARED
(pthread_mutexattr_setpshared), your system may not support
shared mutex's.
configure: WARNING: It doesn't appear that pthread mutex's
are supported on your system

i suspect that i would need to install the FreeBSD port
"linuxthreads" but this is only for i386:

root@torbay# cd /usr/ports/devel/linuxthreads
root@torbay# make install
===>  linuxthreads-2.2.3_23 is only for i386, while you are
running amd64.
*** Error code 1

Stop in /usr/ports/devel/linuxthreads.
 [2008-11-03 20:12 UTC] shire@php.net
I see, sorry I didn't realize BSD didn't support that option on pthreads.  It seems *if* the IPC_NOWAIT works correctly we can support the nonblocking lock here.  I've made the following quick patch, if you're building from source perhaps you could test it out to see if it performs better under your environment?

http://tekrat.com/downloads/bits/apc_sem_nonblocking.patch
 [2008-11-04 04:13 UTC] oliver at realtsp dot com
Brilliant!

Took a little bit of fiddling to get apc to build correctly
from within the FreeBSD port. By converting your patch to
use the standard port format and making it part of the port
build process it seems to (!!) work like a charm. Even with
40 concurrent users hitting an empty cache apc does not get
"slammed/locked" at all.

Obviously right now I am getting lots of these.

[Tue Nov  4 09:06:45 2008] [apc-warning] nonblocking lock: 1

It seems one for each file which is cached, which looks
right from the code? (I am not a C programmer)

Is there any tidying up that needs doing here? 

Do you want to commit this change to APC or shall I get the
FreeBSD port maintainer for APC to add the final patch to
the port?

Thanks a lot.
 [2008-11-04 04:14 UTC] gopalv82 at yahoo dot com
FASTCgi spawns multiple independent processes.

I still haven't got any sane way to share locks between independent processes, portably.

Every php-fcgi spawned has its own cache.
 [2008-11-04 04:29 UTC] oliver at realtsp dot com
Hi Gopal

Not sure what you are saying here. We have been using APC
with fastcgi for a year now, first using fcntl and now sysV
IPC semaphores. It works well and does *not* spawn a new
cache for every fastcgi child process (ie the cache seems to
be attached to the fastcgi parent process which is similar
to an apache pre-fork situation).

One small gripe we have with the IPC sysV semphore
implementation is that the semaphores are not
"removed/deallocated" when APC exits and because there are a
limited number of semaphores avaiable in the FreeBSD kernel
(adjustable but limited anyway), if you restart fastcgi/APC
enough times you get:

[Mon Nov  3 23:04:58 2008] [apc-error] apc_sem_unlock:
semop(3080192) failed: No space left on device

right now we are getting around this with some bash
scripting on restart:

/usr/local/etc/rc.d/fastcgi-php.sh stop

# clean up the semaphores
for id in `ipcs | grep www | awk '{print $2}'`
do
  ipcrm -s $id
done

/usr/local/etc/rc.d/fastcgi-php.sh start


Is there any way this can be cleaned up from within APC,
during "cache shutdown"?

Oliver
 [2008-11-04 11:54 UTC] oliver at realtsp dot com
i am moving  the semaphore cleanup issue to a new bug, after
some investigation..as it is really unrelated.

new bug for this issue is

http://pecl.php.net/bugs/bug.php?id=14957
 [2008-11-07 09:58 UTC] oliver at realtsp dot com
Hi Shire

We have now tested your patch more rigorously and it is
performing very well even on production systems. I have
changed it slightly to only print lock status 

#ifdef __DEBUG_APC__

and I have made the patch compatible with the FreeBSD ports
system to make it part of our standard build.

When removing the apc_wprint with the compiler conditional
it became necessary to "return 1;" explicitly in the case
where the lock was obtained. This confused us a bit as the
presence fof apc_wprint seemed to make it work correctly
before. However explicitly returning 1 for an "int" defined
function is more correct anyway no?

Here is the revised patch:

http://www.realtsp.com/download/patch-apc_sem_non_blocking.c

Thanks again, and let me know whether you want to commit
this to APC or whether I try to get the port maintainer to
add it to the FreeBSD APC port.

Oliver
 [2008-11-08 21:31 UTC] shire@php.net
This bug has been fixed in CVS.

In case this was a documentation problem, the fix will show up at the
end of next Sunday (CET) on pecl.php.net.

In case this was a pecl.php.net website problem, the change will show
up on the website in short time.
 
Thank you for the report, and for helping us make PECL better.

I've committed a patch with corrections.  Let us know if you run into further issues.
 
PHP Copyright © 2001-2024 The PHP Group
All rights reserved.
Last updated: Fri Apr 26 05:01:30 2024 UTC