One flew over the allocator's nest
At some times even a experienced guide in the land of Solaris runs into a trap, even when the guide is aware of this trap.
I have a standard tactic when i’m trying to diagnose performance problems: At first … prstat -mL
to get an overview what the threads are really doing:
PID USERNAME USR SYS TRP TFL DFL LCK SLP LAT VCX ICX SCL SIG PROCESS/LWPID
26235 app 4.0 0.1 0.0 0.0 0.0 96 0.2 0.2 643 10 894 0 SomeApp/5
18107 app 4.0 0.1 0.0 0.0 0.0 95 1.0 0.1 1K 7 1K 0 SomeApp/8
26235 app 3.7 0.1 0.0 0.0 0.0 96 0.2 0.4 584 36 836 0 SomeApp/4
The first thing that springs into mind is the high percentage of time in the LCK state. So you start the next tool to get some additional information about that high LCK rate.
Obviously the weapon of choice is plockstat -A -p 18107
:
<small>
<pre>Count nsec Lock Caller
-------------------------------------------------------------------------------
78 69427 libc.so.1`libc_malloc_lock libzip.so`Java_java_util_zip_Deflater_deflateBytes+0x29c
44 99867 libc.so.1`libc_malloc_lock libzip.so`Java_java_util_zip_Deflater_deflateBytes+0x36c
21 35373 libc.so.1`libc_malloc_lock libzip.so`Java_java_util_zip_Deflater_deflateBytes+0x29c
11 58851 libc.so.1`libc_malloc_lock libzip.so`Java_java_util_zip_Deflater_deflateBytes+0x2dc
9 71014 libc.so.1`libc_malloc_lock libzip.so`Java_java_util_zip_Deflater_deflateBytes+0x36c
[... ca 25 Zeilen deleted ...]
1 10483 libc.so.1`libc_malloc_lock libnsl.so.1`netconfig_free+0x10
1 4401 libc.so.1`libc_malloc_lock libnsl.so.1`netconfig_free+0x10
</pre>
</small>
Hey, libc_malloc_lock
, my old friend. The wait time in the locks summed up to 15126580ns . Looks like this application is in need of a thread hot memory allocator. You can do this via preloading a different allocator library. As the application was started by SMF, it was easy just to use SMF to enable the preloading:
#svccfg -s application/someapp setenv LD_PRELOAD libumem.so
# svcadm refresh
# svcadm restart someapp
This procedure had the intended effect. An additional run of plockstat yielded much favorable results:
<small>
<pre>Count nsec Lock Caller
-------------------------------------------------------------------------------
5 37929 0x3b400 libumem.so.1`umem_cache_free+0x6c
2 76420 0x3b400 libumem.so.1`umem_cache_alloc+0x50
12 5828 0x3b400 libumem.so.1`umem_cache_free+0x6c
2 31008 libumem.so.1`vmem0+0x4a8 libumem.so.1`vmem_alloc+0xf0
8 6141 0x3b400 libumem.so.1`umem_cache_alloc+0x50
1 40411 libumem.so.1`vmem0+0x4a8 libumem.so.1`vmem_xalloc+0x140
1 5922 0x37080 libumem.so.1`umem_cache_alloc+0x50</pre>
</small>
All mutex blocks summed up to 569898ns. That’s just roughly 3.768% of the time needed with the libc malloc. There were just two things nagging at me: 15126580ns was high but not that much.But even more puzzling: The LCK time rised to 100% most of the time for many threads with where it was at 90% before. It’s good that i’m preferring trains to get to my customers, otherwise my car insurance would be unpayable by now, because i’m somewhere else with my thoughts. Nevertheless the pencil to connect the dots was rather slow yesterday. My brain experienced several majf yesterday. The first dot was a measurement a day before, where i had gone after the source of the high amount of the syscalls and the reasons for a high number of context switches. In hindsight this was the first dot:
# dtrace -n 'sched:::off-cpu /execname == "SomeApp"/{@[ustack()] = count()}END{trunc(@,5)}'
dtrace: description 'sched:::off-cpu' matched 3 probes
^C
CPU ID FUNCTION:NAME
1 2 :END
libc.so.1`___lwp_cond_wait+0x4
[...]
528
libc.so.1`___lwp_cond_wait+0x4
[...]
553
libc.so.1`___lwp_cond_wait+0x4
[...]
570
libc.so.1`___lwp_cond_wait+0x4
[...]
638
libc.so.1`___lwp_cond_wait+0x4
[...]
860
Hmmm … that was interesting but not unexpected. After that i’ve looked for the numbers of systemcalls. And there a second dot appeared:
# dtrace -n 'syscall:::entry/execname == "SomeApp"/{@[execname,probefunc] = count()}'
dtrace: description 'syscall:::entry' matched 234 probes
^C
SomeApp lwp_cond_wait 15683
The third dot was the strange behaviour in regard of the LCK time after changing the memory allocator. It was 17:30 o’clock and i went home. The tip of the pencil moved with whooping 1 attoparsec per week. There was something ringing in my head, but no idea where this ringing come from. Reaching home i’ve cooked a tea … a glacier attempted to overtake the tip of the pencil at the fast lane. Still this ringing. Drove me nuts. Should i call for the mental asylum? Even Gregory House M.D. was there and he slept with a character played by Franka Potente … not that bad ;) Viewed my newly purchased “District 9” DVD. And now to cite my thought at the middle of the movie: “Fuck! That prstat/cond_wait issue”. And suddenly the tip of the pencil to connect the dot moved with with approximately lightspeed (as a pencil has a mass, it can never reach lightspeed).
The reason for the counterintuitive behaviour the numbers in the LCK column of prstat is “sleeping on CVs is ‘LCK’ time, should be ‘SLP’”. When a thread is waiting on a conditional variable and sleeps, this time isn’t recorded as SLP time, it’s recorded as LCK time, while SLP would be correct. There is a good explanation of the issue and problem of fixing it in in the bug description.
The customers application uses vast amounts of threads (“Oh my god, it’s full of threads”), so conventional interpretation of the LCK column was totally misguiding. The strange increase in LCK time when load decreased was logical, too. As less work reaches the threads the more time they wait in the lwp_cond_wait
. And as this number counts to LCK is just natural that your systems seems to burn a lot of cycles in locks.