Tree - rpms/glibc - CentOS Git server

arrfab / rpms / glibc

Forked from rpms/glibc 5 years ago

Source
Stats

Blame SOURCES/glibc-rh1284959-1.patch

Blob History Raw

		147e83	`Description: Makes trimming work consistently across arenas.`
		147e83	`Author: Mel Gorman <mgorman@suse.de>`
		147e83	`Origin: git://sourceware.org/git/glibc.git`
		147e83	`Bug-RHEL: N/A`
		147e83	`Bug-Fedora: N/A`
		147e83	`Bug-Upstream: #17195`
		147e83	`Upstream status: committed`
		147e83
		147e83	`Part of commit 8a35c3fe122d49ba76dff815b3537affb5a50b45 is also included`
		147e83	`to allow the use of ALIGN_UP within malloc/arena.c.`
		147e83
		147e83	`commit c26efef9798914e208329c0e8c3c73bb1135d9e3`
		147e83	`Author: Mel Gorman <mgorman@suse.de>`
		147e83	`Date: Thu Apr 2 12:14:14 2015 +0530`
		147e83
		147e83	`malloc: Consistently apply trim_threshold to all heaps [BZ #17195]`
		147e83
		147e83	`Trimming heaps is a balance between saving memory and the system overhead`
		147e83	`required to update page tables and discard allocated pages. The malloc`
		147e83	`option M_TRIM_THRESHOLD is a tunable that users are meant to use to decide`
		147e83	`where this balance point is but it is only applied to the main arena.`
		147e83
		147e83	`For scalability reasons, glibc malloc has per-thread heaps but these are`
		147e83	`shrunk with madvise() if there is one page free at the top of the heap.`
		147e83	`In some circumstances this can lead to high system overhead if a thread`
		147e83	`has a control flow like`
		147e83
		147e83	`while (data_to_process) {`
		147e83	`buf = malloc(large_size);`
		147e83	`do_stuff();`
		147e83	`free(buf);`
		147e83	`}`
		147e83
		147e83	`For a large size, the free() will call madvise (pagetable teardown, page`
		147e83	`free and TLB flush) every time followed immediately by a malloc (fault,`
		147e83	`kernel page alloc, zeroing and charge accounting). The kernel overhead`
		147e83	`can dominate such a workload.`
		147e83
		147e83	`This patch allows the user to tune when madvise gets called by applying`
		147e83	`the trim threshold to the per-thread heaps and using similar logic to the`
		147e83	`main arena when deciding whether to shrink. Alternatively if the dynamic`
		147e83	`brk/mmap threshold gets adjusted then the new values will be obeyed by`
		147e83	`the per-thread heaps.`
		147e83
		147e83	`Bug 17195 was a test case motivated by a problem encountered in scientific`
		147e83	`applications written in python that performance badly due to high page fault`
		147e83	`overhead. The basic operation of such a program was posted by Julian Taylor`
		147e83	`https://sourceware.org/ml/libc-alpha/2015-02/msg00373.html`
		147e83
		147e83	`With this patch applied, the overhead is eliminated. All numbers in this`
		147e83	`report are in seconds and were recorded by running Julian's program 30`
		147e83	`times.`
		147e83
		147e83	`pyarray`
		147e83	`glibc madvise`
		147e83	`2.21 v2`
		147e83	`System min 1.81 ( 0.00%) 0.00 (100.00%)`
		147e83	`System mean 1.93 ( 0.00%) 0.02 ( 99.20%)`
		147e83	`System stddev 0.06 ( 0.00%) 0.01 ( 88.99%)`
		147e83	`System max 2.06 ( 0.00%) 0.03 ( 98.54%)`
		147e83	`Elapsed min 3.26 ( 0.00%) 2.37 ( 27.30%)`
		147e83	`Elapsed mean 3.39 ( 0.00%) 2.41 ( 28.84%)`
		147e83	`Elapsed stddev 0.14 ( 0.00%) 0.02 ( 82.73%)`
		147e83	`Elapsed max 4.05 ( 0.00%) 2.47 ( 39.01%)`
		147e83
		147e83	`glibc madvise`
		147e83	`2.21 v2`
		147e83	`User 141.86 142.28`
		147e83	`System 57.94 0.60`
		147e83	`Elapsed 102.02 72.66`
		147e83
		147e83	`Note that almost a minutes worth of system time is eliminted and the`
		147e83	`program completes 28% faster on average.`
		147e83
		147e83	`To illustrate the problem without python this is a basic test-case for`
		147e83	`the worst case scenario where every free is a madvise followed by a an alloc`
		147e83
		147e83	`/* gcc bench-free.c -lpthread -o bench-free */`
		147e83	`static int num = 1024;`
		147e83
		147e83	`void __attribute__((noinline,noclone)) dostuff (void *p)`
		147e83	`{`
		147e83	`}`
		147e83
		147e83	`void worker (void data)`
		147e83	`{`
		147e83	`int i;`
		147e83
		147e83	`for (i = num; i--;)`
		147e83	`{`
		147e83	`void m = malloc (484096);`
		147e83	`dostuff (m);`
		147e83	`free (m);`
		147e83	`}`
		147e83
		147e83	`return NULL;`
		147e83	`}`
		147e83
		147e83	`int main()`
		147e83	`{`
		147e83	`int i;`
		147e83	`pthread_t t;`
		147e83	`void *ret;`
		147e83	`if (pthread_create (&t, NULL, worker, NULL))`
		147e83	`exit (2);`
		147e83	`if (pthread_join (t, &ret))`
		147e83	`exit (3);`
		147e83	`return 0;`
		147e83	`}`
		147e83
		147e83	`Before the patch, this resulted in 1024 calls to madvise. With the patch applied,`
		147e83	`madvise is called twice because the default trim threshold is high enough to avoid`
		147e83	`this.`
		147e83
		147e83	`This a more complex case where there is a mix of frees. It's simply a different worker`
		147e83	`function for the test case above`
		147e83
		147e83	`void worker (void data)`
		147e83	`{`
		147e83	`int i;`
		147e83	`int j = 0;`
		147e83	`void *free_index[num];`
		147e83
		147e83	`for (i = num; i--;)`
		147e83	`{`
		147e83	`void m = malloc ((i % 58) 4096);`
		147e83	`dostuff (m);`
		147e83	`if (i % 2 == 0) {`
		147e83	`free (m);`
		147e83	`} else {`
		147e83	`free_index[j++] = m;`
		147e83	`}`
		147e83	`}`
		147e83	`for (; j >= 0; j--)`
		147e83	`{`
		147e83	`free(free_index[j]);`
		147e83	`}`
		147e83
		147e83	`return NULL;`
		147e83	`}`
		147e83
		147e83	`glibc 2.21 calls malloc 90305 times but with the patch applied, it's`
		147e83	`called 13438. Increasing the trim threshold will decrease the number of`
		147e83	`times it's called with the option of eliminating the overhead.`
		147e83
		147e83	`ebizzy is meant to generate a workload resembling common web application`
		147e83	`server workloads. It is threaded with a large working set that at its core`
		147e83	`has an allocation, do_stuff, free loop that also hits this case. The primary`
		147e83	`metric of the benchmark is records processed per second. This is running on`
		147e83	`my desktop which is a single socket machine with an I7-4770 and 8 cores.`
		147e83	`Each thread count was run for 30 seconds. It was only run once as the`
		147e83	`performance difference is so high that the variation is insignificant.`
		147e83
		147e83	`glibc 2.21 patch`
		147e83	`threads 1 10230 44114`
		147e83	`threads 2 19153 84925`
		147e83	`threads 4 34295 134569`
		147e83	`threads 8 51007 183387`
		147e83
		147e83	`Note that the saving happens to be a concidence as the size allocated`
		147e83	`by ebizzy was less than the default threshold. If a different number of`
		147e83	`chunks were specified then it may also be necessary to tune the threshold`
		147e83	`to compensate`
		147e83
		147e83	`This is roughly quadrupling the performance of this benchmark. The difference in`
		147e83	`system CPU usage illustrates why.`
		147e83
		147e83	`ebizzy running 1 thread with glibc 2.21`
		147e83	`10230 records/s 306904`
		147e83	`real 30.00 s`
		147e83	`user 7.47 s`
		147e83	`sys 22.49 s`
		147e83
		147e83	`22.49 seconds was spent in the kernel for a workload runinng 30 seconds. With the`
		147e83	`patch applied`
		147e83
		147e83	`ebizzy running 1 thread with patch applied`
		147e83	`44126 records/s 1323792`
		147e83	`real 30.00 s`
		147e83	`user 29.97 s`
		147e83	`sys 0.00 s`
		147e83
		147e83	`system CPU usage was zero with the patch applied. strace shows that glibc`
		147e83	`running this workload calls madvise approximately 9000 times a second. With`
		147e83	`the patch applied madvise was called twice during the workload (or 0.06`
		147e83	`times per second).`
		147e83
		147e83	`2015-02-10 Mel Gorman <mgorman@suse.de>`
		147e83
		147e83	`[BZ #17195]`
		147e83	`* malloc/arena.c (free): Apply trim threshold to per-thread heaps`
		147e83	`as well as the main arena.`
		147e83
		147e83	`Index: glibc-2.17-c758a686/malloc/arena.c`
		147e83	`===================================================================`
		147e83	`--- glibc-2.17-c758a686.orig/malloc/arena.c`
		147e83	`+++ glibc-2.17-c758a686/malloc/arena.c`
		147e83	`@@ -661,7 +661,7 @@ heap_trim(heap_info *heap, size_t pad)`
		147e83	`unsigned long pagesz = GLRO(dl_pagesize);`
		147e83	`mchunkptr top_chunk = top(ar_ptr), p, bck, fwd;`
		147e83	`heap_info *prev_heap;`
		147e83	`- long new_size, top_size, extra, prev_size, misalign;`
		147e83	`+ long new_size, top_size, top_area, extra, prev_size, misalign;`
		147e83
		147e83	`/* Can this heap go away completely? */`
		147e83	`while(top_chunk == chunk_at_offset(heap, sizeof(*heap))) {`
		147e83	`@@ -695,9 +695,16 @@ heap_trim(heap_info *heap, size_t pad)`
		147e83	`set_head(top_chunk, new_size \| PREV_INUSE);`
		147e83	`/check_chunk(ar_ptr, top_chunk);/`
		147e83	`}`
		147e83	`+`
		147e83	`+ /* Uses similar logic for per-thread arenas as the main arena with systrim`
		147e83	`+ by preserving the top pad and at least a page. */`
		147e83	`top_size = chunksize(top_chunk);`
		147e83	`- extra = (top_size - pad - MINSIZE - 1) & ~(pagesz - 1);`
		147e83	`- if(extra < (long)pagesz)`
		147e83	`+ top_area = top_size - MINSIZE - 1;`
		147e83	`+ if (top_area <= pad)`
		147e83	`+ return 0;`
		147e83	`+`
		147e83	`+ extra = ALIGN_DOWN(top_area - pad, pagesz);`
		147e83	`+ if ((unsigned long) extra < mp_.trim_threshold)`
		147e83	`return 0;`
		147e83	`/* Try to shrink. */`
		147e83	`if(shrink_heap(heap, extra) != 0)`
		147e83	`Index: glibc-2.17-c758a686/malloc/malloc.c`
		147e83	`===================================================================`
		147e83	`--- glibc-2.17-c758a686.orig/malloc/malloc.c`
		147e83	`+++ glibc-2.17-c758a686/malloc/malloc.c`
		147e83	`@@ -236,6 +236,8 @@`
		147e83	`/* For va_arg, va_start, va_end. */`
		147e83	`#include <stdarg.h>`
		147e83
		147e83	`+/* For ALIGN_UP. */`
		147e83	`+#include <libc-internal.h>`
		147e83
		147e83	`/*`
		147e83	`Debugging:`

arrfab / rpms / glibc

Source Code

Blame SOURCES/glibc-rh1284959-1.patch