| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
|
|
|
| |
Speedup on N4:
MemAllocTest 3044 -> 2396 (~21% reduction)
BinaryTrees 4101 -> 2929 (~26% reduction)
Bug: 9986565
Change-Id: Ia1d1a37b9e001f903c3c056e8ec68fc8c623a78b
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Allow ValgrindMallocSpace wrapper for RosAlloc.Requires refactoring,
as ValgrindMallocSpace was bound to the signature of DlMallocSpace.
Also turn of native stack dumping when running under Valgrind to
work around b/18119146.
Ritzperf before and after
Mean 3190.725 3082.475
Standard Error 11.68407 10.37911
Mode 3069 2980
Median 3182.5 3051.5
Variance 16382.117 12927.125
Standard Deviation 127.99264 113.69751
Kurtosis 1.1065632 0.3657799
Skewness 0.9013805 0.9117792
Range 644 528
Minimum 2991 2928
Maximum 3635 3456
Count 120 120
Bug: 18119146
Change-Id: I25558ea7cb578406011dede9d3d0bdbfee4ff4d5
|
|
|
|
| |
Change-Id: I4e4ef3a2002fc59ebd9097087f150eaf3f2a7e08
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Used by SS/GSS collectors since these run with mutators suspended and
only allocate from a single thread. Added AllocThreadUnsafe to
BumpPointerSpace and RosAllocSpace. Added AllocThreadUnsafe which uses
current runs as thread local runs for a thread unsafe allocation.
Added code to revoke current runs which are the same idx as thread
local runs.
Changed:
The number of thread local runs in each thread is now the the number
of thread local runs in RosAlloc instead of the number of size
brackets.
Total GC time / time on EvaluateAndApplyChanges.
TLAB SS:
Before: 36.7s / 7254
After: 16.1s / 4837
TLAB GSS:
Before: 6.9s / 3973
After: 5.7s / 3778
Bug: 8981901
Change-Id: Id1d264ade3799f431bf7ebbdcca6146aefbeb632
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Enabling this flag greatly reduces how much time was spent in the GC.
It was not done previously since it was regressing MemAllocTest. With
these RosAlloc changes, the benchmark score no longer regresses after
we enable the flag.
Changed Run::AllocSlot to only have one mode of allocation. The new
mode is finding the first free bit in the bitmap. This was
previously the slow path but is now the fast path. Some optimizations
which enabled this include always having the alloc bitmap bits which
correspond to invalid slots be set to 1. This prevents us from needing
a bound check since we will never end up allocating there.
Changed revoking thread local buffer to point to an invalid run. The
invalid run is just a run which always has all the allocation bits set
to 1. When a thread attempts to do a thread local allocation from here
it will always fail and go slow path. This eliminates the need for a
null check for revoked runs.
Changed zeroing of memory to happen during free, AllocPages should
always return zeroed memory. Added prefetching which happens when we
allocate a run.
Some refactoring to reduce duplicated code.
Ergonomics changes: Changed kStickyGcThroughputAdjustment to 1.0,
this helps reduce GC time.
Measurements (3 samples per benchmark):
Before: MemAllocTest scores: 3463, 3445, 3431
EvaluateAndApplyChanges score | total GC time
Iter 1: 3485, 23.602436s
Iter 2: 3434, 22.499882s
Iter 3: 3483, 23.253274s
After: MemAllocTest scores: 3495, 3417, 3409
EvaluateAndApplyChanges score | total GC time:
Iter 1: 3375, 17.463462s
Iter 2: 3358, 16.185188s
Iter 3: 3367, 15.822312s
Bug: 8788501
Bug: 11790317
Bug: 9986565
Change-Id: Ifd273a054824028dabed27c07c081dde1816f93c
|
|
Bug: 9986565
Change-Id: I9bc411b8ae39379f9d730f40974857a585405fde
|