<feed xmlns='http://www.w3.org/2005/Atom'>
<title>art/runtime/arch/quick_alloc_entrypoints.cc, branch cm-13.0</title>
<subtitle>Unnamed repository; edit this file 'description' to name the repository.
</subtitle>
<link rel='alternate' type='text/html' href='https://git.replicant.us/replicant/art/'/>
<entry>
<title>Guard entrypoint changing by runtime shutdown lock.</title>
<updated>2014-03-04T01:08:30+00:00</updated>
<author>
<name>Mathieu Chartier</name>
<email>mathieuc@google.com</email>
</author>
<published>2014-03-02T21:28:37+00:00</published>
<link rel='alternate' type='text/html' href='https://git.replicant.us/replicant/art/commit/?id=d889178ec78930538d9d6a66c3df9ee9afaffbb4'/>
<id>d889178ec78930538d9d6a66c3df9ee9afaffbb4</id>
<content type='text'>
There was a race when we changed the allocation entrypoints where a
new thread would be starting (Thread::Init) and initialize to the
wrong entrypoints. Guarding allocation entrypoint changing
with the runtime shutdown lock fixes this race condition since
Thread::Init is only called with the runtime shutdown lock held.

Bug: 13250963

Change-Id: I8eb209c124b6bf17020de874e1b0083f158b8200
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
There was a race when we changed the allocation entrypoints where a
new thread would be starting (Thread::Init) and initialize to the
wrong entrypoints. Guarding allocation entrypoint changing
with the runtime shutdown lock fixes this race condition since
Thread::Init is only called with the runtime shutdown lock held.

Bug: 13250963

Change-Id: I8eb209c124b6bf17020de874e1b0083f158b8200
</pre>
</div>
</content>
</entry>
<entry>
<title>Embed array class pointers at array allocation sites.</title>
<updated>2014-01-28T00:50:29+00:00</updated>
<author>
<name>Hiroshi Yamauchi</name>
<email>yamauchi@google.com</email>
</author>
<published>2014-01-28T00:50:29+00:00</published>
<link rel='alternate' type='text/html' href='https://git.replicant.us/replicant/art/commit/?id=bb8f0ab736b61db8f543e433859272e83f96ee9b'/>
<id>bb8f0ab736b61db8f543e433859272e83f96ee9b</id>
<content type='text'>
Following https://android-review.googlesource.com/#/c/79302, embed
array class pointers at array allocation sites in the compiled code.

Change-Id: I67a1292466dfbb7f48e746e5060e992dd93525c5
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Following https://android-review.googlesource.com/#/c/79302, embed
array class pointers at array allocation sites in the compiled code.

Change-Id: I67a1292466dfbb7f48e746e5060e992dd93525c5
</pre>
</div>
</content>
</entry>
<entry>
<title>Use direct class pointers at allocation sites in the compiled code.</title>
<updated>2014-01-23T23:29:12+00:00</updated>
<author>
<name>Hiroshi Yamauchi</name>
<email>yamauchi@google.com</email>
</author>
<published>2014-01-15T19:46:48+00:00</published>
<link rel='alternate' type='text/html' href='https://git.replicant.us/replicant/art/commit/?id=be1ca55db3362f5b100c4c65da5342fd299520bb'/>
<id>be1ca55db3362f5b100c4c65da5342fd299520bb</id>
<content type='text'>
- Rather than looking up a class from its type ID (and checking if
  it's resolved/initialized, resolving/initializing if not), use
  direct class pointers, if possible (boot-code-to-boot-class pointers
  and app-code-to-boot-class pointers.)
- This results in a 1-2% speedup in Ritz MemAllocTest on Nexus 4.
- Embedding the object size (along with class pointers) caused a 1-2%
  slowdown in MemAllocTest and isn't implemented in this change.
- TODO: do the same for array allocations.
- TODO: when/if an application gets its own image, implement
  app-code-to-app-class pointers.
- Fix a -XX:gc bug.
  cf. https://android-review.googlesource.com/79460/
- Add /tmp/android-data/dalvik-cache to the list of locations to
  remove oat files in clean-oat-host.
  cf. https://android-review.googlesource.com/79550
- Add back a dropped UNLIKELY in FindMethodFromCode().
  cf. https://android-review.googlesource.com/74205

Bug: 9986565
Change-Id: I590b96bd21f7a7472f88e36752e675547559a5b1
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
- Rather than looking up a class from its type ID (and checking if
  it's resolved/initialized, resolving/initializing if not), use
  direct class pointers, if possible (boot-code-to-boot-class pointers
  and app-code-to-boot-class pointers.)
- This results in a 1-2% speedup in Ritz MemAllocTest on Nexus 4.
- Embedding the object size (along with class pointers) caused a 1-2%
  slowdown in MemAllocTest and isn't implemented in this change.
- TODO: do the same for array allocations.
- TODO: when/if an application gets its own image, implement
  app-code-to-app-class pointers.
- Fix a -XX:gc bug.
  cf. https://android-review.googlesource.com/79460/
- Add /tmp/android-data/dalvik-cache to the list of locations to
  remove oat files in clean-oat-host.
  cf. https://android-review.googlesource.com/79550
- Add back a dropped UNLIKELY in FindMethodFromCode().
  cf. https://android-review.googlesource.com/74205

Bug: 9986565
Change-Id: I590b96bd21f7a7472f88e36752e675547559a5b1
</pre>
</div>
</content>
</entry>
<entry>
<title>Background compaction support.</title>
<updated>2014-01-08T22:16:12+00:00</updated>
<author>
<name>Mathieu Chartier</name>
<email>mathieuc@google.com</email>
</author>
<published>2013-12-16T19:54:42+00:00</published>
<link rel='alternate' type='text/html' href='https://git.replicant.us/replicant/art/commit/?id=e6da9af8dfe0a3e3fbc2be700554f6478380e7b9'/>
<id>e6da9af8dfe0a3e3fbc2be700554f6478380e7b9</id>
<content type='text'>
When the process state changes to a state which does not perceives
jank, we copy from the main free-list backed allocation space to
the bump pointer space and enable the semispace allocator.

When we transition back to foreground, we copy back to a free-list
backed space.

Create a seperate non-moving space which only holds non-movable
objects. This enables us to quickly wipe the current alloc space
(DlMalloc / RosAlloc) when we transition to background.

Added multiple alloc space support to the sticky mark sweep GC.

Added a -XX:BackgroundGC option which lets you specify
which GC to use for background apps. Passing in
-XX:BackgroundGC=SS makes the heap compact the heap for apps which
do not perceive jank.

Results:
Simple background foreground test:
0. Reboot phone, unlock.
1. Open browser, click on home.
2. Open calculator, click on home.
3. Open calendar, click on home.
4. Open camera, click on home.
5. Open clock, click on home.
6. adb shell dumpsys meminfo

PSS Normal ART:
Sample 1:
    88468 kB: Dalvik
     3188 kB: Dalvik Other
Sample 2:
    81125 kB: Dalvik
     3080 kB: Dalvik Other

PSS Dalvik:
Total PSS by category:
Sample 1:
    81033 kB: Dalvik
    27787 kB: Dalvik Other
Sample 2:
    81901 kB: Dalvik
    28869 kB: Dalvik Other

PSS ART + Background Compaction:
Sample 1:
    71014 kB: Dalvik
     1412 kB: Dalvik Other
Sample 2:
    73859 kB: Dalvik
     1400 kB: Dalvik Other

Dalvik other reduction can be explained by less deep allocation
stacks / less live bitmaps / less dirty cards.

TODO improvements: Recycle mem-maps which are unused in the current
state. Not hardcode 64 MB capacity of non movable space (avoid
returning linear alloc nightmares). Figure out ways to deal with low
virtual address memory problems.

Bug: 8981901

Change-Id: Ib235d03f45548ffc08a06b8ae57bf5bada49d6f3
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
When the process state changes to a state which does not perceives
jank, we copy from the main free-list backed allocation space to
the bump pointer space and enable the semispace allocator.

When we transition back to foreground, we copy back to a free-list
backed space.

Create a seperate non-moving space which only holds non-movable
objects. This enables us to quickly wipe the current alloc space
(DlMalloc / RosAlloc) when we transition to background.

Added multiple alloc space support to the sticky mark sweep GC.

Added a -XX:BackgroundGC option which lets you specify
which GC to use for background apps. Passing in
-XX:BackgroundGC=SS makes the heap compact the heap for apps which
do not perceive jank.

Results:
Simple background foreground test:
0. Reboot phone, unlock.
1. Open browser, click on home.
2. Open calculator, click on home.
3. Open calendar, click on home.
4. Open camera, click on home.
5. Open clock, click on home.
6. adb shell dumpsys meminfo

PSS Normal ART:
Sample 1:
    88468 kB: Dalvik
     3188 kB: Dalvik Other
Sample 2:
    81125 kB: Dalvik
     3080 kB: Dalvik Other

PSS Dalvik:
Total PSS by category:
Sample 1:
    81033 kB: Dalvik
    27787 kB: Dalvik Other
Sample 2:
    81901 kB: Dalvik
    28869 kB: Dalvik Other

PSS ART + Background Compaction:
Sample 1:
    71014 kB: Dalvik
     1412 kB: Dalvik Other
Sample 2:
    73859 kB: Dalvik
     1400 kB: Dalvik Other

Dalvik other reduction can be explained by less deep allocation
stacks / less live bitmaps / less dirty cards.

TODO improvements: Recycle mem-maps which are unused in the current
state. Not hardcode 64 MB capacity of non movable space (avoid
returning linear alloc nightmares). Figure out ways to deal with low
virtual address memory problems.

Bug: 8981901

Change-Id: Ib235d03f45548ffc08a06b8ae57bf5bada49d6f3
</pre>
</div>
</content>
</entry>
<entry>
<title>Thread local bump pointer allocator.</title>
<updated>2013-12-17T00:57:37+00:00</updated>
<author>
<name>Mathieu Chartier</name>
<email>mathieuc@google.com</email>
</author>
<published>2013-11-30T01:24:40+00:00</published>
<link rel='alternate' type='text/html' href='https://git.replicant.us/replicant/art/commit/?id=692fafd9778141fa6ef0048c9569abd7ee0253bf'/>
<id>692fafd9778141fa6ef0048c9569abd7ee0253bf</id>
<content type='text'>
Added a thread local allocator to the heap, each thread has three
pointers which specify the thread local buffer: start, cur, and
end. When the remaining space in the thread local buffer isn't large
enough for the allocation, the allocator allocates a new thread
local buffer using the bump pointer allocator.

The bump pointer space had to be modified to accomodate thread
local buffers. These buffers are called "blocks", where a block
is a buffer which contains a set of adjacent objects. Blocks
aren't necessarily full and may have wasted memory towards the
end. Blocks have an 8 byte header which specifies their size and is
required for traversing bump pointer spaces.

Memory usage is in between full bump pointer and ROSAlloc since
madvised memory limits wasted ram to an average of 1/2 page per
block.

Added a runtime option -XX:UseTLAB which specifies whether or
not to use the thread local allocator. Its a NOP if the garbage
collector is not the semispace collector.

TODO: Smarter block accounting to prevent us reading objects until
we either hit the end of the block or GetClass() == null which
signifies that the block isn't 100% full. This would provide a
slight speedup to BumpPointerSpace::Walk.

Timings: -XX:HeapMinFree=4m -XX:HeapMaxFree=8m -Xmx48m
ritzperf memalloc:
Dalvik -Xgc:concurrent: 11678
Dalvik -Xgc:noconcurrent: 6697
-Xgc:MS: 5978
-Xgc:SS: 4271
-Xgc:CMS: 4150
-Xgc:SS -XX:UseTLAB: 3255

Bug: 9986565
Bug: 12042213

Change-Id: Ib7e1d4b199a8199f3b1de94b0a7b6e1730689cad
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Added a thread local allocator to the heap, each thread has three
pointers which specify the thread local buffer: start, cur, and
end. When the remaining space in the thread local buffer isn't large
enough for the allocation, the allocator allocates a new thread
local buffer using the bump pointer allocator.

The bump pointer space had to be modified to accomodate thread
local buffers. These buffers are called "blocks", where a block
is a buffer which contains a set of adjacent objects. Blocks
aren't necessarily full and may have wasted memory towards the
end. Blocks have an 8 byte header which specifies their size and is
required for traversing bump pointer spaces.

Memory usage is in between full bump pointer and ROSAlloc since
madvised memory limits wasted ram to an average of 1/2 page per
block.

Added a runtime option -XX:UseTLAB which specifies whether or
not to use the thread local allocator. Its a NOP if the garbage
collector is not the semispace collector.

TODO: Smarter block accounting to prevent us reading objects until
we either hit the end of the block or GetClass() == null which
signifies that the block isn't 100% full. This would provide a
slight speedup to BumpPointerSpace::Walk.

Timings: -XX:HeapMinFree=4m -XX:HeapMaxFree=8m -Xmx48m
ritzperf memalloc:
Dalvik -Xgc:concurrent: 11678
Dalvik -Xgc:noconcurrent: 6697
-Xgc:MS: 5978
-Xgc:SS: 4271
-Xgc:CMS: 4150
-Xgc:SS -XX:UseTLAB: 3255

Bug: 9986565
Bug: 12042213

Change-Id: Ib7e1d4b199a8199f3b1de94b0a7b6e1730689cad
</pre>
</div>
</content>
</entry>
<entry>
<title>Refactor allocation entrypoints.</title>
<updated>2013-11-20T19:14:11+00:00</updated>
<author>
<name>Mathieu Chartier</name>
<email>mathieuc@google.com</email>
</author>
<published>2013-11-15T01:45:16+00:00</published>
<link rel='alternate' type='text/html' href='https://git.replicant.us/replicant/art/commit/?id=cbb2d20bea2861f244da2e2318d8c088300a3710'/>
<id>cbb2d20bea2861f244da2e2318d8c088300a3710</id>
<content type='text'>
Adds support for switching entrypoints during runtime. Enables
addition of new allocators with out requiring significant copy
paste. Slight speedup on ritzperf probably due to more inlining.

TODO: Ensuring that the entire allocation path is inlined so
that the switch statement in the allocation code is optimized
out.

Rosalloc measurements:
4583
4453
4439
4434
4751

After change:
4184
4287
4131
4335
4097

Change-Id: I1352a3cbcdf6dae93921582726324d91312df5c9
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Adds support for switching entrypoints during runtime. Enables
addition of new allocators with out requiring significant copy
paste. Slight speedup on ritzperf probably due to more inlining.

TODO: Ensuring that the entire allocation path is inlined so
that the switch statement in the allocation code is optimized
out.

Rosalloc measurements:
4583
4453
4439
4434
4751

After change:
4184
4287
4131
4335
4097

Change-Id: I1352a3cbcdf6dae93921582726324d91312df5c9
</pre>
</div>
</content>
</entry>
</feed>
