| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Optimizing + quick tests are passing, devices boot.
TODO: Test and fix bugs in mips64.
Saves 16 bytes per most ArtMethod, 7.5MB reduction in system PSS.
Some of the savings are from removal of virtual methods and direct
methods object arrays.
Bug: 19264997
(cherry picked from commit e401d146407d61eeb99f8d6176b2ac13c4df1e33)
Change-Id: I622469a0cfa0e7082a2119f3d6a9491eb61e3f3d
Fix some ArtMethod related bugs
Added root visiting for runtime methods, not currently required
since the GcRoots in these methods are null.
Added missing GetInterfaceMethodIfProxy in GetMethodLine, fixes
--trace run-tests 005, 044.
Fixed optimizing compiler bug where we used a normal stack location
instead of double on ARM64, this fixes the debuggable tests.
TODO: Fix JDWP tests.
Bug: 19264997
Change-Id: I7c55f69c61d1b45351fd0dc7185ffe5efad82bd3
ART: Fix casts for 64-bit pointers on 32-bit compiler.
Bug: 19264997
Change-Id: Ief45cdd4bae5a43fc8bfdfa7cf744e2c57529457
Fix JDWP tests after ArtMethod change
Fixes Throwable::GetStackDepth for exception event detection after
internal stack trace representation change.
Adds missing ArtMethod::GetInterfaceMethodIfProxy call in case of
proxy method.
Bug: 19264997
Change-Id: I363e293796848c3ec491c963813f62d868da44d2
Fix accidental IMT and root marking regression
Was always using the conflict trampoline. Also included fix for
regression in GC time caused by extra roots. Most of the regression
was IMT.
Fixed bug in DumpGcPerformanceInfo where we would get SIGABRT due to
detached thread.
EvaluateAndApplyChanges:
From ~2500 -> ~1980
GC time: 8.2s -> 7.2s due to 1s less of MarkConcurrentRoots
Bug: 19264997
Change-Id: I4333e80a8268c2ed1284f87f25b9f113d4f2c7e0
Fix bogus image test assert
Previously we were comparing the size of the non moving space to
size of the image file.
Now we properly compare the size of the image space against the size
of the image file.
Bug: 19264997
Change-Id: I7359f1f73ae3df60c5147245935a24431c04808a
[MIPS64] Fix art_quick_invoke_stub argument offsets.
ArtMethod reference's size got bigger, so we need to move other args
and leave enough space for ArtMethod* and 'this' pointer.
This fixes mips64 boot.
Bug: 19264997
Change-Id: I47198d5f39a4caab30b3b77479d5eedaad5006ab
|
|
|
|
|
|
|
|
|
| |
This reverts commit 2a7a1d7808f003bea908023ebd11eb442d2fca39.
Fix the problem that a long long >> 63 got the wrong answer. The
problem was that a shr was used instead of a sar.
Change-Id: I0327f79c718016ddec9272a605fc50ec15ec4566
|
|
|
|
| |
Change-Id: I7ede0f59d5109644887bf5d39201d4e1bf043f34
|
|\ |
|
| |
| |
| |
| |
| |
| |
| | |
This reverts commit 386ce406f150645158d6067c4e0a36565aefc44f.
Bug: 20413424
Change-Id: I6e93ff132907f2653f1ae12d6676ff2298f62ca1
|
|/
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The algorithm of ParallelMoveResolverNoSwap() is almost the same with
ParallelMoveResolverWithSwap(), except the way we resolve the circular
dependency. NoSwap() uses additional scratch register to resolve the
circular dependency. For example, (0->1) (1->2) (2->0) will be performed
as (2->scratch) (1->2) (0->1) (scratch->0).
On architectures without swap register support, NoSwap() can reduce the
number of moves from 3x(N-1) to (N+1) when there is circular dependency
with N moves.
And also, NoSwap() algorithm does not depend on architecture register
layout information, which means it can support register pairs on arm32
and X/W, D/S registers on arm64 without additional modification.
Change-Id: Idf56bd5469bb78c0e339e43ab16387428a082318
|
|\ |
|
| |
| |
| |
| |
| |
| |
| |
| | |
This reverts commit a5c19ce8d200d68a528f2ce0ebff989106c4a933.
This commit introduces a performance regression on CaffeineLogic of 30%.
Change-Id: I917e206e249d44e1748537bc1b2d31054ea4959d
|
|\ \ |
|
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
Test fails on arm.
This reverts commit 2d45b4df3838d9c0e5a213305ccd1d7009e01437.
Change-Id: Id2864917b52f7ffba459680303a2d15b34f16a4e
|
|\| | |
|
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
long-to-fp conversion implemented using SSE loses the precision.
The test is included. CL uses FPU to provide the correct result.
Change-Id: I8eaf3c46819a8cb52642a7e7d7c4e3e0edbc88db
Signed-off-by: Serguei Katkov <serguei.i.katkov@intel.com>
|
|\ \ \ |
|
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | | |
This reverts commit 222fcf96c9b73bbb739012575e7e413caf9348ec.
Reverting this CL as it is breaking a few tests (see http://build.chromium.org/p/client.art/builders/host-x86/builds/3251/steps/test%20optimizing/logs/stdio). Will investigate ASAP.
Change-Id: Iddd8363e83a24aa49fbdf0f0c9dc12e63b4848de
|
|\ \ \ \
| |_|_|/
|/| | | |
|
| | | |
| | | |
| | | |
| | | | |
Change-Id: Ibf39cbc8ac1d773599d70be2cb1e941674b60f1d
|
| |/ /
|/| |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
Add a new constructor to ScratchRegisterScope that will supply a
register if there is a free one, but not spill to force one. Use this
to generated alternate code that doesn't use a temporary, as the
spill/restore of a register generates extra instructions that aren't
necessary on x86.
Here is the benefit for a 32 bit memory-to-memory exchange with no
free registers:
< 50 push eax
< 53 push ebx
< 8B44244C mov eax, [esp + 76]
< 8B5C246C mov ebx, [esp + 108]
< 8944246C mov [esp + 108], eax
< 895C244C mov [esp + 76], ebx
< 5B pop ebx
< 58 pop eax
---
> FF742444 push [esp + 68]
> FF742468 push [esp + 104]
> 8F44244C pop [esp + 72]
> 8F442468 pop [esp + 100]
Avoid using xchg instruction, as it is slow on smaller processors.
Change-Id: Id29ee3abd998577baaee552d55d23e60ae0c7871
Signed-off-by: Mark Mendell <mark.p.mendell@intel.com>
|
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
Support memory operands for integer shifts. Generate better code for
long shifts by constants.
Change-Id: Icc92fa1b59cc280d4894af6f054e19b01977d5ce
Signed-off-by: Mark Mendell <mark.p.mendell@intel.com>
|
|\| |
| |/
|/| |
|
| |
| |
| |
| |
| |
| | |
This is done using the algorithms in Hacker's Delight chapter 10.
Change-Id: I7bacefe10067569769ed31a1f7834f796fb41119
|
|\ \ |
|
| |/
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
Implement floor/ceil/round/RoundFloat on x86 and x86_64.
Implement RoundDouble on x86_64.
Add support for roundss and roundsd on both architectures. Support them
in the disassembler as well.
Add the instruction set features for x86, as the 'round' instruction is
only supported if SSE4.1 is supported.
Fix the tests to handle the addition of passing the instruction set
features to x86 and x86_64.
Add assembler tests for roundsd and roundss to x86_64 assembler tests.
Change-Id: I9742d5930befb0bbc23f3d6c83ce0183ed9fe04f
Signed-off-by: Mark Mendell <mark.p.mendell@intel.com>
|
|/
|
|
|
|
| |
This reverts commit 0ba627337274ccfb8c9cb9bf23fffb1e1b9d1430.
Change-Id: I1ca10d15bbb49897a0cf541ab160431ec180a006
|
|
|
|
| |
Change-Id: Ia540df98755ac493fe61bd63f0bd94f6d97fbb57
|
|
|
|
|
|
|
|
|
|
|
| |
Implement the supported intrinsics for X86.
Enhance the graph visualizer to print <U> for unallocated locations, to
allow calling the graph dumper from within register allocation for
debugging purposes.
Change-Id: I3b0319eb70a9a4ea228f67065b4c52d13a1ae775
Signed-off-by: Mark Mendell <mark.p.mendell@intel.com>
|
|
|
|
|
|
|
|
|
|
| |
This breaks compiling the core image:
Error after BCE: art::SSAChecker: Instruction 219 in block 1 does not dominate use 221 in block 1.
This reverts commit e295e6ec5beaea31be5d7d3c996cd8cfa2053129.
Change-Id: Ieeb48797d451836ed506ccb940872f1443942e4e
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
A mechanism is introduced that a runtime method can be called
from code compiled with optimizing compiler to deoptimize into
interpreter. This can be used to establish invariants in the managed code
If the invariant does not hold at runtime, we will deoptimize and continue
execution in the interpreter. This allows to optimize the managed code as
if the invariant was proven during compile time. However, the exception
will be thrown according to the semantics demanded by the spec.
The invariant and optimization included in this patch are based on the
length of an array. Given a set of array accesses with constant indices
{c1, ..., cn}, we can optimize away all bounds checks iff all 0 <= min(ci) and
max(ci) < array-length. The first can be proven statically. The second can be
established with a deoptimization-based invariant. This replaces n bounds
checks with one invariant check (plus slow-path code).
Change-Id: I8c6e34b56c85d25b91074832d13dba1db0a81569
|
|
|
|
|
|
| |
This reverts commit 154552e666347d41d95d7619c6ee56249ff4feca.
Change-Id: Idc726551c249a888b7ff5fde8508ae50e81b2e13
|
|
|
|
|
|
|
|
| |
Few libcore failures.
This reverts commit b4ba354cf8d22b261205494875cc014f18587b50.
Change-Id: I4a28d853e730dff9b69aec9555505803cf2fcd63
|
|
|
|
| |
Change-Id: I9006972a65a1f191c45691104a960366747f9d16
|
|
|
|
|
|
|
| |
When a block branches to a non-following block, but blocks
in-between do branch to it, we can avoid doing the branch.
Change-Id: I9b343f662a4efc718cd4b58168f93162a24e1219
|
|\ |
|
| |
| |
| |
| |
| |
| |
| |
| | |
Add support for the new ABI passing FP parameters in XMM0-XMM3. This
allows us to optimize for x86 methods that don't use 'long'.
Change-Id: Ic79a24767173451e7d7095ccc2a00b307593a868
Signed-off-by: Mark Mendell <mark.p.mendell@intel.com>
|
|/
|
|
| |
Change-Id: I044757a2f06e535cdc1480c4fc8182b89635baf6
|
|
|
|
|
|
|
|
|
|
| |
This reverts commit 949c91fb91f40a4a80b2b492913cf8541008975e.
This time, don't clobber EBX before saving it.
Redo some of the macros to make register usage explicit.
Change-Id: I8db8662877cd006816e16a28f42444ab7c36bfef
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
And the 3 Mac build fixes. Fix conflicts in context_x86.* .
This reverts commits
3d2c8e74c27efee58e24ec31441124f3f21384b9 ,
34eda1dd66b92a361797c63d57fa19e83c08a1b4 ,
f601d1954348b71186fa160a0ae6a1f4f1c5aee6 ,
bc503348a1da573488503cc2819c9e30807bea31 .
Bug: 19150481
Change-Id: I6650ee30a7d261159380fe2119e14379e4dc9970
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Use XMM0-XMM3 as parameter registers for float/double on X86. X86_64
already uses XMM0-XMM7 for parameters.
Change the 'hidden' argument register from XMM0 to XMM7 to avoid a
conflict.
Add support for FPR save/restore in runtime/arch/x86.
Minimal support for Optimizing baseline compiler.
Bump the version in runtime/oat.h because this is an ABI change.
Change-Id: Ia6fe150e8488b9e582b0178c0dda65fc81d5a8ba
Signed-off-by: Mark Mendell <mark.p.mendell@intel.com>
|
|
|
|
|
|
|
|
|
| |
- Share the computation of core_spill_mask and fpu_spill_mask
between backends.
- Remove explicit stack overflow check support: we need to adjust
them and since they are not tested, they will easily bitrot.
Change-Id: I0b619b8de4e1bdb169ea1ae7c6ede8df0d65837a
|
|
|
|
|
|
| |
Will work on other architectures and FP support in other CLs.
Change-Id: I8cef0343eedc7202d206f5217fdf0349035f0e4d
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Replace the calls to fmod/fmodf by inline code as is done in the Quick
compiler.
Remove the quick fmod/fmodf runtime entries, as they are no longer in
use.
64 bit code generator Move() routine needed to be enhanced to handle
constants, as Location::Any() allows them to be generated.
Change-Id: I6b6a42f6faeed4b0b3c940453e487daf5b25d184
Signed-off-by: Mark Mendell <mark.p.mendell@intel.com>
|
|
|
|
|
|
|
|
|
| |
- for backends: arm, arm64, x86, x86_64
- fixed parameter passing for CodeGenerator
- 003-omnibus-opcodes test verifies that NullPointerExceptions work as
expected
Change-Id: I1b302acd353342504716c9169a80706cf3aba2c8
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The current stack frame calculation assumes that each live register to
be saved/restored has the word size of the machine. This fails for X86,
where a double in an XMM register takes up 8 bytes. Change the
calculation to keep track of the number of core registers and number of
fp registers to handle this distinction.
This is slightly pessimal, as the registers may not be active at the
same time, but the only way to handle this would be to allocate both
classes of registers simultaneously, or remember all the active
intervals, matching them up and compute the size of each safepoint
interval.
Change-Id: If7860aa319b625c214775347728cdf49a56946eb
Signed-off-by: Mark Mendell <mark.p.mendell@intel.com>
|
|
|
|
|
|
|
|
|
|
|
|
| |
The basic approach is:
- An instruction that needs two registers gets two intervals.
- When allocating the low part, we also allocate the high part.
- When splitting a low (or high) interval, we also split the high
(or low) equivalent.
- Allocation follows the (S/D register) requirement that low
registers are always even and the high equivalent is low + 1.
Change-Id: I06a5148e05a2ffc7e7555d08e871ed007b4c2797
|
|
|
|
|
|
|
|
|
|
|
|
| |
- for backends: arm, x86, x86_64
- added necessary instructions to assemblies
- clean up code gen for field set/get
- fixed InstructionDataEquals for some instructions
- fixed comments in compiler_enums
* 003-opcode test verifies basic volatile functionality
Change-Id: I144393efa312dfb2c332cb84056b00edffee338a
|
|
|
|
|
|
| |
Added SHL, SHR, USHR for arm, x86, x86_64.
Change-Id: I971f594e270179457e6958acf1401ff7630df07e
|
|
|
|
|
|
|
|
| |
These constants were defined prior to k{InstructionSet}PointerSize. So
use them consistently in optimizing as a first step. We can discuss
whether we should remove them in a second step.
Change-Id: If129de1a3bb8b65f8d9c816a8ad466815fb202e6
|
|
|
|
|
|
|
| |
- for arm, x86, x86_64
- minor cleanup/fix in div tests
Change-Id: I240874010206a5a9b3aaffbc81a885b94c248f93
|
|\ |
|
| |
| |
| |
| |
| |
| |
| |
| | |
The two locations of the index and length could overlap,
so we need a parallel move. Also factorize the code for
doing a parallel move based on two locations.
Change-Id: Iee8b3459e2eed6704d45e9a564fb2cd050741ea4
|
|/
|
|
| |
Change-Id: I7cf6da1fd334a7177a5580931b8f174dd40b7cec
|