| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
|
|
|
| |
Don't use odd numbered fp registers for single precision data on
MIPS32r6 (64-bit FPU).
Bug: 23050326
Change-Id: I35cc19df091149773411e2336b01c170929376bc
(cherry picked from commit fc8156a3df88e259c892d50bf23f7c4f11531844)
|
|
|
|
|
|
|
| |
(cherry picked from commit 22bb5a2ebc1e2724179faf4660b2735dcb185f21)
Bug: 21555893
Change-Id: I2a995be128a5603d08753c14956dd8c8240ac63c
|
|
|
|
|
|
| |
CFI is necessary for stack unwinding in gdb, lldb, and libunwind.
Change-Id: Ic3b84c9dc91c4bae80e27cda02190f3274e95ae8
|
|
|
|
|
|
|
| |
Code from compiler/dex/quick/mips64 is merged with code
in mips folder.
Change-Id: I785983c21549141306484647da86a0bb4815daaa
|
|
|
|
|
|
|
|
|
| |
Add Mips32r6 compiler support.
Don't use deprecated Mips32r2 instructions if running in Mips32r6
mode.
Change-Id: I54e689aa8c026ccb75c4af515aa2794f471c9f67
|
|
|
|
|
|
|
|
|
|
| |
Make several fields const in CompilationUnit. May benefit some Mir2Lir
code that repeats tests, and in general immutability is good.
Remove compiler_internals.h and refactor some other headers to reduce
overly broad imports (and thus forced recompiles on changes).
Change-Id: I898405907c68923581373b5981d8a85d2e5d185a
|
|
|
|
|
|
| |
Fix the indentation to be standard.
Change-Id: I39a16716be3429dfef6df0a585e24423b46363a2
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
On ARM, after emitting invoke-interface we didn't have any
free temps to use for storing the result, so we would crash
if the result was an unpromoted dalvik register with stack
location too far from SP.
Bug: 18769895
(cherry picked from commit d6bd06c713e8ec69de96510ef57bdf7adb4781ed)
Change-Id: Id88f6f3788eaf6ecbc7bd68880b445423f6e4f94
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Now every architecture must provide a mapper between
VRs parameters and physical registers. Additionally as
a helper function architecture can provide a bulk copy
helper for GenDalvikArgs utility.
All other things becomes a common code stuff:
GetArgMappingToPhysicalReg, GenDalvikArgsNoRange,
GenDalvikArgsRange, FlushIns.
Mapper now uses shorty representation of input
parameters. This is required due to location are not
enough to detect the type of parameter (fp or core).
For the details
see https://android-review.googlesource.com/#/c/113936/.
Change-Id: Ie762b921e0acaa936518ee6b63c9a9d25f83e434
Signed-off-by: Serguei Katkov <serguei.i.katkov@intel.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Also, refactor how feature strings are handled so they are additive or
subtractive.
Make MIPS have features for FPU 32-bit and MIPS v2. Use in the quick compiler
rather than #ifdefs that wouldn't have worked in cross-compilation.
Add SIMD features for x86/x86-64 proposed in:
https://android-review.googlesource.com/#/c/112370/
Bug: 18056890
Change-Id: Ic88ff84a714926bd277beb74a430c5c7d5ed7666
|
|
|
|
|
|
|
|
|
|
|
| |
Fix associated errors about unused paramenters and implict sign conversions.
For sign conversion this was largely in the area of enums, so add ostream
operators for the effected enums and fix tools/generate-operator-out.py.
Tidy arena allocation code and arena allocated data types, rather than fixing
new and delete operators.
Remove dead code.
Change-Id: I5b433e722d2f75baacfacae4d32aef4a828bfe1b
|
|
|
|
|
|
|
| |
Purge GrowableArray from Quick and Portable.
Remove GrowableArray<T>::Iterator.
Change-Id: I92157d3a6ea5975f295662809585b2dc15caa1c6
|
|
|
|
|
|
|
| |
Clean up the compiler: less extern functions, dis-entangle
compilers, hide some compiler specifics, lower global includes.
Change-Id: Ibaf88d02505d86994d7845cf0075be5041cc8438
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
To reduce the complexity of calling trampolines in generic code,
introduce an enumeration for entrypoints. Introduce a header that lists
the entrypoint enum and exposes a templatized method that translates an
enum value to the corresponding thread offset value.
Call helpers are rewritten to have an enum parameter instead of the
thread offset. Also rewrite LoadHelper and GenConversionCall this way.
It is now LoadHelper's duty to select the right thread offset size.
Introduce InvokeTrampoline virtual method to Mir2Lir. This allows to
further simplify the call helpers, as well as make OpThreadMem specific
to X86 only (removed from Mir2Lir).
Make GenInlinedCharAt virtual, move a copy to X86 backend, and simplify
both copies. Remove LoadBaseIndexedDisp and OpRegMem from Mir2Lir, as they
are now specific to X86 only.
Remove StoreBaseIndexedDisp from Mir2Lir, as it was only ever used in the
X86 backend.
Remove OpTlsCmp from Mir2Lir, as it was only ever used in the X86 backend.
Remove OpLea from Mir2Lir, as it was only ever defined in the X86 backend.
Remove GenImmedCheck from Mir2Lir as it was neither used nor implemented.
Change-Id: If0a6182288c5d57653e3979bf547840a4c47626e
|
|
|
|
|
|
|
|
|
|
|
| |
This patch fixes the following art test failures for Mips:
003-omnibus-opcodes
030-bad-finalizer
041-narrowing
059-finalizer-throw
Change-Id: I4e0e9ff75f949c92059dd6b8d579450dc15f4467
Signed-off-by: Douglas Leung <douglas@mips.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Not sufficiently tested for 64-bit targets, but should be
fairly close.
A significant amount of refactoring could stil be done, (in
later CLs).
With this change we are not making any changes to the vmap
scheme. As a result, it is a requirement that if a vreg
is promoted to both a 32-bit view and the low half of a
64-bit view it must share the same physical register. We
may change this restriction later on to allow for more flexibility
for 32-bit Arm.
For example, if v4, v5, v4/v5 and v5/v6 are all hot enough to
promote, we'd end up with something like:
v4 (as an int) -> r10
v4/v5 (as a long) -> r10
v5 (as an int) -> r11
v5/v6 (as a long) -> r11
Fix a couple of ARM64 bugs on the way...
Change-Id: I6a152b9c164d9f1a053622266e165428045362f3
|
|
|
|
|
|
|
|
|
| |
This patch enable quick mode for Mips and allows the emulator to boot.
However the emulator is still not 100% functional. It still have problems
launching some apps.
Change-Id: Id46a39a649a2fd431a9f13b06ecf34cbd1d20930
Signed-off-by: Douglas Leung <douglas@mips.com>
|
|
|
|
|
|
|
|
| |
Reduce LIR memory usage by holding masks by pointers in the
LIR rather than directly and using pre-defined const masks
for the common cases, allocating very few on the arena.
Change-Id: I0f6d27ef6867acd157184c8c74f9612cebfe6c16
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This patch shows our efforts on resolving the ART limitations:
- passing "float"/"double" arguments via FPR
- passing "long" arguments via single GPR, not pair
- passing more than 3 agruments via GPR.
Work done:
- Extended SpecialTargetRegister enum with kARG4, kARG5, fARG4..fARG7.
- Created initial LoadArgRegs/GenDalvikX/FlushIns version in X86Mir2Lir.
- Unlimited number of long/double/float arguments support
- Refactored (v2)
Change-Id: I5deadd320b4341d5b2f50ba6fa4a98031abc3902
Signed-off-by: Serguei Katkov <serguei.i.katkov@intel.com>
Signed-off-by: Dmitry Petrochenko <dmitry.petrochenko@intel.com>
Signed-off-by: Chao-ying Fu <chao-ying.fu@intel.com>
Signed-off-by: Mark Mendell <mark.p.mendell@intel.com>
|
|
|
|
|
|
|
|
| |
Create a helper template class ArrayRef and use it instead
of std::vector<> for register pools in target_<arch>.cc to
avoid these heap allocations during program startup.
Change-Id: I4ab0205af9c1d28a239c0a105fcdc60ba800a70a
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
For 32-bit targets, object references are 32 bits wide both in
Dalvik virtual registers and in core physical registers. Because of
this, object references and non-floating point values were both
handled as if they had the same register class (kCoreReg).
However, for 64-bit systems, references are 32 bits in Dalvik vregs, but
64 bits in physical registers. Although the same underlying physical
core registers will still be used for object reference and non-float
values, different register class views will be used to represent them.
For example, an object reference in arm64 might be held in x3 at some
point, while the same underlying physical register, w3, would be used
to hold a 32-bit int.
This CL breaks apart the handling of object reference and non-float values
to allow the proper register class (or register view) to be used. A
new register class, kRefReg, is introduced which will map to a 32-bit
core register on 32-bit targets, and 64-bit core registers on 64-bit
targets. From this point on, object references should be allocated
registers in the kRefReg class rather than kCoreReg.
Change-Id: I6166827daa8a0ea3af326940d56a6a14874f5810
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Add a 64-bit temp register allocation path. The recent physical
register handling rework supports multiple views of the same
physical register (or, such as for Arm's float/double regs,
different parts of the same physical register).
This CL adds a 64-bit core register view for 64-bit targets. In
short, each core register will have a 64-bit name, and a 32-bit
name. The different views will be kept in separate register pools,
but aliasing will be tracked. The core temp register allocation
routines will be largely identical - except for 32-bit targets,
which will continue to use pairs of 32-bit core registers for holding
long values.
Change-Id: I8f118e845eac7903ad8b6dcec1952f185023c053
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
It turns out that the register pool sanity checker was not
working as expected, leaving some inconsistencies unreported.
This could result in "out of registers" failures, as well
as other more subtle problems.
This CL fixes the sanity checker, adds a lot more check and cleans
up the previously undetected episodes of insanity.
Cherry-pick of internal change 468162
Change-Id: Id2da97e99105a4c272c5fd256205a94b904ecea8
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
We do not emit barriers on non-SMP systems. But on ARM, we have
places that need to conditionally execute, which is done through
an IT instruction. The guide of said instruction thus changes
between SMP and non-SMP systems.
To cleanly approach this, change the API so that GenMemBarrier
returns whether it generated an instruction. ARM will have to
query the result and update any dependent IT.
Throw a build system error if TARGET_CPU_SMP is not set.
Fix runtime/Android.mk to work with new multilib host.
Bug: 14989275
Change-Id: I9e611b770e8a1cd4ca19367d7dae0573ec08dc61
|
|
|
|
|
|
|
|
|
|
|
|
| |
This duplicates all methods with ThreadOffset parameters, so that
both ThreadOffset<4> and ThreadOffset<8> can be handled. Dynamic
checks against the compilation unit's instruction set determine
which pointer size to use and therefore which methods to call.
Methods with unsupported pointer sizes should fatally fail, as
this indicates an issue during method selection.
Change-Id: Ifdb445b3732d3dc5e6a220db57374a55e91e1bf6
|
|
|
|
|
| |
Bug: 14112919
Change-Id: I79316f438dd3adea9b2653ffc968af83671ad282
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Significant refactoring of register handling to unify usage across
all targets & 32/64 backends.
Reworked RegStorage encoding to allow expanded use of
x86 xmm registers; removed vector registers as a separate
register type. Reworked RegisterInfo to describe aliased
physical registers. Eliminated quite a bit of target-specific code
and generalized common code.
Use of RegStorage instead of int for registers now propagated down
to the NewLIRx() level. In future CLs, the NewLIRx() routines will
be replaced with versions that are explicit about what kind of
operand they expect (RegStorage, displacement, etc.). The goal
is to eventually use RegStorage all the way to the assembly phase.
TBD: MIPS needs verification.
TBD: Re-enable liveness tracking.
Change-Id: I388c006d5fa9b3ea72db4e37a19ce257f2a15964
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This CL replaces the typical use of LoadWord/StoreWord
utilities (which, in practice, were 32-bit load/store) in
favor of a new set that make the size explicit. We now have:
LoadWordDisp/StoreWordDisp:
32 or 64 depending on target. Load or store the natural
word size. Expect this to be used infrequently - generally
when we know we're dealing with a native pointer or flushed
register not holding a Dalvik value (Dalvik values will flush
to home location sizes based on Dalvik, rather than the target).
Load32Disp/Store32Disp:
Load or store 32 bits, regardless of target.
Load64Disp/Store64Disp:
Load or store 64 bits, regardless of target.
LoadRefDisp:
Load a 32-bit compressed reference, and expand it to the
natural word size in the target register.
StoreRefDisp:
Compress a reference held in a register of the natural word
size and store it as a 32-bit compressed reference.
Change-Id: I50fcbc8684476abd9527777ee7c152c61ba41c6f
|
|\ |
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
MIPS architecture includes internal registers HI and LO.
Similar to condition codes in other architectures, these internal
resouces must be accounted for during instruction scheduling.
Previously, the Quick backend for MIPS dealt with them by defining
rHI and rLO pseudo registers - treating them as actual registers for
def/use masks. This CL changes the handling of these resources to
be in line with how condition codes are used elsewhere - leaving
register definitions to be used for registers.
Change-Id: Idcd77f3107b0c9b081ad05b1aab663fb9f41492d
|
|/
|
|
|
|
|
| |
Begin a more full implementation x86-64 REX prefixes.
Doesn't implement 64bit thread offset support for the JNI compiler.
Change-Id: If9af2f08a1833c21ddb4b4077f9b03add1a05147
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Ready for review.
Continue the process of using RegStorage rather than
ints to hold register value in the top layers of codegen.
Given the huge number of changes in this CL, I've attempted
to minimize the number of actual logic changes. With this
CL, the use of ints for registers has largely been eliminated
except in the lowest utility levels. "Wide" utility routines
have been updated to take a single RegStorage rather than
a pair of ints representing low and high registers.
Upcoming CLs will be smaller and more targeted. My expectations:
o Allocate float double registers as a single double rather than
a pair of float single registers.
o Refactor to push code which assumes long and double Dalvik
values are held in a pair of register to the target dependent
layer.
o Clean-up of the xxx_mir.h files to reduce the amount of #defines
for registers. May also do a register renumbering to bring all
of our targets' register naming more consistent. Possibly
introduce a target-independent float/non-float test at the
RegStorage level.
Change-Id: I646de7392bdec94595dd2c6f76e0f1c4331096ff
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This adds the ability to use SEGV signals
to throw NullPointerException exceptions from Java code rather
than having the compiler generate explicit comparisons and
branches. It does this by using sigaction to trap SIGSEGV and when triggered
makes sure it's in compiled code and if so, sets the return
address to the entry point to throw the exception.
It also uses this signal mechanism to determine whether to check
for thread suspension. Instead of the compiler generating calls
to a function to check for threads being suspended, the compiler
will now load indirect via an address in the TLS area. To trigger
a suspend, the contents of this address are changed from something
valid to 0. A SIGSEGV will occur and the handler will check
for a valid instruction pattern before invoking the thread
suspension check code.
If a user program taps SIGSEGV it will prevent our signal handler
working. This will cause a failure in the runtime.
There are two signal handlers at present. You can control them
individually using the flags -implicit-checks: on the runtime
command line. This takes a string parameter, a comma
separated set of strings. Each can be one of:
none switch off
null null pointer checks
suspend suspend checks
all all checks
So to switch only suspend checks on, pass:
-implicit-checks:suspend
There is also -explicit-checks to provide the reverse once
we change the default.
For dalvikvm, pass --runtime-arg -implicit-checks:foo,bar
The default is -implicit-checks:none
There is also a property 'dalvik.vm.implicit_checks' whose value is the same
string as the command option. The default is 'none'. For example to switch on
null checks using the option:
setprop dalvik.vm.implicit_checks null
It only works for ARM right now.
Bumps OAT version number due to change to Thread offsets.
Bug: 13121132
Change-Id: If743849138162f3c7c44a523247e413785677370
|
|
|
|
|
|
|
|
|
|
|
|
| |
This saves more than 0.5s of boot.oat compilation time
on Nexus 5.
TODO: Move other stuff to the scoped allocator. This CL
alone increases the peak memory allocation. By reusing
the memory for other parts of the compilation we should
reduce this overhead.
Change-Id: Ifbc00aab4f3afd0000da818dfe68b96713824a08
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This reverts commit 86ec520fc8b696ed6f164d7b756009ecd6e4aace.
Ready. Fixed the original type, plus some mechanical changes
for rebasing.
Still needs additional testing, but the problem with the original
CL appears to have been a typo in the definition of the x86
double return template RegLocation.
Change-Id: I828c721f91d9b2546ef008c6ea81f40756305891
|
|
|
|
|
|
| |
This reverts commit 2c1ed456dcdb027d097825dd98dbe48c71599b6c.
Change-Id: If88d69ba88e0af0b407ff2240566d7e4545d8a99
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
For historical reasons, the Quick backend found it convenient
to consider all 64-bit Dalvik values held in registers
to be contained in a pair of 32-bit registers. Though this
worked well for ARM (with double-precision registers also
treated as a pair of 32-bit single-precision registers) it doesn't
play well with other targets. And, it is somewhat problematic
for 64-bit architectures.
This is the first of several CLs that will rework the way the
Quick backend deals with physical registers. The goal is to
eliminate the "64-bit value backed with 32-bit register pair"
requirement from the target-indendent portions of the backend
and support 64-bit registers throughout.
The key RegLocation struct, which describes the location of
Dalvik virtual register & register pairs, previously contained
fields for high and low physical registers. The low_reg and
high_reg fields are being replaced with a new type: RegStorage.
There will be a single instance of RegStorage for each RegLocation.
Note that RegStorage does not increase the space used. It is
16 bits wide, the same as the sum of the 8-bit low_reg and
high_reg fields.
At a target-independent level, it will describe whether the physical
register storage associated with the Dalvik value is a single 32
bit, single 64 bit, pair of 32 bit or vector. The actual register
number encoding is left to the target-dependent code layer.
Because physical register handling is pervasive throughout the
backend, this restructuring necessarily involves large CLs with
lots of changes. I'm going to roll these out in stages, and
attempt to segregate the CLs with largely mechanical changes from
those which restructure or rework the logic.
This CL is of the mechanical change variety - it replaces low_reg
and high_reg from RegLocation and introduces RegStorage. It also
includes a lot of new code (such as many calls to GetReg())
that should go away in upcoming CLs.
The tentative plan for the subsequent CLs is:
o Rework standard register utilities such as AllocReg() and
FreeReg() to use RegStorage instead of ints.
o Rework the target-independent GenXXX, OpXXX, LoadValue,
StoreValue, etc. routines to take RegStorage rather than
int register encodings.
o Take advantage of the vector representation and eliminate
the current vector field in RegLocation.
o Replace the "wide" variants of codegen utilities that take
low_reg/high_reg pairs with versions that use RegStorage.
o Add 64-bit register target independent codegen utilities
where possible, and where not virtualize with 32-bit general
register and 64-bit general register variants in the target
dependent layer.
o Expand/rework the LIR def/use flags to allow for more registers
(currently, we lose out on 16 MIPS floating point regs as
well as ARM's D16..D31 for lack of space in the masks).
o [Possibly] move the float/non-float determination of a register
from the target-dependent encoding to RegStorage. In other
words, replace IsFpReg(register_encoding_bits).
At the end of the day, all code in the target independent layer
should be using RegStorage, as should much of the target dependent
layer. Ideally, we won't be using the physical register number
encoding extracted from RegStorage (i.e. GetReg()) until the
NewLIRx() layer.
Change-Id: Idc5c741478f720bdd1d7123b94e4288be5ce52cb
|
|
|
|
|
|
|
|
| |
Moved GenSpecialCase from being ARM specific to common code to allow
it to be used by x86 quick as well.
Change-Id: I728733e8f4c4da99af6091ef77e5c76ae0fee850
Signed-off-by: Razvan A Lupusoru <razvan.a.lupusoru@intel.com>
|
|
|
|
|
|
| |
Also correct header file inclusion ordering.
Change-Id: I8fb99e80cf1487e8b2278d4c1d110d14ed18c086
|
|
|
|
|
|
|
| |
Use snprintf rather than sprintf to avoid Werror failures.
Work around an annotalysis bug when compiling -O0.
Change-Id: Ie7e0a70dbceea5fa85f98262b91bcdbd74fdef1c
|
|
|
|
| |
Change-Id: I6a72703a11985e2753fa9b4520c375a164301433
|
|
|
|
| |
Change-Id: Idf7fe85e1293453a8ad862ff2380dcd5db4e3a39
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This CL re-instates the select pattern optimization disabled by
CL 374310, and fixes the underlying problem: improper handling of
the kPseudoBarrier LIR opcode. The bug was introduced in the
recent assembler restructuring. In short, LIR pseudo opcodes (which
have values < 0), should always have size 0 - and thus cause no
bits to be emitted during assembly. In this case, bad logic caused
us to set the size of a kPseudoBarrier opcode via lookup through the
EncodingMap.
Because all pseudo ops are < 0, this meant we did an array underflow
load, picking up whatever garbage was located before the EncodingMap.
This explains why this error showed up recently - we'd previuosly just
gotten a lucky layout.
This CL corrects the faulty logic, and adds DCHECKs to uses of
the EncodingMap to ensure that we don't try to access w/ a
pseudo op. Additionally, the existing is_pseudo_op() macro is
replaced with IsPseudoLirOp(), named similar to the existing
IsPseudoMirOp().
Change-Id: I46761a0275a923d85b545664cadf052e1ab120dc
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Not as much compile-time gain from reworking the assembly phase as I'd
hoped, but still worthwhile. Should see ~2% improvement thanks to
the assembly rework. On the other hand, expect some huge gains for some
application thanks to better detection of large machine-generated init
methods. Thinkfree shows a 25% improvement.
The major assembly change was to establish thread the LIR nodes that
require fixup into a fixup chain. Only those are processed during the
final assembly pass(es). This doesn't help for methods which only
require a single pass to assemble, but does speed up the larger methods
which required multiple assembly passes.
Also replaced the block_map_ basic block lookup table (which contained
space for a BasicBlock* for each dex instruction unit) with a block id
map - cutting its space requirements by half in a 32-bit pointer
environment.
Changes:
o Reduce size of LIR struct by 12.5% (one of the big memory users)
o Repurpose the use/def portion of the LIR after optimization complete.
o Encode instruction bits to LIR
o Thread LIR nodes requiring pc fixup
o Change follow-on assembly passes to only consider fixup LIRs
o Switch on pc-rel fixup kind
o Fast-path for small methods - single pass assembly
o Avoid using cb[n]z for null checks (almost always exceed displacement)
o Improve detection of large initialization methods.
o Rework def/use flag setup.
o Remove a sequential search from FindBlock using lookup table of 16-bit
block ids rather than full block pointers.
o Eliminate pcRelFixup and use fixup kind instead.
o Add check for 16-bit overflow on dex offset.
Change-Id: I4c6615f83fed46f84629ad6cfe4237205a9562b4
|
|
|
|
|
|
|
|
|
|
|
|
| |
This CL yeilds about a 4% improvement in the compilation phase
of dex2oat (single-threaded; multi-threaded compilation is
more difficult to accurately measure). The register utilities
could stand to be completely rewritten, but this gets most of the
easy benefit.
Next up: the assembly phase.
Change-Id: Ife5a474e9b1a6d9e501e888dda6749d34eb77e96
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Before we were creating arenas for each method. The issue with doing this
is that we needed to memset each memory allocation. This can be improved
if you start out with arenas that contain all zeroed memory and recycle
them for each method. When you give memory back to the arena pool you do
a single memset to zero out all of the memory that you used.
Always inlined the fast path of the allocation code.
Removed the "zero" parameter since the new arena allocator always returns
zeroed memory.
Host dex2oat time on target oat apks (2 samples each).
Before:
real 1m11.958s
user 4m34.020s
sys 1m28.570s
After:
real 1m9.690s
user 4m17.670s
sys 1m23.960s
Target device dex2oat samples (Mako, Thinkfree.apk):
Without new arena allocator:
0m26.47s real 0m54.60s user 0m25.85s system
0m25.91s real 0m54.39s user 0m26.69s system
0m26.61s real 0m53.77s user 0m27.35s system
0m26.33s real 0m54.90s user 0m25.30s system
0m26.34s real 0m53.94s user 0m27.23s system
With new arena allocator:
0m25.02s real 0m54.46s user 0m19.94s system
0m25.17s real 0m55.06s user 0m20.72s system
0m24.85s real 0m55.14s user 0m19.30s system
0m24.59s real 0m54.02s user 0m20.07s system
0m25.06s real 0m55.00s user 0m20.42s system
Correctness of Thinkfree.apk.oat verified by diffing both of the oat files.
Change-Id: I5ff7b85ffe86c57d3434294ca7a621a695bf57a9
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Create set of entry points needed for image methods to avoid fix-up at load time:
- interpreter - bridge to interpreter, bridge to compiled code
- jni - dlsym lookup
- quick - resolution and bridge to interpreter
- portable - resolution and bridge to interpreter
Fix JNI work around to use JNI work around argument rewriting code that'd been
accidentally disabled.
Remove abstact method error stub, use interpreter bridge instead.
Consolidate trampoline (previously stub) generation in generic helper.
Simplify trampolines to jump directly into assembly code, keeps stack crawlable.
Dex: replace use of int with ThreadOffset for values that are thread offsets.
Tidy entry point routines between interpreter, jni, quick and portable.
Change-Id: I52a7c2bbb1b7e0ff8a3c3100b774212309d0828e
(cherry picked from commit 848871b4d8481229c32e0d048a9856e5a9a17ef9)
|
|
|
|
| |
Change-Id: Iae286862c85fb8fd8901eae1204cd6d271d69496
|
|
|
|
| |
Change-Id: I730bd87b476bfa36e93b42e816ef358006b69ba5
|
|
|
|
| |
Change-Id: I456fc8d80371d6dfc07e6d109b7f478c25602b65
|