diff options
-rw-r--r-- | docs/dalvik-bytecode.css | 165 | ||||
-rw-r--r-- | docs/dalvik-bytecode.html | 1551 | ||||
-rw-r--r-- | docs/dex-format.css | 387 | ||||
-rw-r--r-- | docs/dex-format.html | 3049 | ||||
-rw-r--r-- | docs/instruction-formats.css | 129 | ||||
-rw-r--r-- | docs/instruction-formats.html | 461 |
6 files changed, 0 insertions, 5742 deletions
diff --git a/docs/dalvik-bytecode.css b/docs/dalvik-bytecode.css deleted file mode 100644 index e4a5caa3c..000000000 --- a/docs/dalvik-bytecode.css +++ /dev/null @@ -1,165 +0,0 @@ -h1 { - font-family: serif; - color: #222266; -} - -h2 { - font-family: serif; - border-top-style: solid; - border-top-width: 2px; - border-color: #ccccdd; - padding-top: 12px; - margin-top: 48px; - margin-bottom: 2px; - color: #222266; -} - -@media print { - table { - font-size: 8pt; - } -} - -@media screen { - table { - font-size: 10pt; - } -} - - -/* general for all tables */ - -table { - border-collapse: collapse; - margin-top: 12px; -} - -table th { - font-family: sans-serif; - background: #aabbff; -} - -table td { - font-family: sans-serif; - border-top-style: solid; - border-bottom-style: solid; - border-width: 1px; - border-color: #aaaaff; - padding-top: 4px; - padding-bottom: 4px; - padding-left: 4px; - padding-right: 6px; - background: #eeeeff; -} - -table td p { - margin-top: 4pt; - margin-bottom: 0pt; -} - - - -/* opcodes table */ - -table.instruc { - margin-top: 24px; - margin-bottom: 24px; - margin-left: 48px; - margin-right: 48px; -} - -table.instruc td { - font-family: sans-serif; - border-top-style: solid; - border-bottom-style: solid; - border-width: 1px; - padding-top: 4px; - padding-bottom: 4px; - padding-left: 2px; - padding-right: 2px; -} - -table.instruc td:first-child { - font-family: monospace; - font-size: 90%; - vertical-align: top; - width: 12%; -} - -table.instruc td:first-child + td { - font-family: monospace; - font-size: 90%; - vertical-align: top; - width: 23%; -} - -table.instruc td:first-child + td i { - font-family: sans-serif; - font-size: 90%; -} - -table.instruc td:first-child + td + td { - vertical-align: top; - width: 28%; -} - -table.instruc td:first-child + td + td + td { - vertical-align: top; - width: 37%; -} - - -/* supplemental opcode format table */ - -table.supplement { - margin-top: 24px; - margin-bottom: 24px; - margin-left: 48px; - margin-right: 48px; -} - -table.supplement td:first-child { - font-family: monospace; - vertical-align: top; - width: 20%; -} - -table.supplement td:first-child + td { - font-family: monospace; - vertical-align: top; - width: 20%; -} - -table.supplement td:first-child + td + td { - font-family: sans-serif; - vertical-align: top; - width: 60%; -} - - -/* math details table */ - -table.math { - margin-top: 24px; - margin-bottom: 24px; - margin-left: 48px; - margin-right: 48px; -} - -table.math td:first-child { - font-family: monospace; - vertical-align: top; - width: 10%; -} - -table.math td:first-child + td { - font-family: monospace; - vertical-align: top; - width: 30%; -} - -table.math td:first-child + td + td { - font-family: sans-serif; - vertical-align: top; - width: 60%; -} diff --git a/docs/dalvik-bytecode.html b/docs/dalvik-bytecode.html deleted file mode 100644 index 66c9c4881..000000000 --- a/docs/dalvik-bytecode.html +++ /dev/null @@ -1,1551 +0,0 @@ -<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd"> - -<html> - -<head> -<title>Bytecode for the Dalvik VM</title> -<link rel=stylesheet href="dalvik-bytecode.css"> -</head> - -<body> - -<h1>Bytecode for the Dalvik VM</h1> -<p>Copyright © 2007 The Android Open Source Project - -<h2>General Design</h2> - -<ul> -<li>The machine model and calling conventions are meant to approximately - imitate common real architectures and C-style calling conventions: - <ul> - <li>The VM is register-based, and frames are fixed in size upon creation. - Each frame consists of a particular number of registers (specified by - the method) as well as any adjunct data needed to execute the method, - such as (but not limited to) the program counter and a reference to the - <code>.dex</code> file that contains the method. - </li> - <li>When used for bit values (such as integers and floating point - numbers), registers are considered 32 bits wide. Adjacent register - pairs are used for 64-bit values. There is no alignment requirement - for register pairs. - </li> - <li>When used for object references, registers are considered wide enough - to hold exactly one such reference. - </li> - <li>In terms of bitwise representation, <code>(Object) null == (int) - 0</code>. - </li> - <li>The <i>N</i> arguments to a method land in the last <i>N</i> registers - of the method's invocation frame, in order. Wide arguments consume - two registers. Instance methods are passed a <code>this</code> reference - as their first argument. - </li> - </ul> -<li>The storage unit in the instruction stream is a 16-bit unsigned quantity. - Some bits in some instructions are ignored / must-be-zero. -</li> -<li>Instructions aren't gratuitously limited to a particular type. For - example, instructions that move 32-bit register values without interpretation - don't have to specify whether they are moving ints or floats. -</li> -<li>There are separately enumerated and indexed constant pools for - references to strings, types, fields, and methods. -</li> -<li>Bitwise literal data is represented in-line in the instruction stream.</li> -<li>Because, in practice, it is uncommon for a method to need more than - 16 registers, and because needing more than eight registers <i>is</i> - reasonably common, many instructions are limited to only addressing - the first 16 - registers. When reasonably possible, instructions allow references to - up to the first 256 registers. In addition, some instructions have variants - that allow for much larger register counts, including a pair of catch-all - <code>move</code> instructions that can address registers in the range - <code>v0</code> – <code>v65535</code>. - In cases where an instruction variant isn't - available to address a desired register, it is expected that the register - contents get moved from the original register to a low register (before the - operation) and/or moved from a low result register to a high register - (after the operation). -</li> -<li>There are several "pseudo-instructions" that are used to hold - variable-length data payloads, which are referred to by regular - instructions (for example, - <code>fill-array-data</code>). Such instructions must never be - encountered during the normal flow of execution. In addition, the - instructions must be located on even-numbered bytecode offsets (that is, - 4-byte aligned). In order to meet this requirement, dex generation tools - must emit an extra <code>nop</code> instruction as a spacer if such an - instruction would otherwise be unaligned. Finally, though not required, - it is expected that most tools will choose to emit these instructions at - the ends of methods, since otherwise it would likely be the case that - additional instructions would be needed to branch around them. -</li> -<li>When installed on a running system, some instructions may be altered, - changing their format, as an install-time static linking optimization. - This is to allow for faster execution once linkage is known. - See the associated - <a href="instruction-formats.html">instruction formats document</a> - for the suggested variants. The word "suggested" is used advisedly; - it is not mandatory to implement these. -</li> -<li>Human-syntax and mnemonics: - <ul> - <li>Dest-then-source ordering for arguments.</li> - <li>Some opcodes have a disambiguating name suffix to indicate the type(s) - they operate on: - <ul> - <li>Type-general 32-bit opcodes are unmarked.</li> - <li>Type-general 64-bit opcodes are suffixed with <code>-wide</code>.</li> - <li>Type-specific opcodes are suffixed with their type (or a - straightforward abbreviation), one of: <code>-boolean</code> - <code>-byte</code> <code>-char</code> <code>-short</code> - <code>-int</code> <code>-long</code> <code>-float</code> - <code>-double</code> <code>-object</code> <code>-string</code> - <code>-class</code> <code>-void</code>.</li> - </ul> - </li> - <li>Some opcodes have a disambiguating suffix to distinguish - otherwise-identical operations that have different instruction layouts - or options. These suffixes are separated from the main names with a slash - ("<code>/</code>") and mainly exist at all to make there be a one-to-one - mapping with static constants in the code that generates and interprets - executables (that is, to reduce ambiguity for humans). - </li> - <li>In the descriptions here, the width of a value (indicating, e.g., the - range of a constant or the number of registers possibly addressed) is - emphasized by the use of a character per four bits of width. - </li> - <li>For example, in the instruction - "<code>move-wide/from16 vAA, vBBBB</code>": - <ul> - <li>"<code>move</code>" is the base opcode, indicating the base operation - (move a register's value).</li> - <li>"<code>wide</code>" is the name suffix, indicating that it operates - on wide (64 bit) data.</li> - <li>"<code>from16</code>" is the opcode suffix, indicating a variant - that has a 16-bit register reference as a source.</li> - <li>"<code>vAA</code>" is the destination register (implied by the - operation; again, the rule is that destination arguments always come - first), which must be in the range <code>v0</code> – - <code>v255</code>.</li> - <li>"<code>vBBBB</code>" is the source register, which must be in the - range <code>v0</code> – <code>v65535</code>.</li> - </ul> - </li> - </ul> -</li> -<li>See the <a href="instruction-formats.html">instruction formats - document</a> for more details about the various instruction formats - (listed under "Op & Format") as well as details about the opcode - syntax. -</li> -<li>See the <a href="dex-format.html"><code>.dex</code> file format - document</a> for more details about where the bytecode fits into - the bigger picture. -</li> -</ul> - -<h2>Summary of Instruction Set</h2> - -<table class="instruc"> -<thead> -<tr> - <th>Op & Format</th> - <th>Mnemonic / Syntax</th> - <th>Arguments</th> - <th>Description</th> -</tr> -</thead> -<tbody> -<tr> - <td>00 10x</td> - <td>nop</td> - <td> </td> - <td>Waste cycles. - <p><b>Note:</b> - Data-bearing pseudo-instructions are tagged with this opcode, in which - case the high-order byte of the opcode unit indicates the nature of - the data. See "<code>packed-switch-payload</code> Format", - "<code>sparse-switch-payload</code> Format", and - "<code>fill-array-data-payload</code> Format" below.</p> - </td> -</tr> -<tr> - <td>01 12x</td> - <td>move vA, vB</td> - <td><code>A:</code> destination register (4 bits)<br/> - <code>B:</code> source register (4 bits)</td> - <td>Move the contents of one non-object register to another.</td> -</tr> -<tr> - <td>02 22x</td> - <td>move/from16 vAA, vBBBB</td> - <td><code>A:</code> destination register (8 bits)<br/> - <code>B:</code> source register (16 bits)</td> - <td>Move the contents of one non-object register to another.</td> -</tr> -<tr> - <td>03 32x</td> - <td>move/16 vAAAA, vBBBB</td> - <td><code>A:</code> destination register (16 bits)<br/> - <code>B:</code> source register (16 bits)</td> - <td>Move the contents of one non-object register to another.</td> -</tr> -<tr> - <td>04 12x</td> - <td>move-wide vA, vB</td> - <td><code>A:</code> destination register pair (4 bits)<br/> - <code>B:</code> source register pair (4 bits)</td> - <td>Move the contents of one register-pair to another. - <p><b>Note:</b> - It is legal to move from <code>v<i>N</i></code> to either - <code>v<i>N-1</i></code> or <code>v<i>N+1</i></code>, so implementations - must arrange for both halves of a register pair to be read before - anything is written.</p> - </td> -</tr> -<tr> - <td>05 22x</td> - <td>move-wide/from16 vAA, vBBBB</td> - <td><code>A:</code> destination register pair (8 bits)<br/> - <code>B:</code> source register pair (16 bits)</td> - <td>Move the contents of one register-pair to another. - <p><b>Note:</b> - Implementation considerations are the same as <code>move-wide</code>, - above.</p> - </td> -</tr> -<tr> - <td>06 32x</td> - <td>move-wide/16 vAAAA, vBBBB</td> - <td><code>A:</code> destination register pair (16 bits)<br/> - <code>B:</code> source register pair (16 bits)</td> - <td>Move the contents of one register-pair to another. - <p><b>Note:</b> - Implementation considerations are the same as <code>move-wide</code>, - above.</p> - </td> -</tr> -<tr> - <td>07 12x</td> - <td>move-object vA, vB</td> - <td><code>A:</code> destination register (4 bits)<br/> - <code>B:</code> source register (4 bits)</td> - <td>Move the contents of one object-bearing register to another.</td> -</tr> -<tr> - <td>08 22x</td> - <td>move-object/from16 vAA, vBBBB</td> - <td><code>A:</code> destination register (8 bits)<br/> - <code>B:</code> source register (16 bits)</td> - <td>Move the contents of one object-bearing register to another.</td> -</tr> -<tr> - <td>09 32x</td> - <td>move-object/16 vAAAA, vBBBB</td> - <td><code>A:</code> destination register (16 bits)<br/> - <code>B:</code> source register (16 bits)</td> - <td>Move the contents of one object-bearing register to another.</td> -</tr> -<tr> - <td>0a 11x</td> - <td>move-result vAA</td> - <td><code>A:</code> destination register (8 bits)</td> - <td>Move the single-word non-object result of the most recent - <code>invoke-<i>kind</i></code> into the indicated register. - This must be done as the instruction immediately after an - <code>invoke-<i>kind</i></code> whose (single-word, non-object) result - is not to be ignored; anywhere else is invalid.</td> -</tr> -<tr> - <td>0b 11x</td> - <td>move-result-wide vAA</td> - <td><code>A:</code> destination register pair (8 bits)</td> - <td>Move the double-word result of the most recent - <code>invoke-<i>kind</i></code> into the indicated register pair. - This must be done as the instruction immediately after an - <code>invoke-<i>kind</i></code> whose (double-word) result - is not to be ignored; anywhere else is invalid.</td> -</tr> -<tr> - <td>0c 11x</td> - <td>move-result-object vAA</td> - <td><code>A:</code> destination register (8 bits)</td> - <td>Move the object result of the most recent <code>invoke-<i>kind</i></code> - into the indicated register. This must be done as the instruction - immediately after an <code>invoke-<i>kind</i></code> or - <code>filled-new-array</code> - whose (object) result is not to be ignored; anywhere else is invalid.</td> -</tr> -<tr> - <td>0d 11x</td> - <td>move-exception vAA</td> - <td><code>A:</code> destination register (8 bits)</td> - <td>Save a just-caught exception into the given register. This must - be the first instruction of any exception handler whose caught - exception is not to be ignored, and this instruction must <i>only</i> - ever occur as the first instruction of an exception handler; anywhere - else is invalid.</td> -</tr> -<tr> - <td>0e 10x</td> - <td>return-void</td> - <td> </td> - <td>Return from a <code>void</code> method.</td> -</tr> -<tr> - <td>0f 11x</td> - <td>return vAA</td> - <td><code>A:</code> return value register (8 bits)</td> - <td>Return from a single-width (32-bit) non-object value-returning - method. - </td> -</tr> -<tr> - <td>10 11x</td> - <td>return-wide vAA</td> - <td><code>A:</code> return value register-pair (8 bits)</td> - <td>Return from a double-width (64-bit) value-returning method.</td> -</tr> -<tr> - <td>11 11x</td> - <td>return-object vAA</td> - <td><code>A:</code> return value register (8 bits)</td> - <td>Return from an object-returning method.</td> -</tr> -<tr> - <td>12 11n</td> - <td>const/4 vA, #+B</td> - <td><code>A:</code> destination register (4 bits)<br/> - <code>B:</code> signed int (4 bits)</td> - <td>Move the given literal value (sign-extended to 32 bits) into - the specified register.</td> -</tr> -<tr> - <td>13 21s</td> - <td>const/16 vAA, #+BBBB</td> - <td><code>A:</code> destination register (8 bits)<br/> - <code>B:</code> signed int (16 bits)</td> - <td>Move the given literal value (sign-extended to 32 bits) into - the specified register.</td> -</tr> -<tr> - <td>14 31i</td> - <td>const vAA, #+BBBBBBBB</td> - <td><code>A:</code> destination register (8 bits)<br/> - <code>B:</code> arbitrary 32-bit constant</td> - <td>Move the given literal value into the specified register.</td> -</tr> -<tr> - <td>15 21h</td> - <td>const/high16 vAA, #+BBBB0000</td> - <td><code>A:</code> destination register (8 bits)<br/> - <code>B:</code> signed int (16 bits)</td> - <td>Move the given literal value (right-zero-extended to 32 bits) into - the specified register.</td> -</tr> -<tr> - <td>16 21s</td> - <td>const-wide/16 vAA, #+BBBB</td> - <td><code>A:</code> destination register (8 bits)<br/> - <code>B:</code> signed int (16 bits)</td> - <td>Move the given literal value (sign-extended to 64 bits) into - the specified register-pair.</td> -</tr> -<tr> - <td>17 31i</td> - <td>const-wide/32 vAA, #+BBBBBBBB</td> - <td><code>A:</code> destination register (8 bits)<br/> - <code>B:</code> signed int (32 bits)</td> - <td>Move the given literal value (sign-extended to 64 bits) into - the specified register-pair.</td> -</tr> -<tr> - <td>18 51l</td> - <td>const-wide vAA, #+BBBBBBBBBBBBBBBB</td> - <td><code>A:</code> destination register (8 bits)<br/> - <code>B:</code> arbitrary double-width (64-bit) constant</td> - <td>Move the given literal value into - the specified register-pair.</td> -</tr> -<tr> - <td>19 21h</td> - <td>const-wide/high16 vAA, #+BBBB000000000000</td> - <td><code>A:</code> destination register (8 bits)<br/> - <code>B:</code> signed int (16 bits)</td> - <td>Move the given literal value (right-zero-extended to 64 bits) into - the specified register-pair.</td> -</tr> -<tr> - <td>1a 21c</td> - <td>const-string vAA, string@BBBB</td> - <td><code>A:</code> destination register (8 bits)<br/> - <code>B:</code> string index</td> - <td>Move a reference to the string specified by the given index into the - specified register.</td> -</tr> -<tr> - <td>1b 31c</td> - <td>const-string/jumbo vAA, string@BBBBBBBB</td> - <td><code>A:</code> destination register (8 bits)<br/> - <code>B:</code> string index</td> - <td>Move a reference to the string specified by the given index into the - specified register.</td> -</tr> -<tr> - <td>1c 21c</td> - <td>const-class vAA, type@BBBB</td> - <td><code>A:</code> destination register (8 bits)<br/> - <code>B:</code> type index</td> - <td>Move a reference to the class specified by the given index into the - specified register. In the case where the indicated type is primitive, - this will store a reference to the primitive type's degenerate - class.</td> -</tr> -<tr> - <td>1d 11x</td> - <td>monitor-enter vAA</td> - <td><code>A:</code> reference-bearing register (8 bits)</td> - <td>Acquire the monitor for the indicated object.</td> -</tr> -<tr> - <td>1e 11x</td> - <td>monitor-exit vAA</td> - <td><code>A:</code> reference-bearing register (8 bits)</td> - <td>Release the monitor for the indicated object. - <p><b>Note:</b> - If this instruction needs to throw an exception, it must do - so as if the pc has already advanced past the instruction. - It may be useful to think of this as the instruction successfully - executing (in a sense), and the exception getting thrown <i>after</i> - the instruction but <i>before</i> the next one gets a chance to - run. This definition makes it possible for a method to use - a monitor cleanup catch-all (e.g., <code>finally</code>) block as - the monitor cleanup for that block itself, as a way to handle the - arbitrary exceptions that might get thrown due to the historical - implementation of <code>Thread.stop()</code>, while still managing - to have proper monitor hygiene.</p> - </td> -</tr> -<tr> - <td>1f 21c</td> - <td>check-cast vAA, type@BBBB</td> - <td><code>A:</code> reference-bearing register (8 bits)<br/> - <code>B:</code> type index (16 bits)</td> - <td>Throw a <code>ClassCastException</code> if the reference in the - given register cannot be cast to the indicated type. - <p><b>Note:</b> Since <code>A</code> must always be a reference - (and not a primitive value), this will necessarily fail at runtime - (that is, it will throw an exception) if <code>B</code> refers to a - primitive type.</p> - </td> -</tr> -<tr> - <td>20 22c</td> - <td>instance-of vA, vB, type@CCCC</td> - <td><code>A:</code> destination register (4 bits)<br/> - <code>B:</code> reference-bearing register (4 bits)<br/> - <code>C:</code> type index (16 bits)</td> - <td>Store in the given destination register <code>1</code> - if the indicated reference is an instance of the given type, - or <code>0</code> if not. - <p><b>Note:</b> Since <code>B</code> must always be a reference - (and not a primitive value), this will always result - in <code>0</code> being stored if <code>C</code> refers to a primitive - type.</td> -</tr> -<tr> - <td>21 12x</td> - <td>array-length vA, vB</td> - <td><code>A:</code> destination register (4 bits)<br/> - <code>B:</code> array reference-bearing register (4 bits)</td> - <td>Store in the given destination register the length of the indicated - array, in entries</td> -</tr> -<tr> - <td>22 21c</td> - <td>new-instance vAA, type@BBBB</td> - <td><code>A:</code> destination register (8 bits)<br/> - <code>B:</code> type index</td> - <td>Construct a new instance of the indicated type, storing a - reference to it in the destination. The type must refer to a - non-array class.</td> -</tr> -<tr> - <td>23 22c</td> - <td>new-array vA, vB, type@CCCC</td> - <td><code>A:</code> destination register (8 bits)<br/> - <code>B:</code> size register<br/> - <code>C:</code> type index</td> - <td>Construct a new array of the indicated type and size. The type - must be an array type.</td> -</tr> -<tr> - <td>24 35c</td> - <td>filled-new-array {vC, vD, vE, vF, vG}, type@BBBB</td> - <td> - <code>A:</code> array size and argument word count (4 bits)<br/> - <code>B:</code> type index (16 bits)<br/> - <code>C..G:</code> argument registers (4 bits each) - </td> - <td>Construct an array of the given type and size, filling it with the - supplied contents. The type must be an array type. The array's - contents must be single-word (that is, - no arrays of <code>long</code> or <code>double</code>, but reference - types are acceptable). The constructed - instance is stored as a "result" in the same way that the method invocation - instructions store their results, so the constructed instance must - be moved to a register with an immediately subsequent - <code>move-result-object</code> instruction (if it is to be used).</td> -</tr> -<tr> - <td>25 3rc</td> - <td>filled-new-array/range {vCCCC .. vNNNN}, type@BBBB</td> - <td><code>A:</code> array size and argument word count (8 bits)<br/> - <code>B:</code> type index (16 bits)<br/> - <code>C:</code> first argument register (16 bits)<br/> - <code>N = A + C - 1</code></td> - <td>Construct an array of the given type and size, filling it with - the supplied contents. Clarifications and restrictions are the same - as <code>filled-new-array</code>, described above.</td> -</tr> -<tr> - <td>26 31t</td> - <td>fill-array-data vAA, +BBBBBBBB <i>(with supplemental data as specified - below in "<code>fill-array-data-payload</code> Format")</i></td> - <td><code>A:</code> array reference (8 bits)<br/> - <code>B:</code> signed "branch" offset to table data pseudo-instruction - (32 bits) - </td> - <td>Fill the given array with the indicated data. The reference must be - to an array of primitives, and the data table must match it in type and - must contain no more elements than will fit in the array. That is, - the array may be larger than the table, and if so, only the initial - elements of the array are set, leaving the remainder alone. - </td> -</tr> -<tr> - <td>27 11x</td> - <td>throw vAA</td> - <td><code>A:</code> exception-bearing register (8 bits)<br/></td> - <td>Throw the indicated exception.</td> -</tr> -<tr> - <td>28 10t</td> - <td>goto +AA</td> - <td><code>A:</code> signed branch offset (8 bits)</td> - <td>Unconditionally jump to the indicated instruction. - <p><b>Note:</b> - The branch offset must not be <code>0</code>. (A spin - loop may be legally constructed either with <code>goto/32</code> or - by including a <code>nop</code> as a target before the branch.)</p> - </td> -</tr> -<tr> - <td>29 20t</td> - <td>goto/16 +AAAA</td> - <td><code>A:</code> signed branch offset (16 bits)<br/></td> - <td>Unconditionally jump to the indicated instruction. - <p><b>Note:</b> - The branch offset must not be <code>0</code>. (A spin - loop may be legally constructed either with <code>goto/32</code> or - by including a <code>nop</code> as a target before the branch.)</p> - </td> -</tr> -<tr> - <td>2a 30t</td> - <td>goto/32 +AAAAAAAA</td> - <td><code>A:</code> signed branch offset (32 bits)<br/></td> - <td>Unconditionally jump to the indicated instruction.</td> -</tr> -<tr> - <td>2b 31t</td> - <td>packed-switch vAA, +BBBBBBBB <i>(with supplemental data as - specified below in "<code>packed-switch-payload</code> Format")</i></td> - <td><code>A:</code> register to test<br/> - <code>B:</code> signed "branch" offset to table data pseudo-instruction - (32 bits) - </td> - <td>Jump to a new instruction based on the value in the - given register, using a table of offsets corresponding to each value - in a particular integral range, or fall through to the next - instruction if there is no match. - </td> -</tr> -<tr> - <td>2c 31t</td> - <td>sparse-switch vAA, +BBBBBBBB <i>(with supplemental data as - specified below in "<code>sparse-switch-payload</code> Format")</i></td> - <td><code>A:</code> register to test<br/> - <code>B:</code> signed "branch" offset to table data pseudo-instruction - (32 bits) - </td> - <td>Jump to a new instruction based on the value in the given - register, using an ordered table of value-offset pairs, or fall - through to the next instruction if there is no match. - </td> -</tr> -<tr> - <td>2d..31 23x</td> - <td>cmp<i>kind</i> vAA, vBB, vCC<br/> - 2d: cmpl-float <i>(lt bias)</i><br/> - 2e: cmpg-float <i>(gt bias)</i><br/> - 2f: cmpl-double <i>(lt bias)</i><br/> - 30: cmpg-double <i>(gt bias)</i><br/> - 31: cmp-long - </td> - <td><code>A:</code> destination register (8 bits)<br/> - <code>B:</code> first source register or pair<br/> - <code>C:</code> second source register or pair</td> - <td>Perform the indicated floating point or <code>long</code> comparison, - storing <code>0</code> if the two arguments are equal, <code>1</code> - if the second argument is larger, or <code>-1</code> if the first - argument is larger. The "bias" listed for the floating point operations - indicates how <code>NaN</code> comparisons are treated: "Gt bias" - instructions return <code>1</code> for <code>NaN</code> comparisons, - and "lt bias" instructions return - <code>-1</code>. - <p>For example, to check to see if floating point - <code>a < b</code>, then it is advisable to use - <code>cmpg-float</code>; a result of <code>-1</code> indicates that - the test was true, and the other values indicate it was false either - due to a valid comparison or because one or the other values was - <code>NaN</code>.</p> - </td> -</tr> -<tr> - <td>32..37 22t</td> - <td>if-<i>test</i> vA, vB, +CCCC<br/> - 32: if-eq<br/> - 33: if-ne<br/> - 34: if-lt<br/> - 35: if-ge<br/> - 36: if-gt<br/> - 37: if-le<br/> - </td> - <td><code>A:</code> first register to test (4 bits)<br/> - <code>B:</code> second register to test (4 bits)<br/> - <code>C:</code> signed branch offset (16 bits)</td> - <td>Branch to the given destination if the given two registers' values - compare as specified. - <p><b>Note:</b> - The branch offset must not be <code>0</code>. (A spin - loop may be legally constructed either by branching around a - backward <code>goto</code> or by including a <code>nop</code> as - a target before the branch.)</p> - </td> -</tr> -<tr> - <td>38..3d 21t</td> - <td>if-<i>test</i>z vAA, +BBBB<br/> - 38: if-eqz<br/> - 39: if-nez<br/> - 3a: if-ltz<br/> - 3b: if-gez<br/> - 3c: if-gtz<br/> - 3d: if-lez<br/> - </td> - <td><code>A:</code> register to test (8 bits)<br/> - <code>B:</code> signed branch offset (16 bits)</td> - <td>Branch to the given destination if the given register's value compares - with 0 as specified. - <p><b>Note:</b> - The branch offset must not be <code>0</code>. (A spin - loop may be legally constructed either by branching around a - backward <code>goto</code> or by including a <code>nop</code> as - a target before the branch.)</p> - </td> -</tr> -<tr> - <td>3e..43 10x</td> - <td><i>(unused)</i></td> - <td> </td> - <td><i>(unused)</i></td> -</tr> -<tr> - <td>44..51 23x</td> - <td><i>arrayop</i> vAA, vBB, vCC<br/> - 44: aget<br/> - 45: aget-wide<br/> - 46: aget-object<br/> - 47: aget-boolean<br/> - 48: aget-byte<br/> - 49: aget-char<br/> - 4a: aget-short<br/> - 4b: aput<br/> - 4c: aput-wide<br/> - 4d: aput-object<br/> - 4e: aput-boolean<br/> - 4f: aput-byte<br/> - 50: aput-char<br/> - 51: aput-short - </td> - <td><code>A:</code> value register or pair; may be source or dest - (8 bits)<br/> - <code>B:</code> array register (8 bits)<br/> - <code>C:</code> index register (8 bits)</td> - <td>Perform the identified array operation at the identified index of - the given array, loading or storing into the value register.</td> -</tr> -<tr> - <td>52..5f 22c</td> - <td>i<i>instanceop</i> vA, vB, field@CCCC<br/> - 52: iget<br/> - 53: iget-wide<br/> - 54: iget-object<br/> - 55: iget-boolean<br/> - 56: iget-byte<br/> - 57: iget-char<br/> - 58: iget-short<br/> - 59: iput<br/> - 5a: iput-wide<br/> - 5b: iput-object<br/> - 5c: iput-boolean<br/> - 5d: iput-byte<br/> - 5e: iput-char<br/> - 5f: iput-short - </td> - <td><code>A:</code> value register or pair; may be source or dest - (4 bits)<br/> - <code>B:</code> object register (4 bits)<br/> - <code>C:</code> instance field reference index (16 bits)</td> - <td>Perform the identified object instance field operation with - the identified field, loading or storing into the value register. - <p><b>Note:</b> These opcodes are reasonable candidates for static linking, - altering the field argument to be a more direct offset.</p> - </td> -</tr> -<tr> - <td>60..6d 21c</td> - <td>s<i>staticop</i> vAA, field@BBBB<br/> - 60: sget<br/> - 61: sget-wide<br/> - 62: sget-object<br/> - 63: sget-boolean<br/> - 64: sget-byte<br/> - 65: sget-char<br/> - 66: sget-short<br/> - 67: sput<br/> - 68: sput-wide<br/> - 69: sput-object<br/> - 6a: sput-boolean<br/> - 6b: sput-byte<br/> - 6c: sput-char<br/> - 6d: sput-short - </td> - <td><code>A:</code> value register or pair; may be source or dest - (8 bits)<br/> - <code>B:</code> static field reference index (16 bits)</td> - <td>Perform the identified object static field operation with the identified - static field, loading or storing into the value register. - <p><b>Note:</b> These opcodes are reasonable candidates for static linking, - altering the field argument to be a more direct offset.</p> - </td> -</tr> -<tr> - <td>6e..72 35c</td> - <td>invoke-<i>kind</i> {vC, vD, vE, vF, vG}, meth@BBBB<br/> - 6e: invoke-virtual<br/> - 6f: invoke-super<br/> - 70: invoke-direct<br/> - 71: invoke-static<br/> - 72: invoke-interface - </td> - <td> - <code>A:</code> argument word count (4 bits)<br/> - <code>B:</code> method reference index (16 bits)<br/> - <code>C..G:</code> argument registers (4 bits each) - </td> - <td>Call the indicated method. The result (if any) may be stored - with an appropriate <code>move-result*</code> variant as the immediately - subsequent instruction. - <p><code>invoke-virtual</code> is used to invoke a normal virtual - method (a method that is not <code>private</code>, <code>static</code>, - or <code>final</code>, and is also not a constructor).</p> - <p><code>invoke-super</code> is used to invoke the closest superclass's - virtual method (as opposed to the one with the same <code>method_id</code> - in the calling class). The same method restrictions hold as for - <code>invoke-virtual</code>.</p> - <p><code>invoke-direct</code> is used to invoke a non-<code>static</code> - direct method (that is, an instance method that is by its nature - non-overridable, namely either a <code>private</code> instance method - or a constructor).</p> - <p><code>invoke-static</code> is used to invoke a <code>static</code> - method (which is always considered a direct method).</p> - <p><code>invoke-interface</code> is used to invoke an - <code>interface</code> method, that is, on an object whose concrete - class isn't known, using a <code>method_id</code> that refers to - an <code>interface</code>.</p> - <p><b>Note:</b> These opcodes are reasonable candidates for static linking, - altering the method argument to be a more direct offset - (or pair thereof).</p> - </td> -</tr> -<tr> - <td>73 10x</td> - <td><i>(unused)</i></td> - <td> </td> - <td><i>(unused)</i></td> -</tr> -<tr> - <td>74..78 3rc</td> - <td>invoke-<i>kind</i>/range {vCCCC .. vNNNN}, meth@BBBB<br/> - 74: invoke-virtual/range<br/> - 75: invoke-super/range<br/> - 76: invoke-direct/range<br/> - 77: invoke-static/range<br/> - 78: invoke-interface/range - </td> - <td><code>A:</code> argument word count (8 bits)<br/> - <code>B:</code> method reference index (16 bits)<br/> - <code>C:</code> first argument register (16 bits)<br/> - <code>N = A + C - 1</code></td> - <td>Call the indicated method. See first <code>invoke-<i>kind</i></code> - description above for details, caveats, and suggestions. - </td> -</tr> -<tr> - <td>79..7a 10x</td> - <td><i>(unused)</i></td> - <td> </td> - <td><i>(unused)</i></td> -</tr> -<tr> - <td>7b..8f 12x</td> - <td><i>unop</i> vA, vB<br/> - 7b: neg-int<br/> - 7c: not-int<br/> - 7d: neg-long<br/> - 7e: not-long<br/> - 7f: neg-float<br/> - 80: neg-double<br/> - 81: int-to-long<br/> - 82: int-to-float<br/> - 83: int-to-double<br/> - 84: long-to-int<br/> - 85: long-to-float<br/> - 86: long-to-double<br/> - 87: float-to-int<br/> - 88: float-to-long<br/> - 89: float-to-double<br/> - 8a: double-to-int<br/> - 8b: double-to-long<br/> - 8c: double-to-float<br/> - 8d: int-to-byte<br/> - 8e: int-to-char<br/> - 8f: int-to-short - </td> - <td><code>A:</code> destination register or pair (4 bits)<br/> - <code>B:</code> source register or pair (4 bits)</td> - <td>Perform the identified unary operation on the source register, - storing the result in the destination register.</td> -</tr> - -<tr> - <td>90..af 23x</td> - <td><i>binop</i> vAA, vBB, vCC<br/> - 90: add-int<br/> - 91: sub-int<br/> - 92: mul-int<br/> - 93: div-int<br/> - 94: rem-int<br/> - 95: and-int<br/> - 96: or-int<br/> - 97: xor-int<br/> - 98: shl-int<br/> - 99: shr-int<br/> - 9a: ushr-int<br/> - 9b: add-long<br/> - 9c: sub-long<br/> - 9d: mul-long<br/> - 9e: div-long<br/> - 9f: rem-long<br/> - a0: and-long<br/> - a1: or-long<br/> - a2: xor-long<br/> - a3: shl-long<br/> - a4: shr-long<br/> - a5: ushr-long<br/> - a6: add-float<br/> - a7: sub-float<br/> - a8: mul-float<br/> - a9: div-float<br/> - aa: rem-float<br/> - ab: add-double<br/> - ac: sub-double<br/> - ad: mul-double<br/> - ae: div-double<br/> - af: rem-double - </td> - <td><code>A:</code> destination register or pair (8 bits)<br/> - <code>B:</code> first source register or pair (8 bits)<br/> - <code>C:</code> second source register or pair (8 bits)</td> - <td>Perform the identified binary operation on the two source registers, - storing the result in the first source register.</td> -</tr> -<tr> - <td>b0..cf 12x</td> - <td><i>binop</i>/2addr vA, vB<br/> - b0: add-int/2addr<br/> - b1: sub-int/2addr<br/> - b2: mul-int/2addr<br/> - b3: div-int/2addr<br/> - b4: rem-int/2addr<br/> - b5: and-int/2addr<br/> - b6: or-int/2addr<br/> - b7: xor-int/2addr<br/> - b8: shl-int/2addr<br/> - b9: shr-int/2addr<br/> - ba: ushr-int/2addr<br/> - bb: add-long/2addr<br/> - bc: sub-long/2addr<br/> - bd: mul-long/2addr<br/> - be: div-long/2addr<br/> - bf: rem-long/2addr<br/> - c0: and-long/2addr<br/> - c1: or-long/2addr<br/> - c2: xor-long/2addr<br/> - c3: shl-long/2addr<br/> - c4: shr-long/2addr<br/> - c5: ushr-long/2addr<br/> - c6: add-float/2addr<br/> - c7: sub-float/2addr<br/> - c8: mul-float/2addr<br/> - c9: div-float/2addr<br/> - ca: rem-float/2addr<br/> - cb: add-double/2addr<br/> - cc: sub-double/2addr<br/> - cd: mul-double/2addr<br/> - ce: div-double/2addr<br/> - cf: rem-double/2addr - </td> - <td><code>A:</code> destination and first source register or pair - (4 bits)<br/> - <code>B:</code> second source register or pair (4 bits)</td> - <td>Perform the identified binary operation on the two source registers, - storing the result in the first source register.</td> -</tr> -<tr> - <td>d0..d7 22s</td> - <td><i>binop</i>/lit16 vA, vB, #+CCCC<br/> - d0: add-int/lit16<br/> - d1: rsub-int (reverse subtract)<br/> - d2: mul-int/lit16<br/> - d3: div-int/lit16<br/> - d4: rem-int/lit16<br/> - d5: and-int/lit16<br/> - d6: or-int/lit16<br/> - d7: xor-int/lit16 - </td> - <td><code>A:</code> destination register (4 bits)<br/> - <code>B:</code> source register (4 bits)<br/> - <code>C:</code> signed int constant (16 bits)</td> - <td>Perform the indicated binary op on the indicated register (first - argument) and literal value (second argument), storing the result in - the destination register. - <p><b>Note:</b> - <code>rsub-int</code> does not have a suffix since this version is the - main opcode of its family. Also, see below for details on its semantics. - </p> - </td> -</tr> -<tr> - <td>d8..e2 22b</td> - <td><i>binop</i>/lit8 vAA, vBB, #+CC<br/> - d8: add-int/lit8<br/> - d9: rsub-int/lit8<br/> - da: mul-int/lit8<br/> - db: div-int/lit8<br/> - dc: rem-int/lit8<br/> - dd: and-int/lit8<br/> - de: or-int/lit8<br/> - df: xor-int/lit8<br/> - e0: shl-int/lit8<br/> - e1: shr-int/lit8<br/> - e2: ushr-int/lit8 - </td> - <td><code>A:</code> destination register (8 bits)<br/> - <code>B:</code> source register (8 bits)<br/> - <code>C:</code> signed int constant (8 bits)</td> - <td>Perform the indicated binary op on the indicated register (first - argument) and literal value (second argument), storing the result - in the destination register. - <p><b>Note:</b> See below for details on the semantics of - <code>rsub-int</code>.</p> - </td> -</tr> -<tr> - <td>e3..ff 10x</td> - <td><i>(unused)</i></td> - <td> </td> - <td><i>(unused)</i></td> -</tr> -</tbody> -</table> - -<h2><code>packed-switch-payload</code> Format</h2> - -<table class="supplement"> -<thead> -<tr> - <th>Name</th> - <th>Format</th> - <th>Description</th> -</tr> -</thead> -<tbody> -<tr> - <td>ident</td> - <td>ushort = 0x0100</td> - <td>identifying pseudo-opcode</td> -</tr> -<tr> - <td>size</td> - <td>ushort</td> - <td>number of entries in the table</td> -</tr> -<tr> - <td>first_key</td> - <td>int</td> - <td>first (and lowest) switch case value</td> -</tr> -<tr> - <td>targets</td> - <td>int[]</td> - <td>list of <code>size</code> relative branch targets. The targets are - relative to the address of the switch opcode, not of this table. - </td> -</tr> -</tbody> -</table> - -<p><b>Note:</b> The total number of code units for an instance of this -table is <code>(size * 2) + 4</code>.</p> - -<h2><code>sparse-switch-payload</code> Format</h2> - -<table class="supplement"> -<thead> -<tr> - <th>Name</th> - <th>Format</th> - <th>Description</th> -</tr> -</thead> -<tbody> -<tr> - <td>ident</td> - <td>ushort = 0x0200</td> - <td>identifying pseudo-opcode</td> -</tr> -<tr> - <td>size</td> - <td>ushort</td> - <td>number of entries in the table</td> -</tr> -<tr> - <td>keys</td> - <td>int[]</td> - <td>list of <code>size</code> key values, sorted low-to-high</td> -</tr> -<tr> - <td>targets</td> - <td>int[]</td> - <td>list of <code>size</code> relative branch targets, each corresponding - to the key value at the same index. The targets are - relative to the address of the switch opcode, not of this table. - </td> -</tr> -</tbody> -</table> - -<p><b>Note:</b> The total number of code units for an instance of this -table is <code>(size * 4) + 2</code>.</p> - -<h2><code>fill-array-data-payload</code> Format</h2> - -<table class="supplement"> -<thead> -<tr> - <th>Name</th> - <th>Format</th> - <th>Description</th> -</tr> -</thead> -<tbody> -<tr> - <td>ident</td> - <td>ushort = 0x0300</td> - <td>identifying pseudo-opcode</td> -</tr> -<tr> - <td>element_width</td> - <td>ushort</td> - <td>number of bytes in each element</td> -</tr> -<tr> - <td>size</td> - <td>uint</td> - <td>number of elements in the table</td> -</tr> -<tr> - <td>data</td> - <td>ubyte[]</td> - <td>data values</td> -</tr> -</tbody> -</table> - -<p><b>Note:</b> The total number of code units for an instance of this -table is <code>(size * element_width + 1) / 2 + 4</code>.</p> - - -<h2>Mathematical Operation Details</h2> - -<p><b>Note:</b> Floating point operations must follow IEEE 754 rules, using -round-to-nearest and gradual underflow, except where stated otherwise.</p> - -<table class="math"> -<thead> -<tr> - <th>Opcode</th> - <th>C Semantics</th> - <th>Notes</th> -</tr> -</thead> -<tbody> -<tr> - <td>neg-int</td> - <td>int32 a;<br/> - int32 result = -a; - </td> - <td>Unary twos-complement.</td> -</tr> -<tr> - <td>not-int</td> - <td>int32 a;<br/> - int32 result = ~a; - </td> - <td>Unary ones-complement.</td> -</tr> -<tr> - <td>neg-long</td> - <td>int64 a;<br/> - int64 result = -a; - </td> - <td>Unary twos-complement.</td> -</tr> -<tr> - <td>not-long</td> - <td>int64 a;<br/> - int64 result = ~a; - </td> - <td>Unary ones-complement.</td> -</tr> -<tr> - <td>neg-float</td> - <td>float a;<br/> - float result = -a; - </td> - <td>Floating point negation.</td> -</tr> -<tr> - <td>neg-double</td> - <td>double a;<br/> - double result = -a; - </td> - <td>Floating point negation.</td> -</tr> -<tr> - <td>int-to-long</td> - <td>int32 a;<br/> - int64 result = (int64) a; - </td> - <td>Sign extension of <code>int32</code> into <code>int64</code>.</td> -</tr> -<tr> - <td>int-to-float</td> - <td>int32 a;<br/> - float result = (float) a; - </td> - <td>Conversion of <code>int32</code> to <code>float</code>, using - round-to-nearest. This loses precision for some values. - </td> -</tr> -<tr> - <td>int-to-double</td> - <td>int32 a;<br/> - double result = (double) a; - </td> - <td>Conversion of <code>int32</code> to <code>double</code>.</td> -</tr> -<tr> - <td>long-to-int</td> - <td>int64 a;<br/> - int32 result = (int32) a; - </td> - <td>Truncation of <code>int64</code> into <code>int32</code>.</td> -</tr> -<tr> - <td>long-to-float</td> - <td>int64 a;<br/> - float result = (float) a; - </td> - <td>Conversion of <code>int64</code> to <code>float</code>, using - round-to-nearest. This loses precision for some values. - </td> -</tr> -<tr> - <td>long-to-double</td> - <td>int64 a;<br/> - double result = (double) a; - </td> - <td>Conversion of <code>int64</code> to <code>double</code>, using - round-to-nearest. This loses precision for some values. - </td> -</tr> -<tr> - <td>float-to-int</td> - <td>float a;<br/> - int32 result = (int32) a; - </td> - <td>Conversion of <code>float</code> to <code>int32</code>, using - round-toward-zero. <code>NaN</code> and <code>-0.0</code> (negative zero) - convert to the integer <code>0</code>. Infinities and values with - too large a magnitude to be represented get converted to either - <code>0x7fffffff</code> or <code>-0x80000000</code> depending on sign. - </td> -</tr> -<tr> - <td>float-to-long</td> - <td>float a;<br/> - int64 result = (int64) a; - </td> - <td>Conversion of <code>float</code> to <code>int64</code>, using - round-toward-zero. The same special case rules as for - <code>float-to-int</code> apply here, except that out-of-range values - get converted to either <code>0x7fffffffffffffff</code> or - <code>-0x8000000000000000</code> depending on sign. - </td> -</tr> -<tr> - <td>float-to-double</td> - <td>float a;<br/> - double result = (double) a; - </td> - <td>Conversion of <code>float</code> to <code>double</code>, preserving - the value exactly. - </td> -</tr> -<tr> - <td>double-to-int</td> - <td>double a;<br/> - int32 result = (int32) a; - </td> - <td>Conversion of <code>double</code> to <code>int32</code>, using - round-toward-zero. The same special case rules as for - <code>float-to-int</code> apply here. - </td> -</tr> -<tr> - <td>double-to-long</td> - <td>double a;<br/> - int64 result = (int64) a; - </td> - <td>Conversion of <code>double</code> to <code>int64</code>, using - round-toward-zero. The same special case rules as for - <code>float-to-long</code> apply here. - </td> -</tr> -<tr> - <td>double-to-float</td> - <td>double a;<br/> - float result = (float) a; - </td> - <td>Conversion of <code>double</code> to <code>float</code>, using - round-to-nearest. This loses precision for some values. - </td> -</tr> -<tr> - <td>int-to-byte</td> - <td>int32 a;<br/> - int32 result = (a << 24) >> 24; - </td> - <td>Truncation of <code>int32</code> to <code>int8</code>, sign - extending the result. - </td> -</tr> -<tr> - <td>int-to-char</td> - <td>int32 a;<br/> - int32 result = a & 0xffff; - </td> - <td>Truncation of <code>int32</code> to <code>uint16</code>, without - sign extension. - </td> -</tr> -<tr> - <td>int-to-short</td> - <td>int32 a;<br/> - int32 result = (a << 16) >> 16; - </td> - <td>Truncation of <code>int32</code> to <code>int16</code>, sign - extending the result. - </td> -</tr> -<tr> - <td>add-int</td> - <td>int32 a, b;<br/> - int32 result = a + b; - </td> - <td>Twos-complement addition.</td> -</tr> -<tr> - <td>sub-int</td> - <td>int32 a, b;<br/> - int32 result = a - b; - </td> - <td>Twos-complement subtraction.</td> -</tr> -<tr> - <td>rsub-int</td> - <td>int32 a, b;<br/> - int32 result = b - a; - </td> - <td>Twos-complement reverse subtraction.</td> -</tr> -<tr> - <td>mul-int</td> - <td>int32 a, b;<br/> - int32 result = a * b; - </td> - <td>Twos-complement multiplication.</td> -</tr> -<tr> - <td>div-int</td> - <td>int32 a, b;<br/> - int32 result = a / b; - </td> - <td>Twos-complement division, rounded towards zero (that is, truncated to - integer). This throws <code>ArithmeticException</code> if - <code>b == 0</code>. - </td> -</tr> -<tr> - <td>rem-int</td> - <td>int32 a, b;<br/> - int32 result = a % b; - </td> - <td>Twos-complement remainder after division. The sign of the result - is the same as that of <code>a</code>, and it is more precisely - defined as <code>result == a - (a / b) * b</code>. This throws - <code>ArithmeticException</code> if <code>b == 0</code>. - </td> -</tr> -<tr> - <td>and-int</td> - <td>int32 a, b;<br/> - int32 result = a & b; - </td> - <td>Bitwise AND.</td> -</tr> -<tr> - <td>or-int</td> - <td>int32 a, b;<br/> - int32 result = a | b; - </td> - <td>Bitwise OR.</td> -</tr> -<tr> - <td>xor-int</td> - <td>int32 a, b;<br/> - int32 result = a ^ b; - </td> - <td>Bitwise XOR.</td> -</tr> -<tr> - <td>shl-int</td> - <td>int32 a, b;<br/> - int32 result = a << (b & 0x1f); - </td> - <td>Bitwise shift left (with masked argument).</td> -</tr> -<tr> - <td>shr-int</td> - <td>int32 a, b;<br/> - int32 result = a >> (b & 0x1f); - </td> - <td>Bitwise signed shift right (with masked argument).</td> -</tr> -<tr> - <td>ushr-int</td> - <td>uint32 a, b;<br/> - int32 result = a >> (b & 0x1f); - </td> - <td>Bitwise unsigned shift right (with masked argument).</td> -</tr> -<tr> - <td>add-long</td> - <td>int64 a, b;<br/> - int64 result = a + b; - </td> - <td>Twos-complement addition.</td> -</tr> -<tr> - <td>sub-long</td> - <td>int64 a, b;<br/> - int64 result = a - b; - </td> - <td>Twos-complement subtraction.</td> -</tr> -<tr> - <td>mul-long</td> - <td>int64 a, b;<br/> - int64 result = a * b; - </td> - <td>Twos-complement multiplication.</td> -</tr> -<tr> - <td>div-long</td> - <td>int64 a, b;<br/> - int64 result = a / b; - </td> - <td>Twos-complement division, rounded towards zero (that is, truncated to - integer). This throws <code>ArithmeticException</code> if - <code>b == 0</code>. - </td> -</tr> -<tr> - <td>rem-long</td> - <td>int64 a, b;<br/> - int64 result = a % b; - </td> - <td>Twos-complement remainder after division. The sign of the result - is the same as that of <code>a</code>, and it is more precisely - defined as <code>result == a - (a / b) * b</code>. This throws - <code>ArithmeticException</code> if <code>b == 0</code>. - </td> -</tr> -<tr> - <td>and-long</td> - <td>int64 a, b;<br/> - int64 result = a & b; - </td> - <td>Bitwise AND.</td> -</tr> -<tr> - <td>or-long</td> - <td>int64 a, b;<br/> - int64 result = a | b; - </td> - <td>Bitwise OR.</td> -</tr> -<tr> - <td>xor-long</td> - <td>int64 a, b;<br/> - int64 result = a ^ b; - </td> - <td>Bitwise XOR.</td> -</tr> -<tr> - <td>shl-long</td> - <td>int64 a, b;<br/> - int64 result = a << (b & 0x3f); - </td> - <td>Bitwise shift left (with masked argument).</td> -</tr> -<tr> - <td>shr-long</td> - <td>int64 a, b;<br/> - int64 result = a >> (b & 0x3f); - </td> - <td>Bitwise signed shift right (with masked argument).</td> -</tr> -<tr> - <td>ushr-long</td> - <td>uint64 a, b;<br/> - int64 result = a >> (b & 0x3f); - </td> - <td>Bitwise unsigned shift right (with masked argument).</td> -</tr> -<tr> - <td>add-float</td> - <td>float a, b;<br/> - float result = a + b; - </td> - <td>Floating point addition.</td> -</tr> -<tr> - <td>sub-float</td> - <td>float a, b;<br/> - float result = a - b; - </td> - <td>Floating point subtraction.</td> -</tr> -<tr> - <td>mul-float</td> - <td>float a, b;<br/> - float result = a * b; - </td> - <td>Floating point multiplication.</td> -</tr> -<tr> - <td>div-float</td> - <td>float a, b;<br/> - float result = a / b; - </td> - <td>Floating point division.</td> -</tr> -<tr> - <td>rem-float</td> - <td>float a, b;<br/> - float result = a % b; - </td> - <td>Floating point remainder after division. This function is different - than IEEE 754 remainder and is defined as - <code>result == a - roundTowardZero(a / b) * b</code>. - </td> -</tr> -<tr> - <td>add-double</td> - <td>double a, b;<br/> - double result = a + b; - </td> - <td>Floating point addition.</td> -</tr> -<tr> - <td>sub-double</td> - <td>double a, b;<br/> - double result = a - b; - </td> - <td>Floating point subtraction.</td> -</tr> -<tr> - <td>mul-double</td> - <td>double a, b;<br/> - double result = a * b; - </td> - <td>Floating point multiplication.</td> -</tr> -<tr> - <td>div-double</td> - <td>double a, b;<br/> - double result = a / b; - </td> - <td>Floating point division.</td> -</tr> -<tr> - <td>rem-double</td> - <td>double a, b;<br/> - double result = a % b; - </td> - <td>Floating point remainder after division. This function is different - than IEEE 754 remainder and is defined as - <code>result == a - roundTowardZero(a / b) * b</code>. - </td> -</tr> -</tbody> -</table> - -</body> -</html> diff --git a/docs/dex-format.css b/docs/dex-format.css deleted file mode 100644 index 153dd4e8d..000000000 --- a/docs/dex-format.css +++ /dev/null @@ -1,387 +0,0 @@ -h1 { - font-family: serif; - border-top-style: solid; - border-top-width: 5px; - padding-top: 9pt; - margin-top: 40pt; - color: #222266; -} - -h1.title { - border: none; -} - -h2 { - font-family: serif; - border-top-style: solid; - border-top-width: 2px; - border-color: #ccccdd; - padding-top: 9pt; - margin-top: 40pt; - margin-bottom: 2pt; - color: #222266; -} - -h3 { - font-family: serif; - font-style: bold; - margin-top: 20pt; - margin-bottom: 2pt; - color: #222266; -} - -h4 { - font-family: serif; - font-style: italic; - margin-top: 2pt; - margin-bottom: 2pt; - color: #666688; -} - -@media print { - table { - font-size: 8pt; - } -} - -@media screen { - table { - font-size: 10pt; - } -} - -pre { - background: #eeeeff; - border-color: #aaaaff; - border-style: solid; - border-width: 1px; - margin-left: 40pt; - margin-right: 40pt; - padding: 6pt; -} - -table { - border-collapse: collapse; - margin-top: 10pt; - margin-left: 40pt; - margin-right: 40pt; -} - -table th { - font-family: sans-serif; - background: #aabbff; -} - -table td { - font-family: sans-serif; - border-top-style: solid; - border-bottom-style: solid; - border-width: 1px; - border-color: #aaaaff; - padding-top: 3pt; - padding-bottom: 3pt; - padding-left: 3pt; - padding-right: 4pt; - background: #eeeeff; -} - -table p { - margin-bottom: 0pt; -} - -/* for the bnf syntax sections */ - -table.bnf { - background: #eeeeff; - border-color: #aaaaff; - border-style: solid; - border-width: 1px; - margin-top: 3pt; - margin-bottom: 3pt; - padding-top: 2pt; - padding-bottom: 6pt; - padding-left: 6pt; - padding-right: 6pt; -} - -table.bnf td { - border: none; - padding-left: 6pt; - padding-right: 6pt; - padding-top: 1pt; - padding-bottom: 1pt; -} - -table.bnf td:first-child { - padding-right: 0pt; - width: 8pt; -} - -table.bnf td:first-child td { - padding-left: 0pt; -} - -table.bnf td.def { - padding-top: 6pt; -} - -table.bnf td.bar { - padding-left: 15pt; -} - -table.bnf code { - font-weight: bold; -} - - -/* for the type name guide */ - -table.guide { - margin-top: 20pt; - margin-bottom: 20pt; -} - -table.guide td:first-child { - font-family: monospace; - width: 15%; -} - -table.guide td:first-child + td { - font-family: sans-serif; - width: 85%; -} - - -/* for the LEB128 example tables */ - -table.leb128Bits { - margin-top: 20pt; - margin-bottom: 20pt; -} - -table.leb128Bits td { - border-left: solid #aaaaff 1px; - border-right: solid #aaaaff 1px; -} - -table.leb128Bits td.start1 { - border-left: none; -} - -table.leb128Bits td.start2 { - border-left: solid #000 2px; -} - -table.leb128Bits td.end2 { - border-right: none; -} - -table.leb128 { - margin-top: 20pt; - margin-bottom: 20pt; -} - -table.leb128 td:first-child { - font-family: monospace; - text-align: center; - width: 31%; -} - -table.leb128 td:first-child + td { - font-family: monospace; - text-align: center; - width: 23%; -} - -table.leb128 td:first-child + td + td { - font-family: monospace; - text-align: center; - width: 23%; -} - -table.leb128 td:first-child + td + td + td { - font-family: monospace; - text-align: center; - width: 23%; -} - - -/* for the general format tables */ - -table.format { - margin-top: 20pt; - margin-bottom: 20pt; -} - -table.format td:first-child { - font-family: monospace; - width: 20%; -} - -table.format td:first-child + td { - font-family: monospace; - width: 20%; -} - -table.format td:first-child + td + td { - width: 60%; -} - -table.format td i { - font-family: sans-serif; -} - - -/* for the type code table */ - -table.typeCodes { - margin-top: 20pt; - margin-bottom: 20pt; -} - -table.typeCodes td:first-child { - font-family: monospace; - width: 30%; -} - -table.typeCodes td:first-child + td { - font-family: monospace; - width: 30%; -} - -table.typeCodes td:first-child + td + td { - font-family: monospace; - width: 10%; -} - -table.typeCodes td:first-child + td + td + td { - font-family: monospace; - width: 30%; -} - -table.typeCodes td i { - font-family: sans-serif; -} - - -/* for the access flags table */ - -table.accessFlags { - margin-top: 20pt; - margin-bottom: 20pt; -} - -table.accessFlags td:first-child { - font-family: monospace; - width: 10%; -} - -table.accessFlags td:first-child + td { - font-family: monospace; - width: 6%; -} - -table.accessFlags td:first-child + td + td { - width: 28%; -} - -table.accessFlags td:first-child + td + td + td { - width: 28%; -} - -table.accessFlags td:first-child + td + td + td + td { - width: 28%; -} - -table.accessFlags i { - font-family: sans-serif; -} - - -/* for the descriptor table */ - -table.descriptor { - margin-top: 20pt; - margin-bottom: 20pt; -} - -table.descriptor td:first-child { - font-family: monospace; - width: 25%; -} - -table.descriptor td:first-child + td { - font-family: sans-serif; - width: 75%; -} - - -/* for the debug bytecode table */ - -table.debugByteCode { - margin-top: 20pt; - margin-bottom: 20pt; -} - -table.debugByteCode td:first-child { - font-family: monospace; - width: 20%; -} - -table.debugByteCode td:first-child + td { - font-family: monospace; - width: 5%; -} - -table.debugByteCode td:first-child + td + td{ - font-family: monospace; - width: 15%; -} - -table.debugByteCode td:first-child + td + td + td { - width: 25%; -} - -table.debugByteCode td:first-child + td + td + td + td { - width: 35%; -} - -table.debugByteCode i { - font-family: sans-serif; -} - - -/* for the encoded value table */ - -table.encodedValue { - margin-top: 20pt; - margin-bottom: 20pt; -} - -table.encodedValue td:first-child { - font-family: monospace; - width: 12%; -} - -table.encodedValue td:first-child + td { - font-family: monospace; - width: 10%; -} - -table.encodedValue td:first-child + td + td { - font-family: monospace; - width: 15%; -} - -table.encodedValue td:first-child + td + td + td { - font-family: monospace; - width: 15%; -} - -table.encodedValue td:first-child + td + td + td + td { - width: 48%; -} - -table.encodedValue td i { - font-family: sans-serif; -} diff --git a/docs/dex-format.html b/docs/dex-format.html deleted file mode 100644 index 81c0b3645..000000000 --- a/docs/dex-format.html +++ /dev/null @@ -1,3049 +0,0 @@ -<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd"> - -<html> - -<head> -<title>.dex — Dalvik Executable Format</title> -<link rel=stylesheet href="dex-format.css"> -</head> - -<body> - -<h1 class="title"><code>.dex</code> — Dalvik Executable Format</h1> -<p>Copyright © 2007 The Android Open Source Project - -<p>This document describes the layout and contents of <code>.dex</code> -files, which are used to hold a set of class definitions and their associated -adjunct data.</p> - -<h1>Guide To Types</h1> - -<table class="guide"> -<thead> -<tr> - <th>Name</th> - <th>Description</th> -</tr> -</thead> -<tbody> -<tr> - <td>byte</td> - <td>8-bit signed int</td> -</tr> -<tr> - <td>ubyte</td> - <td>8-bit unsigned int</td> -</tr> -<tr> - <td>short</td> - <td>16-bit signed int, little-endian</td> -</tr> -<tr> - <td>ushort</td> - <td>16-bit unsigned int, little-endian</td> -</tr> -<tr> - <td>int</td> - <td>32-bit signed int, little-endian</td> -</tr> -<tr> - <td>uint</td> - <td>32-bit unsigned int, little-endian</td> -</tr> -<tr> - <td>long</td> - <td>64-bit signed int, little-endian</td> -</tr> -<tr> - <td>ulong</td> - <td>64-bit unsigned int, little-endian</td> -</tr> -<tr> - <td>sleb128</td> - <td>signed LEB128, variable-length (see below)</td> -</tr> -<tr> - <td>uleb128</td> - <td>unsigned LEB128, variable-length (see below)</td> -</tr> -<tr> - <td>uleb128p1</td> - <td>unsigned LEB128 plus <code>1</code>, variable-length (see below)</td> -</tr> -</tbody> -</table> - -<h3>LEB128</h3> - -<p>LEB128 ("<b>L</b>ittle-<b>E</b>ndian <b>B</b>ase <b>128</b>") is a -variable-length encoding for -arbitrary signed or unsigned integer quantities. The format was -borrowed from the <a href="http://dwarfstd.org/Dwarf3Std.php">DWARF3</a> -specification. In a <code>.dex</code> file, LEB128 is only ever used to -encode 32-bit quantities.</p> - -<p>Each LEB128 encoded value consists of one to five -bytes, which together represent a single 32-bit value. Each -byte has its most significant bit set except for the final byte in the -sequence, which has its most significant bit clear. The remaining -seven bits of each byte are payload, with the least significant seven -bits of the quantity in the first byte, the next seven in the second -byte and so on. In the case of a signed LEB128 (<code>sleb128</code>), -the most significant payload bit of the final byte in the sequence is -sign-extended to produce the final value. In the unsigned case -(<code>uleb128</code>), any bits not explicitly represented are -interpreted as <code>0</code>. - -<table class="leb128Bits"> -<thead> -<tr><th colspan="16">Bitwise diagram of a two-byte LEB128 value</th></tr> -<tr> - <th colspan="8">First byte</td> - <th colspan="8">Second byte</td> -</tr> -</thead> -<tbody> -<tr> - <td class="start1"><code>1</code></td> - <td>bit<sub>6</sub></td> - <td>bit<sub>5</sub></td> - <td>bit<sub>4</sub></td> - <td>bit<sub>3</sub></td> - <td>bit<sub>2</sub></td> - <td>bit<sub>1</sub></td> - <td>bit<sub>0</sub></td> - <td class="start2"><code>0</code></td> - <td>bit<sub>13</sub></td> - <td>bit<sub>12</sub></td> - <td>bit<sub>11</sub></td> - <td>bit<sub>10</sub></td> - <td>bit<sub>9</sub></td> - <td>bit<sub>8</sub></td> - <td class="end2">bit<sub>7</sub></td> -</tr> -</tbody> -</table> - -<p>The variant <code>uleb128p1</code> is used to represent a signed -value, where the representation is of the value <i>plus one</i> encoded -as a <code>uleb128</code>. This makes the encoding of <code>-1</code> -(alternatively thought of as the unsigned value <code>0xffffffff</code>) -— but no other negative number — a single byte, and is -useful in exactly those cases where the represented number must either -be non-negative or <code>-1</code> (or <code>0xffffffff</code>), -and where no other negative values are allowed (or where large unsigned -values are unlikely to be needed).</p> - -<p>Here are some examples of the formats:</p> - -<table class="leb128"> -<thead> -<tr> - <th>Encoded Sequence</th> - <th>As <code>sleb128</code></th> - <th>As <code>uleb128</code></th> - <th>As <code>uleb128p1</code></th> -</tr> -</thead> -<tbody> - <tr><td>00</td><td>0</td><td>0</td><td>-1</td></tr> - <tr><td>01</td><td>1</td><td>1</td><td>0</td></tr> - <tr><td>7f</td><td>-1</td><td>127</td><td>126</td></tr> - <tr><td>80 7f</td><td>-128</td><td>16256</td><td>16255</td></tr> -</tbody> -</table> - -<h1>Overall File Layout</h1> - -<table class="format"> -<thead> -<tr> - <th>Name</th> - <th>Format</th> - <th>Description</th> -</tr> -</thead> -<tbody> -<tr> - <td>header</td> - <td>header_item</td> - <td>the header</td> -</tr> -<tr> - <td>string_ids</td> - <td>string_id_item[]</td> - <td>string identifiers list. These are identifiers for all the strings - used by this file, either for internal naming (e.g., type descriptors) - or as constant objects referred to by code. This list must be sorted - by string contents, using UTF-16 code point values (not in a - locale-sensitive manner), and it must not contain any duplicate entries. - </td> -</tr> -<tr> - <td>type_ids</td> - <td>type_id_item[]</td> - <td>type identifiers list. These are identifiers for all types (classes, - arrays, or primitive types) referred to by this file, whether defined - in the file or not. This list must be sorted by <code>string_id</code> - index, and it must not contain any duplicate entries. - </td> -</tr> -<tr> - <td>proto_ids</td> - <td>proto_id_item[]</td> - <td>method prototype identifiers list. These are identifiers for all - prototypes referred to by this file. This list must be sorted in - return-type (by <code>type_id</code> index) major order, and then - by arguments (also by <code>type_id</code> index). The list must not - contain any duplicate entries. - </td> -</tr> -<tr> - <td>field_ids</td> - <td>field_id_item[]</td> - <td>field identifiers list. These are identifiers for all fields - referred to by this file, whether defined in the file or not. This - list must be sorted, where the defining type (by <code>type_id</code> - index) is the major order, field name (by <code>string_id</code> index) - is the intermediate order, and type (by <code>type_id</code> index) - is the minor order. The list must not contain any duplicate entries. - </td> -</tr> -<tr> - <td>method_ids</td> - <td>method_id_item[]</td> - <td>method identifiers list. These are identifiers for all methods - referred to by this file, whether defined in the file or not. This - list must be sorted, where the defining type (by <code>type_id</code> - index) is the major order, method name (by <code>string_id</code> - index) is the intermediate order, and method prototype (by - <code>proto_id</code> index) is the minor order. The list must not - contain any duplicate entries. - </td> -</tr> -<tr> - <td>class_defs</td> - <td>class_def_item[]</td> - <td>class definitions list. The classes must be ordered such that a given - class's superclass and implemented interfaces appear in the - list earlier than the referring class. Furthermore, it is invalid for - a definition for the same-named class to appear more than once in - the list. - </td> -</tr> -<tr> - <td>data</td> - <td>ubyte[]</td> - <td>data area, containing all the support data for the tables listed above. - Different items have different alignment requirements, and - padding bytes are inserted before each item if necessary to achieve - proper alignment. - </td> -</tr> -<tr> - <td>link_data</td> - <td>ubyte[]</td> - <td>data used in statically linked files. The format of the data in - this section is left unspecified by this document. - This section is empty in unlinked files, and runtime implementations - may use it as they see fit. - </td> -</tr> -</tbody> -</table> - -<h1>Bitfield, String, and Constant Definitions</h1> - -<h2><code>DEX_FILE_MAGIC</code></h2> -<h4>embedded in <code>header_item</code></h4> - -<p>The constant array/string <code>DEX_FILE_MAGIC</code> is the list of -bytes that must appear at the beginning of a <code>.dex</code> file -in order for it to be recognized as such. The value intentionally -contains a newline (<code>"\n"</code> or <code>0x0a</code>) and a -null byte (<code>"\0"</code> or <code>0x00</code>) in order to help -in the detection of certain forms of corruption. The value also -encodes a format version number as three decimal digits, which is -expected to increase monotonically over time as the format evolves.</p> - -<pre> -ubyte[8] DEX_FILE_MAGIC = { 0x64 0x65 0x78 0x0a 0x30 0x33 0x35 0x00 } - = "dex\n035\0" -</pre> - -<p><b>Note:</b> At least a couple earlier versions of the format have -been used in widely-available public software releases. For example, -version <code>009</code> was used for the M3 releases of the -Android platform (November–December 2007), -and version <code>013</code> was used for the M5 releases of the Android -platform (February–March 2008). In several respects, these earlier -versions of the format differ significantly from the version described in this -document.</p> - -<h2><code>ENDIAN_CONSTANT</code> and <code>REVERSE_ENDIAN_CONSTANT</code></h2> -<h4>embedded in <code>header_item</code></h4> - -<p>The constant <code>ENDIAN_CONSTANT</code> is used to indicate the -endianness of the file in which it is found. Although the standard -<code>.dex</code> format is little-endian, implementations may choose -to perform byte-swapping. Should an implementation come across a -header whose <code>endian_tag</code> is <code>REVERSE_ENDIAN_CONSTANT</code> -instead of <code>ENDIAN_CONSTANT</code>, it would know that the file -has been byte-swapped from the expected form.</p> - -<pre> -uint ENDIAN_CONSTANT = 0x12345678; -uint REVERSE_ENDIAN_CONSTANT = 0x78563412; -</pre> - -<h2><code>NO_INDEX</code></h2> -<h4>embedded in <code>class_def_item</code> and -<code>debug_info_item</code></h4> - -<p>The constant <code>NO_INDEX</code> is used to indicate that -an index value is absent.</p> - -<p><b>Note:</b> This value isn't defined to be -<code>0</code>, because that is in fact typically a valid index.</p> - -<p><b>Also Note:</b> The chosen value for <code>NO_INDEX</code> is -representable as a single byte in the <code>uleb128p1</code> encoding.</p> - -<pre> -uint NO_INDEX = 0xffffffff; // == -1 if treated as a signed int -</pre> - -<h2><code>access_flags</code> Definitions</h2> -<h4>embedded in <code>class_def_item</code>, -<code>encoded_field</code>, <code>encoded_method</code>, and -<code>InnerClass</code></h4> - -<p>Bitfields of these flags are used to indicate the accessibility and -overall properties of classes and class members.</p> - -<table class="accessFlags"> -<thead> -<tr> - <th>Name</th> - <th>Value</th> - <th>For Classes (and <code>InnerClass</code> annotations)</th> - <th>For Fields</th> - <th>For Methods</th> -</tr> -</thead> -<tbody> -<tr> - <td>ACC_PUBLIC</td> - <td>0x1</td> - <td><code>public</code>: visible everywhere</td> - <td><code>public</code>: visible everywhere</td> - <td><code>public</code>: visible everywhere</td> -</tr> -<tr> - <td>ACC_PRIVATE</td> - <td>0x2</td> - <td><super>*</super> - <code>private</code>: only visible to defining class - </td> - <td><code>private</code>: only visible to defining class</td> - <td><code>private</code>: only visible to defining class</td> -</tr> -<tr> - <td>ACC_PROTECTED</td> - <td>0x4</td> - <td><super>*</super> - <code>protected</code>: visible to package and subclasses - </td> - <td><code>protected</code>: visible to package and subclasses</td> - <td><code>protected</code>: visible to package and subclasses</td> -</tr> -<tr> - <td>ACC_STATIC</td> - <td>0x8</td> - <td><super>*</super> - <code>static</code>: is not constructed with an outer - <code>this</code> reference</td> - <td><code>static</code>: global to defining class</td> - <td><code>static</code>: does not take a <code>this</code> argument</td> -</tr> -<tr> - <td>ACC_FINAL</td> - <td>0x10</td> - <td><code>final</code>: not subclassable</td> - <td><code>final</code>: immutable after construction</td> - <td><code>final</code>: not overridable</td> -</tr> -<tr> - <td>ACC_SYNCHRONIZED</td> - <td>0x20</td> - <td> </td> - <td> </td> - <td><code>synchronized</code>: associated lock automatically acquired - around call to this method. <b>Note:</b> This is only valid to set when - <code>ACC_NATIVE</code> is also set.</td> -</tr> -<tr> - <td>ACC_VOLATILE</td> - <td>0x40</td> - <td> </td> - <td><code>volatile</code>: special access rules to help with thread - safety</td> - <td> </td> -</tr> -<tr> - <td>ACC_BRIDGE</td> - <td>0x40</td> - <td> </td> - <td> </td> - <td>bridge method, added automatically by compiler as a type-safe - bridge</td> -</tr> -<tr> - <td>ACC_TRANSIENT</td> - <td>0x80</td> - <td> </td> - <td><code>transient</code>: not to be saved by default serialization</td> - <td> </td> -</tr> -<tr> - <td>ACC_VARARGS</td> - <td>0x80</td> - <td> </td> - <td> </td> - <td>last argument should be treated as a "rest" argument by compiler</td> -</tr> -<tr> - <td>ACC_NATIVE</td> - <td>0x100</td> - <td> </td> - <td> </td> - <td><code>native</code>: implemented in native code</td> -</tr> -<tr> - <td>ACC_INTERFACE</td> - <td>0x200</td> - <td><code>interface</code>: multiply-implementable abstract class</td> - <td> </td> - <td> </td> -</tr> -<tr> - <td>ACC_ABSTRACT</td> - <td>0x400</td> - <td><code>abstract</code>: not directly instantiable</td> - <td> </td> - <td><code>abstract</code>: unimplemented by this class</td> -</tr> -<tr> - <td>ACC_STRICT</td> - <td>0x800</td> - <td> </td> - <td> </td> - <td><code>strictfp</code>: strict rules for floating-point arithmetic</td> -</tr> -<tr> - <td>ACC_SYNTHETIC</td> - <td>0x1000</td> - <td>not directly defined in source code</td> - <td>not directly defined in source code</td> - <td>not directly defined in source code</td> -</tr> -<tr> - <td>ACC_ANNOTATION</td> - <td>0x2000</td> - <td>declared as an annotation class</td> - <td> </td> - <td> </td> -</tr> -<tr> - <td>ACC_ENUM</td> - <td>0x4000</td> - <td>declared as an enumerated type</td> - <td>declared as an enumerated value</td> - <td> </td> -</tr> -<tr> - <td><i>(unused)</i></td> - <td>0x8000</td> - <td> </td> - <td> </td> - <td> </td> -</tr> -<tr> - <td>ACC_CONSTRUCTOR</td> - <td>0x10000</td> - <td> </td> - <td> </td> - <td>constructor method (class or instance initializer)</td> -</tr> -<tr> - <td>ACC_DECLARED_<br/>SYNCHRONIZED</td> - <td>0x20000</td> - <td> </td> - <td> </td> - <td>declared <code>synchronized</code>. <b>Note:</b> This has no effect on - execution (other than in reflection of this flag, per se). - </td> -</tr> -</tbody> -</table> - -<p><super>*</super> Only allowed on for <code>InnerClass</code> annotations, -and must not ever be on in a <code>class_def_item</code>.</p> - -<h2>MUTF-8 (Modified UTF-8) Encoding</h2> - -<p>As a concession to easier legacy support, the <code>.dex</code> format -encodes its string data in a de facto standard modified UTF-8 form, hereafter -referred to as MUTF-8. This form is identical to standard UTF-8, except:</p> - -<ul> - <li>Only the one-, two-, and three-byte encodings are used.</li> - <li>Code points in the range <code>U+10000</code> … - <code>U+10ffff</code> are encoded as a surrogate pair, each of - which is represented as a three-byte encoded value.</li> - <li>The code point <code>U+0000</code> is encoded in two-byte form.</li> - <li>A plain null byte (value <code>0</code>) indicates the end of - a string, as is the standard C language interpretation.</li> -</ul> - -<p>The first two items above can be summarized as: MUTF-8 -is an encoding format for UTF-16, instead of being a more direct -encoding format for Unicode characters.</p> - -<p>The final two items above make it simultaneously possible to include -the code point <code>U+0000</code> in a string <i>and</i> still manipulate -it as a C-style null-terminated string.</p> - -<p>However, the special encoding of <code>U+0000</code> means that, unlike -normal UTF-8, the result of calling the standard C function -<code>strcmp()</code> on a pair of MUTF-8 strings does not always -indicate the properly signed result of comparison of <i>unequal</i> strings. -When ordering (not just equality) is a concern, the most straightforward -way to compare MUTF-8 strings is to decode them character by character, -and compare the decoded values. (However, more clever implementations are -also possible.)</p> - -<p>Please refer to <a href="http://unicode.org">The Unicode -Standard</a> for further information about character encoding. -MUTF-8 is actually closer to the (relatively less well-known) encoding -<a href="http://www.unicode.org/reports/tr26/">CESU-8</a> than to UTF-8 -per se.</p> - -<h2><code>encoded_value</code> Encoding</h2> -<h4>embedded in <code>annotation_element</code> and -<code>encoded_array_item</code></h4> - -<p>An <code>encoded_value</code> is an encoded piece of (nearly) -arbitrary hierarchically structured data. The encoding is meant to -be both compact and straightforward to parse.</p> - -<table class="format"> -<thead> -<tr> - <th>Name</th> - <th>Format</th> - <th>Description</th> -</tr> -</thead> -<tbody> -<tr> - <td>(value_arg << 5) | value_type</td> - <td>ubyte</td> - <td>byte indicating the type of the immediately subsequent - <code>value</code> along - with an optional clarifying argument in the high-order three bits. - See below for the various <code>value</code> definitions. - In most cases, <code>value_arg</code> encodes the length of - the immediately-subsequent <code>value</code> in bytes, as - <code>(size - 1)</code>, e.g., <code>0</code> means that - the value requires one byte, and <code>7</code> means it requires - eight bytes; however, there are exceptions as noted below. - </td> -</tr> -<tr> - <td>value</td> - <td>ubyte[]</td> - <td>bytes representing the value, variable in length and interpreted - differently for different <code>value_type</code> bytes, though - always little-endian. See the various value definitions below for - details. - </td> -</tr> -</tbody> -</table> - -<h3>Value Formats</h3> - -<table class="encodedValue"> -<thead> -<tr> - <th>Type Name</th> - <th><code>value_type</code></th> - <th><code>value_arg</code> Format</th> - <th><code>value</code> Format</th> - <th>Description</th> -</tr> -</thead> -<tbody> -<tr> - <td>VALUE_BYTE</td> - <td>0x00</td> - <td><i>(none; must be <code>0</code>)</i></td> - <td>ubyte[1]</td> - <td>signed one-byte integer value</td> -</tr> -<tr> - <td>VALUE_SHORT</td> - <td>0x02</td> - <td>size - 1 (0…1)</td> - <td>ubyte[size]</td> - <td>signed two-byte integer value, sign-extended</td> -</tr> -<tr> - <td>VALUE_CHAR</td> - <td>0x03</td> - <td>size - 1 (0…1)</td> - <td>ubyte[size]</td> - <td>unsigned two-byte integer value, zero-extended</td> -</tr> -<tr> - <td>VALUE_INT</td> - <td>0x04</td> - <td>size - 1 (0…3)</td> - <td>ubyte[size]</td> - <td>signed four-byte integer value, sign-extended</td> -</tr> -<tr> - <td>VALUE_LONG</td> - <td>0x06</td> - <td>size - 1 (0…7)</td> - <td>ubyte[size]</td> - <td>signed eight-byte integer value, sign-extended</td> -</tr> -<tr> - <td>VALUE_FLOAT</td> - <td>0x10</td> - <td>size - 1 (0…3)</td> - <td>ubyte[size]</td> - <td>four-byte bit pattern, zero-extended <i>to the right</i>, and - interpreted as an IEEE754 32-bit floating point value - </td> -</tr> -<tr> - <td>VALUE_DOUBLE</td> - <td>0x11</td> - <td>size - 1 (0…7)</td> - <td>ubyte[size]</td> - <td>eight-byte bit pattern, zero-extended <i>to the right</i>, and - interpreted as an IEEE754 64-bit floating point value - </td> -</tr> -<tr> - <td>VALUE_STRING</td> - <td>0x17</td> - <td>size - 1 (0…3)</td> - <td>ubyte[size]</td> - <td>unsigned (zero-extended) four-byte integer value, - interpreted as an index into - the <code>string_ids</code> section and representing a string value - </td> -</tr> -<tr> - <td>VALUE_TYPE</td> - <td>0x18</td> - <td>size - 1 (0…3)</td> - <td>ubyte[size]</td> - <td>unsigned (zero-extended) four-byte integer value, - interpreted as an index into - the <code>type_ids</code> section and representing a reflective - type/class value - </td> -</tr> -<tr> - <td>VALUE_FIELD</td> - <td>0x19</td> - <td>size - 1 (0…3)</td> - <td>ubyte[size]</td> - <td>unsigned (zero-extended) four-byte integer value, - interpreted as an index into - the <code>field_ids</code> section and representing a reflective - field value - </td> -</tr> -<tr> - <td>VALUE_METHOD</td> - <td>0x1a</td> - <td>size - 1 (0…3)</td> - <td>ubyte[size]</td> - <td>unsigned (zero-extended) four-byte integer value, - interpreted as an index into - the <code>method_ids</code> section and representing a reflective - method value - </td> -</tr> -<tr> - <td>VALUE_ENUM</td> - <td>0x1b</td> - <td>size - 1 (0…3)</td> - <td>ubyte[size]</td> - <td>unsigned (zero-extended) four-byte integer value, - interpreted as an index into - the <code>field_ids</code> section and representing the value of - an enumerated type constant - </td> -</tr> -<tr> - <td>VALUE_ARRAY</td> - <td>0x1c</td> - <td><i>(none; must be <code>0</code>)</i></td> - <td>encoded_array</td> - <td>an array of values, in the format specified by - "<code>encoded_array</code> Format" below. The size - of the <code>value</code> is implicit in the encoding. - </td> -</tr> -<tr> - <td>VALUE_ANNOTATION</td> - <td>0x1d</td> - <td><i>(none; must be <code>0</code>)</i></td> - <td>encoded_annotation</td> - <td>a sub-annotation, in the format specified by - "<code>encoded_annotation</code> Format" below. The size - of the <code>value</code> is implicit in the encoding. - </td> -</tr> -<tr> - <td>VALUE_NULL</td> - <td>0x1e</td> - <td><i>(none; must be <code>0</code>)</i></td> - <td><i>(none)</i></td> - <td><code>null</code> reference value</td> -</tr> -<tr> - <td>VALUE_BOOLEAN</td> - <td>0x1f</td> - <td>boolean (0…1)</td> - <td><i>(none)</i></td> - <td>one-bit value; <code>0</code> for <code>false</code> and - <code>1</code> for <code>true</code>. The bit is represented in the - <code>value_arg</code>. - </td> -</tr> -</tbody> -</table> - -<h3><code>encoded_array</code> Format</h3> - -<table class="format"> -<thead> -<tr> - <th>Name</th> - <th>Format</th> - <th>Description</th> -</tr> -</thead> -<tbody> -<tr> - <td>size</td> - <td>uleb128</td> - <td>number of elements in the array</td> -</tr> -<tr> - <td>values</td> - <td>encoded_value[size]</td> - <td>a series of <code>size</code> <code>encoded_value</code> byte - sequences in the format specified by this section, concatenated - sequentially. - </td> -</tr> -</tbody> -</table> - -<h3><code>encoded_annotation</code> Format</h3> - -<table class="format"> -<thead> -<tr> - <th>Name</th> - <th>Format</th> - <th>Description</th> -</tr> -</thead> -<tbody> -<tr> - <td>type_idx</td> - <td>uleb128</td> - <td>type of the annotation. This must be a class (not array or primitive) - type. - </td> -</tr> -<tr> - <td>size</td> - <td>uleb128</td> - <td>number of name-value mappings in this annotation</td> -</tr> -<tr> - <td>elements</td> - <td>annotation_element[size]</td> - <td>elements of the annotataion, represented directly in-line (not as - offsets). Elements must be sorted in increasing order by - <code>string_id</code> index. - </td> -</tr> -</tbody> -</table> - -<h3><code>annotation_element</code> Format</h3> - -<table class="format"> -<thead> -<tr> - <th>Name</th> - <th>Format</th> - <th>Description</th> -</tr> -</thead> -<tbody> -<tr> - <td>name_idx</td> - <td>uleb128</td> - <td>element name, represented as an index into the - <code>string_ids</code> section. The string must conform to the - syntax for <i>MemberName</i>, defined above. - </td> -</tr> -<tr> - <td>value</td> - <td>encoded_value</td> - <td>element value</td> -</tr> -</tbody> -</table> - -<h2>String Syntax</h2> - -<p>There are several kinds of item in a <code>.dex</code> file which -ultimately refer to a string. The following BNF-style definitions -indicate the acceptable syntax for these strings.</p> - -<h3><i>SimpleName</i></h3> - -<p>A <i>SimpleName</i> is the basis for the syntax of the names of other -things. The <code>.dex</code> format allows a fair amount of latitude -here (much more than most common source languages). In brief, a simple -name consists of any low-ASCII alphabetic character or digit, a few -specific low-ASCII symbols, and most non-ASCII code points that are not -control, space, or special characters. Note that surrogate code points -(in the range <code>U+d800</code> … <code>U+dfff</code>) are not -considered valid name characters, per se, but Unicode supplemental -characters <i>are</i> valid (which are represented by the final -alternative of the rule for <i>SimpleNameChar</i>), and they should be -represented in a file as pairs of surrogate code points in the MUTF-8 -encoding.</p> - -<table class="bnf"> - <tr><td colspan="2" class="def"><i>SimpleName</i> →</td></tr> - <tr> - <td/> - <td><i>SimpleNameChar</i> (<i>SimpleNameChar</i>)*</td> - </tr> - - <tr><td colspan="2" class="def"><i>SimpleNameChar</i> →</td></tr> - <tr> - <td/> - <td><code>'A'</code> … <code>'Z'</code></td> - </tr> - <tr> - <td class="bar">|</td> - <td><code>'a'</code> … <code>'z'</code></td> - </tr> - <tr> - <td class="bar">|</td> - <td><code>'0'</code> … <code>'9'</code></td> - </tr> - <tr> - <td class="bar">|</td> - <td><code>'$'</code></td> - </tr> - <tr> - <td class="bar">|</td> - <td><code>'-'</code></td> - </tr> - <tr> - <td class="bar">|</td> - <td><code>'_'</code></td> - </tr> - <tr> - <td class="bar">|</td> - <td><code>U+00a1</code> … <code>U+1fff</code></td> - </tr> - <tr> - <td class="bar">|</td> - <td><code>U+2010</code> … <code>U+2027</code></td> - </tr> - <tr> - <td class="bar">|</td> - <td><code>U+2030</code> … <code>U+d7ff</code></td> - </tr> - <tr> - <td class="bar">|</td> - <td><code>U+e000</code> … <code>U+ffef</code></td> - </tr> - <tr> - <td class="bar">|</td> - <td><code>U+10000</code> … <code>U+10ffff</code></td> - </tr> -</table> - -<h3><i>MemberName</i></h3> -<h4>used by <code>field_id_item</code> and <code>method_id_item</code></h4> - -<p>A <i>MemberName</i> is the name of a member of a class, members being -fields, methods, and inner classes.</p> - -<table class="bnf"> - <tr><td colspan="2" class="def"><i>MemberName</i> →</td></tr> - <tr> - <td/> - <td><i>SimpleName</i></td> - </tr> - <tr> - <td class="bar">|</td> - <td><code>'<'</code> <i>SimpleName</i> <code>'>'</code></td> - </tr> -</table> - -<h3><i>FullClassName</i></h3> - -<p>A <i>FullClassName</i> is a fully-qualified class name, including an -optional package specifier followed by a required name.</p> - -<table class="bnf"> - <tr><td colspan="2" class="def"><i>FullClassName</i> →</td></tr> - <tr> - <td/> - <td><i>OptionalPackagePrefix</i> <i>SimpleName</i></td> - </tr> - - <tr><td colspan="2" class="def"><i>OptionalPackagePrefix</i> →</td></tr> - <tr> - <td/> - <td>(<i>SimpleName</i> <code>'/'</code>)*</td> - </tr> -</table> - -<h3><i>TypeDescriptor</i></h3> -<h4>used by <code>type_id_item</code></h4> - -<p>A <i>TypeDescriptor</i> is the representation of any type, including -primitives, classes, arrays, and <code>void</code>. See below for -the meaning of the various versions.</p> - -<table class="bnf"> - <tr><td colspan="2" class="def"><i>TypeDescriptor</i> →</td></tr> - <tr> - <td/> - <td><code>'V'</code></td> - </tr> - <tr> - <td class="bar">|</td> - <td><i>FieldTypeDescriptor</i></td> - </tr> - - <tr><td colspan="2" class="def"><i>FieldTypeDescriptor</i> →</td></tr> - <tr> - <td/> - <td><i>NonArrayFieldTypeDescriptor</i></td> - </tr> - <tr> - <td class="bar">|</td> - <td>(<code>'['</code> * 1…255) - <i>NonArrayFieldTypeDescriptor</i></td> - </tr> - - <tr> - <td colspan="2" class="def"><i>NonArrayFieldTypeDescriptor</i>→</td> - </tr> - <tr> - <td/> - <td><code>'Z'</code></td> - </tr> - <tr> - <td class="bar">|</td> - <td><code>'B'</code></td> - </tr> - <tr> - <td class="bar">|</td> - <td><code>'S'</code></td> - </tr> - <tr> - <td class="bar">|</td> - <td><code>'C'</code></td> - </tr> - <tr> - <td class="bar">|</td> - <td><code>'I'</code></td> - </tr> - <tr> - <td class="bar">|</td> - <td><code>'J'</code></td> - </tr> - <tr> - <td class="bar">|</td> - <td><code>'F'</code></td> - </tr> - <tr> - <td class="bar">|</td> - <td><code>'D'</code></td> - </tr> - <tr> - <td class="bar">|</td> - <td><code>'L'</code> <i>FullClassName</i> <code>';'</code></td> - </tr> -</table> - -<h3><i>ShortyDescriptor</i></h3> -<h4>used by <code>proto_id_item</code></h4> - -<p>A <i>ShortyDescriptor</i> is the short form representation of a method -prototype, including return and parameter types, except that there is -no distinction between various reference (class or array) types. Instead, -all reference types are represented by a single <code>'L'</code> character.</p> - -<table class="bnf"> - <tr><td colspan="2" class="def"><i>ShortyDescriptor</i> →</td></tr> - <tr> - <td/> - <td><i>ShortyReturnType</i> (<i>ShortyFieldType</i>)*</td> - </tr> - - <tr><td colspan="2" class="def"><i>ShortyReturnType</i> →</td></tr> - <tr> - <td/> - <td><code>'V'</code></td> - </tr> - <tr> - <td class="bar">|</td> - <td><i>ShortyFieldType</i></td> - </tr> - - <tr><td colspan="2" class="def"><i>ShortyFieldType</i> →</td></tr> - <tr> - <td/> - <td><code>'Z'</code></td> - </tr> - <tr> - <td class="bar">|</td> - <td><code>'B'</code></td> - </tr> - <tr> - <td class="bar">|</td> - <td><code>'S'</code></td> - </tr> - <tr> - <td class="bar">|</td> - <td><code>'C'</code></td> - </tr> - <tr> - <td class="bar">|</td> - <td><code>'I'</code></td> - </tr> - <tr> - <td class="bar">|</td> - <td><code>'J'</code></td> - </tr> - <tr> - <td class="bar">|</td> - <td><code>'F'</code></td> - </tr> - <tr> - <td class="bar">|</td> - <td><code>'D'</code></td> - </tr> - <tr> - <td class="bar">|</td> - <td><code>'L'</code></td> - </tr> -</table> - -<h2><i>TypeDescriptor</i> Semantics</h2> - -<p>This is the meaning of each of the variants of <i>TypeDescriptor</i>.</p> - -<table class="descriptor"> -<thead> -<tr> - <th>Syntax</th> - <th>Meaning</th> -</tr> -</thead> -<tbody> -<tr> - <td>V</td> - <td><code>void</code>; only valid for return types</td> -</tr> -<tr> - <td>Z</td> - <td><code>boolean</code></td> -</tr> -<tr> - <td>B</td> - <td><code>byte</code></td> -</tr> -<tr> - <td>S</td> - <td><code>short</code></td> -</tr> -<tr> - <td>C</td> - <td><code>char</code></td> -</tr> -<tr> - <td>I</td> - <td><code>int</code></td> -</tr> -<tr> - <td>J</td> - <td><code>long</code></td> -</tr> -<tr> - <td>F</td> - <td><code>float</code></td> -</tr> -<tr> - <td>D</td> - <td><code>double</code></td> -</tr> -<tr> - <td>L<i>fully/qualified/Name</i>;</td> - <td>the class <code><i>fully.qualified.Name</i></code></td> -</tr> -<tr> - <td>[<i>descriptor</i></td> - <td>array of <code><i>descriptor</i></code>, usable recursively for - arrays-of-arrays, though it is invalid to have more than 255 - dimensions. - </td> -</tr> -</tbody> -</table> - -<h1>Items and Related Structures</h1> - -<p>This section includes definitions for each of the top-level items that -may appear in a <code>.dex</code> file. - -<h2><code>header_item</code></h2> -<h4>appears in the <code>header</code> section</h4> -<h4>alignment: 4 bytes</h4> - -<table class="format"> -<thead> -<tr> - <th>Name</th> - <th>Format</th> - <th>Description</th> -</tr> -</thead> -<tbody> -<tr> - <td>magic</td> - <td>ubyte[8] = DEX_FILE_MAGIC</td> - <td>magic value. See discussion above under "<code>DEX_FILE_MAGIC</code>" - for more details. - </td> -</tr> -<tr> - <td>checksum</td> - <td>uint</td> - <td>adler32 checksum of the rest of the file (everything but - <code>magic</code> and this field); used to detect file corruption - </td> -</tr> -<tr> - <td>signature</td> - <td>ubyte[20]</td> - <td>SHA-1 signature (hash) of the rest of the file (everything but - <code>magic</code>, <code>checksum</code>, and this field); used - to uniquely identify files - </td> -</tr> -<tr> - <td>file_size</td> - <td>uint</td> - <td>size of the entire file (including the header), in bytes -</tr> -<tr> - <td>header_size</td> - <td>uint = 0x70</td> - <td>size of the header (this entire section), in bytes. This allows for at - least a limited amount of backwards/forwards compatibility without - invalidating the format. - </td> -</tr> -<tr> - <td>endian_tag</td> - <td>uint = ENDIAN_CONSTANT</td> - <td>endianness tag. See discussion above under "<code>ENDIAN_CONSTANT</code> - and <code>REVERSE_ENDIAN_CONSTANT</code>" for more details. - </td> -</tr> -<tr> - <td>link_size</td> - <td>uint</td> - <td>size of the link section, or <code>0</code> if this file isn't - statically linked</td> -</tr> -<tr> - <td>link_off</td> - <td>uint</td> - <td>offset from the start of the file to the link section, or - <code>0</code> if <code>link_size == 0</code>. The offset, if non-zero, - should be to an offset into the <code>link_data</code> section. The - format of the data pointed at is left unspecified by this document; - this header field (and the previous) are left as hooks for use by - runtime implementations. - </td> -</tr> -<tr> - <td>map_off</td> - <td>uint</td> - <td>offset from the start of the file to the map item, or - <code>0</code> if this file has no map. The offset, if non-zero, - should be to an offset into the <code>data</code> section, - and the data should be in the format specified by "<code>map_list</code>" - below. - </td> -</tr> -<tr> - <td>string_ids_size</td> - <td>uint</td> - <td>count of strings in the string identifiers list</td> -</tr> -<tr> - <td>string_ids_off</td> - <td>uint</td> - <td>offset from the start of the file to the string identifiers list, or - <code>0</code> if <code>string_ids_size == 0</code> (admittedly a - strange edge case). The offset, if non-zero, - should be to the start of the <code>string_ids</code> section. - </td> -</tr> -<tr> - <td>type_ids_size</td> - <td>uint</td> - <td>count of elements in the type identifiers list</td> -</tr> -<tr> - <td>type_ids_off</td> - <td>uint</td> - <td>offset from the start of the file to the type identifiers list, or - <code>0</code> if <code>type_ids_size == 0</code> (admittedly a - strange edge case). The offset, if non-zero, - should be to the start of the <code>type_ids</code> - section. - </td> -</tr> -<tr> - <td>proto_ids_size</td> - <td>uint</td> - <td>count of elements in the prototype identifiers list</td> -</tr> -<tr> - <td>proto_ids_off</td> - <td>uint</td> - <td>offset from the start of the file to the prototype identifiers list, or - <code>0</code> if <code>proto_ids_size == 0</code> (admittedly a - strange edge case). The offset, if non-zero, - should be to the start of the <code>proto_ids</code> - section. - </td> -</tr> -<tr> - <td>field_ids_size</td> - <td>uint</td> - <td>count of elements in the field identifiers list</td> -</tr> -<tr> - <td>field_ids_off</td> - <td>uint</td> - <td>offset from the start of the file to the field identifiers list, or - <code>0</code> if <code>field_ids_size == 0</code>. The offset, if - non-zero, should be to the start of the <code>field_ids</code> - section.</td> -</td> -</tr> -<tr> - <td>method_ids_size</td> - <td>uint</td> - <td>count of elements in the method identifiers list</td> -</tr> -<tr> - <td>method_ids_off</td> - <td>uint</td> - <td>offset from the start of the file to the method identifiers list, or - <code>0</code> if <code>method_ids_size == 0</code>. The offset, if - non-zero, should be to the start of the <code>method_ids</code> - section.</td> -</tr> -<tr> - <td>class_defs_size</td> - <td>uint</td> - <td>count of elements in the class definitions list</td> -</tr> -<tr> - <td>class_defs_off</td> - <td>uint</td> - <td>offset from the start of the file to the class definitions list, or - <code>0</code> if <code>class_defs_size == 0</code> (admittedly a - strange edge case). The offset, if non-zero, - should be to the start of the <code>class_defs</code> section. - </td> -</tr> -<tr> - <td>data_size</td> - <td>uint</td> - <td>Size of <code>data</code> section in bytes. Must be an even - multiple of sizeof(uint).</td> -</tr> -<tr> - <td>data_off</td> - <td>uint</td> - <td>offset from the start of the file to the start of the - <code>data</code> section. - </td> -</tr> -</tbody> -</table> - -<h2><code>map_list</code></h2> -<h4>appears in the <code>data</code> section</h4> -<h4>referenced from <code>header_item</code></h4> -<h4>alignment: 4 bytes</h4> - -<p>This is a list of the entire contents of a file, in order. It -contains some redundancy with respect to the <code>header_item</code> -but is intended to be an easy form to use to iterate over an entire -file. A given type must appear at most once in a map, but there is no -restriction on what order types may appear in, other than the -restrictions implied by the rest of the format (e.g., a -<code>header</code> section must appear first, followed by a -<code>string_ids</code> section, etc.). Additionally, the map entries must -be ordered by initial offset and must not overlap.</p> - -<table class="format"> -<thead> -<tr> - <th>Name</th> - <th>Format</th> - <th>Description</th> -</tr> -</thead> -<tbody> -<tr> - <td>size</td> - <td>uint</td> - <td>size of the list, in entries</td> -</tr> -<tr> - <td>list</td> - <td>map_item[size]</td> - <td>elements of the list</td> -</tr> -</tbody> -</table> - -<h3><code>map_item</code> Format</h3> - -<table class="format"> -<thead> -<tr> - <th>Name</th> - <th>Format</th> - <th>Description</th> -</tr> -</thead> -<tbody> -<tr> - <td>type</td> - <td>ushort</td> - <td>type of the items; see table below</td> -</tr> -<tr> - <td>unused</td> - <td>ushort</td> - <td><i>(unused)</i></td> -</tr> -<tr> - <td>size</td> - <td>uint</td> - <td>count of the number of items to be found at the indicated offset</td> -</tr> -<tr> - <td>offset</td> - <td>uint</td> - <td>offset from the start of the file to the items in question</td> -</tr> -</tbody> -</table> - - -<h3>Type Codes</h3> - -<table class="typeCodes"> -<thead> -<tr> - <th>Item Type</th> - <th>Constant</th> - <th>Value</th> - <th>Item Size In Bytes</th> -</tr> -</thead> -<tbody> -<tr> - <td>header_item</td> - <td>TYPE_HEADER_ITEM</td> - <td>0x0000</td> - <td>0x70</td> -</tr> -<tr> - <td>string_id_item</td> - <td>TYPE_STRING_ID_ITEM</td> - <td>0x0001</td> - <td>0x04</td> -</tr> -<tr> - <td>type_id_item</td> - <td>TYPE_TYPE_ID_ITEM</td> - <td>0x0002</td> - <td>0x04</td> -</tr> -<tr> - <td>proto_id_item</td> - <td>TYPE_PROTO_ID_ITEM</td> - <td>0x0003</td> - <td>0x0c</td> -</tr> -<tr> - <td>field_id_item</td> - <td>TYPE_FIELD_ID_ITEM</td> - <td>0x0004</td> - <td>0x08</td> -</tr> -<tr> - <td>method_id_item</td> - <td>TYPE_METHOD_ID_ITEM</td> - <td>0x0005</td> - <td>0x08</td> -</tr> -<tr> - <td>class_def_item</td> - <td>TYPE_CLASS_DEF_ITEM</td> - <td>0x0006</td> - <td>0x20</td> -</tr> -<tr> - <td>map_list</td> - <td>TYPE_MAP_LIST</td> - <td>0x1000</td> - <td>4 + (item.size * 12)</td> -</tr> -<tr> - <td>type_list</td> - <td>TYPE_TYPE_LIST</td> - <td>0x1001</td> - <td>4 + (item.size * 2)</td> -</tr> -<tr> - <td>annotation_set_ref_list</td> - <td>TYPE_ANNOTATION_SET_REF_LIST</td> - <td>0x1002</td> - <td>4 + (item.size * 4)</td> -</tr> -<tr> - <td>annotation_set_item</td> - <td>TYPE_ANNOTATION_SET_ITEM</td> - <td>0x1003</td> - <td>4 + (item.size * 4)</td> -</tr> -<tr> - <td>class_data_item</td> - <td>TYPE_CLASS_DATA_ITEM</td> - <td>0x2000</td> - <td><i>implicit; must parse</i></td> -</tr> -<tr> - <td>code_item</td> - <td>TYPE_CODE_ITEM</td> - <td>0x2001</td> - <td><i>implicit; must parse</i></td> -</tr> -<tr> - <td>string_data_item</td> - <td>TYPE_STRING_DATA_ITEM</td> - <td>0x2002</td> - <td><i>implicit; must parse</i></td> -</tr> -<tr> - <td>debug_info_item</td> - <td>TYPE_DEBUG_INFO_ITEM</td> - <td>0x2003</td> - <td><i>implicit; must parse</i></td> -</tr> -<tr> - <td>annotation_item</td> - <td>TYPE_ANNOTATION_ITEM</td> - <td>0x2004</td> - <td><i>implicit; must parse</i></td> -</tr> -<tr> - <td>encoded_array_item</td> - <td>TYPE_ENCODED_ARRAY_ITEM</td> - <td>0x2005</td> - <td><i>implicit; must parse</i></td> -</tr> -<tr> - <td>annotations_directory_item</td> - <td>TYPE_ANNOTATIONS_DIRECTORY_ITEM</td> - <td>0x2006</td> - <td><i>implicit; must parse</i></td> -</tr> -</tbody> -</table> - - -<h2><code>string_id_item</code></h2> -<h4>appears in the <code>string_ids</code> section</h4> -<h4>alignment: 4 bytes</h4> - -<table class="format"> -<thead> -<tr> - <th>Name</th> - <th>Format</th> - <th>Description</th> -</tr> -</thead> -<tbody> -<tr> - <td>string_data_off</td> - <td>uint</td> - <td>offset from the start of the file to the string data for this - item. The offset should be to a location - in the <code>data</code> section, and the data should be in the - format specified by "<code>string_data_item</code>" below. - There is no alignment requirement for the offset. - </td> -</tr> -</tbody> -</table> - -<h2><code>string_data_item</code></h2> -<h4>appears in the <code>data</code> section</h4> -<h4>alignment: none (byte-aligned)</h4> - -<table class="format"> -<thead> -<tr> - <th>Name</th> - <th>Format</th> - <th>Description</th> -</tr> -</thead> -<tbody> -<tr> - <td>utf16_size</td> - <td>uleb128</td> - <td>size of this string, in UTF-16 code units (which is the "string - length" in many systems). That is, this is the decoded length of - the string. (The encoded length is implied by the position of - the <code>0</code> byte.)</td> -</tr> -<tr> - <td>data</td> - <td>ubyte[]</td> - <td>a series of MUTF-8 code units (a.k.a. octets, a.k.a. bytes) - followed by a byte of value <code>0</code>. See - "MUTF-8 (Modified UTF-8) Encoding" above for details and - discussion about the data format. - <p><b>Note:</b> It is acceptable to have a string which includes - (the encoded form of) UTF-16 surrogate code units (that is, - <code>U+d800</code> … <code>U+dfff</code>) - either in isolation or out-of-order with respect to the usual - encoding of Unicode into UTF-16. It is up to higher-level uses of - strings to reject such invalid encodings, if appropriate.</p> - </td> -</tr> -</tbody> -</table> - -<h2><code>type_id_item</code></h2> -<h4>appears in the <code>type_ids</code> section</h4> -<h4>alignment: 4 bytes</h4> - -<table class="format"> -<thead> -<tr> - <th>Name</th> - <th>Format</th> - <th>Description</th> -</tr> -</thead> -<tbody> -<tr> - <td>descriptor_idx</td> - <td>uint</td> - <td>index into the <code>string_ids</code> list for the descriptor - string of this type. The string must conform to the syntax for - <i>TypeDescriptor</i>, defined above. - </td> -</tr> -</tbody> -</table> - -<h2><code>proto_id_item</code></h2> -<h4>appears in the <code>proto_ids</code> section</h4> -<h4>alignment: 4 bytes</h4> - -<table class="format"> -<thead> -<tr> - <th>Name</th> - <th>Format</th> - <th>Description</th> -</tr> -</thead> -<tbody> -<tr> - <td>shorty_idx</td> - <td>uint</td> - <td>index into the <code>string_ids</code> list for the short-form - descriptor string of this prototype. The string must conform to the - syntax for <i>ShortyDescriptor</i>, defined above, and must correspond - to the return type and parameters of this item. - </td> -</tr> -<tr> - <td>return_type_idx</td> - <td>uint</td> - <td>index into the <code>type_ids</code> list for the return type - of this prototype - </td> -</tr> -<tr> - <td>parameters_off</td> - <td>uint</td> - <td>offset from the start of the file to the list of parameter types - for this prototype, or <code>0</code> if this prototype has no - parameters. This offset, if non-zero, should be in the - <code>data</code> section, and the data there should be in the - format specified by <code>"type_list"</code> below. Additionally, there - should be no reference to the type <code>void</code> in the list. - </td> -</tr> -</tbody> -</table> - -<h2><code>field_id_item</code></h2> -<h4>appears in the <code>field_ids</code> section</h4> -<h4>alignment: 4 bytes</h4> - -<table class="format"> -<thead> -<tr> - <th>Name</th> - <th>Format</th> - <th>Description</th> -</tr> -</thead> -<tbody> -<tr> - <td>class_idx</td> - <td>ushort</td> - <td>index into the <code>type_ids</code> list for the definer of this - field. This must be a class type, and not an array or primitive type. - </td> -</tr> -<tr> - <td>type_idx</td> - <td>ushort</td> - <td>index into the <code>type_ids</code> list for the type of - this field - </td> -</tr> -<tr> - <td>name_idx</td> - <td>uint</td> - <td>index into the <code>string_ids</code> list for the name of this - field. The string must conform to the syntax for <i>MemberName</i>, - defined above. - </td> -</tr> -</tbody> -</table> - -<h2><code>method_id_item</code></h2> -<h4>appears in the <code>method_ids</code> section</h4> -<h4>alignment: 4 bytes</h4> - -<table class="format"> -<thead> -<tr> - <th>Name</th> - <th>Format</th> - <th>Description</th> -</tr> -</thead> -<tbody> -<tr> - <td>class_idx</td> - <td>ushort</td> - <td>index into the <code>type_ids</code> list for the definer of this - method. This must be a class or array type, and not a primitive type. - </td> -</tr> -<tr> - <td>proto_idx</td> - <td>ushort</td> - <td>index into the <code>proto_ids</code> list for the prototype of - this method - </td> -</tr> -<tr> - <td>name_idx</td> - <td>uint</td> - <td>index into the <code>string_ids</code> list for the name of this - method. The string must conform to the syntax for <i>MemberName</i>, - defined above. - </td> -</tr> -</tbody> -</table> - -<h2><code>class_def_item</code></h2> -<h4>appears in the <code>class_defs</code> section</h4> -<h4>alignment: 4 bytes</h4> - -<table class="format"> -<thead> -<tr> - <th>Name</th> - <th>Format</th> - <th>Description</th> -</tr> -</thead> -<tbody> -<tr> - <td>class_idx</td> - <td>uint</td> - <td>index into the <code>type_ids</code> list for this class. - This must be a class type, and not an array or primitive type. - </td> -</tr> -<tr> - <td>access_flags</td> - <td>uint</td> - <td>access flags for the class (<code>public</code>, <code>final</code>, - etc.). See "<code>access_flags</code> Definitions" for details. - </td> -</tr> -<tr> - <td>superclass_idx</td> - <td>uint</td> - <td>index into the <code>type_ids</code> list for the superclass, or - the constant value <code>NO_INDEX</code> if this class has no - superclass (i.e., it is a root class such as <code>Object</code>). - If present, this must be a class type, and not an array or primitive type. - </td> -</tr> -<tr> - <td>interfaces_off</td> - <td>uint</td> - <td>offset from the start of the file to the list of interfaces, or - <code>0</code> if there are none. This offset - should be in the <code>data</code> section, and the data - there should be in the format specified by - "<code>type_list</code>" below. Each of the elements of the list - must be a class type (not an array or primitive type), and there - must not be any duplicates. - </td> -</tr> -<tr> - <td>source_file_idx</td> - <td>uint</td> - <td>index into the <code>string_ids</code> list for the name of the - file containing the original source for (at least most of) this class, - or the special value <code>NO_INDEX</code> to represent a lack of - this information. The <code>debug_info_item</code> of any given method - may override this source file, but the expectation is that most classes - will only come from one source file. - </td> -</tr> -<tr> - <td>annotations_off</td> - <td>uint</td> - <td>offset from the start of the file to the annotations structure - for this class, or <code>0</code> if there are no annotations on - this class. This offset, if non-zero, should be in the - <code>data</code> section, and the data there should be in - the format specified by "<code>annotations_directory_item</code>" below, - with all items referring to this class as the definer. - </td> -</tr> -<tr> - <td>class_data_off</td> - <td>uint</td> - <td>offset from the start of the file to the associated - class data for this item, or <code>0</code> if there is no class - data for this class. (This may be the case, for example, if this class - is a marker interface.) The offset, if non-zero, should be in the - <code>data</code> section, and the data there should be in the - format specified by "<code>class_data_item</code>" below, with all - items referring to this class as the definer. - </td> -</tr> -<tr> - <td>static_values_off</td> - <td>uint</td> - <td>offset from the start of the file to the list of initial - values for <code>static</code> fields, or <code>0</code> if there - are none (and all <code>static</code> fields are to be initialized with - <code>0</code> or <code>null</code>). This offset should be in the - <code>data</code> section, and the data there should be in the - format specified by "<code>encoded_array_item</code>" below. The size - of the array must be no larger than the number of <code>static</code> - fields declared by this class, and the elements correspond to the - <code>static</code> fields in the same order as declared in the - corresponding <code>field_list</code>. The type of each array - element must match the declared type of its corresponding field. - If there are fewer elements in the array than there are - <code>static</code> fields, then the leftover fields are initialized - with a type-appropriate <code>0</code> or <code>null</code>. - </td> -</tr> -</tbody> -</table> - -<h2><code>class_data_item</code></h2> -<h4>referenced from <code>class_def_item</code></h4> -<h4>appears in the <code>data</code> section</h4> -<h4>alignment: none (byte-aligned)</h4> - -<table class="format"> -<thead> -<tr> - <th>Name</th> - <th>Format</th> - <th>Description</th> -</tr> -</thead> -<tbody> -<tr> - <td>static_fields_size</td> - <td>uleb128</td> - <td>the number of static fields defined in this item</td> -</tr> -<tr> - <td>instance_fields_size</td> - <td>uleb128</td> - <td>the number of instance fields defined in this item</td> -</tr> -<tr> - <td>direct_methods_size</td> - <td>uleb128</td> - <td>the number of direct methods defined in this item</td> -</tr> -<tr> - <td>virtual_methods_size</td> - <td>uleb128</td> - <td>the number of virtual methods defined in this item</td> -</tr> -<tr> - <td>static_fields</td> - <td>encoded_field[static_fields_size]</td> - <td>the defined static fields, represented as a sequence of - encoded elements. The fields must be sorted by - <code>field_idx</code> in increasing order. - </td> -</tr> -<tr> - <td>instance_fields</td> - <td>encoded_field[instance_fields_size]</td> - <td>the defined instance fields, represented as a sequence of - encoded elements. The fields must be sorted by - <code>field_idx</code> in increasing order. - </td> -</tr> -<tr> - <td>direct_methods</td> - <td>encoded_method[direct_methods_size]</td> - <td>the defined direct (any of <code>static</code>, <code>private</code>, - or constructor) methods, represented as a sequence of - encoded elements. The methods must be sorted by - <code>method_idx</code> in increasing order. - </td> -</tr> -<tr> - <td>virtual_methods</td> - <td>encoded_method[virtual_methods_size]</td> - <td>the defined virtual (none of <code>static</code>, <code>private</code>, - or constructor) methods, represented as a sequence of - encoded elements. This list should <i>not</i> include inherited - methods unless overridden by the class that this item represents. The - methods must be sorted by <code>method_idx</code> in increasing order. - </td> -</tr> -</tbody> -</table> - -<p><b>Note:</b> All elements' <code>field_id</code>s and -<code>method_id</code>s must refer to the same defining class.</p> - -<h3><code>encoded_field</code> Format</h3> - -<table class="format"> -<thead> -<tr> - <th>Name</th> - <th>Format</th> - <th>Description</th> -</tr> -</thead> -<tbody> -<tr> - <td>field_idx_diff</td> - <td>uleb128</td> - <td>index into the <code>field_ids</code> list for the identity of this - field (includes the name and descriptor), represented as a difference - from the index of previous element in the list. The index of the - first element in a list is represented directly. - </td> -</tr> -<tr> - <td>access_flags</td> - <td>uleb128</td> - <td>access flags for the field (<code>public</code>, <code>final</code>, - etc.). See "<code>access_flags</code> Definitions" for details. - </td> -</tr> -</tbody> -</table> - -<h3><code>encoded_method</code> Format</h3> - -<table class="format"> -<thead> -<tr> - <th>Name</th> - <th>Format</th> - <th>Description</th> -</tr> -</thead> -<tbody> -<tr> - <td>method_idx_diff</td> - <td>uleb128</td> - <td>index into the <code>method_ids</code> list for the identity of this - method (includes the name and descriptor), represented as a difference - from the index of previous element in the list. The index of the - first element in a list is represented directly. - </td> -</tr> -<tr> - <td>access_flags</td> - <td>uleb128</td> - <td>access flags for the method (<code>public</code>, <code>final</code>, - etc.). See "<code>access_flags</code> Definitions" for details. - </td> -</tr> -<tr> - <td>code_off</td> - <td>uleb128</td> - <td>offset from the start of the file to the code structure for this - method, or <code>0</code> if this method is either <code>abstract</code> - or <code>native</code>. The offset should be to a location in the - <code>data</code> section. The format of the data is specified by - "<code>code_item</code>" below. - </td> -</tr> -</tbody> -</table> - -<h2><code>type_list</code></h2> -<h4>referenced from <code>class_def_item</code> and -<code>proto_id_item</code></h4> -<h4>appears in the <code>data</code> section</h4> -<h4>alignment: 4 bytes</h4> - -<table class="format"> -<thead> -<tr> - <th>Name</th> - <th>Format</th> - <th>Description</th> -</tr> -</thead> -<tbody> -<tr> - <td>size</td> - <td>uint</td> - <td>size of the list, in entries</td> -</tr> -<tr> - <td>list</td> - <td>type_item[size]</td> - <td>elements of the list</td> -</tr> -</tbody> -</table> - -<h3><code>type_item</code> Format</h3> - -<table class="format"> -<thead> -<tr> - <th>Name</th> - <th>Format</th> - <th>Description</th> -</tr> -</thead> -<tbody> -<tr> - <td>type_idx</td> - <td>ushort</td> - <td>index into the <code>type_ids</code> list</td> -</tr> -</tbody> -</table> - -<h2><code>code_item</code></h2> -<h4>referenced from <code>encoded_method</code></h4> -<h4>appears in the <code>data</code> section</h4> -<h4>alignment: 4 bytes</h4> - -<table class="format"> -<thead> -<tr> - <th>Name</th> - <th>Format</th> - <th>Description</th> -</tr> -</thead> -<tbody> -<tr> - <td>registers_size</td> - <td>ushort</td> - <td>the number of registers used by this code</td> -</tr> -<tr> - <td>ins_size</td> - <td>ushort</td> - <td>the number of words of incoming arguments to the method that this - code is for</td> -</tr> -<tr> - <td>outs_size</td> - <td>ushort</td> - <td>the number of words of outgoing argument space required by this - code for method invocation - </td> -</tr> -<tr> - <td>tries_size</td> - <td>ushort</td> - <td>the number of <code>try_item</code>s for this instance. If non-zero, - then these appear as the <code>tries</code> array just after the - <code>insns</code> in this instance. - </td> -</tr> -<tr> - <td>debug_info_off</td> - <td>uint</td> - <td>offset from the start of the file to the debug info (line numbers + - local variable info) sequence for this code, or <code>0</code> if - there simply is no information. The offset, if non-zero, should be - to a location in the <code>data</code> section. The format of - the data is specified by "<code>debug_info_item</code>" below. - </td> -</tr> -<tr> - <td>insns_size</td> - <td>uint</td> - <td>size of the instructions list, in 16-bit code units</td> -</tr> -<tr> - <td>insns</td> - <td>ushort[insns_size]</td> - <td>actual array of bytecode. The format of code in an <code>insns</code> - array is specified by the companion document - <a href="dalvik-bytecode.html">"Bytecode for the Dalvik VM"</a>. Note - that though this is defined as an array of <code>ushort</code>, there - are some internal structures that prefer four-byte alignment. Also, - if this happens to be in an endian-swapped file, then the swapping is - <i>only</i> done on individual <code>ushort</code>s and not on the - larger internal structures. - </td> -</tr> -<tr> - <td>padding</td> - <td>ushort <i>(optional)</i> = 0</td> - <td>two bytes of padding to make <code>tries</code> four-byte aligned. - This element is only present if <code>tries_size</code> is non-zero - and <code>insns_size</code> is odd. - </td> -</tr> -<tr> - <td>tries</td> - <td>try_item[tries_size] <i>(optional)</i></td> - <td>array indicating where in the code exceptions are caught and - how to handle them. Elements of the array must be non-overlapping in - range and in order from low to high address. This element is only - present if <code>tries_size</code> is non-zero. - </td> -</tr> -<tr> - <td>handlers</td> - <td>encoded_catch_handler_list <i>(optional)</i></td> - <td>bytes representing a list of lists of catch types and associated - handler addresses. Each <code>try_item</code> has a byte-wise offset - into this structure. This element is only present if - <code>tries_size</code> is non-zero. - </td> -</tr> -</tbody> -</table> - -<h3><code>try_item</code> Format </h3> - -<table class="format"> -<thead> -<tr> - <th>Name</th> - <th>Format</th> - <th>Description</th> -</tr> -</thead> -<tbody> -<tr> - <td>start_addr</td> - <td>uint</td> - <td>start address of the block of code covered by this entry. The address - is a count of 16-bit code units to the start of the first covered - instruction. - </td> -</tr> -<tr> - <td>insn_count</td> - <td>ushort</td> - <td>number of 16-bit code units covered by this entry. The last code - unit covered (inclusive) is <code>start_addr + insn_count - 1</code>. - </td> -</tr> -<tr> - <td>handler_off</td> - <td>ushort</td> - <td>offset in bytes from the start of the associated - <code>encoded_catch_hander_list</code> to the - <code>encoded_catch_handler</code> for this entry. This must be an - offset to the start of an <code>encoded_catch_handler</code>. - </td> -</tr> -</tbody> -</table> - -<h3><code>encoded_catch_handler_list</code> Format</h3> - -<table class="format"> -<thead> -<tr> - <th>Name</th> - <th>Format</th> - <th>Description</th> -</tr> -</thead> -<tbody> -<tr> - <td>size</td> - <td>uleb128</td> - <td>size of this list, in entries</td> -</tr> -<tr> - <td>list</td> - <td>encoded_catch_handler[handlers_size]</td> - <td>actual list of handler lists, represented directly (not as offsets), - and concatenated sequentially</td> -</tr> -</tbody> -</table> - -<h3><code>encoded_catch_handler</code> Format</h3> - -<table class="format"> -<thead> -<tr> - <th>Name</th> - <th>Format</th> - <th>Description</th> -</tr> -</thead> -<tbody> -<tr> - <td>size</td> - <td>sleb128</td> - <td>number of catch types in this list. If non-positive, then this is - the negative of the number of catch types, and the catches are followed - by a catch-all handler. For example: A <code>size</code> of <code>0</code> - means that there is a catch-all but no explicitly typed catches. - A <code>size</code> of <code>2</code> means that there are two explicitly - typed catches and no catch-all. And a <code>size</code> of <code>-1</code> - means that there is one typed catch along with a catch-all. - </td> -</tr> -<tr> - <td>handlers</td> - <td>encoded_type_addr_pair[abs(size)]</td> - <td>stream of <code>abs(size)</code> encoded items, one for each caught - type, in the order that the types should be tested. - </td> -</tr> -<tr> - <td>catch_all_addr</td> - <td>uleb128 <i>(optional)</i></td> - <td>bytecode address of the catch-all handler. This element is only - present if <code>size</code> is non-positive. - </td> -</tr> -</tbody> -</table> - -<h3><code>encoded_type_addr_pair</code> Format</h3> - -<table class="format"> -<thead> -<tr> - <th>Name</th> - <th>Format</th> - <th>Description</th> -</tr> -</thead> -<tbody> -<tr> - <td>type_idx</td> - <td>uleb128</td> - <td>index into the <code>type_ids</code> list for the type of the - exception to catch - </td> -</tr> -<tr> - <td>addr</td> - <td>uleb128</td> - <td>bytecode address of the associated exception handler</td> -</tr> -</tbody> -</table> - -<h2><code>debug_info_item</code></h2> -<h4>referenced from <code>code_item</code></h4> -<h4>appears in the <code>data</code> section</h4> -<h4>alignment: none (byte-aligned)</h4> - -<p>Each <code>debug_info_item</code> defines a DWARF3-inspired byte-coded -state machine that, when interpreted, emits the positions -table and (potentially) the local variable information for a -<code>code_item</code>. The sequence begins with a variable-length -header (the length of which depends on the number of method -parameters), is followed by the state machine bytecodes, and ends -with an <code>DBG_END_SEQUENCE</code> byte.</p> - -<p>The state machine consists of five registers. The -<code>address</code> register represents the instruction offset in the -associated <code>insns_item</code> in 16-bit code units. The -<code>address</code> register starts at <code>0</code> at the beginning of each -<code>debug_info</code> sequence and must only monotonically increase. -The <code>line</code> register represents what source line number -should be associated with the next positions table entry emitted by -the state machine. It is initialized in the sequence header, and may -change in positive or negative directions but must never be less than -<code>1</code>. The <code>source_file</code> register represents the -source file that the line number entries refer to. It is initialized to -the value of <code>source_file_idx</code> in <code>class_def_item</code>. -The other two variables, <code>prologue_end</code> and -<code>epilogue_begin</code>, are boolean flags (initialized to -<code>false</code>) that indicate whether the next position emitted -should be considered a method prologue or epilogue. The state machine -must also track the name and type of the last local variable live in -each register for the <code>DBG_RESTART_LOCAL</code> code.</p> - -<p>The header is as follows:</p> - -<table class="format"> -<thead> -<tr> - <th>Name</th> - <th>Format</th> - <th>Description</th> -</tr> -</thead> -<tbody> -<tr> - <td>line_start</td> - <td>uleb128</td> - <td>the initial value for the state machine's <code>line</code> register. - Does not represent an actual positions entry. - </td> -</tr> -<tr> - <td>parameters_size</td> - <td>uleb128</td> - <td>the number of parameter names that are encoded. There should be - one per method parameter, excluding an instance method's <code>this</code>, - if any. - </td> -</tr> -<tr> - <td>parameter_names</td> - <td>uleb128p1[parameters_size]</td> - <td>string index of the method parameter name. An encoded value of - <code>NO_INDEX</code> indicates that no name - is available for the associated parameter. The type descriptor - and signature are implied from the method descriptor and signature. - </td> -</tr> -</tbody> -</table> - -<p>The byte code values are as follows:</p> - -<table class="debugByteCode"> -<thead> -<tr> - <th>Name</th> - <th>Value</th> - <th>Format</th> - <th>Arguments</th> - <th>Description</th> -</tr> -</thead> -<tbody> -<tr> - <td>DBG_END_SEQUENCE</td> - <td>0x00</td> - <td></td> - <td><i>(none)</i></td> - <td>terminates a debug info sequence for a <code>code_item</code></td> -</tr> -<tr> - <td>DBG_ADVANCE_PC</td> - <td>0x01</td> - <td>uleb128 addr_diff</td> - <td><code>addr_diff</code>: amount to add to address register</td> - <td>advances the address register without emitting a positions entry</td> -</tr> -<tr> - <td>DBG_ADVANCE_LINE</td> - <td>0x02</td> - <td>sleb128 line_diff</td> - <td><code>line_diff</code>: amount to change line register by</td> - <td>advances the line register without emitting a positions entry</td> -</tr> -<tr> - <td>DBG_START_LOCAL</td> - <td>0x03</td> - <td>uleb128 register_num<br/> - uleb128p1 name_idx<br/> - uleb128p1 type_idx - </td> - <td><code>register_num</code>: register that will contain local<br/> - <code>name_idx</code>: string index of the name<br/> - <code>type_idx</code>: type index of the type - </td> - <td>introduces a local variable at the current address. Either - <code>name_idx</code> or <code>type_idx</code> may be - <code>NO_INDEX</code> to indicate that that value is unknown. - </td> -</tr> -<tr> - <td>DBG_START_LOCAL_EXTENDED</td> - <td>0x04</td> - <td>uleb128 register_num<br/> - uleb128p1 name_idx<br/> - uleb128p1 type_idx<br/> - uleb128p1 sig_idx - </td> - <td><code>register_num</code>: register that will contain local<br/> - <code>name_idx</code>: string index of the name<br/> - <code>type_idx</code>: type index of the type<br/> - <code>sig_idx</code>: string index of the type signature - </td> - <td>introduces a local with a type signature at the current address. - Any of <code>name_idx</code>, <code>type_idx</code>, or - <code>sig_idx</code> may be <code>NO_INDEX</code> - to indicate that that value is unknown. (If <code>sig_idx</code> is - <code>-1</code>, though, the same data could be represented more - efficiently using the opcode <code>DBG_START_LOCAL</code>.) - <p><b>Note:</b> See the discussion under - "<code>dalvik.annotation.Signature</code>" below for caveats about - handling signatures.</p> - </td> -</tr> -<tr> - <td>DBG_END_LOCAL</td> - <td>0x05</td> - <td>uleb128 register_num</td> - <td><code>register_num</code>: register that contained local</td> - <td>marks a currently-live local variable as out of scope at the current - address - </td> -</tr> -<tr> - <td>DBG_RESTART_LOCAL</td> - <td>0x06</td> - <td>uleb128 register_num</td> - <td><code>register_num</code>: register to restart</td> - <td>re-introduces a local variable at the current address. The name - and type are the same as the last local that was live in the specified - register. - </td> -</tr> -<tr> - <td>DBG_SET_PROLOGUE_END</td> - <td>0x07</td> - <td></td> - <td><i>(none)</i></td> - <td>sets the <code>prologue_end</code> state machine register, - indicating that the next position entry that is added should be - considered the end of a method prologue (an appropriate place for - a method breakpoint). The <code>prologue_end</code> register is - cleared by any special (<code>>= 0x0a</code>) opcode. - </td> -</tr> -<tr> - <td>DBG_SET_EPILOGUE_BEGIN</td> - <td>0x08</td> - <td></td> - <td><i>(none)</i></td> - <td>sets the <code>epilogue_begin</code> state machine register, - indicating that the next position entry that is added should be - considered the beginning of a method epilogue (an appropriate place - to suspend execution before method exit). - The <code>epilogue_begin</code> register is cleared by any special - (<code>>= 0x0a</code>) opcode. - </td> -</tr> -<tr> - <td>DBG_SET_FILE</td> - <td>0x09</td> - <td>uleb128p1 name_idx</td> - <td><code>name_idx</code>: string index of source file name; - <code>NO_INDEX</code> if unknown - </td> - <td>indicates that all subsequent line number entries make reference to this - source file name, instead of the default name specified in - <code>code_item</code> - </td> -</tr> -<tr> - <td><i>Special Opcodes</i></td> - <!-- When updating the range below, make sure to search for other - instances of 0x0a in this section. --> - <td>0x0a…0xff</td> - <td></td> - <td><i>(none)</i></td> - <td>advances the <code>line</code> and <code>address</code> registers, - emits a position entry, and clears <code>prologue_end</code> and - <code>epilogue_begin</code>. See below for description. - </td> -</tr> -</tbody> -</table> - -<h3>Special Opcodes</h3> - -<p>Opcodes with values between <code>0x0a</code> and <code>0xff</code> -(inclusive) move both the <code>line</code> and <code>address</code> -registers by a small amount and then emit a new position table entry. -The formula for the increments are as follows:</p> - -<pre> -DBG_FIRST_SPECIAL = 0x0a // the smallest special opcode -DBG_LINE_BASE = -4 // the smallest line number increment -DBG_LINE_RANGE = 15 // the number of line increments represented - -adjusted_opcode = opcode - DBG_FIRST_SPECIAL - -line += DBG_LINE_BASE + (adjusted_opcode % DBG_LINE_RANGE) -address += (adjusted_opcode / DBG_LINE_RANGE) -</pre> - -<h2><code>annotations_directory_item</code></h2> -<h4>referenced from <code>class_def_item</code></h4> -<h4>appears in the <code>data</code> section</h4> -<h4>alignment: 4 bytes</h4> - -<table class="format"> -<thead> -<tr> - <th>Name</th> - <th>Format</th> - <th>Description</th> -</tr> -</thead> -<tbody> -<tr> - <td>class_annotations_off</td> - <td>uint</td> - <td>offset from the start of the file to the annotations made directly - on the class, or <code>0</code> if the class has no direct annotations. - The offset, if non-zero, should be to a location in the - <code>data</code> section. The format of the data is specified - by "<code>annotation_set_item</code>" below. - </td> -</tr> -<tr> - <td>fields_size</td> - <td>uint</td> - <td>count of fields annotated by this item</td> -</tr> -<tr> - <td>annotated_methods_size</td> - <td>uint</td> - <td>count of methods annotated by this item</td> -</tr> -<tr> - <td>annotated_parameters_size</td> - <td>uint</td> - <td>count of method parameter lists annotated by this item</td> -</tr> -<tr> - <td>field_annotations</td> - <td>field_annotation[fields_size] <i>(optional)</i></td> - <td>list of associated field annotations. The elements of the list must - be sorted in increasing order, by <code>field_idx</code>. - </td> -</tr> -<tr> - <td>method_annotations</td> - <td>method_annotation[methods_size] <i>(optional)</i></td> - <td>list of associated method annotations. The elements of the list must - be sorted in increasing order, by <code>method_idx</code>. - </td> -</tr> -<tr> - <td>parameter_annotations</td> - <td>parameter_annotation[parameters_size] <i>(optional)</i></td> - <td>list of associated method parameter annotations. The elements of the - list must be sorted in increasing order, by <code>method_idx</code>. - </td> -</tr> -</tbody> -</table> - -<p><b>Note:</b> All elements' <code>field_id</code>s and -<code>method_id</code>s must refer to the same defining class.</p> - -<h3><code>field_annotation</code> Format</h3> - -<table class="format"> -<thead> -<tr> - <th>Name</th> - <th>Format</th> - <th>Description</th> -</tr> -</thead> -<tbody> -<tr> - <td>field_idx</td> - <td>uint</td> - <td>index into the <code>field_ids</code> list for the identity of the - field being annotated - </td> -</tr> -<tr> - <td>annotations_off</td> - <td>uint</td> - <td>offset from the start of the file to the list of annotations for - the field. The offset should be to a location in the <code>data</code> - section. The format of the data is specified by - "<code>annotation_set_item</code>" below. - </td> -</tr> -</tbody> -</table> - -<h3><code>method_annotation</code> Format</h3> - -<table class="format"> -<thead> -<tr> - <th>Name</th> - <th>Format</th> - <th>Description</th> -</tr> -</thead> -<tbody> -<tr> - <td>method_idx</td> - <td>uint</td> - <td>index into the <code>method_ids</code> list for the identity of the - method being annotated - </td> -</tr> -<tr> - <td>annotations_off</td> - <td>uint</td> - <td>offset from the start of the file to the list of annotations for - the method. The offset should be to a location in the - <code>data</code> section. The format of the data is specified by - "<code>annotation_set_item</code>" below. - </td> -</tr> -</tbody> -</table> - -<h3><code>parameter_annotation</code> Format</h2> - -<table class="format"> -<thead> -<tr> - <th>Name</th> - <th>Format</th> - <th>Description</th> -</tr> -</thead> -<tbody> -<tr> - <td>method_idx</td> - <td>uint</td> - <td>index into the <code>method_ids</code> list for the identity of the - method whose parameters are being annotated - </td> -</tr> -<tr> - <td>annotations_off</td> - <td>uint</td> - <td>offset from the start of the file to the list of annotations for - the method parameters. The offset should be to a location in the - <code>data</code> section. The format of the data is specified by - "<code>annotation_set_ref_list</code>" below. - </td> -</tr> -</tbody> -</table> - -<h2><code>annotation_set_ref_list</code></h2> -<h4>referenced from <code>parameter_annotations_item</code></h4> -<h4>appears in the <code>data</code> section</h4> -<h4>alignment: 4 bytes</h4> - -<table class="format"> -<thead> -<tr> - <th>Name</th> - <th>Format</th> - <th>Description</th> -</tr> -</thead> -<tbody> -<tr> - <td>size</td> - <td>uint</td> - <td>size of the list, in entries</td> -</tr> -<tr> - <td>list</td> - <td>annotation_set_ref_item[size]</td> - <td>elements of the list</td> -</tr> -</tbody> -</table> - -<h3><code>annotation_set_ref_item</code> Format</h3> - -<table class="format"> -<thead> -<tr> - <th>Name</th> - <th>Format</th> - <th>Description</th> -</tr> -</thead> -<tbody> -<tr> - <td>annotations_off</td> - <td>uint</td> - <td>offset from the start of the file to the referenced annotation set - or <code>0</code> if there are no annotations for this element. - The offset, if non-zero, should be to a location in the <code>data</code> - section. The format of the data is specified by - "<code>annotation_set_item</code>" below. - </td> -</tr> -</tbody> -</table> - -<h2><code>annotation_set_item</code></h2> -<h4>referenced from <code>annotations_directory_item</code>, -<code>field_annotations_item</code>, -<code>method_annotations_item</code>, and -<code>annotation_set_ref_item</code></h4> -<h4>appears in the <code>data</code> section</h4> -<h4>alignment: 4 bytes</h4> - -<table class="format"> -<thead> -<tr> - <th>Name</th> - <th>Format</th> - <th>Description</th> -</tr> -</thead> -<tbody> -<tr> - <td>size</td> - <td>uint</td> - <td>size of the set, in entries</td> -</tr> -<tr> - <td>entries</td> - <td>annotation_off_item[size]</td> - <td>elements of the set. The elements must be sorted in increasing order, - by <code>type_idx</code>. - </td> -</tr> -</tbody> -</table> - -<h3><code>annotation_off_item</code> Format</h3> - -<table class="format"> -<thead> -<tr> - <th>Name</th> - <th>Format</th> - <th>Description</th> -</tr> -</thead> -<tbody> -<tr> - <td>annotation_off</td> - <td>uint</td> - <td>offset from the start of the file to an annotation. - The offset should be to a location in the <code>data</code> section, - and the format of the data at that location is specified by - "<code>annotation_item</code>" below. - </td> -</tr> -</tbody> -</table> - - -<h2><code>annotation_item</code></h2> -<h4>referenced from <code>annotation_set_item</code></h4> -<h4>appears in the <code>data</code> section</h4> -<h4>alignment: none (byte-aligned)</h4> - -<table class="format"> -<thead> -<tr> - <th>Name</th> - <th>Format</th> - <th>Description</th> -</tr> -</thead> -<tbody> -<tr> - <td>visibility</td> - <td>ubyte</td> - <td>intended visibility of this annotation (see below)</td> -</tr> -<tr> - <td>annotation</td> - <td>encoded_annotation</td> - <td>encoded annotation contents, in the format described by - "<code>encoded_annotation</code> Format" under - "<code>encoded_value</code> Encoding" above. - </td> -</tr> -</tbody> -</table> - -<h3>Visibility values</h3> - -<p>These are the options for the <code>visibility</code> field in an -<code>annotation_item</code>:</p> - -<table class="format"> -<thead> -<tr> - <th>Name</th> - <th>Value</th> - <th>Description</th> -</tr> -</thead> -<tbody> -<tr> - <td>VISIBILITY_BUILD</td> - <td>0x00</td> - <td>intended only to be visible at build time (e.g., during compilation - of other code) - </td> -</tr> -<tr> - <td>VISIBILITY_RUNTIME</td> - <td>0x01</td> - <td>intended to visible at runtime</td> -</tr> -<tr> - <td>VISIBILITY_SYSTEM</td> - <td>0x02</td> - <td>intended to visible at runtime, but only to the underlying system - (and not to regular user code) - </td> -</tr> -</tbody> -</table> - -<h2><code>encoded_array_item</code></h2> -<h4>referenced from <code>class_def_item</code></h4> -<h4>appears in the <code>data</code> section</h4> -<h4>alignment: none (byte-aligned)</h4> - -<table class="format"> -<thead> -<tr> - <th>Name</th> - <th>Format</th> - <th>Description</th> -</tr> -</thead> -<tbody> -<tr> - <td>value</td> - <td>encoded_array</td> - <td>bytes representing the encoded array value, in the format specified - by "<code>encoded_array</code> Format" under "<code>encoded_value</code> - Encoding" above. - </td> -</tr> -</tbody> -</table> - -<h1>System Annotations</h1> - -<p>System annotations are used to represent various pieces of reflective -information about classes (and methods and fields). This information is -generally only accessed indirectly by client (non-system) code.</p> - -<p>System annotations are represented in <code>.dex</code> files as -annotations with visibility set to <code>VISIBILITY_SYSTEM</code>. - -<h2><code>dalvik.annotation.AnnotationDefault</code></h2> -<h4>appears on methods in annotation interfaces</h4> - -<p>An <code>AnnotationDefault</code> annotation is attached to each -annotation interface which wishes to indicate default bindings.</p> - -<table class="format"> -<thead> -<tr> - <th>Name</th> - <th>Format</th> - <th>Description</th> -</tr> -</thead> -<tbody> -<tr> - <td>value</td> - <td>Annotation</td> - <td>the default bindings for this annotation, represented as an annotation - of this type. The annotation need not include all names defined by the - annotation; missing names simply do not have defaults. - </td> -</tr> -</tbody> -</table> - -<h2><code>dalvik.annotation.EnclosingClass</code></h2> -<h4>appears on classes</h4> - -<p>An <code>EnclosingClass</code> annotation is attached to each class -which is either defined as a member of another class, per se, or is -anonymous but not defined within a method body (e.g., a synthetic -inner class). Every class that has this annotation must also have an -<code>InnerClass</code> annotation. Additionally, a class must not have -both an <code>EnclosingClass</code> and an -<code>EnclosingMethod</code> annotation.</p> - -<table class="format"> -<thead> -<tr> - <th>Name</th> - <th>Format</th> - <th>Description</th> -</tr> -</thead> -<tbody> -<tr> - <td>value</td> - <td>Class</td> - <td>the class which most closely lexically scopes this class</td> -</tr> -</tbody> -</table> - -<h2><code>dalvik.annotation.EnclosingMethod</code></h2> -<h4>appears on classes</h4> - -<p>An <code>EnclosingMethod</code> annotation is attached to each class -which is defined inside a method body. Every class that has this -annotation must also have an <code>InnerClass</code> annotation. -Additionally, a class must not have both an <code>EnclosingClass</code> -and an <code>EnclosingMethod</code> annotation.</p> - -<table class="format"> -<thead> -<tr> - <th>Name</th> - <th>Format</th> - <th>Description</th> -</tr> -</thead> -<tbody> -<tr> - <td>value</td> - <td>Method</td> - <td>the method which most closely lexically scopes this class</td> -</tr> -</tbody> -</table> - -<h2><code>dalvik.annotation.InnerClass</code></h2> -<h4>appears on classes</h4> - -<p>An <code>InnerClass</code> annotation is attached to each class -which is defined in the lexical scope of another class's definition. -Any class which has this annotation must also have <i>either</i> an -<code>EnclosingClass</code> annotation <i>or</i> an -<code>EnclosingMethod</code> annotation.</p> - -<table class="format"> -<thead> -<tr> - <th>Name</th> - <th>Format</th> - <th>Description</th> -</tr> -</thead> -<tbody> -<tr> - <td>name</td> - <td>String</td> - <td>the originally declared simple name of this class (not including any - package prefix). If this class is anonymous, then the name is - <code>null</code>. - </td> -</tr> -<tr> - <td>accessFlags</td> - <td>int</td> - <td>the originally declared access flags of the class (which may differ - from the effective flags because of a mismatch between the execution - models of the source language and target virtual machine) - </td> -</tr> -</tbody> -</table> - -<h2><code>dalvik.annotation.MemberClasses</code></h2> -<h4>appears on classes</h4> - -<p>A <code>MemberClasses</code> annotation is attached to each class -which declares member classes. (A member class is a direct inner class -that has a name.)</p> - -<table class="format"> -<thead> -<tr> - <th>Name</th> - <th>Format</th> - <th>Description</th> -</tr> -</thead> -<tbody> -<tr> - <td>value</td> - <td>Class[]</td> - <td>array of the member classes</td> -</tr> -</tbody> -</table> - -<h2><code>dalvik.annotation.Signature</code></h2> -<h4>appears on classes, fields, and methods</h4> - -<p>A <code>Signature</code> annotation is attached to each class, -field, or method which is defined in terms of a more complicated type -than is representable by a <code>type_id_item</code>. The -<code>.dex</code> format does not define the format for signatures; it -is merely meant to be able to represent whatever signatures a source -language requires for successful implementation of that language's -semantics. As such, signatures are not generally parsed (or verified) -by virtual machine implementations. The signatures simply get handed -off to higher-level APIs and tools (such as debuggers). Any use of a -signature, therefore, should be written so as not to make any -assumptions about only receiving valid signatures, explicitly guarding -itself against the possibility of coming across a syntactically -invalid signature.</p> - -<p>Because signature strings tend to have a lot of duplicated content, -a <code>Signature</code> annotation is defined as an <i>array</i> of -strings, where duplicated elements naturally refer to the same -underlying data, and the signature is taken to be the concatenation of -all the strings in the array. There are no rules about how to pull -apart a signature into separate strings; that is entirely up to the -tools that generate <code>.dex</code> files.</p> - -<table class="format"> -<thead> -<tr> - <th>Name</th> - <th>Format</th> - <th>Description</th> -</tr> -</thead> -<tbody> -<tr> - <td>value</td> - <td>String[]</td> - <td>the signature of this class or member, as an array of strings that - is to be concatenated together</td> -</tr> -</tbody> -</table> - -<h2><code>dalvik.annotation.Throws</code></h2> -<h4>appears on methods</h4> - -<p>A <code>Throws</code> annotation is attached to each method which is -declared to throw one or more exception types.</p> - -<table class="format"> -<thead> -<tr> - <th>Name</th> - <th>Format</th> - <th>Description</th> -</tr> -</thead> -<tbody> -<tr> - <td>value</td> - <td>Class[]</td> - <td>the array of exception types thrown</td> -</tr> -</tbody> -</table> - -</body> -</html> diff --git a/docs/instruction-formats.css b/docs/instruction-formats.css deleted file mode 100644 index a2dc42f9d..000000000 --- a/docs/instruction-formats.css +++ /dev/null @@ -1,129 +0,0 @@ -h1 { - font-family: serif; - color: #222266; -} - -h2 { - font-family: serif; - border-top-style: solid; - border-top-width: 2px; - border-color: #ccccdd; - padding-top: 12px; - margin-top: 48px; - margin-bottom: 2px; - color: #222266; -} - -h3 { - font-family: serif; - color: #222266; -} - -@media print { - table { - font-size: 8pt; - } -} - -@media screen { - table { - font-size: 10pt; - } -} - -table th { - font-family: sans-serif; - background: #aaaaff; -} - -table { - border-collapse: collapse; -} - -table td { - font-family: sans-serif; - border-top-style: solid; - border-bottom-style: solid; - border-width: 1px; - border-color: #aaaaff; - padding-top: 4px; - padding-bottom: 4px; - padding-left: 2px; - padding-right: 2px; - background: #eeeeff; -} - - -/* the mnemonic guide */ - -table.letters { - margin-top: 24px; - margin-bottom: 24px; - margin-left: 48px; - margin-right: 48px; -} - -table.letters td:first-child { - font-family: monospace; - width: 10%; - text-align: center; -} - -table.letters td:first-child + td { - width: 10%; - text-align: center; -} - -table.letters td:first-child + td + td { - width: 80%; -} - - -/* the formats, per se */ - -table.format { - background: #aaaaaa; - border-collapse: collapse; - margin-top: 24px; - margin-bottom: 24px; - margin-left: 48px; - margin-right: 48px; -} - -table.format td { - font-family: monospace; -} - -table.format td + td i { - font-family: sans-serif; -} - -table.format td sub { - font-family: sans-serif; -} - -table.format td sub { - font-family: sans-serif; - font-style: italic; - font-size: 70% -} - -table.format th:first-child { - width: 28%; -} - -table.format th:first-child + th { - width: 5%; -} - -table.format th:first-child + th + th { - width: 45%; -} - -table.format th:first-child + th + th + th { - width: 22%; -} - -table.format p { - margin-bottom: 0pt; -}
\ No newline at end of file diff --git a/docs/instruction-formats.html b/docs/instruction-formats.html deleted file mode 100644 index f320a2d13..000000000 --- a/docs/instruction-formats.html +++ /dev/null @@ -1,461 +0,0 @@ -<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd"> - -<html> - -<head> -<title>Dalvik VM Instruction Formats</title> -<link rel=stylesheet href="instruction-formats.css"> -</head> - -<body> - -<h1>Dalvik VM Instruction Formats</h1> -<p>Copyright © 2007 The Android Open Source Project - -<h2>Introduction and Overview</h2> - -<p>This document lists the instruction formats used by Dalvik bytecode -and is meant to be used in conjunction with the -<a href="dalvik-bytecode.html">bytecode reference document</a>.</p> - -<h3>Bitwise descriptions</h3> - -<p>The first column in the format table lists the bitwise layout of -the format. It consists of one or more space-separated "words" each of -which describes a 16-bit code unit. Each character in a word -represents four bits, read from high bits to low, with vertical bars -("<code>|</code>") interspersed to aid in reading. Uppercase letters -in sequence from "<code>A</code>" are used to indicate fields within -the format (which then get defined further by the syntax column). The term -"<code>op</code>" is used to indicate the position of an eight-bit -opcode within the format. A slashed zero -("<code>Ø</code>") is used to indicate that all bits must be -zero in the indicated position.</p> - -<p>For the most part, lettering proceeds from earlier code units to -later code units, and low-order to high-order within a code unit. -However, there are a few exceptions to this general rule, which are -done in order to make the naming of similar-meaning parts be the same -across different instruction formats. These cases are noted explicitly -in the format descriptions.</p> - -<p>For example, the format "<code>B|A|<i>op</i> CCCC</code>" indicates -that the format consists of two 16-bit code units. The first word -consists of the opcode in the low eight bits and a pair of four-bit -values in the high eight bits; and the second word consists of a single -16-bit value.</p> - -<h3>Format IDs</h3> - -<p>The second column in the format table indicates the short identifier -for the format, which is used in other documents and in code to identify -the format.</p> - -<p>Most format IDs consist of three characters, two digits followed by a -letter. The first digit indicates the number of 16-bit code units in the -format. The second digit indicates the maximum number of registers that the -format contains (maximum, since some formats can accomodate a variable -number of registers), with the special designation "<code>r</code>" indicating -that a range of registers is encoded. The final letter semi-mnemonically -indicates the type of any extra data encoded by the format. For example, -format "<code>21t</code>" is of length two, contains one register reference, -and additionally contains a branch target.</p> - -<p>Suggested static linking formats have an additional -"<code>s</code>" suffix, making them four characters total. Similarly, -suggested "inline" linking formats have an additional "<code>i</code>" -suffix. (In this context, inline linking is like static linking, -except with more direct ties into a virtual machine's implementation.) -Finally, a couple oddball suggested formats (e.g., -"<code>20bc</code>") include two pieces of data which are both -represented in its format ID.</p> - -<p>The full list of typecode letters are as follows. Note that some -forms have different sizes, depending on the format:</p> - -<table class="letters"> -<thead> -<tr> - <th>Mnemonic</th> - <th>Bit Sizes</th> - <th>Meaning</th> -</tr> -</thead> -<tbody> -<tr> - <td>b</td> - <td>8</td> - <td>immediate signed <b>b</b>yte</td> -</tr> -<tr> - <td>c</td> - <td>16, 32</td> - <td><b>c</b>onstant pool index</td> -</tr> -<tr> - <td>f</td> - <td>16</td> - <td>inter<b>f</b>ace constants (only used in statically linked formats) - </td> -</tr> -<tr> - <td>h</td> - <td>16</td> - <td>immediate signed <b>h</b>at (high-order bits of a 32- or 64-bit - value; low-order bits are all <code>0</code>) - </td> -</tr> -<tr> - <td>i</td> - <td>32</td> - <td>immediate signed <b>i</b>nt, or 32-bit float</td> -</tr> -<tr> - <td>l</td> - <td>64</td> - <td>immediate signed <b>l</b>ong, or 64-bit double</td> -</tr> -<tr> - <td>m</td> - <td>16</td> - <td><b>m</b>ethod constants (only used in statically linked formats)</td> -</tr> -<tr> - <td>n</td> - <td>4</td> - <td>immediate signed <b>n</b>ibble</td> -</tr> -<tr> - <td>s</td> - <td>16</td> - <td>immediate signed <b>s</b>hort</td> -</tr> -<tr> - <td>t</td> - <td>8, 16, 32</td> - <td>branch <b>t</b>arget</td> -</tr> -<tr> - <td>x</td> - <td>0</td> - <td>no additional data</td> -</tr> -</tbody> -</table> - -<h3>Syntax</h3> - -<p>The third column of the format table indicates the human-oriented -syntax for instructions which use the indicated format. Each instruction -starts with the named opcode and is optionally followed by one or -more arguments, themselves separated with commas.</p> - -<p>Wherever an argument refers to a field from the first column, the -letter for that field is indicated in the syntax, repeated once for -each four bits of the field. For example, an eight-bit field labeled -"<code>BB</code>" in the first column would also be labeled -"<code>BB</code>" in the syntax column.</p> - -<p>Arguments which name a register have the form "<code>v<i>X</i></code>". -The prefix "<code>v</code>" was chosen instead of the more common -"<code>r</code>" exactly to avoid conflicting with (non-virtual) architectures -on which a Dalvik virtual machine might be implemented which themselves -use the prefix "<code>r</code>" for their registers. (That is, this -decision makes it possible to talk about both virtual and real registers -together without the need for circumlocution.)</p> - -<p>Arguments which indicate a literal value have the form -"<code>#+<i>X</i></code>". Some formats indicate literals that only -have non-zero bits in their high-order bits; for these, the zeroes -are represented explicitly in the syntax, even though they do not -appear in the bitwise representation.</p> - -<p>Arguments which indicate a relative instruction address offset have the -form "<code>+<i>X</i></code>".</p> - -<p>Arguments which indicate a literal constant pool index have the form -"<code><i>kind</i>@<i>X</i></code>", where "<code><i>kind</i></code>" -indicates which constant pool is being referred to. Each opcode that -uses such a format explicitly allows only one kind of constant; see -the opcode reference to figure out the correspondence. The four -kinds of constant pool are "<code>string</code>" (string pool index), -"<code>type</code>" (type pool index), "<code>field</code>" (field -pool index), and "<code>meth</code>" (method pool index).</p> - -<p>Similar to the representation of constant pool indices, there are -also suggested (optional) forms that indicate prelinked offsets or -indices. There are two types of suggested prelinked value: vtable offsets -(indicated as "<code>vtaboff</code>") and field offsets (indicated as -"<code>fieldoff</code>").</p> - -<p>In the cases where a format value isn't explictly part of the syntax -but instead picks a variant, each variant is listed with the prefix -"<code>[<i>X</i>=<i>N</i>]</code>" (e.g., "<code>[A=2]</code>") to indicate -the correspondence.</p> - -<h2>The Formats</h2> - -<table class="format"> -<thead> -<tr> - <th>Format</th> - <th>ID</th> - <th>Syntax</th> - <th>Notable Opcodes Covered</th> -</tr> -</thead> -<tbody> -<tr> - <td><i>N/A</i></td> - <td>00x</td> - <td><i><code>N/A</code></i></td> - <td><i>pseudo-format used for unused opcodes; suggested for use as the - nominal format for a breakpoint opcode</i></td> -</tr> -<tr> - <td>ØØ|<i>op</i></td> - <td>10x</td> - <td><i><code>op</code></i></td> - <td> </td> -</tr> -<tr> - <td rowspan="2">B|A|<i>op</i></td> - <td>12x</td> - <td><i><code>op</code></i> vA, vB</td> - <td> </td> -</tr> -<tr> - <td>11n</td> - <td><i><code>op</code></i> vA, #+B</td> - <td> </td> -</tr> -<tr> - <td rowspan="2">AA|<i>op</i></td> - <td>11x</td> - <td><i><code>op</code></i> vAA</td> - <td> </td> -</tr> -<tr> - <td>10t</td> - <td><i><code>op</code></i> +AA</td> - <td>goto</td> -</tr> -<tr> - <td>ØØ|<i>op</i> AAAA</td></td> - <td>20t</td> - <td><i><code>op</code></i> +AAAA</td> - <td>goto/16</td> -</tr> -<tr> - <td>AA|<i>op</i> BBBB</td></td> - <td>20bc</td> - <td><i><code>op</code></i> AA, kind@BBBB</td> - <td><i>suggested format for statically determined verification errors; - A is the type of error and B is an index into a type-appropriate - table (e.g. method references for a no-such-method error)</i></td> -</tr> -<tr> - <td rowspan="5">AA|<i>op</i> BBBB</td> - <td>22x</td> - <td><i><code>op</code></i> vAA, vBBBB</td> - <td> </td> -</tr> -<tr> - <td>21t</td> - <td><i><code>op</code></i> vAA, +BBBB</td> - <td> </td> -</tr> -<tr> - <td>21s</td> - <td><i><code>op</code></i> vAA, #+BBBB</td> - <td> </td> -</tr> -<tr> - <td>21h</td> - <td><i><code>op</code></i> vAA, #+BBBB0000<br/> - <i><code>op</code></i> vAA, #+BBBB000000000000 - </td> - <td> </td> -</tr> -<tr> - <td>21c</td> - <td><i><code>op</code></i> vAA, type@BBBB<br/> - <i><code>op</code></i> vAA, field@BBBB<br/> - <i><code>op</code></i> vAA, string@BBBB - </td> - <td>check-cast<br/> - const-class<br/> - const-string - </td> -</tr> -<tr> - <td rowspan="2">AA|<i>op</i> CC|BB</td> - <td>23x</td> - <td><i><code>op</code></i> vAA, vBB, vCC</td> - <td> </td> -</tr> -<tr> - <td>22b</td> - <td><i><code>op</code></i> vAA, vBB, #+CC</td> - <td> </td> -</tr> -<tr> - <td rowspan="4">B|A|<i>op</i> CCCC</td> - <td>22t</td> - <td><i><code>op</code></i> vA, vB, +CCCC</td> - <td> </td> -</tr> -<tr> - <td>22s</td> - <td><i><code>op</code></i> vA, vB, #+CCCC</td> - <td> </td> -</tr> -<tr> - <td>22c</td> - <td><i><code>op</code></i> vA, vB, type@CCCC<br/> - <i><code>op</code></i> vA, vB, field@CCCC - </td> - <td>instance-of</td> -</tr> -<tr> - <td>22cs</td> - <td><i><code>op</code></i> vA, vB, fieldoff@CCCC</td> - <td><i>suggested format for statically linked field access instructions of - format 22c</i> - </td> -</tr> -<tr> - <td>ØØ|<i>op</i> AAAA<sub>lo</sub> AAAA<sub>hi</sub></td></td> - <td>30t</td> - <td><i><code>op</code></i> +AAAAAAAA</td> - <td>goto/32</td> -</tr> -<tr> - <td>ØØ|<i>op</i> AAAA BBBB</td> - <td>32x</td> - <td><i><code>op</code></i> vAAAA, vBBBB</td> - <td> </td> -</tr> -<tr> - <td rowspan="3">AA|<i>op</i> BBBB<sub>lo</sub> BBBB<sub>hi</sub></td> - <td>31i</td> - <td><i><code>op</code></i> vAA, #+BBBBBBBB</td> - <td> </td> -</tr> -<tr> - <td>31t</td> - <td><i><code>op</code></i> vAA, +BBBBBBBB</td> - <td> </td> -</tr> -<tr> - <td>31c</td> - <td><i><code>op</code></i> vAA, string@BBBBBBBB</td> - <td>const-string/jumbo</td> -</tr> -<tr> - <td rowspan="3">A|G|<i>op</i> BBBB F|E|D|C</td> - <td>35c</td> - <td><i>[<code>A=5</code>] <code>op</code></i> {vC, vD, vE, vF, vG}, - meth@BBBB<br/> - <i>[<code>A=5</code>] <code>op</code></i> {vC, vD, vE, vF, vG}, - type@BBBB<br/> - <i>[<code>A=4</code>] <code>op</code></i> {vC, vD, vE, vF}, - <i><code>kind</code></i>@BBBB<br/> - <i>[<code>A=3</code>] <code>op</code></i> {vC, vD, vE}, - <i><code>kind</code></i>@BBBB<br/> - <i>[<code>A=2</code>] <code>op</code></i> {vC, vD}, - <i><code>kind</code></i>@BBBB<br/> - <i>[<code>A=1</code>] <code>op</code></i> {vC}, - <i><code>kind</code></i>@BBBB<br/> - <i>[<code>A=0</code>] <code>op</code></i> {}, - <i><code>kind</code></i>@BBBB<br/> - <p><i>The unusual choice in lettering here reflects a desire to make - the count and the reference index have the same label as in format - 3rc.</i></p> - </td> - <td> </td> -</tr> -<tr> - <td>35ms</td> - <td><i>[<code>A=5</code>] <code>op</code></i> {vC, vD, vE, vF, vG}, - vtaboff@BBBB<br/> - <i>[<code>A=4</code>] <code>op</code></i> {vC, vD, vE, vF}, - vtaboff@BBBB<br/> - <i>[<code>A=3</code>] <code>op</code></i> {vC, vD, vE}, - vtaboff@BBBB<br/> - <i>[<code>A=2</code>] <code>op</code></i> {vC, vD}, - vtaboff@BBBB<br/> - <i>[<code>A=1</code>] <code>op</code></i> {vC}, - vtaboff@BBBB<br/> - <p><i>The unusual choice in lettering here reflects a desire to make - the count and the reference index have the same label as in format - 3rms.</i></p> - </td> - <td><i>suggested format for statically linked <code>invoke-virtual</code> - and <code>invoke-super</code> instructions of format 35c</i> - </td> -</tr> -<tr> - <td>35mi</td> - <td><i>[<code>A=5</code>] <code>op</code></i> {vC, vD, vE, vF, vG}, - inline@BBBB<br/> - <i>[<code>A=4</code>] <code>op</code></i> {vC, vD, vE, vF}, - inline@BBBB<br/> - <i>[<code>A=3</code>] <code>op</code></i> {vC, vD, vE}, - inline@BBBB<br/> - <i>[<code>A=2</code>] <code>op</code></i> {vC, vD}, - inline@BBBB<br/> - <i>[<code>A=1</code>] <code>op</code></i> {vC}, - inline@BBBB<br/> - <p><i>The unusual choice in lettering here reflects a desire to make - the count and the reference index have the same label as in format - 3rmi.</i></p> - </td> - <td><i>suggested format for inline linked <code>invoke-static</code> - and <code>invoke-virtual</code> instructions of format 35c</i> - </td> -</tr> -<tr> - <td rowspan="3">AA|<i>op</i> BBBB CCCC</td> - <td>3rc</td> - <td><i><code>op</code></i> {vCCCC .. vNNNN}, meth@BBBB<br/> - <i><code>op</code></i> {vCCCC .. vNNNN}, type@BBBB<br/> - <p><i>where <code>NNNN = CCCC+AA-1</code>, that is <code>A</code> - determines the count <code>0..255</code>, and <code>C</code> - determines the first register</i></p> - </td> - <td> </td> -</tr> -<tr> - <td>3rms</td> - <td><i><code>op</code></i> {vCCCC .. vNNNN}, vtaboff@BBBB<br/> - <p><i>where <code>NNNN = CCCC+AA-1</code>, that is <code>A</code> - determines the count <code>0..255</code>, and <code>C</code> - determines the first register</i></p> - </td> - <td><i>suggested format for statically linked <code>invoke-virtual</code> - and <code>invoke-super</code> instructions of format <code>3rc</code></i> - </td> -</tr> -<tr> - <td>3rmi</td> - <td><i><code>op</code></i> {vCCCC .. vNNNN}, inline@BBBB<br/> - <p><i>where <code>NNNN = CCCC+AA-1</code>, that is <code>A</code> - determines the count <code>0..255</code>, and <code>C</code> - determines the first register</i></p> - </td> - <td><i>suggested format for inline linked <code>invoke-static</code> - and <code>invoke-virtual</code> instructions of format 3rc</i> - </td> -</tr> -<tr> - <td>AA|<i>op</i> BBBB<sub>lo</sub> BBBB BBBB BBBB<sub>hi</sub></td> - <td>51l</td> - <td><i><code>op</code></i> vAA, #+BBBBBBBBBBBBBBBB</td> - <td>const-wide</td> -</tr> -</tbody> -</table> - -</body> -</html> |