aboutsummaryrefslogtreecommitdiffstats
path: root/libc/arch-arm/cortex-a9
Commit message (Collapse)AuthorAgeFilesLines
* cortex-a9: Fix reference to __memcpy_base_aligned.Kyle Repinski2016-06-271-1/+1
| | | | | | | With a different memcpy, __memcpy_base_aligned ceased to exist. Instead, point to the name defined by whatever includes memcpy_base.S Change-Id: I242cf49cbada35337ba155d7f170e86a905ff55f
* libc: arm: add optimized memchr implementationYingshiuan Pan2015-10-311-0/+1
| | | | | | | | | | | | | This optimization is extracted from cortex-strings and bionic-ized, and applied to arm-v7a cpus (a7, a9, a15, a53, denver, krait). I ran stringbench[1] on ARM Juno, this optimization could outperform origin C implementation by 77%. [1] https://android.git.linaro.org/gitweb/platform/external/stringbench.git Change-Id: I1c3fb0c89ce2b3ee7e44f492367b6caf6db58ccf Signed-off-by: Yingshiuan Pan <yingshiuan.pan@linaro.org>
* Fix over read in strcpy/stpcpy/strcat.Christopher Ferris2015-10-292-134/+148
| | | | | | | | | | | | | | | | | | | | | | | This bug will happen when these circumstances are met: - Destination address & 0x7 == 1, strlen of src is 11, 12, 13. - Destination address & 0x7 == 2, strlen of src is 10, 11, 12. - Destination address & 0x7 == 3, strlen of src is 9, 10, 11. - Destination address & 0x7 == 4, strlen of src is 8, 9, 10. In these cases, the dest alignment code does a ldr which reads 4 bytes, and it will read past the end of the source. In most cases, this is probably benign, but if this crosses into a new page it could cause a crash. Fix the labels in the cortex-a9 strcat. Modify the overread test to vary the dst alignment to expost this bug. Also, shrink the strcat/strlcat overread cases since the dst alignment variation increases the runtime too much. Bug: 24345899 Change-Id: Ib34a559bfcebd89861985b29cae6c1e47b5b5855
* Remove pushes from memsets (krait/cortex-a9).Christopher Ferris2015-10-291-16/+11
| | | | | | | | | | On the path that only uses r0 in both the krait and cortex-a9 memset, remove the push and use r3 instead. In addition, for cortex-a9, remove the artificial function since it's not needed since dwarf unwinding is now supported on arm. Change-Id: Ia4ed1cc435b03627a7193215e76c8ea3335f949a
* Replace bx lr with update of pc from the stack.Christopher Ferris2015-10-292-6/+3
| | | | | | | | | | | | | When there is arm assembler of this format: ldmxx sp!, {..., lr} or pop {..., lr} bx lr It can be replaced with: ldmxx sp!, {..., pc} or pop {..., pc} Change-Id: Ic27048c52f90ac4360ad525daf0361a830dc22a3
* Use unified syntax to compile with both llvm and gcc.Chih-Hung Hsieh2015-05-151-14/+15
| | | | | | | | | | | All arch-arm and arch-arm64 .S files were compiled by gcc with and without this patch. The output object files were identical. When compiled with llvm and this patch, the output files were also identical to gcc's output. BUG: 18061004 Change-Id: I458914d512ddf5496e4eb3d288bf032cd526d32b (cherry picked from commit 33f33515b503b634d9fbc57dda7123ea9cf23fc6)
* Use assembly memmove for all arm32 processors.Christopher Ferris2015-04-081-3/+5
| | | | | Bug: 15110993 Change-Id: Ia3dcd6b8c4032f8c72b6f2e628b635ce99667c09
* Clean up <stdlib.h> slightly.Elliott Hughes2015-01-261-1/+3
| | | | | | | Interestingly, this mostly involves cleaning up our implementation of various <string.h> functions. Change-Id: Ifaef49b5cb997134f7bc0cc31bdac844bdb9e089
* Move the generic arm memcmp.S into the generic directory.Elliott Hughes2014-12-151-0/+1
| | | | Change-Id: I48e4d14a0dcddbb246edbac6d0329619574ab44d
* Add stpcpy assembler version.Christopher Ferris2014-09-304-436/+571
| | | | | | | For generic, continue to use the C version of the code. Bug: 13746695 Change-Id: I77426a70b06131f2373bb51265bea1240bb3f101
* Cleanup arm assembly.Christopher Ferris2014-09-297-110/+100
| | | | | | | | Remove the old arm directives. Change the non-local labels to .L labels. Add cfi directives to strcpy.S. Change-Id: I9bafee1ffe5d85c92d07cfa8a85338cef9759562
* denver: optimize memmoveShu Zhang2014-05-201-0/+1
| | | | | | Optimize 32-bit denver memmove with reversal memcpy. Change-Id: Iaad0a9475248cdd7e4f50d58bea9db1b767abc88
* Unify our assembler macros.Elliott Hughes2014-02-208-13/+12
| | | | | | | | | | | | | | Our <machine/asm.h> files were modified from upstream, to the extent that no architecture was actually using the upstream ENTRY or END macros, assuming that architecture even had such a macro upstream. This patch moves everyone to the same macros, with just a few tweaks remaining in the <machine/asm.h> files, which no one should now use directly. I've removed most of the unused cruft from the <machine/asm.h> files, though there's still rather a lot in the mips/mips64 ones. Bug: 12229603 Change-Id: I2fff287dc571ac1087abe9070362fb9420d85d6d
* Reconfig libc's Android.mk to build for multilibYing Wang2014-02-121-10/+9
| | | | | | | | | | | | | | | 1. Moved arch-specific setup to their own files: - <arch>/<arch>.mk, arch-specific configs. Variables in those config end with the arch name. - removed the extra complexity introduced by function libc-add-cpu-variant-src, which seems to be not very useful these days. 2. Separated out the crt object files generation rules and set up the rules for both TARGET_ARCH and TARGET_2ND_ARCH. 3. Build all the libraries for both TARGET_ARCH and TARGET_2ND_ARCH, with the arch-specific LOCAL_ variables. Bug: 11654773 Change-Id: I9c2d85db0affa49199d182236d2210060a321421
* Add .cfi_startproc/.cfi_endproc to ENTRY/END.Christopher Ferris2013-11-196-43/+1
| | | | | Bug: 10414953 Change-Id: I711718098b9f3cc0ba8277778df64557e9c7b2a0
* 'Avoid confusing "read prevented write" log messages' 2.Elliott Hughes2013-10-154-4/+4
| | | | | | This time it's assembler. Change-Id: Iae6369833b8046b8eda70238bb4ed0cae64269ea
* Fix x86_64 build, clean up intermediate libraries.Elliott Hughes2013-10-094-4/+4
| | | | | | | | | | | | | | | | | | | | | | The x86_64 build was failing because clone.S had a call to __thread_entry which was being added to a different intermediate .a on the way to making libc.so, and the linker couldn't guarantee statically that such a relocation would be possible. ld: error: out/target/product/generic_x86_64/obj/STATIC_LIBRARIES/libc_common_intermediates/libc_common.a(clone.o): requires dynamic R_X86_64_PC32 reloc against '__thread_entry' which may overflow at runtime; recompile with -fPIC This patch addresses that by ensuring that the caller and callee end up in the same intermediate .a. While I'm here, I've tried to clean up some of the mess that led to this situation too. In particular, this removes libc/private/ from the default include path (except for the DNS code), and splits out the DNS code into its own library (since it's a weird special case of upstream NetBSD code that's diverged so heavily it's unlikely ever to get back in sync). There's more cleanup of the DNS situation possible, but this is definitely a step in the right direction, and it's more than enough to get x86_64 building cleanly. Change-Id: I00425a7245b7a2573df16cc38798187d0729e7c4
* Make error messages even better!Nick Kralevich2013-10-044-4/+4
| | | | Change-Id: I72bd1eb1d526dc59833e5bc3c636171f7f9545af
* Remove the __ARM_FEATURE_DSP check.Christopher Ferris2013-10-021-11/+0
| | | | | | | | | | | | | The check for __ARM_FEATURE_DSP being defined is pointless since it is always defined. Bug: 10971279 Merge from internal master. (cherry-picked from d2642fa70cfbd77286514e1123fcd280d7f7047f) Change-Id: If23ab3271f4da0c38cd531ffdc9a7e5eed6ec5dc
* libc: don't export unnecessary symbolsNick Kralevich2013-10-025-6/+6
| | | | | | | Symbols associated with the internal implementation of memcpy like routines should be private. Change-Id: I2b1d1f59006395c29d518c153928437b08f93d16
* __memcpy_chk: Fix signed cmp of unsigned values.Christopher Ferris2013-09-203-3/+3
| | | | | | | | | | | | | | | I accidentally did a signed comparison of the size_t values passed in for three of the _chk functions. Changing them to unsigned compares. Add three new tests to verify this failure is fixed. Bug: 10691831 Merge from internal master. (cherry-picked from 883ef2499c2ff76605f73b1240f719ca6282e554) Change-Id: Id9a96b549435f5d9b61dc132cf1082e0e30889f5
* Fix all debug directives.Christopher Ferris2013-09-206-82/+130
| | | | | | | | | | | | | | | | | | | | | | | | | | | The backtrace when a fortify check failed was not correct. This change adds all of the necessary directives to get a correct backtrace. Fix the strcmp directives and change all labels to local labels. Testing: - Verify that the runtime can decode the stack for __memcpy_chk, __memset_chk, __strcpy_chk, __strcat_chk fortify failures. - Verify that gdb can decode the stack properly when hitting a fortify check. - Verify that the runtime can decode the stack for a seg fault for all of the _chk functions and for memcpy/memset. - Verify that gdb can decode the stack for a seg fault for all of the _chk functions and for memcpy/memset. - Verify that the runtime can decode the stack for a seg fault for strcmp. - Verify that gdb can decode the stack for a seg fault in strcmp. Bug: 10342460 Bug: 10345269 Merge from internal master. (cherry-picked from 05332f2ce7e542d32ff4d5cd9f60248ad71fbf0d) Change-Id: Ibc919b117cfe72b9ae97e35bd48185477177c5ca
* Update all debug directives.Christopher Ferris2013-09-206-1/+42
| | | | | | | | | | | | | | | | The libcorkscrew stack unwinder does not understand cfi directives, so add .save directives so that it can function properly. Also add the directives in to strcmp.S and fix a missing set of directives in cortex-a9/memcpy_base.S. Bug: 10345269 Merge from internal master. (cherry-picked from 5f7ccea3ffab05aeceecb85c821003cf580630d3) Change-Id: If48a216203216a643807f5d61906015984987189
* Create optimized __strcpy_chk/__strcat_chk.Christopher Ferris2013-08-156-180/+652
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This change pulls the memcpy code out into a new file so that the __strcpy_chk and __strcat_chk can use it with an include. The new versions of the two chk functions uses assembly versions of strlen and memcpy to implement this check. This allows near parity with the assembly versions of strcpy/strcat. It also means that as memcpy implementations get faster, so do the chk functions. Other included changes: - Change all of the assembly labels to local labels. The other labels confuse gdb and mess up backtracing. - Add .cfi_startproc and .cfi_endproc directives so that gdb is not confused when falling through from one function to another. - Change all functions to use cfi directives since they are more powerful. - Move the memcpy_chk fail code outside of the memcpy function definition so that backtraces work properly. - Preserve lr before the calls to __fortify_chk_fail so that the backtrace actually works. Testing: - Ran the bionic unit tests. Verified all error messages in logs are set correctly. - Ran libc_test, replacing strcpy with __strcpy_chk and replacing strcat with __strcat_chk. - Ran the debugger on nexus10, nexus4, and old nexus7. Verified that the backtrace is correct for all fortify check failures. Also verify that when falling through from __memcpy_chk to memcpy that the backtrace is still correct. Also verified the same for __memset_chk and bzero. Verified the two different paths in the cortex-a9 memset routine that save variables to the stack still show the backtrace properly. Bug: 9293744 (cherry-picked from 2be91915dcecc956d14ff281db0c7d216ca98af2) Change-Id: Ia407b74d3287d0b6af0139a90b6eb3bfaebf2155
* Optimize __memset_chk, __memcpy_chk. DO NOT MERGE.Christopher Ferris2013-08-142-0/+49
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This change creates assembler versions of __memcpy_chk/__memset_chk that is implemented in the memcpy/memset assembler code. This change avoids an extra call to memcpy/memset, instead allowing a simple fall through to occur from the chk code into the body of the real implementation. Testing: - Ran the libc_test on __memcpy_chk/__memset_chk on all nexus devices. - Wrote a small test executable that has three calls to __memcpy_chk and three calls to __memset_chk. First call dest_len is length + 1. Second call dest_len is length. Third call dest_len is length - 1. Verified that the first two calls pass, and the third fails. Examined the logcat output on all nexus devices to verify that the fortify error message was sent properly. - I benchmarked the new __memcpy_chk and __memset_chk on all systems. For __memcpy_chk and large copies, the savings is relatively small (about 1%). For small copies, the savings is large on cortex-a15/krait devices (between 5% to 30%). For cortex-a9 and small copies, the speed up is present, but relatively small (about 3% to 5%). For __memset_chk and large copies, the savings is also small (about 1%). However, all processors show larger speed-ups on small copies (about 30% to 100%). Bug: 9293744 Merge from internal master. (cherry-picked from 7c860db0747f6276a6e43984d43f8fa5181ea936) Change-Id: I916ad305e4001269460ca6ebd38aaa0be8ac7f52
* Optimize strcat/strcpy, small tweaks to strlen. DO NOT MERGEChristopher Ferris2013-08-084-2/+1174
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Create one version of strcat/strcpy/strlen for cortex-a15/krait and another version for cortex-a9. Tested with the libc_test strcat/strcpy/strlen tests. Including new tests that verify that the src for strcat/strcpy do not overread across page boundaries. NOTE: The handling of unaligned strcpy (same code in strcat) could probably be optimized further such that the src is read 64 bits at a time instead of the partial reads occurring now. strlen improves slightly since it was recently optimized. Performance improvements for strcpy and strcat (using an empty dest string): cortex-a9 - Small copies vary from about 5% to 20% as the size gets above 10 bytes. - Copies >= 1024, about a 60% improvement. - Unaligned copies, from about 40% improvement. cortex-a15 - Most small copies exhibit a 100% improvement, a few copies only improve by 20%. - Copies >= 1024, about 150% improvement. - Unaligned copies, about 100% improvement. krait - Most small copies vary widely, but on average 20% improvement, then the performance gets better, hitting about a 100% improvement when copies 64 bytes of data. - Copies >= 1024, about 100% improvement. - When coping MBs of data, about 50% improvement. - Unaligned copies, about 90% improvement. As strcat destination strings get larger in size: cortex-a9 - about 40% improvement for small dst strings (>= 32). - about 250% improvement for dst strings >= 1024. cortex-a15 - about 200% improvement for small dst strings (>=32). - about 250% improvement for dst strings >= 1024. krait - about 25% improvement for small dst strings (>=32). - about 100% improvement for dst strings >=1024. Merge from internal master. (cherry-picked from d119b7b6f48fe507088cfb98bcafa99b320fd884) Change-Id: I296463b251ef9fab004ee4dded2793feca5b547a
* Add new optimized strlen for arm.Christopher Ferris2013-07-161-0/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | This optimized version is primarily targeted at cortex-a15. Tested on all nexus devices using the system/extras/libc_test strlen test. Tested alignments from 1 to 32 that are powers of 2. Tested that strlen does not cross page boundaries at all alignments. Speed improvements listed below: cortex-a15 - Sizes >= 32 bytes, ~75% improvement. - Sizes >= 1024 bytes, ~250% improvement. cortex-a9 - Sizes >= 32 bytes, ~75% improvement. - Sizes >= 1024 bytes, ~85% improvement. krait - Sizes >= 32 bytes, ~95% improvement. - Sizes >= 1024 bytes, ~160% improvement. Merge from internal master. (cherry-picked from 2fc071797743b88a9a47427d46baed7c7b24f4d2) Change-Id: I1ceceb4e745fd68e9d946f96d1d42e0cdaff6ccf
* Create arch specific versions of strcmp.Christopher Ferris2013-03-202-0/+545
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This uses the new strcmp.a15.S code as the basis for new versions of strcmp.S. The cortex-a15 code is the performance optimized version of strcmp.a15.S taken with only the addition of a few pld instructions. The cortex-a9 code is the same as the cortex-a15 code except that the unaligned strcmp code was taken from the original strcmp.S. The krait code is the same as the cortex-a15 code except that one path in the unaligned strcmp code was taken from the original strcmp.S code (the 2 byte overlap case). The generic code is the original unmodified strmp.S from the bionic subdirectory. All three new versions underwent these test cases: Strings the same, all same size: - Both pointers double word aligned. - One pointer double word aligned, one pointer word aligned. - Both pointers word aligned. - One pointer double word aligned, one pointer 1 off a word alignment. - One pointer double word aligned, one pointer 2 off a word alignment. - One pointer double word aligned, one pointer 3 off a word alignment. - One pointer word aligned, one pointer 1 off a word alignment. - One pointer word aligned, one pointer 2 off a word alignment. - One pointer word aligned, one pointer 3 off a word alignment. For all cases where it made sense, the two pointers were also tested swapped. Different strings, all same size: - Single difference at double word boundary. - Single difference at word boudary. - Single difference at 1 off a word alignment. - Single difference at 2 off a word alignment. - Single difference at 3 off a word alignment. Different sized strings, strings the same until the end: - Shorter string ends on a double word boundary. - Shorter string ends on word boundary. - Shorter string ends at 1 off a word boundary. - Shorter string ends at 2 off a word boundary. - Shorter string ends at 3 off a word boundary. For all different cases, run them through the same pointer alignment cases when the strings are the same size. For all cases the two pointers were also tested swapped. Bug: 8005082 Merge from internal master. (cherry-picked from commit a9a5870d166f8060a8182cd61e5536b0becea74e) Change-Id: I4c2b98f8a50804fb98ab67f75e9d660f1315a144
* Break bionic implementations into arch versions.Christopher Ferris2013-03-123-0/+367
Move arch specific code for arm, mips, x86 into separate makefiles. In addition, add different arm cpu versions of memcpy/memset. Bug: 8005082 Merge from internal master (acdde8c1cf8e8beed98c052757d96695b820b50c). Change-Id: I04f3d0715104fab618e1abf7cf8f7eec9bec79df