aboutsummaryrefslogtreecommitdiffstats
Commit message (Collapse)AuthorAgeFilesLines
* android: Build kms_swrast for the Android platformhistory/lineage-16.0-base-oldRob Herring2019-02-095-3/+38
| | | | | | Signed-off-by: Rob Herring <robh@kernel.org> Signed-off-by: Robert Foss <robert.foss@collabora.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
* platform/android: Enable kms_swrast fallbackRobert Foss2019-02-091-6/+8
| | | | | | | | | | | | Add support for the ForceSoftware option, which is togglable on the Android platform through setting the "drm.gpu.force_software" property to a non-zero value. kms_swrast is also enabled as a fallback for when a driver is not able to be loaded for for a drm node that was opened. Signed-off-by: Robert Foss <robert.foss@collabora.com> Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
* egl/android: Add Android property for forcing software renderingRobert Foss2019-02-091-0/+10
| | | | | | | | | | In order to simplify Android bringup on new devices, provide the property "drm.gpu.force_software" which forces kms_swrast to be used. Signed-off-by: Robert Foss <robert.foss@collabora.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
* softpipe: Add assert verifying successful pipe_transfer_mapRobert Foss2019-02-091-0/+1
| | | | | | | | | This failure mode is a bit tricky to debug and manifests itself later as a null pointer dereference, for which finding the origin is needlessly tricky. Signed-off-by: Robert Foss <robert.foss@collabora.com> Reviewed-by: Brian Paul <brianp@vmware.com>
* radeonsi: fix pk2h breakageMarek Olšák2018-07-231-2/+5
|
* radeonsi: reduce LDS stalls by 40% for tessellationMarek Olšák2018-07-234-6/+14
| | | | | | | | 40% is the decrease in the LGKM counter (which includes SMEM too) for the GFX9 LSHS stage. This will make the LDS size slightly larger, but I wasn't able to increase the patch stride without corruption, so I'm increasing the vertex stride.
* radeonsi: Add debug option to enable LLVM GlobalISel (v2)Tom Stellard2018-07-235-2/+22
| | | | | | | | | | R600_DEBUG=gisel will tell LLVM to use GlobalISel rather than SelectionDAG for instruction selection. v2: mareko: move the helper to src/amd/common Signed-off-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Tom Stellard <tstellar@redhat.com>
* intel/compiler: Account for built-in uniforms in analyze_ubo_rangesJason Ekstrand2018-07-238-9/+45
| | | | | | | | | | | | The original pass only looked for load_uniform intrinsics but there are a number of other places that could end up loading a push constant. One obvious omission was images which always implicitly use a push constant. Legacy VS clip planes also get pushed into the shader. This fixes some new Vulkan CTS tests that test random combinations of bindings and, in particular, test lots of UBOs and images together. Cc: mesa-stable@lists.freedesktop.org Cc: Kenneth Graunke <kenneth@whitecape.org>
* radv: enable VK_KHR_16bit_storage extension / 16bit storage featuresDaniel Schürmann2018-07-233-4/+8
| | | | Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
* ac: add support for 16bit load_push_constantDaniel Schürmann2018-07-231-0/+20
| | | | Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
* radv: add support for 16bit input/outputDaniel Schürmann2018-07-232-18/+80
| | | | Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
* nir: add 16bit type information to glsl typesDaniel Schürmann2018-07-233-0/+28
| | | | Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
* ac: add support for 16bit buffer loadsDaniel Schürmann2018-07-231-40/+55
| | | | | | v2: Fixed dvec3 loads (bas) Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
* ac: add support for 16bit UBO loadsDaniel Schürmann2018-07-233-3/+51
| | | | Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
* ac: add support for 16bit ssbo storesDaniel Schürmann2018-07-231-60/+84
| | | | Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
* ac: add 16bit conversion operationsDaniel Schürmann2018-07-232-9/+31
| | | | Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
* r600: enable tess_input_info for TESDave Airlie2018-07-231-14/+6
| | | | | | | | | | | There might be a nicer way to do this, but this is at least correct. This fixes: KHR-GL44.tessellation_shader.single.max_patch_vertices KHR-GL44.tessellation_shader.tessellation_control_to_tessellation_evaluation.gl_PatchVerticesIn Reviewed-By: Gert Wollny <gert.wollny@collabora.com> Cc: mesa-stable@lists.freedesktop.org
* docs/features: fix virgl gles3.1 entriesDave Airlie2018-07-241-2/+2
|
* draw: force draw pipeline if there's more than 65535 verticesRoland Scheidegger2018-07-233-0/+21
| | | | | | | | | | | | | | | | | | | | | | | | The pt emit path can only handle 65535 - the number of vertices is truncated to a ushort, resulting in a too small buffer allocation, which will crash. Forcing the pipeline path looks suboptimal, then again this bug is probably there ever since GS is supported, so it seems it's not happening often. (Note that the vertex_id in the vertex header is 16 bit too, however this is only used by the draw pipeline, and it denotes the emit vertex nr, and that uses vbuf code, which will only emit smaller chunks, so should be fine I think.) Other solutions would be to simply allow 32bit counts for vertex allocation, however 65535 is already larger than this was intended for (the idea being it should be more cache friendly). Or could try to teach the pt emit path to split the emit in smaller chunks (only the non-index path can be affected, since gs output is always linear), but it's a bit tricky (we don't know the primitive boundaries up-front). Fixes: https://bugs.freedesktop.org/show_bug.cgi?id=107295 Cc: <mesa-stable@lists.freedesktop.org> Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
* docs/features: note ARB_copy_image is working on virglDave Airlie2018-07-241-1/+1
|
* Revert "virgl: remove unused stride-arguments"Dave Airlie2018-07-245-5/+33
| | | | | | This reverts commit dc938b8398c0dafb60507e41685f7518b681c24d. This adds warnings in vtest, and possibly breaks it.
* docs/features: note ssbo and atomic counters done for virglDave Airlie2018-07-241-5/+5
|
* virgl: add initial shader_storage_buffer_object support. (v2)Dave Airlie2018-07-249-0/+98
| | | | | | | | | | | This adds the guest side support for ARB_shader_storage_buffer_object. Co-authors: Gurchetan Singh <gurchetansingh@chromium.org> v2: move to using separate maximums (fixup macros) Reviewed-By: Gert Wollny <gert.wollny@collabora.com>
* nir: Add a couple trivial abs optimizationsJason Ekstrand2018-07-231-0/+2
| | | | | | | Spotted in a shader in Batman: Arkham City. Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
* glsl: remove delegating constructors to allow build with C++98Caio Marcelo de Oliveira Filho2018-07-231-6/+8
| | | | | | | | | | | | | | Delegating constructors is a C++11 feature, so this was breaking when compiling with C++98. Change the copy_propagation_state() calls that used the convenience constructor to use a static member function instead. Since copy_propagation_state is expected to be heap allocated, this change is a good fit. Tested-by: Vinson Lee <vlee@freedesktop.org> Reviewed-by: Matt Turner <mattst88@gmail.com> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=107305
* v3d: Implement a small immediates optimization, based on VC4's.Eric Anholt2018-07-238-19/+143
| | | | | | | | | We can do one per instruction, and we have to be careful not to overwrite raddr_b, but this greatly reduces the pressure on uniform loads (particularly around ldvpm/stvpm instructions). total instructions in shared programs: 90768 -> 88220 (-2.81%) instructions in affected programs: 82711 -> 80163 (-3.08%)
* v3d: Return an invalid src number if asked for a missing implicit uniform.Eric Anholt2018-07-232-3/+3
| | | | | | Sometimes when iterating over sources, we might want to check if it's the implicit one. We wouldn't want to match on a non-implicit src using this function.
* v3d: Skip emitting texture config parameter 2 if it's just the defaults.Eric Anholt2018-07-231-1/+5
| | | | | | shader-db: total instructions in shared programs: 91275 -> 90768 (-0.56%) instructions in affected programs: 20702 -> 20195 (-2.45%)
* v3d: Update an XXX comment for a path we handled in HW on V3D 4.x.Eric Anholt2018-07-231-1/+1
|
* v3d: Switch to using the new SFU instructions on V3D 4.x.Eric Anholt2018-07-238-24/+118
| | | | | | | | | | | | | | | | These instructions let us write directly to the phys regfile, instead of just R4. That lets us avoid moving out of R4 to avoid conflicting with other SFU results, and to avoid conflicting with thread switches. There is still an extra instruction of latency, which is not represented in the scheduler at the moment. If you use the result before it's ready, the QPU will just stall, unlike the magic R4 mode where you'd read the previous value. That means that the following shader-db results aren't quite representative (since we now cause some stalls instead of emitting nops), but they're impressive enough that I'm happy with the change. total instructions in shared programs: 95669 -> 91275 (-4.59%) instructions in affected programs: 82590 -> 78196 (-5.32%)
* v3d: Add QPU pack/unpack for the new SFU instructions.Eric Anholt2018-07-234-0/+32
| | | | | These instructions allow writing the result to any register, instead of a special writeback to r4.
* v3d: Fix the name of the "flpop" operation.Eric Anholt2018-07-236-6/+7
| | | | | Noticed while trying to sort a new op into the appropriate place to match the documentation.
* v3d: Print the instruction we're testing in the QPU disasm/pack round-trip.Eric Anholt2018-07-231-2/+3
| | | | | If we fail initial disassembly, it's good to know what instruction it was that failed.
* v3d: Drop unused vir_SAT() operation.Eric Anholt2018-07-231-8/+0
| | | | We lower saturates in NIR.
* v3d: Rotate through registers to improve post-RA scheduling options.Eric Anholt2018-07-231-0/+45
| | | | | | | | | | | Similarly to VC4's implementation, by not picking r0 immediately upon freeing it, we give the scheduler more of a chance to fit later writes in earlier. I'm not clear on whether there's any real cost to picking phys over accumulators, so keep that behavior for now. shader-db: total instructions in shared programs: 96831 -> 95669 (-1.20%) instructions in affected programs: 77254 -> 76092 (-1.50%)
* v3d: Allow reading from physical regs written in the previous instruction.Eric Anholt2018-07-231-24/+0
| | | | | | | | | This restriction existed in V3D 2.x, but lifting it was a major change in 3.x. shader-db results: total instructions in shared programs: 98117 -> 96831 (-1.31%) instructions in affected programs: 48520 -> 47234 (-2.65%)
* anv: remove unnecessary runtime copy of static stringEric Engestrom2018-07-231-2/+1
| | | | | | | | It's actually also a bit safer, since now the compiler will warn if the string is larger than the `.name` array. Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
* anv: Pay attention to VK_ACCESS_MEMORY_(READ|WRITE)_BITAlex Smith2018-07-231-0/+9
| | | | | | | | | | | | | According to the spec, these should apply to all read/write access types (so would be equivalent to specifying all other access types individually). Currently, they were doing nothing. v2: Handle VK_ACCESS_MEMORY_WRITE_BIT in dstAccessMask. Signed-off-by: Alex Smith <asmith@feralinteractive.com> Cc: mesa-stable@lists.freedesktop.org Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
* virgl: remove unused stride-argumentsErik Faye-Lund2018-07-235-33/+5
| | | | | | | | The IOCTLs doesn't pass this along, so computing them in the first place is kinda pointless. Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com> Reviewed-by: Gurchetan Singh <gurchetansingh@chromium.org>
* radv: print a big warning when RADV_TRACE_FILE is setSamuel Pitoiset2018-07-231-0/+4
| | | | | | | | Users shouldn't use this debugging option except when we ask them to do! Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
* radv: fix a memleak for merged shaders on GFX9Samuel Pitoiset2018-07-231-1/+1
| | | | | | | | | | modules[i] can be NULL for merged shaders but we have to free the NIR code. radv_can_dump_shader_stats() already handles if modules[i] is NULL, no need to check it twice. Cc: mesa-stable@lists.freedesktop.org Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
* intel/blorp: Fix blits to R8G8B8_UNORM_SRGB sRGB harderJason Ekstrand2018-07-231-3/+11
| | | | | | | | | | | | The first fix attempt contained a nasty typo which somehow didn't get caught in review. It also didn't work as intended because the sRGB conversion was happening but then throwing away all but the red channel because it dind't know it was RGB. Really, it's my fault for trying to fix a bug without first writing tests. I've now written tests and they pass with this change. :) Fixes: 11712b9ca17 "intel/blorp: Fix blits to R8G8B8_UNORM_SRGB" Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
* anv: Stop setting 3DSTATE_PS_EXTRA::PixelShaderHasUAVJason Ekstrand2018-07-221-31/+22
| | | | | | | | | | | | | | We've had several broadwell hangs that have come down to this bit just not working correctly. Most recently, we've had a pile of hangs reported with apps running under DXVK: https://github.com/doitsujin/dxvk/issues/469 Instead, use the bit that doesn't try to imply weird D3D coherency things and just force-enables the PS like we want. cc: mesa-stable@lists.freedesktop.org Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
* anv: Properly handle GetImageSubresourceLayout on complex imagesJason Ekstrand2018-07-221-7/+16
| | | | | | | | We support mipmapped and arrayed linear images so we need to support vkGetImageSubresourceLayout on them. Fortunately, it's just a trivial call into ISL. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
* radeonsi/nir: make use of nir_lower_load_const_to_scalar()Timothy Arceri2018-07-231-0/+2
| | | | | | | | | This allows NIR to CSE more operations. LLVM does this also so the impact is limited, however doing this in NIR allows other opts to make progress. For example some loops in Civilization Beyond Earth shaders are unrolled. Reviewed-by: Marek Olšák <marek.olsak@amd.com>
* anv/gen9: expose VK_EXT_post_depth_coverageIlia Mirkin2018-07-223-2/+10
| | | | | | | | | | Note that the use of ICMS_INNER_CONSERVATIVE disagrees with the GL driver. Perhaps it's more performant than ICMS_NORMAL and is otherwise permitted? Not sure, so I left it as-is. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
* spirv: add support for SPV_KHR_post_depth_coverageIlia Mirkin2018-07-222-0/+10
| | | | | | | | | Allow the capability to be exposed, and convert the new execution mode into fs state. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
* android: util/disk_cache: fix building errors in gallium driversMauro Rossi2018-07-211-0/+1
| | | | | | | | | | | | | | | | | | This patch applies the necessary changes in Android.common.mk as per automake rules, to avoid following building error: external/mesa/src/gallium/drivers/nouveau/nouveau_screen.c:159:8: error: implicit declaration of function 'disk_cache_get_function_timestamp' is invalid in C99 [-Werror,-Wimplicit-function-declaration] if (disk_cache_get_function_timestamp(nouveau_disk_cache_create, ^ 1 error generated. (v2) -DENABLE_SHADER_CACHE Android cflag is kept, to leave the AS-IS capability enabled Fixes: cc10b34 ("util/disk_cache: Fix disk_cache_get_function_timestamp with disabled cache.") Signed-off-by: Mauro Rossi <issor.oruam@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
* Android: fix a missing nir_intrinsics.h errorChih-Wei Huang2018-07-212-0/+3
| | | | | | | | | | | | | | | | | | | | | The commit 76dfed8ae2d5 changed nir_intrinsics.h to be a generated header, but the corresponding dependency was not updated for Android. It causes the error: [ 0% 19/4336] target C: libmesa_pipe_radeonsi <= external/mesa/src/gallium/drivers/radeonsi/si_debug.c ... In file included from external/mesa/src/gallium/drivers/radeonsi/si_debug.c:25: In file included from external/mesa/src/gallium/drivers/radeonsi/si_pipe.h:28: In file included from external/mesa/src/gallium/drivers/radeonsi/si_shader.h:140: In file included from external/mesa/src/amd/common/ac_llvm_build.h:30: external/mesa/src/compiler/nir/nir.h:966:10: fatal error: 'nir_intrinsics.h' file not found ^~~~~~~~~~~~~~~~~~ 1 error generated. Fixes: 76dfed8ae2d5 ("nir: mako all the intrinsics") Signed-off-by: Chih-Wei Huang <cwhuang@linux.org.tw> Reviewed-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Mauro Rossi <issor.oruam@gmail.com>
* nir: Fix end of function without return warning/error.Bas Nieuwenhuizen2018-07-201-0/+2
| | | | | | | | | There always is a continue block, so let us just do unreachable. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Fixes: 8cacf38f527 "nir: Do not use continue block after removing it." CC: 18.1 <mesa-stable@lists.freedesktop.org> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=107312