31 results
Search Results
Now showing 1 - 10 of 31
Item A Parallel Approach to Compression and Decompression of Triangle Meshes using the GPU(The Eurographics Association and John Wiley & Sons Ltd., 2017) Jakob, Johannes; Buchenau, Christoph; Guthe, Michael; Bærentzen, Jakob Andreas and Hildebrandt, KlausMost state-of-the-art compression algorithms use complex connectivity traversal and prediction schemes, which are not efficient enough for online compression of large meshes. In this paper we propose a scalable massively parallel approach for compression and decompression of large triangle meshes using the GPU. Our method traverses the input mesh in a parallel breadth-first manner and encodes the connectivity data similarly to the well known cut-border machine. Geometry data is compressed using a local prediction strategy. In contrast to the original cut-border machine, we can additionally handle triangle meshes with inconsistently oriented faces. Our approach is more than one order of magnitude faster than currently used methods and achieves competitive compression rates.Item Iso Photographic Rendering(The Eurographics Association, 2018) Porral, Philippe; Lucas, Laurent; Muller, Thomas; Randrianandrasana, Joël; Reinhard Klein and Holly RushmeierIn the field of computer graphics, the simulation of the visual appearance of materials requires an accurate computation of the light transport equation. Consequently, material models need to take into account various factors which may influence the spectral radiance perceived by the human eye. Though numerous relevant studies on the reflectance properties of materials have been conducted to date, environment maps used to simulate visual behaviors remain chiefly trichromatic. Whereas questions regarding the accurate characterization of natural lighting have been raised for some time, there are still no real sky environment maps that include both spectral radiance and polarization data. Under these conditions the simulations carried out are approximate and therefore insufficient for the industrial world where investment-sensitive decisions are often made based on these very calculations.Item Ray-Traced Collision Detection: Interpenetration Control and Multi-GPU Performance(The Eurographics Association, 2013) Lehericey, Francois; Gouranton, Valérie; Arnaldi, Bruno; Betty Mohler and Bruno Raffin and Hideo Saito and Oliver StaadtWe proposed in [LGA13] an iterative ray-traced collision detection algorithm (IRTCD) that exploits spatial and temporal coherency and proved to be computationally efficient but at the price of some geometrical approximations that allow more interpenetration than needed. In this paper, we present two methods to efficiently control and reduce the interpenetration without noticeable computation overhead. The first method predicts the next potentially colliding vertices. These predictions are used to make our IRTCD algorithm more robust to the above-mentioned approximations, therefore reducing the errors up to 91%. We also present a ray re-projection algorithm that improves the physical response of ray-traced collision detection algorithm. This algorithm also reduces, up to 52%, the interpenetration between objects in a virtual environment. Our last contribution shows that our algorithm, when implemented on multi-GPUs architectures, is far faster.Item Reduced Precision for Hardware Ray Tracing in GPUs(The Eurographics Association, 2014) Keely, Sean; Ingo Wald and Jonathan Ragan-KelleyWe propose a high performance, GPU integrated, hardware ray tracing system. We present and make use of a new analysis of ray traversal in axis aligned bounding volume hierarchies. This analysis enables compact traversal hardware through the use of reduced precision arithmetic. We also propose a new cache based technique for scheduling ray traversal. With the addition of our compact fixed function traversal unit and cache mechanism, we show that current GPU architectures are well suited for hardware accelerated ray tracing, requiring only small modifications to provide high performance. By making use of existing GPU resources we are able to keep all rays and scheduling traffic on chip and out of caches. We used simulations to estimate the performance of our architecture. Our system achieves an average ray rate of 3.4 billion rays per second while path tracing our test scenes.Item Path Tracing on Massively Parallel Neuromorphic Hardware(The Eurographics Association, 2012) Richmond, Paul; Allerton, David J.; Hamish Carr and Silvester CzannerRay tracing on parallel hardware has recently benefit from significant advances in the graphics hardware and associated software tools. Despite this, the SIMD nature of graphics card architectures is only able to perform well on groups of coherent rays which exhibit little in the way of divergence. This paper presents SpiNNaker, a massively parallel system based on low power ARM cores, as an architecture suitable for ray tracing applications. The asynchronous design allows us to demonstrate a linear performance increase with respect to the number of cores. The performance perWatt ratio achieved within the fixed point path tracing example presented is far greater than that of a multi-core CPU and similar to that of a GPU under optimal conditions.Item Packet-Oriented Streamline Tracing on Modern SIMD Architectures(The Eurographics Association, 2015) Hentschel, Bernd; Göbbert, Jens Henrik; Klemm, Michael; Springer, Paul; Schnorr, Andrea; Kuhlen, Torsten W.; C. Dachsbacher and P. NavrátilThe advection of integral lines is an important computational kernel in vector field visualization. We investigate how this kernel can profit from vector (SIMD) extensions in modern CPUs. As a baseline, we formulate a streamline tracing algorithm that facilitates auto-vectorization by an optimizing compiler. We analyze this algorithm and propose two different optimizations. Our results show that particle tracing does not per se benefit from SIMD computation. Based on a careful analysis of the auto-vectorized code, we propose an optimized data access routine and a re-packing scheme which increases average SIMD efficiency. We evaluate our approach on three different, turbulent flow fields. Our optimized approaches increase integration performance up to 5:6 over our baseline measurement. We conclude with a discussion of current limitations and aspects for future work.Item GPU Collision Detection in Conformal Geometric Space(The Eurographics Association, 2021) Roa, Eduardo; Theoktisto, VÃctor; Fairén, Marta; Navazo, Isabel; Silva, F. and Gutierrez, D. and RodrÃguez, J. and Figueiredo, M.We derive a conformal algebra treatment unifying all types of collisions among points, vectors, areas (defined by bivectors and trivectors) and 3D solid objects (defined by trivectors and quadvectors), based in a reformulation of collision queries from R3 to conformal R4,1 space. The algebraic formulation in this 5D space is then implemented in GPU to allow faster parallel computation queries. Results show expected orders of magnitude improvements computing collisions among known mesh models, allowing interactive rates without using optimizations and bounding volume hierarchies.Item Stochastic Depth Buffer Compression using Generalized Plane Encoding(The Eurographics Association and Blackwell Publishing Ltd., 2013) Andersson, Magnus; Munkberg, Jacob; Akenine-Möller, Tomas; I. Navazo, P. PoulinIn this paper, we derive compact representations of the depth function for a triangle undergoing motion or defocus blur. Unlike a static primitive, where the depth function is planar, the depth function is a rational function in time and the lens parameters. Furthermore, we show how these compact depth functions can be used to design an efficient depth buffer compressor/decompressor, which significantly lowers total depth buffer bandwidth usage for a range of test scenes. In addition, our compressor/decompressor is simpler in the number of operations needed to execute, which makes our algorithm more amenable for hardware implementation than previous methodsItem A Parallel Architecture for IISPH Fluids(The Eurographics Association, 2014) Thaler, Felix; Solenthaler, Barbara; Gross, Markus; Jan Bender and Christian Duriez and Fabrice Jaillet and Gabriel ZachmannWe present an architecture for parallel computation of incompressible IISPH simulations on distributed memory systems. We use orthogonal recursive bisection for domain decomposition and present a stable and fast converging load balancing controller. The neighbor search data structure is derived such that it optimally fits into the parallel pipeline. We further show how symmetry aspects of the simulation can be integrated into the architecture. Simultaneous communication and computation are used to minimize parallelization overhead. The seamless integration of these parallel concepts into IISPH results in near linear scaling for large-scale simulations.Item Bandwidth-Efficient BVH Layout for Incremental Hardware Traversal(The Eurographics Association, 2016) Liktor, Gabor; Vaidyanathan, Karthik; Ulf Assarsson and Warren HuntThe memory footprint of bounding volume hierarchies (BVHs) can be significantly reduced using incremental encoding, which enables the coarse quantization of bounding volumes. However, this compression alone does not necessarily yield a comparable improvement in memory bandwidth. While the bounding volumes of the BVH nodes can be aggressively quantized, the size of the child node pointers remains a significant overhead. Moreover, as BVH nodes become comparably small to practical cache line sizes, the BVH is cached less efficiently. In this paper we introduce a novel memory layout and node addressing scheme and map it to a system architecture for fixed-function ray traversal. We evaluate this scheme using an architecture simulator and demonstrate a significant reduction in memory bandwidth, compared to previous approaches.