2024
Permanent URI for this collection
Browse
Browsing 2024 by Title
Now showing 1 - 3 of 3
Results Per Page
Sort Options
Item Computational models of visual attention and gaze behavior in virtual reality(2024-03-08) Martin, DanielVirtual reality (VR) is an emerging medium that has the potential to unlock unprecedented experiences. Since the late 1960s, this technology has advanced steadily, and can nowadays be a gateway to a completely different world. VR offers a degree of realism, immersion, and engagement never seen before, and lately we have witnessed how newer virtual content is being continuously created. However, to get the most out of this promising medium, there is still much to learn about people’s visual attention and gaze behavior in the virtual universe. Questions like “What attracts users’ attention?” or “How malleable is the human brain when in a virtual experience?” have no definite answer yet. We argue that it is important to build a principled understanding of viewing and attentional behavior in VR. This thesis presents contributions in two key aspects: Understanding and modeling users’ gaze behavior, and leveraging imperceptible manipulations to improve the virtual experience. In the first part of this thesis we have focused on developing computational models of gaze behavior in virtual environments. First, and resorting to the well-known concept of saliency, we have devised models of user attention in 360o images and 360o videos that are able to predict which parts of a virtual scene are more likely to draw viewers’ attention. Then, we have designed another two computational models for spatio-temporal attention prediction, one of them able to simulate thousands of virtual observers per second by generating realistic sequences of gaze points in 360o images, and the other one predicting different, yet plausible sequences of fixations on traditional images. Additionally, we have explored how attention works in 3D meshes. All such models have allowed us to delve into the particularities of human gaze behavior under different environments. Besides that, we have aimed at achieving a deeper understanding on visual attention in multimodal environments. First, we have exhaustively reviewed a vast literature on the use of additional sensory modalities, like audio, haptics, or proprioception, in virtual reality - also known as multimodality -, and its role and benefits in several disciplines. Then, we have gathered and analyzed the largest dataset of viewing behavior in ambisonic 360o videos to date, finding effects on different factors like type of content, or gender, among others. We have finally analyzed how viewing behavior varies depending on the performed tasks: We have delved into attention in the very specific case of driving scenarios, and we have also studied how significant effects in gaze behavior can be found when performing different tasks in immersive environments. The second part of this thesis attempts to improve virtual experiences by means of imperceptible manipulations. We have firstly focused on lateral movement in VR, and have devised thresholds for the detection of such manipulations, which we then applied in three key problems in VR that have no definite solution yet, namely 6-DoF viewing of 3-DoF content, overcoming physical space constraints, and reducing motion sickness. On the other hand, we have explored the manipulation of the virtual scene, resorting to the phenomenon of change blindness, and have derived insights and guidelines on how to elicit or avoid such an effect, and how human brains’ limitations affect it.Item Improving the efficiency of point cloud data management(TUprints, 2024-07) Bormann, PascalThe collection of point cloud data has increased drastically in recent years, which poses challenges for the data management layer. Multi-billion point datasets are commonplace and users are getting accustomed to real-time data exploration in the Web. To make this possible, existing point cloud data management approaches rely on optimized data formats which are time- and resource-intensive to generate. This introduces long wait times before data can be used and frequent data duplication, since these optimized formats are often domain- or application-specific. As a result, data management is a challenging and expensive aspect when developing applications that use point cloud data. We observe that the interaction between applications and the point cloud data management layer can be modeled as a series of queries similar to those found in traditional databases. Based on this observation, we evaluate current point cloud data management using three query metrics: Responsiveness, throughput, and expressiveness. We contribute to the current state of the art by improving these metrics for both the handling of raw files without preprocessing, as well as indexed point clouds. In the domain of unindexed point cloud data, we introduce the concept of ad-hoc queries, which are queries executed ad-hoc on raw point cloud files. We demonstrate that ad-hoc queries can improve query responsiveness significantly as they do not require long wait times for indexing or database imports. Using columnar memory layouts, queries on datasets of up to a billion points can be answered in interactive or near-interactive time, with throughputs of more than one hundred million points per second on unindexed data. A demonstration of an adaptive indexing method shows that spending a few seconds per query on index creation can improve responsiveness by up to an order of magnitude. Our experiments also confirm the importance of high-throughput systems when querying point cloud data, as the overhead of data transmission has a significant effect on the overall query performance. For situations where indexing is mandatory, we demonstrate improvements to the runtime performance of existing point cloud indexing tools. We developed a fast indexer based on task-parallel programming, using Morton indices to efficiently sort and distribute point batches onto worker threads. This system, called Schwarzwald, outperformed existing indexers by up to a factor 9 when it was first published, and still has competitive performance to current out-of-core capable indexers. Additionally we adapted our indexing algorithm for distributed processing in a Cloud-environment and demonstrate that its horizontal scalability allows it to outperform all existing indexers by up to a factor of 3. Lastly we demonstrated point cloud indexing in real-time during Light Detection And Ranging (LiDAR) capturing, based on a similar task-based algorithm but optimized for progressive indexing. Our real-time indexer is able to keep up with current LiDAR sensors in a real-world test, with end-to-end latencies as low as 0.1 seconds. Together, our improvements significantly reduce wait times for working with point cloud data and increase the overall efficiency of the data access layer.Item Visual Insights into Memory Behavior of GPU Ray Tracers(TUprints, 2024-07) von Buelow, MaxRay tracing is a fundamental rendering technique that typically projects three-dimensional representations of a scene onto a two-dimensional display. This is achieved by perspectively sampling a set of rays into the scene and computing intersections against the relevant geometry. Secondary rays may be sent out from these intersection points, allowing for physically correct global illumination on the reverse photon direction. Real-time rendering has historically used classical rasterization pipelines, which are straightforward to implement on hardware as they form a data-parallel problem projecting the whole scene into the coordinate system of the image. In contrast, task-parallel ray tracing suffers from incoherency between rays. However, recent advances in ray tracing have led to more efficient approaches, resulting in even more efficient embedded hardware implementations. While these approaches are already capable of rendering realistic images, further improvements in run-time performance can compensate for computational time to achieve higher framerates, display resolutions, ray-tracing recursion depths, or reducing the energy footprint of ray-tracing data centers. A fundamental technique for improving ray-tracing performance is the use of bounding-volume hierarchies (BVH), which prevent rays from intersecting the entire scene, especially in occluded or distant regions. In addition to the structural efficiency of a BVH, the primary bottlenecks of GPU ray tracing are memory latency and work distribution. These factors mainly result in more coherent memory accesses, making caching more efficient. Creating programs with the goal of achieving higher caching rates typically requires increased programming efforts and a deep understanding of the hardware, as an additional abstraction layer is introduced, making the memory pipeline less transparent. General-purpose profilers aim to support the implementation process. However, they typically display caching rates based on kernel calls. This is because these values are measured using basic hardware counters that do not distinguish between the context of a memory access. In many cases, it would be useful to have a more detailed representation of memory-related profiling metrics, such as the number of recordings per memory allocation or projections into other domains, such as the framebuffer or the scene geometry. This thesis presents a new method for simulating the GPU memory pipeline accurately. The method uses memory traces exported by dynamic binary instrumentation, which can be applied to any compiled GPU binaries, similar to standard profilers. The exported memory profiles can be used for performance visualization purposes in individual domains, as well as traditional memory profiling metrics that can be displayed in finer granularity than usual. A method for mapping memory metrics onto the original scene is included, allowing users to explore profiling results within the scene domain, making the profiling process more intuitive. In addition, this thesis presents a novel compressed ray-tracing implementation that optimizes its memory footprint by making assumptions about the topological properties of the scene to be rendered. The findings can be used to evaluate and optimize a wide range of ray tracing and ray marching applications in a user-friendly manner.