4 results
Search Results
Now showing 1 - 4 of 4
Item Inpainting Normal Maps for Lightstage data(The Eurographics Association, 2023) Zuo, Hancheng; Tiddeman, Bernard; Vangorp, Peter; Hunter, DavidThis paper presents a new method for inpainting of normal maps using a generative adversarial network (GAN) model. Normal maps can be acquired from a lightstage, and when used for performance capture, there is a risk of areas of the face being obscured by the movement (e.g. by arms, hair or props). Inpainting aims to fill missing areas of an image with plausible data. This work builds on previous work for general image inpainting, using a bow tie-like generator network and a discriminator network, and alternating training of the generator and discriminator. The generator tries to sythesise images that match the ground truth, and that can also fool the discriminator that is classifying real vs processed images. The discriminator is occasionally retrained to improve its performance at identifying the processed images. In addition, our method takes into account the nature of the normal map data, and so requires modification to the loss function. We replace a mean squared error loss with a cosine loss when training the generator. Due to the small amount of available training data available, even when using synthetic datasets, we require significant augmentation, which also needs to take account of the particular nature of the input data. Image flipping and in-plane rotations need to properly flip and rotate the normal vectors. During training, we monitored key performance metrics including average loss, Structural Similarity Index Measure (SSIM), and Peak Signal-to-Noise Ratio (PSNR) of the generator, alongside average loss and accuracy of the discriminator. Our analysis reveals that the proposed model generates high-quality, realistic inpainted normal maps, demonstrating the potential for application to performance capture. The results of this investigation provide a baseline on which future researchers could build with more advanced networks and comparison with inpainting of the source images used to generate the normal maps.Item Exploring Language Pedagogy with Virtual Reality and Artificial Intelligence(The Eurographics Association, 2023) Michael, Brandon; Aburumman, Nadine; Vangorp, Peter; Hunter, DavidVirtual Reality (VR) is a highly immersive and interactive experience that renders users to be engrossed in a 3D virtual environment. The recent technological advancements with high-resolution headset display, and accurate tracking of six degrees of freedom paired with controllers allow life-like renditions of real-world scenarios as well as fictional scenarios without potential environmental risks. This paper explores the usage of Virtual Reality in education by incorporating current pedagogical approaches into an interactive 3D virtual environment. The focus of this study revolves around language pedagogy, in specific, the tool developed allows teach users fundamental Mandarin Chinese. This educational VR application enables users to practice their reading and writing skills through a calligraphy lesson and engages users in a listening and speaking lesson through natural conversation. To achieve an organic dialogue, phrases spoken by the user in a lesson are validated immediately through an intuitive phrase recognition system developed using machine learning. The developed prototype has undergone testing to ensure its efficacy. An initial investigation into this prototype found that the majority of participants were supportive of this concept and believe that it would improve the engagement of digital education.Item Investigating Deep Learning for Identification of Crabs and Lobsters on Fishing Boats(The Eurographics Association, 2023) Iftikhar, Muhammad; Tiddeman, Bernard; Neal, Marie; Hold, Natalie; Neal, Mark; Vangorp, Peter; Hunter, DavidThis paper describes a collaboration between marine and computer scientists to improve fisheries data collection. We evaluate deep learning (DL)-based solutions for identifying crabs and lobsters onboard fishing boats. A custom made electronic camera systems onboard the fishing boats captures the video clips. An automated process of frame extraction is adopted to collect images of crabs and lobsters for training and evaluating DL networks. We train Faster R-CNN, Single Shot Detector (SSD), and You Only Look Once (YOLO) with multiple backbones and input sizes. We also evaluate the efficiency of lightweight models for low-power devices equipped on fishing boats and compare the results of MobileNet-based SSD and YOLO-tiny versions. The models trained with higher input sizes result in lower frames per second (FPS) and vice versa. Base models are more accurate but compromise computational and run time cost. Lighter versions are flexible to install with lower mAP than full models. The pre-trained weights for training models on new datasets have a negligible impact on the results. YOLOv4-tiny is a balanced trade-off between accuracy and speed for object detection for low power devices that is the main step of our proposed pipeline for automated recognition and measurement of crabs and lobsters on fishing boats.Item Augmenting Anomaly Detection Datasets with Reactive Synthetic Elements(The Eurographics Association, 2023) Nikolov, Ivan; Vangorp, Peter; Hunter, DavidAutomatic anomaly detection for surveillance purposes has become an integral part of accident prevention and early warning systems. The lack of sufficient real datasets for training and testing such detectors has pushed a lot of research into synthetic data generation. A hybrid approach by combining real images with synthetic elements has been proven to produce the best training results.We aim to extend this hybrid approach by combining the backgrounds and real people captured in datasets with synthetic elements which dynamically react to real pedestrians and create more coherent video sequences. Our pipeline is the first to directly augment synthetic objects like handbags and suitcases to real pedestrians and provides dynamic occlusion between real and synthetic elements in the images. The pipeline can be easily used to produce a continuous stream of randomized augmented normal and abnormal data for training and testing. As a basis for our augmented images, we use one of the most widely used classical datasets for anomaly detection - the UCSD dataset. We show that the synthetic data produced by our proposed pipeline can be used to make the dataset harder for state-of-the-art models, by introducing more varied and challenging anomalies. We also demonstrate that the additional synthetic normal data can boost the performance of some models. Our solution can be easily extended with additional 3D models, animations, and anomaly scenarios.