• Login
    View Item 
    •   Eurographics DL Home
    • Graphics Dissertation Online
    • 2020
    • View Item
    •   Eurographics DL Home
    • Graphics Dissertation Online
    • 2020
    • View Item
    JavaScript is disabled for your browser. Some features of this site may not work without it.

    Real-time 3D Human Body Pose Estimation from Monocular RGB Input

    Thumbnail
    View/Open
    Thesis (57.80Mb)
    Date
    2020-10
    Author
    Mehta, Dushyant
    Item/paper (currently) not available via TIB Hannover.
    Metadata
    Show full item record
    Abstract
    Human motion capture finds extensive application in movies, games, sports and biomechanical analysis. However, existing motion capture solutions require cumbersome external and/or on-body instrumentation, or use active sensors with limits on the possible capture volume dictated by power consumption. The ubiquity and ease of deployment of RGB cameras makes monocular RGB based human motion capture an extremely useful problem to solve, which would lower the barrier-to entry for content creators to employ motion capture tools, and enable newer applications of human motion capture. This thesis demonstrates the first real-time monocular RGB based motion-capture solutions that work in general scene settings. They are based on developing neural network based approaches to address the ill-posed problem of estimating 3D human pose from a single RGB image, in combination with model based fitting. In particular, the contributions of this work make advances towards three key aspects of real-time monocular RGB based motion capture, namely speed, accuracy, and the ability to work for general scenes. New training datasets are proposed, for single-person and multi-person scenarios, which, together with the proposed transfer learning based training pipeline, allow learning based approaches to be appearance invariant. The training datasets are accompanied by evaluation benchmarks with multiple avenues of fine-grained evaluation. The evaluation benchmarks differ visually from the training datasets, so as to promote efforts towards solutions that generalize to in-the-wild scenes. The proposed task formulations for the single-person and multi-person case allow higher accuracy, and incorporate additional qualities such as occlusion robustness, that are helpful in the context of a full motion capture solution. The multi-person formulations are designed to have a nearly constant inference time regardless of the number of subjects in the scene, and combined with contributions towards fast neural network inference, enable real-time 3D pose estimation for multiple subjects. Combining the proposed learning-based approaches with a model-based kinematic skeleton fitting step provides temporally stable joint angle estimates, which can be readily employed for driving virtual characters.
    URI
    https://diglib.eg.org:443/handle/10.2312/2632998
    Collections
    • 2020

    Eurographics Association copyright © 2013 - 2020 
    Send Feedback | Contact - Imprint | Data Privacy Policy | Disable Google Analytics
    Theme by @mire NV
    System hosted at  Graz University of Technology.
    TUGFhA
     

     

    Browse

    All of Eurographics DLCommunities & CollectionsBy Issue DateAuthorsTitlesSubjectsThis CollectionBy Issue DateAuthorsTitlesSubjects

    My Account

    LoginRegister

    BibTeX | TOC

    Create BibTeX Create Table of Contents

    Eurographics Association copyright © 2013 - 2020 
    Send Feedback | Contact - Imprint | Data Privacy Policy | Disable Google Analytics
    Theme by @mire NV
    System hosted at  Graz University of Technology.
    TUGFhA