• Login
    View Item 
    •   Eurographics DL Home
    • Computer Graphics Forum
    • Volume 38 (2019)
    • 38-Issue 2
    • View Item
    •   Eurographics DL Home
    • Computer Graphics Forum
    • Volume 38 (2019)
    • 38-Issue 2
    • View Item
    JavaScript is disabled for your browser. Some features of this site may not work without it.

    Learning a Generative Model for Multi-Step Human-Object Interactions from Videos

    Thumbnail
    View/Open
    v38i2pp367-378.pdf (11.77Mb)
    suppmat.pdf (1.910Mb)
    suppvideo.mp4 (250.0Mb)
    Date
    2019
    Author
    Wang, He ORCID
    Pirk, Sören ORCID
    Yumer, Ersin ORCID
    Kim, Vladimir ORCID
    Sener, Ozan ORCID
    Sridhar, Srinath ORCID
    Guibas, Leonidas ORCID
    Pay-Per-View via TIB Hannover:

    Try if this item/paper is available.

    Metadata
    Show full item record
    Abstract
    Creating dynamic virtual environments consisting of humans interacting with objects is a fundamental problem in computer graphics. While it is well-accepted that agent interactions play an essential role in synthesizing such scenes, most extant techniques exclusively focus on static scenes, leaving the dynamic component out. In this paper, we present a generative model to synthesize plausible multi-step dynamic human-object interactions. Generating multi-step interactions is challenging since the space of such interactions is exponential in the number of objects, activities, and time steps. We propose to handle this combinatorial complexity by learning a lower dimensional space of plausible human-object interactions. We use action plots to represent interactions as a sequence of discrete actions along with the participating objects and their states. To build action plots, we present an automatic method that uses state-of-the-art computer vision techniques on RGB videos in order to detect individual objects and their states, extract the involved hands, and recognize the actions performed. The action plots are built from observing videos of everyday activities and are used to train a generative model based on a Recurrent Neural Network (RNN). The network learns the causal dependencies and constraints between individual actions and can be used to generate novel and diverse multi-step human-object interactions. Our representation and generative model allows new capabilities in a variety of applications such as interaction prediction, animation synthesis, and motion planning for a real robotic agent.
    BibTeX
    @article {10.1111:cgf.13644,
    journal = {Computer Graphics Forum},
    title = {{Learning a Generative Model for Multi-Step Human-Object Interactions from Videos}},
    author = {Wang, He and Pirk, Sören and Yumer, Ersin and Kim, Vladimir and Sener, Ozan and Sridhar, Srinath and Guibas, Leonidas},
    year = {2019},
    publisher = {The Eurographics Association and John Wiley & Sons Ltd.},
    ISSN = {1467-8659},
    DOI = {10.1111/cgf.13644}
    }
    URI
    https://doi.org/10.1111/cgf.13644
    https://diglib.eg.org:443/handle/10.1111/cgf13644
    Collections
    • 38-Issue 2

    Eurographics Association copyright © 2013 - 2020 
    Send Feedback | Contact - Imprint | Data Privacy Policy | Disable Google Analytics
    Theme by @mire NV
    System hosted at  Graz University of Technology.
    TUGFhA
     

     

    Browse

    All of Eurographics DLCommunities & CollectionsBy Issue DateAuthorsTitlesSubjectsThis CollectionBy Issue DateAuthorsTitlesSubjects

    My Account

    LoginRegister

    BibTeX | TOC

    Create BibTeX Create Table of Contents

    Eurographics Association copyright © 2013 - 2020 
    Send Feedback | Contact - Imprint | Data Privacy Policy | Disable Google Analytics
    Theme by @mire NV
    System hosted at  Graz University of Technology.
    TUGFhA