Evaluation of Deep Pose Detectors for Automatic Analysis of Film Style

Wu, Hui-YinNguyen, LuanTabei, YoldozSassatelli, LucileRonfard, RémiWu, Hui-Yin2022-04-202022-04-202022978-3-03868-173-12411-9733https://doi.org/10.2312/wiced.20221047https://diglib.eg.org:443/handle/10.2312/wiced20221047Identifying human characters and how they are portrayed on-screen is inherently linked to how we perceive and interpret the story and artistic value of visual media. Building computational models sensible towards story will thus require a formal representation of the character. Yet this kind of data is complex and tedious to annotate on a large scale. Human pose estimation (HPE) can facilitate this task, to identify features such as position, size, and movement that can be transformed into input to machine learning models, and enable higher artistic and storytelling interpretation. However, current HPE methods operate mainly on non-professional image content, with no comprehensive evaluation of their performance on artistic film. Our goal in this paper is thus to evaluate the performance of HPE methods on artistic film content. We first propose a formal representation of the character based on cinematography theory, then sample and annotate 2700 images from three datasets with this representation, one of which we introduce to the community. An in-depth analysis is then conducted to measure the general performance of two recent HPE methods on metrics of precision and recall for character detection , and to examine the impact of cinematographic style. From these findings, we highlight the advantages of HPE for automated film analysis, and propose future directions to improve their performance on artistic film content.CCS Concepts: Computing methodologies --> Computer vision; Neural networks; Applied computing --> Media artsComputing methodologiesComputer visionNeural networksApplied computingMedia artsEvaluation of Deep Pose Detectors for Automatic Analysis of Film Style10.2312/wiced.202210475-128 pages