A Multimodal Dataset for Dialogue Intent Recognition through Human Movement and Nonverbal Cues

dc.contributor.authorLin, Shu-Weien_US
dc.contributor.authorZhang, Jia-Xiangen_US
dc.contributor.authorLu, Jun-Fu Linen_US
dc.contributor.authorHuang, Yi-Jhengen_US
dc.contributor.authorZhang, Junpoen_US
dc.contributor.editorChristie, Marcen_US
dc.contributor.editorHan, Ping-Hsuanen_US
dc.contributor.editorLin, Shih-Syunen_US
dc.contributor.editorPietroni, Nicoen_US
dc.contributor.editorSchneider, Teseoen_US
dc.contributor.editorTsai, Hsin-Rueyen_US
dc.contributor.editorWang, Yu-Shuenen_US
dc.contributor.editorZhang, Eugeneen_US
dc.date.accessioned2025-10-07T06:05:23Z
dc.date.available2025-10-07T06:05:23Z
dc.date.issued2025
dc.description.abstractThis paper presents a multimodal dataset designed to advance dialogue intent recognition through skeleton-based representations and temporal human movement features. Rather than proposing a new model, our objective is to provide a high-quality, annotated dataset that captures subtle nonverbal cues preceding human speech and interaction. The dataset includes skeletal joint coordinates, facial orientation, and contextual object data (e.g., microphone positions), collected from diverse participants across varied conversational scenarios. In the future research, we will benchmark three types of learning methods and offer comparative insights. The benchmark three types of learning methods will be handcrafted feature models, sequence models (LSTM), and graph-based models (GCN). This resource aims to facilitate the development of more natural, sensor-free, and data-driven human-computer interaction systems by providing a robust foundation for training and evaluation.en_US
dc.description.sectionheadersPosters and Demos
dc.description.seriesinformationPacific Graphics Conference Papers, Posters, and Demos
dc.identifier.doi10.2312/pg.20251310
dc.identifier.isbn978-3-03868-295-0
dc.identifier.pages2 pages
dc.identifier.urihttps://doi.org/10.2312/pg.20251310
dc.identifier.urihttps://diglib.eg.org/handle/10.2312/pg20251310
dc.publisherThe Eurographics Associationen_US
dc.rightsAttribution 4.0 International License
dc.rights.urihttps://creativecommons.org/licenses/by/4.0/
dc.titleA Multimodal Dataset for Dialogue Intent Recognition through Human Movement and Nonverbal Cuesen_US
Files
Original bundle
Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
pg20251310.pdf
Size:
354.54 KB
Format:
Adobe Portable Document Format