DeFT-Net: Dual-Window Extended Frequency Transformer for Rhythmic Motion Prediction

dc.contributor.authorAdemola, Adeyemien_US
dc.contributor.authorSinclair, Daviden_US
dc.contributor.authorKoniaris, Babisen_US
dc.contributor.authorHannah, Samanthaen_US
dc.contributor.authorMitchell, Kennyen_US
dc.contributor.editorHunter, Daviden_US
dc.contributor.editorSlingsby, Aidanen_US
dc.date.accessioned2024-09-09T05:44:59Z
dc.date.available2024-09-09T05:44:59Z
dc.date.issued2024
dc.description.abstractEnabling online virtual reality (VR) users to dance and move in a way that mirrors the real-world necessitates improvements in the accuracy of predicting human motion sequences paving way for an immersive and connected experience. However, the drawbacks of latency in networked motion tracking present a critical detriment in creating a sense of complete engagement, requiring prediction for online synchronization of remote motions. To address this challenge, we propose a novel approach that leverages a synthetically generated dataset based on supervised foot anchor placement timings of rhythmic motions to ensure periodicity resulting in reduced prediction error. Specifically, our model compromises a discrete cosine transform (DCT) to encode motion, refine high frequencies and smooth motion sequences and prevent jittery motions. We introduce a feed-forward attention mechanism to learn based on dual-window pairs of 3D key points pose histories to predict future motions. Quantitative and qualitative experiments validating on the Human3.6m dataset result in observed improvements in the MPJPE evaluation metrics protocol compared with prior state-of-the-art.en_US
dc.description.sectionheaders3D Rendering and Virtual Reality (VR)
dc.description.seriesinformationComputer Graphics and Visual Computing (CGVC)
dc.identifier.doi10.2312/cgvc.20241220
dc.identifier.isbn978-3-03868-249-3
dc.identifier.pages7 pages
dc.identifier.urihttps://doi.org/10.2312/cgvc.20241220
dc.identifier.urihttps://diglib.eg.org/handle/10.2312/cgvc20241220
dc.publisherThe Eurographics Associationen_US
dc.rightsAttribution 4.0 International License
dc.rights.urihttps://creativecommons.org/licenses/by/4.0/
dc.subjectCCS Concepts: Computing methodologies → Machine Learning; Motion Processing; Virtual Reality
dc.subjectComputing methodologies → Machine Learning
dc.subjectMotion Processing
dc.subjectVirtual Reality
dc.titleDeFT-Net: Dual-Window Extended Frequency Transformer for Rhythmic Motion Predictionen_US
Files
Original bundle
Now showing 1 - 2 of 2
No Thumbnail Available
Name:
cgvc20241220.pdf
Size:
2.21 MB
Format:
Adobe Portable Document Format
No Thumbnail Available
Name:
paper1037.mp4
Size:
44.78 MB
Format:
Video MP4