Learning Transformation-Isomorphic Latent Space for Accurate Hand Pose Estimation

dc.contributor.authorRen, Kaiwenen_US
dc.contributor.authorHu, Leien_US
dc.contributor.authorZhang, Zhihengen_US
dc.contributor.authorYe, Yongjingen_US
dc.contributor.authorXia, Shihongen_US
dc.contributor.editorChristie, Marcen_US
dc.contributor.editorHan, Ping-Hsuanen_US
dc.contributor.editorLin, Shih-Syunen_US
dc.contributor.editorPietroni, Nicoen_US
dc.contributor.editorSchneider, Teseoen_US
dc.contributor.editorTsai, Hsin-Rueyen_US
dc.contributor.editorWang, Yu-Shuenen_US
dc.contributor.editorZhang, Eugeneen_US
dc.date.accessioned2025-10-07T06:03:16Z
dc.date.available2025-10-07T06:03:16Z
dc.date.issued2025
dc.description.abstractVision-based regression tasks, such as hand pose estimation, have achieved higher accuracy and faster convergence through representation learning. However, existing representation learning methods often encounter the following issues: the high semantic level of features extracted from images is inadequate for regressing low-level information, and the extracted features include task-irrelevant information, reducing their compactness and interfering with regression tasks. To address these challenges, we propose TI-Net, a highly versatile visual Network backbone designed to construct a Transformation Isomorphic latent space. Specifically, we employ linear transformations to model geometric transformations in the latent space and ensure that TI-Net aligns them with those in the image space. This ensures that the latent features capture compact, low-level information beneficial for pose estimation tasks. We evaluated TI-Net on the hand pose estimation task to demonstrate the network's superiority. On the DexYCB dataset, TI-Net achieved a 10% improvement in the PA-MPJPE metric compared to specialized state-of-the-art (SOTA) hand pose estimation methods. Our code is available at https://github.com/Mine268/TI-Net.en_US
dc.description.sectionheadersDetecting & Estimating from images
dc.description.seriesinformationPacific Graphics Conference Papers, Posters, and Demos
dc.identifier.doi10.2312/pg.20251270
dc.identifier.isbn978-3-03868-295-0
dc.identifier.pages10 pages
dc.identifier.urihttps://doi.org/10.2312/pg.20251270
dc.identifier.urihttps://diglib.eg.org/handle/10.2312/pg20251270
dc.publisherThe Eurographics Associationen_US
dc.rightsAttribution 4.0 International License
dc.rights.urihttps://creativecommons.org/licenses/by/4.0/
dc.subjectCCS Concepts: Computing methodologies → Motion capture; Image representations; Tracking
dc.subjectComputing methodologies → Motion capture
dc.subjectImage representations
dc.subjectTracking
dc.titleLearning Transformation-Isomorphic Latent Space for Accurate Hand Pose Estimationen_US
Files
Original bundle
Now showing 1 - 2 of 2
Loading...
Thumbnail Image
Name:
pg20251270.pdf
Size:
2.89 MB
Format:
Adobe Portable Document Format
Loading...
Thumbnail Image
Name:
paper1003_mm1.pdf
Size:
105.44 KB
Format:
Adobe Portable Document Format