Hur, HyejinBaek, SeongminGil, YounheeKim, SangpilGünther, TobiasMontazeri, Zahra2025-05-092025-05-092025978-3-03868-269-11017-4656https://doi.org/10.2312/egp.20251016https://diglib.eg.org/handle/10.2312/egp20251016In this paper, we propose a method to improve the performance of hand pose estimation from egocentric view. To accurately capture hands moving within a wide range in daily activities, we mounted a fisheye stereo camera on a head mounted display to obtain wide-angle images from egocentric view. Our proposed two-stage method addresses the camera distortion introduced by this setup. The 2D hand keypoints estimated by stage-1 HandNet are converted into 3D hand keypoints through triangulation for perspective cropping. Stage-2 HandNet then predicts the final 2D hand keypoints from the undistorted hand crop image. To train stage-1 HandNet for perspective cropping, we built FisheyeEgoHAND dataset which consists of three categories of scenarios (separate hand, hand-hand, and hand-object) that reflect various hand interactions in an egocentric view. Through experiments, we demonstrated that two-stage 2D hand pose estimation outperforms one-stage approach without perspective cropping.Attribution 4.0 International LicenseCCS Concepts: Computing methodologies → Computer vision; Vision for roboticsComputing methodologies → Computer visionVision for roboticsPerspective Crop Based Egocentric Hand Pose Estimation via Fisheye Stereo Vision10.2312/egp.202510162 pages