Emotion-based Interaction Technique Using User's Voice and Facial Expressions in Virtual and Augmented Reality

Ko, Beom-SeokKang, Ho-SanLee, KyuhongBraunschweiler, ManuelZünd, FabioSumner, Robert W.Choi, Soo-MiChaine, RaphaëlleDeng, ZhigangKim, Min H.2023-10-092023-10-092023978-3-03868-234-9https://doi.org/10.2312/pg.20231286https://diglib.eg.org:443/handle/10.2312/pg20231286This paper presents a novel interaction approach based on a user's emotions within augmented reality (AR) and virtual reality (VR) environments to achieve immersive interaction with virtual intelligent characters. To identify the user's emotions through voice, the Google Speech-to-Text API is used to transcribe speech and then the RoBERTa language processing model is utilized to classify emotions. In AR environment, the intelligent character can change the styles and properties of objects based on the recognized user's emotions during a dialog. On the other side, in VR environment, the movement of the user's eyes and lower face is tracked by VIVE Pro Eye and Facial Tracker, and EmotionNet is used for emotion recognition. Then, the virtual environment can be changed based on the recognized user's emotions. Our findings present an interesting idea for integrating emotionally intelligent characters in AR/VR using generative AI and facial expression recognition.Attribution 4.0 International LicenseCCS Concepts: Human-centered computing -> Human computer interaction (HCI); Hardware -> VIVE Pro Eye; Facial TrackerHuman centered computingHuman computer interaction (HCI)HardwareVIVE Pro EyeFacial TrackerEmotion-based Interaction Technique Using User's Voice and Facial Expressions in Virtual and Augmented Reality10.2312/pg.20231286121-1222 pages