Altıok, Ozan CanSezgin, Tev k MetinHolger Winnemoeller and Lyn Bartram2017-10-182017-10-182017978-1-4503-5081-5-https://doi.org/10.1145/3122791.3122801https://diglib.eg.org:443/handle/10.2312/npar2017a15From a user interaction perspective, speech and sketching make a good couple for describing motion. Speech allows easy speci cation of content, events and relationships, while sketching brings in spatial expressiveness. Yet, we have insu cient knowledge of how sketching and speech can be used for motionbased video retrieval, because there are no existing retrieval systems that support such interaction. In this paper, we describe a WizardofOz protocol and a set of tools that we have developed to engage users in a sketch and speechbased video retrieval task. We report how the tools and the protocol t together using ''retrieval of soccer videos'' as a use case scenario. Our so ware is highly customizable, and our protocol is easy to follow. We believe that together they will serve as a convenient and powerful duo for studying a wide range of multimodal use cases.Humancentered computingSystems and tools for interaction designEmpirical studies in HCIsketchbased interfaceshumancentered designmotionmultimedia retrievalCharacterizing User Behavior for Speech and Sketch-based Video Retrieval Interfaces10.1145/3122791.3122801