WEBVTT

6b39d5bb-6843-4d88-943e-b0a58874b947-0
00:00:00.560 --> 00:00:05.158
In this study, we focus on a
visual analysis of text data

6b39d5bb-6843-4d88-943e-b0a58874b947-1
00:00:05.158 --> 00:00:09.757
originally annotated for the
tasks of computational humor

6b39d5bb-6843-4d88-943e-b0a58874b947-2
00:00:09.757 --> 00:00:13.879
analysis, specifically the Humicroedit data set.

ff4fc5a7-5a13-4fb8-aede-21e5765234d8-0
00:00:15.080 --> 00:00:19.242
This data set was used as part
of the shared task at the

ff4fc5a7-5a13-4fb8-aede-21e5765234d8-1
00:00:19.242 --> 00:00:21.360
Semantic Evaluation workshop.

75d79b6b-3169-4c87-8e0e-d6211ff85cc7-0
00:00:22.600 --> 00:00:27.078
The data set consists of news
headlines that were edited with

75d79b6b-3169-4c87-8e0e-d6211ff85cc7-1
00:00:27.078 --> 00:00:31.702
substitutions with the intention
of making the edited headlines

75d79b6b-3169-4c87-8e0e-d6211ff85cc7-2
00:00:31.702 --> 00:00:32.280
funnier.

c7c87add-d17f-4a54-b3c7-94fde60b0b42-0
00:00:33.360 --> 00:00:37.677
The actual level of funniness of
the edited headlines was

c7c87add-d17f-4a54-b3c7-94fde60b0b42-1
00:00:37.677 --> 00:00:41.846
individually assessed by five
annotators on the scale from

c7c87add-d17f-4a54-b3c7-94fde60b0b42-2
00:00:41.846 --> 00:00:44.080
zero (not funny) to three (funny).

7c5e7f12-80ea-4e50-9a66-f770df1b124b-0
00:00:45.200 --> 00:00:49.465
The original shared task focused
on the development of natural

7c5e7f12-80ea-4e50-9a66-f770df1b124b-1
00:00:49.465 --> 00:00:53.460
language processing methods for
predicting or ranking such

7c5e7f12-80ea-4e50-9a66-f770df1b124b-2
00:00:53.460 --> 00:00:56.440
edited headlines with respect to
funniness.

fd42dcad-808a-4500-bdc1-aa0de7422df7-0
00:00:57.240 --> 00:01:01.983
However, our focus in this study
is not to develop such an NLP

fd42dcad-808a-4500-bdc1-aa0de7422df7-1
00:01:01.983 --> 00:01:06.952
model, but rather to investigate
the respective annotated data in

fd42dcad-808a-4500-bdc1-aa0de7422df7-2
00:01:06.952 --> 00:01:07.480
depth.

07b43bcb-a483-4459-9259-cfb890de99b3-0
00:01:08.640 --> 00:01:11.937
For that purpose, we have
designed the pipeline that

07b43bcb-a483-4459-9259-cfb890de99b3-1
00:01:11.937 --> 00:01:14.240
combines several stages and
methods.

52da6ba7-a369-4af2-87cc-eca2e1321e4b-0
00:01:15.120 --> 00:01:20.221
The original Humicroedit
training data is stored as a CSV

52da6ba7-a369-4af2-87cc-eca2e1321e4b-1
00:01:20.221 --> 00:01:20.640
file.

2da2abb6-6b25-4e72-be5e-e75f99e48ac6-0
00:01:21.240 --> 00:01:24.760
We preprocess it to form edited
headline strings.

82628d4e-f53e-4303-a4f6-dcdf4cfe0622-0
00:01:25.680 --> 00:01:30.391
Then we apply several NLP
analyses to identify sentiment

82628d4e-f53e-4303-a4f6-dcdf4cfe0622-1
00:01:30.391 --> 00:01:35.346
polarity, topics, keywords, and
named entities in the text data

82628d4e-f53e-4303-a4f6-dcdf4cfe0622-2
00:01:35.346 --> 00:01:38.839
and store the results in another
CSV file.

540153e0-8a74-4658-a355-990398df7289-0
00:01:39.880 --> 00:01:43.618
All these results can then be
loaded and explored in our

540153e0-8a74-4658-a355-990398df7289-1
00:01:43.618 --> 00:01:45.520
interactive visual interface.

3f2c5538-b7e3-415d-8de4-9ef56f11f9f4-0
00:01:47.320 --> 00:01:51.832
The user interface of our
prototype consists of several

3f2c5538-b7e3-415d-8de4-9ef56f11f9f4-1
00:01:51.832 --> 00:01:56.183
panels, with the left panel
including the file picker

3f2c5538-b7e3-415d-8de4-9ef56f11f9f4-2
00:01:56.183 --> 00:02:00.937
controls for selecting the
computational results available

3f2c5538-b7e3-415d-8de4-9ef56f11f9f4-3
00:02:00.937 --> 00:02:05.530
in the provided data, such as
the VADER or RoBERTa-based

3f2c5538-b7e3-415d-8de4-9ef56f11f9f4-4
00:02:05.530 --> 00:02:09.640
sentiment analysis results and
filtering controls.

fc930c60-5d00-4b48-86b9-9d51f17076eb-0
00:02:10.960 --> 00:02:15.277
The range sliders allow the
users to filter a data subset

fc930c60-5d00-4b48-86b9-9d51f17076eb-1
00:02:15.277 --> 00:02:19.155
with specific annotated
funniness scores or computed

fc930c60-5d00-4b48-86b9-9d51f17076eb-2
00:02:19.155 --> 00:02:22.960
sentiment polarity scores for
the edited headlines.

e1dad3ac-5837-4709-b44f-8973a92f7422-0
00:02:24.400 --> 00:02:29.084
The prototype also provides an
overview of the list of computed

e1dad3ac-5837-4709-b44f-8973a92f7422-1
00:02:29.084 --> 00:02:33.842
topics as well as the option to
filter the respective results in

e1dad3ac-5837-4709-b44f-8973a92f7422-2
00:02:33.842 --> 00:02:35.160
the central panel.

fe9c18a4-1444-4a77-8ebd-ae04d9c702eb-0
00:02:35.400 --> 00:02:39.596
The prototype provides the main
view, which is currently set to

fe9c18a4-1444-4a77-8ebd-ae04d9c702eb-1
00:02:39.596 --> 00:02:43.136
a treemap, but could be
switched to a dimensionality

fe9c18a4-1444-4a77-8ebd-ae04d9c702eb-2
00:02:43.136 --> 00:02:44.119
reduction plot.

d8bc59f9-5331-4fa2-ac5f-e54abb325d8e-0
00:02:50.240 --> 00:02:53.920
A detailed data table is located
below the main view.

c88ccd7b-3f22-4996-8b57-7d89d1b8ecf8-0
00:03:00.280 --> 00:03:04.997
The secondary view is available
in the right panel, followed by

c88ccd7b-3f22-4996-8b57-7d89d1b8ecf8-1
00:03:04.997 --> 00:03:09.346
several histogram plots focusing
on the average annotated

c88ccd7b-3f22-4996-8b57-7d89d1b8ecf8-2
00:03:09.346 --> 00:03:13.474
funniness score, sentiment, and
topic membership for the

c88ccd7b-3f22-4996-8b57-7d89d1b8ecf8-3
00:03:13.474 --> 00:03:17.822
currently visible filtered data
set, as well as a specific

c88ccd7b-3f22-4996-8b57-7d89d1b8ecf8-4
00:03:17.822 --> 00:03:18.560
selection.

e11dd661-b7ad-4df2-9850-871314f056df-0
00:03:20.360 --> 00:03:23.801
Most of the views support
details on demand with tool

e11dd661-b7ad-4df2-9850-871314f056df-1
00:03:23.801 --> 00:03:24.120
tips.

c0b6ea3a-9c06-412b-9017-3d7b90f97493-0
00:03:26.440 --> 00:03:30.766
For instance, the tooltips for
individual cells within the main

c0b6ea3a-9c06-412b-9017-3d7b90f97493-1
00:03:30.766 --> 00:03:34.760
treemap view provide details
for the respective headlines.

b07c3910-4f03-4c6e-92f0-c088f594d0c2-0
00:03:42.260 --> 00:03:46.946
After clicking on one of such
treemap cells, the data table is

b07c3910-4f03-4c6e-92f0-c088f594d0c2-1
00:03:46.946 --> 00:03:51.187
populated, including the
original headline, for which no

b07c3910-4f03-4c6e-92f0-c088f594d0c2-2
00:03:51.187 --> 00:03:55.650
funniness score is provided in
the data as it only contains

b07c3910-4f03-4c6e-92f0-c088f594d0c2-3
00:03:55.650 --> 00:03:58.180
annotations for edited
headlines.

45eefb31-6676-4961-9085-9df919bab64f-0
00:03:59.940 --> 00:04:03.978
Notice that the treemap cells
corresponding to the related

45eefb31-6676-4961-9085-9df919bab64f-1
00:04:03.978 --> 00:04:07.949
edited headline variants are
highlighted with a different

45eefb31-6676-4961-9085-9df919bab64f-2
00:04:07.949 --> 00:04:08.360
color.

4a02e6bc-02d7-419d-9d1e-bd08292ba7d0-0
00:04:25.370 --> 00:04:28.955
Clicking to select an
intermediate node of the treemap

4a02e6bc-02d7-419d-9d1e-bd08292ba7d0-1
00:04:28.955 --> 00:04:32.410
will result in selecting the
respective data subset.

609ecff4-ef07-4557-987f-850c17532bf0-0
00:05:02.440 --> 00:05:06.937
By exploring the patterns of the
annotated funniness score

609ecff4-ef07-4557-987f-850c17532bf0-1
00:05:06.937 --> 00:05:11.130
distribution versus the
sentiment polarity detected in

609ecff4-ef07-4557-987f-850c17532bf0-2
00:05:11.130 --> 00:05:15.551
the respective original and
edited headlines, the topics,

609ecff4-ef07-4557-987f-850c17532bf0-3
00:05:15.551 --> 00:05:20.354
and also further analyses such
as named entities and keywords,

609ecff4-ef07-4557-987f-850c17532bf0-4
00:05:20.354 --> 00:05:24.851
the user gains insight into the
relationship between these

609ecff4-ef07-4557-987f-850c17532bf0-5
00:05:24.851 --> 00:05:29.196
facets of the text with the
notion of funniness that the

609ecff4-ef07-4557-987f-850c17532bf0-6
00:05:29.196 --> 00:05:31.560
respective annotators followed.

458f206f-9aac-45c1-9175-ea0755478d38-0
00:05:44.730 --> 00:05:47.170
Thank you for watching this
demonstration.