Lecture: Semantic Word Clouds

Seman&c
Analysis
in
Language
Technology

http://stp.lingﬁl.uu.se/~santinim/sais/2016/sais_2016.htm  
 
Semantic Word Clouds
Marina
San(ni

san$nim@stp.lingﬁl.uu.se

Department
of
Linguis(cs
and
Philology

Uppsala
University,
Uppsala,
Sweden

Spring
2016

Previous
lecture:
Ontologies

2

Semantic Web & Ontologies
•  The
goal
of
the
Seman(c
Web
is
to
allow
web
informa(on
and
services
to
be
more

eﬀec(vely
exploited
by
humans
and
automated
tools.

•  Essen(ally,
the
focus
of
the
seman(c
web
is
to
share
data
instead
of
documents.

•  This
data
must
be
”meaningful”
both
for
human
and
for
machines
(ie
automated
tools
and

web
applica(ons)

•  Q:
How
are
we
going
to
represent
meaning
and
knowledge
on
the
web?

•  A:
…
via
annota&on.

•  Knowledge
is
represented
in
the
form
of
rich
conceptual
schemas/formalisms
called

ontologies.

•  Therefore,
ontologies
are
the
backbone
of
the
Seman(c
Web.

•  Ontologies
give
formally
deﬁned
meanings
to
the
terms
used
in
annota&ons,
transforming

them
into
seman&c
annota&ons.
3

Ontologies
are…

•  …
concepts
that
are

hierarchically

organized

4

Tree
of
Porphyry,
III
AD

Wordnet,
XXI
AD
(see
Lect
5,
ex
similarity
measures)

Reasoning:

RDF/OWL
vs
Databases
(and
other
data
structures)

OWL
axioms
behave
like
inference
rules
rather
than
database
constraints.

!
Class: Phoenix!
!SubClassOf: isPetOf only Wizard!
!
Individual: Fawkes!
Types: Phoenix!
Facts: isPetOf Dumbledore!
•  Fawkes
is
said
to
be
a
Phoenix
and
to
be
the
pet
of
Dumbledore,
and
it
is
also
stated
that
only
a

Wizard
can
have
a
pet
Phoenix.

•  In
OWL,
this
leads
to
the
implica(on
that
Dumbledore
is
a
Wizard.
That
is,
if
we
were
to
query
the

ontology
for
instances
of
Wizard,
then
Dumbledore
would
be
part
of
the
answer.

•  In
a
database
se[ng
the
schema
could
include
a
similar
statement
about
the
Phoenix
class,
but
in

this
case
it
would
be
interpreted
as
a
constraint
on
the
data:
adding
the
fact
that
Fawkes
isPetOf

Dumbledore
without
Dumbledore
being
already
known
to
be
a
Wizard
would
lead
to
an
invalid

database
state,
and
such
an
update
would
therefore
be
rejected
by
a
database
management

system
as
a
constraint
viola(on.

5

So, what is an ontology for us?
6
“An
ontology
is
a
FORMAl,
EXPLICIT
speciﬁca&on
of
a

SHARED
conceptualiza&on”

Studer,
Benjamins,
Fensel.
Knowledge
Engineering:
Principles
and
Methods.
Data
and
Knowledge
Engineering.
25
(1998)
161-‐197

An ontology is an explicit specification of a conceptualization
Gruber, T. A translation Approach to portable ontology specifications. Knowledge Acquisition. Vol. 5. 1993. 199-220
Abstract model and
simplified view of some
phenomenon in the world
that we want to represent
Machine-readable
Concepts, properties
relations, functions,
constraints, axioms,
are explicitly defined
Consensual
Knowledge

How
to
build
an
ontology

Generally
speaking
(and
roughly
said),
when

designing
an
ontology,
four
main
components

are
used:

1.  Classes

2.  Rela(ons

3.  Axioms

4.  Instances

7

Prac(cal
Ac(vity:
emo(ons

8

Your
remarks:

•  Emo(ons
are
ambiguous:

eg.
happiness
can
be
also

ill-‐directed

•  The
polarity
of
some

emo(ons
cannot
be

assessed…

•  etc.

Classes

Rela(ons

Axioms

Instances

etc.

Occupa(onal
psychology
(wikipedia)

•  Industrial
and
organiza(onal
psychology
(also
known
as
I–O

psychology,
occupa(onal
psychology,
work
psychology,
WO

psychology,
IWO
psychology
and
business
psychology)
is
the

scien$ﬁc
study
of
human
behavior
in
the
workplace
and
applies

psychological
theories
and
principles
to
organiza(ons
and

individuals
in
their
workplace.

•  I-‐O
psychologists
are
trained
in
the
scien(st–prac((oner
model.
I-‐O

psychologists
contribute
to
an
organiza(on's
success
by
improving

the
performance,
mo(va(on,
job
sa(sfac(on,
occupa(onal
safety

and
health
as
well
as
the
overall
health
and
well-‐being
of
its

employees.
An
I–O
psychologist
conducts
research
on
employee

behaviors
and
a[tudes,
and
how
these
can
be
improved
through

hiring
prac(ces,
training
programs,
feedback,
and
management

systems.

9

In
summary…

Why
to
build
an
ontology?

•  To
share
common
understanding
of
the
structure

of
informa(on
among
people
or
machines

•  To
make
domain
assump$ons
explicit

•  Ojen
based
on
controlled
vocabulary

•  To
analyze
domain
knowledge

•  To
enable
reuse
of
domain
knowledge

10

Ontologies
and
Tags

•  Ontologies
and
tagging
systems
are
two
different

ways
to
organize
the
knowledge
present
in
Web.

•  The
first
one
has
a
formal
fundamental
that

derives
from
descrip(ve
logic
and
ar(ficial

intelligence.
Domain
experts
decide
the
terms.

•  The
other
one
is
simpler
and
it
integrates

heterogeneous
contents,
and
it
is
based
on
the

collabora(on
of
users
in
the
Web
2.0.
User-‐

generated
annota(on.

11

Folksonomies

•  Tagging
facili(es
within
Web
2.0
applica(ons

have
shown
how
it
might
be
possible
for
user

communi$es
to
collabora$vely
annotate
web

content,
and
create
simple
forms
of
ontology

via
the
development
of
loosely-‐hierarchically

organised
sets
of
tags,
oNen
called

folksonomies….

12

Folksonomy=Social
Tagging

•  Folksonomies
(also
known
as
social
tagging)
are

user-‐deﬁned
metadata
collec(ons.

•  Users
do
not
deliberately
create
folksonomies

and
there
is
rarely
a
prescribed
purpose,
but
a

folksonomy
evolves
when
many
users
create
or

store
content
at
par(cular
sites
and
iden(fy
what

they
think
the
content
is
about.

•  “Tag
clouds”
pinpoint
the
frequency
of
certain

tags.

13

•  A
common

way
to

organize
tags

is
in
tag

clouds…

14

Automa(c
folksonomy
construc(on

•  The
collec(ve
knowledge
expressed
though
user-‐
generated
tags
has
a
great
poten(al.

•  However,
we
need
tools
to
eﬃciently
aggregate

data
from
large
numbers
of
users
with
highly

idiosyncra$c
vocabularies
and
invented
words

or
expressions.

•  Many
approaches
to
automa(c
folksonomy

construc(on
combine
tags
using
sta(s(cal

methods
...

•  Ample
space
for
improvement…

15

Ontology,
taxonomy,
folksonomy,
etc.

•  Many
diﬀerent
deﬁni(ons…

•  A
good
summary
and
interpreta(on
is
here:

hpp://www.ideaeng.com/taxonomies-‐
ontologies-‐0602

16

Today…

•  We
will
talk
more
generally
about
word

clouds…

17

Further
Reading

Seman&c
Similarity
from
Natural
Language
and
Ontology
Analysis

by
Sébas(en
Harispe,
Sylvie
Ranwez,
Stefan
Janaqi,
and
Jacky

Montmain

Synthesis
Lectures
on
Human
Language
Technologies,
May
2015,
Vol.

8,
No.
1

•  The
two
state-‐of-‐the-‐art
approaches
for
es(ma(ng
and
quan(fying

seman(c
similari(es/relatedness
of
seman(c
en((es
are
presented

in
detail:
the
ﬁrst
one
relies
on
corpora
analysis
and
is
based
on

Natural
Language
Processing
techniques
and
seman(c
models

while
the
second
is
based
on
more
or
less
formal,
computer-‐
readable
and
workable
forms
of
knowledge
such
as
seman(c

networks,
thesauri
or
ontologies.

18

Previous
lecture:
the
end

19

Acknowledgements

This
presenta(on
is
based
on
the
following
paper:

•  Barth
et
al.
(2014).
Experimental
Comparison
of
Seman(c

Word
Cloud.
In
Experimental
Algorithms,
Volume
8504
of
the

series
Lecture
Notes
in
Computer
Science
pp
247-‐258

–  Link:
hpps://www.cs.arizona.edu/~kobourov/wordle2.pdf

Some
slides
have
been
borrowed
from
Sergey
Pupyrev.

20

Today

•  Experiments
on
seman&cs-‐preserving
word

clouds,
in
which
seman(cally
related
words

are
close
to
each
other.

21

Outline

•  What
is
a
Word
Cloud?

•  3
early
algorithms

•  3
new
algorithms

•  Metrics
&
Quan(ta(ve
Evalua(on

22

Word
Clouds

•  Word
clouds
have
become
a
standard
tool
for

abstrac(ng,
visualizing
and
comparing
texts…

•  We
could
apply
the
same
or
similar

techniques
to
the
huge
amonts
of
tags

produced
by
users
interac(ng
in
the
social

networks

23

Comparison
&
conceptualiza(on
Tool

24

•  Word
Clouds
as
a
tool
for
”conceptualizing”
documents.
Cf

Ontologies

•  Ex:
2008,

comparison
of
speeches:
Obama
vs
McCain

Cf.
Lect
10:

Extrac(ve

summariza(on
&

Abstrac(ve

summariza(on

Word
Clouds
and
Tag
Clouds…

•  …
are
ojen
used
to
represent
importance

among
terms
(ex,
band
popularity)
or
serve
as

a
naviga(on
tool
(ex,
Google
search
results).

25

The
Problem…

• How
to
compute
seman(c-‐preserving
word

clouds
in
which
seman(cally-‐related
words

are
close
to
each
other?

26

Wordle

hpp://www.wordle.net

•  Prac(cal
tools,
like
Wordle,

make
word
cloud
visualiza(on

easy.

They
oﬀer
an
appealing
way

to
SUMMARIZE
text…

Shortoming:
they
do
not
capture

the
rela(onships
between
words
in

any
way
since
word
placement
is

independent
of
context

27

Many
word
clouds
are
arranged
randomly
(look

also
at
the
scapered
colours)

28

Paperns
and
Vicinity/Adjacency

Humans
are
spontaneously
papern-‐seekers:

if
they
see
two
words
close
to
each
other
in
a

word
cloud,
they
spontaneously
think
they
are

related…

29

In
Linguis(cs
and
NLP…

•  This
natural
tendency
in
linking
spacial
vicinity

to
seman&c
relatedness
is
exploited
as

evidence
that
words
are
seman(cally
related

or
seman(cally
similar…

Remember?
:
”You
shall
know
a
word
by
the

company
it
keeps
(Firth,
J.
R.
1957:11)”

30

So,
it
makes
sense
to
place
such
related
words
close

to
each
other
(look
also
at
the
color
distribu(on)

31

Seman(c
word
clouds
have
higher
user

sa(sfac(on
compared
to
other
layouts…

32

All
recent
word
cloud
visualiza(on
tools
aim
to

incoprorate
seman(cs
in
the
layout…

33

…
but
none
of
them
provide
any
guarantee
about
the

quality
of
the
layout
in
terms
of
seman(cs

34

Early
algorithms:
Force-‐Directed
Graph

•  Most
of
the
exis(ng
algorithms
are
based

on
force-‐directed
graph
layout.

•  Force-‐directed
graph
drawing
algorithms

are
a
class
of
algorithms
for
drawing

graphs
in
an
aesthe(cally
pleasing
way

–  Aprac(ve
forces
between
pairs
to
reduce

empty
space

–  Repulsive
forces
ensure
that
words
do
not

overlap

–  Final
force
preserve
seman(c
rela(ons

between
words.

35

Some
of
the
most
ﬂexible

algorithms
for
calcula(ng

layouts
of
simple
undirected

graphs
belong
to
a
class

known
as
force-‐directed

algorithms.
Such
algorithms

calculate
the
layout
of
a

graph
using
only

informa(on
contained

within
the
structure
of
the

graph
itself,
rather
than

relying
on
domain-‐speciﬁc

knowledge.
Graphs
drawn

with
these
algorithms
tend

to
be
aesthe(cally
pleasing,

exhibit
symmetries,
and

tend
to
produce
crossing-‐
free
layouts
for
planar

graphs.

Newer
Algorithms:
rectangle

representa(on
of
graphs

•  Vertex-‐weighted
and
edge-‐weighed
graph:

–  The
ver(ces
of
the
graph
are
the
words

•  Their
weight
correspond
to
some
measure
of
importance

(eg.
word
frequencies)

–  The
edges
capture
the
seman(c
relatedness
of
pair
of

words
(eg.
co-‐occurrence)

•  Their
weight
correspond
to
the
strength
of
the
rela(on

–  Each
vertex
can
be
drawn
as
a
box
(rectangle)
with
a

dimension
determing
by
its
weight

–  A
realized
adjacency

is
the
sum
of
the
edge
weights

for
all
pairs
of
touching
boxes.

–  The
goal
is
to
maximize
the
realized
adjacencies.

36

Purpose
of
the
experiments
that
are
shown

here:

•  Seman(cs
preserva(on
in
terms
of
closeness/
vicinity/adjacency

37

Example

•  A
contact
of
2
boxes
is
a
common
boundary.

•  The
contact
of
two
boxes
is
interpredet
as

seman(c
relatedness

•  The
contact
of
2
boxes
can
be
calculated,
so
the

adjacency
can
be
computed
and
evaluated.

38

Preprocessing:

1)
Term
Extrac(on

2)
Ranking

3)
Similarity/Dissimilarity
Computa(on

39

•  Similarity/dissimilarity
matrix

40

Lect
6:

Repe((on

large
data
computer

apricot
1
0
0

digital
0
1
2

informa(on
1
6
1

41

Which
pair
of
words
is
more
similar?

cosine(apricot,informa(on)
=

cosine(digital,informa(on)
=

cosine(apricot,digital)
=

cos(

v,

w)=

v•

w

v

w
=

v

v
•

w

w
=
viwii=1
N
∑
vi
2
i=1
N
∑ wi
2
i=1
N
∑
1+0+0
1+0+0
1+36+1
1+36+1
0+1+4
0+1+4
1+0+0
0+6+2
0+0+0
=
1
38
=.16
=
8
38 5
=.58
= 0

Lect
06:
Other
possible
similarity
measures

42

Input
-‐
Output

•  The
input
for
all
algorithms
is

– a
collec(on
of
n
rectangles,
each
with
a
ﬁxed

width
and
height
propor(onal
to
the
rank
of
the

word

– A
similarity/dissimilarity
matrix

•  The
output
is
a
set
of
non-‐overlapping

posi(ons
for
the
rectangles.

43

Early
Algorithms

1.  Wordle
(Random)

2.  Context-‐Preserving
Word
Cloud
Visualiza(on

(CPWCV)

3.  Seam
Carving

44

Wordle
à
Random

• 
The
Wordle
algorithm
places
one
word
at
a
(me

in
a
greedy
fashion,
ie
aiming
to
use
space
as

eﬃciently
as
possible.

•  First
the
words
are
sorted
by
weight/rank
in

decreasing
order.

•  Then
for
each
word
in
the
order,
a
posi(on
is

picked
at
random.

45

Context-‐Preserving
Word
Cloud
Visualiza(on
(CPWCV)

•  First,
a
dissimilarity
matrix
is
computed
and

Mul(dimensional
Scaling
(MDS)
is
performed

•  Second,
eﬀort
to
create
a
compact
layout

52

Mul(dimensional
Scaling

(MDS)
aims
at
detec(ng

meaningful
underlying

dimensions
in
the
data.

1:

53

2:
:
repulsive
force

54

3:
:
aprac(ve
force

55

Seam
Carving

•  Basically,
an
algorithm
for
image
resizing

•  It
was
invented
at
Mitsubishi’s

56

2:
Seam
Carving
:
space
is
divided
into

regions

58

3:
Seam
Carving
:
empty
paths

trimmed
out
itera(vely

59

6:
Seam
Carving:
space
divided
into

regions

62

3
New
Algorithms

1.  Inﬂate
and
Push

2.  Star
Forest

3.  Cycle
Cover

64

Inﬂate-‐and-‐Push

•  Simple
heuris(c
method
for
word
layout,
which
aims

to
preserve
seman(c
rela(ons
between
pair
of
words.

•  Based
on

1.  Heuris(cs:
scaling
down
all
word
rectangles
by
some

constant;

2.  Compu(ng
MDS
(mul(dimensional
scaling)
on
the

dissimilarity
matrix

3.  Iteretavely
increase
the
size
of
rectangles
by
5%
(ie

”inﬂate”
words;

4.  When
words
overlaps,
apply
a
force-‐directed
algorithm

to
”push”
words
away.

65

Inﬂate:
star(ng
point

66

Inﬂate
:
scaling
down

67

Inﬂate
:
seman(cally-‐related
words
are
placed
close

to
each
other.
Apply
”inﬂate
words”
(5%)
itera(vely.

68

Inﬂate:
”push
words”:
repulsive
force

to
resolve
overlaps

69

Inﬂate:
ﬁnal
stage

70

Star
Forest

•  A
star
is
a
tree

•  A
star
forest
is
a
forest
whose
connected

components
are
all
stars.

71

Repe((on:
trees
and
graphs

•  A
tree
is
special
form
of
graph
i.e.
minimally

connected
graph
and
having
only
one
path
between

any
two
ver(ces.

•  In
a
graph
there
can
be
more
than
one
path
i.e.
graph

can
have
uni-‐direc(onal
or
bi-‐direc(onal
paths
(edges)

between
nodes.

72

Three
steps

1.  Extrac(ng
the
star
forest:
par&&on
a
graph

into
disjoint
stars

2.  Realising
a
star:
build
a
word
cloud
for
every

star

3.  Pack
all
the
stars
together

73

Star
Forest
:
star
=
tree

1.  Extract
stars
greedily
from
a
dissimilarity
matrix
à
disjoint
stars
=
star
forest

2.  Compute
the
op(mal
stars,
ie
the
best
set
of
words
to
be
adjacent

3.  Aprac(ve
force
to
get
a
compact
layout

74

Cycle
Cover

•  This
algorithm
is
based
on
a
similarity
matrix.

•  First,
a
similarity
path
is
created

•  Then,
the
op(mal
level
of
compact-‐ness
is
computed

75

Quan(ta(ve
Metrics

76

1.  Realized
Adjacenies

–  how
close
are
similar
words
to
each

other?

2.  Distor(on

–  how
distant
are
dissimilar
words?

3.  Uniform
Area
U(liza(on

–  uniformity
of
the
distribu(on

(overpopulated
vs
sparse
areas
in

the
word
cloud)

4.  Comptactness

–  how
well
u(lized
is
the
drawing

area?

5.  Aspect
Ra(o

–  width
and
height
of
the
bounding

box

6.  Running
Time

–  execu(on
(me

2
datasets

(1)
WIKI
,
a
set
of
112

plain-‐text
ar(cles

extracted
from
the
English
Wikipedia,
each

consis(ng
of
at
least
200

dis(nct
words

(2)
PAPERS
,
a
set
of
56

research
papers

published
in
conferences
on
experimental

algorithms
(SEA
and
ALENEX)
in
2011-‐2012.

77

Cycle
Cover
wins

78

Seam
Carving
wins

79

Random
and
Seam
Carving
win

82

All
ok
except
Seam
Carving

83

Lecture: Semantic Word Clouds

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Viewers also liked

Viewers also liked (20)

Similar to Lecture: Semantic Word Clouds

Similar to Lecture: Semantic Word Clouds (20)

More from Marina Santini

More from Marina Santini (16)

Recently uploaded

Recently uploaded (20)

Lecture: Semantic Word Clouds