Proiect MappingBooks
-
Upload
crismaruc-maria-madalina -
Category
Documents
-
view
217 -
download
0
Transcript of Proiect MappingBooks
-
8/13/2019 Proiect MappingBooks
1/60
Cursuri 5-6
Proiectul:
MappingBooksLet me jump in the book!
-
8/13/2019 Proiect MappingBooks
2/60
Linking Book Characters
Building A Corpus Encoding
Relations Between Entities
Dan Cristea1,2, Eugen Ignat1
1 Alexandru Ioan Cuza University of IaiFaculty of Computer Science
2 RomanianAcademy, the Iai branchInstitute for Computer Science
{dcristea, eugen.ignat}@info.uaic.ro
SpeDCluj-Napoca, 15-17 October2013
http://www.sped2013.ro/http://www.sped2013.ro/http://www.sped2013.ro/http://www.sped2013.ro/http://www.sped2013.ro/http://www.sped2013.ro/http://www.sped2013.ro/http://www.sped2013.ro/http://www.sped2013.ro/http://www.sped2013.ro/ -
8/13/2019 Proiect MappingBooks
3/60
I like to read books and to travel
SpeDCluj-Napoca, 15-17 October 2013
-
8/13/2019 Proiect MappingBooks
4/60
Going out of the book
SpeDCluj-Napoca, 15-17 October 2013
-
8/13/2019 Proiect MappingBooks
5/60
I need help to
remember all
kinship
relationsbetween
characters
SpeDCluj-Napoca, 15-17 October 2013
-
8/13/2019 Proiect MappingBooks
6/60
Characters in Forsyte Saga The old Forsytes
Ann, the eldest of the family
Old Jolyon, the patriarch of the family, having made a fortune in tea
James, a solicitor, married to Emily, a most tranquil womanSwithin, James's twin brother with aristocratic pretensions; a bachelor
Roger, "the original Forsyte"
Julia (Juley), a fluttery dowager; Mrs. Septimus Small
Hester, an old maid
Nicholas, the wealthiest in the family
Timothy, the most cautious man in England
Susan, the married sister
The young Forsytes
Young Jolyon, Old Jolyon's artistic and free-thinking son, married three times
Soames, James and Emily's son, an intense, unimaginative and possessive solicitor, married to the unhappy Irene, who later marries Young Jolyon
Winifred, Soames's sister, one of the three daughters of James and Emily, married to the foppish and lethargic Montague Dartie
George, Roger's son, a dyed-in-the-wool mocker
Francie, George's sister and Roger's daughter, emancipated from God
Their children
June, Young Jolyon's defiant daughter from his first marriage; engaged to an architect, Philip Bosinney, who becomes Irene's lover
Jolly, Young Jolyon's son from his second marriage; dies of enteric fever during the Boer Wars
Holly, Young Jolyon's daughter from his second marriage, to June's governessJon, Young Jolyon's son from his third marriage, to Irene, Soames's first wife
Fleur, Soames's daughter from his second marriage, to a French Soho shopgirl Annette; Jon's lover; later marries a baronet, Michael Mont
Val, Winifred and Montague's son; fights in the Boer Wars; marries his cousin Holly
Imogen, Winifred and Montague's daughter
Others
Parfitt, Old Jolyon's butler
Smither, Aunts Ann, Juley and Hester's housekeeper
Warmson, James and Emily's butler
Bilson, Soames's housemaid
Prosper Profond, Winifred's admirer and Annette's lover SpeDCluj-Napoca, 15-17 October 2013
-
8/13/2019 Proiect MappingBooks
7/60
SpeDCluj-Napoca, 15-17 October 2013
-
8/13/2019 Proiect MappingBooks
8/60
This presentation
The entity linking problem
The MappingBooks project proposal
Design conventions of a corpus
Preliminary statistics and what next
SpeDCluj-Napoca, 15-17 October 2013
-
8/13/2019 Proiect MappingBooks
9/60
-
8/13/2019 Proiect MappingBooks
10/60
The art
Classes studied in classification schemes:
Container-Contained, Time-Event, Product-
Producer; classes like Tool-Object are more
vaguely defined
SemEval-07 Task 4: 7 important nominal relations
Patterns: use context information for
classification and extraction of relationships
Supervised machine learning: match entity
mentions onto their correspondent KB records
-
8/13/2019 Proiect MappingBooks
11/60
Examples
Syntactic structures of the kind N1 N2
dog food=> foot consumed by dogs
summer morning=> a morning that happens in
the summer
Patterns of the type:
[Prefix] CW1 [Infix] CW2 [Postfix]
with [Prefix] [Infix] [Postfix]frequent words and
CW1, CW2content words
SpeDCluj-Napoca, 15-17 October 2013
-
8/13/2019 Proiect MappingBooks
12/60
MappingBooksa project proposal
A MappedBook is a book connected with
locations/events in the virtual and real world
and sensitive to the instantaneous location (as
seized by the mobile/tablet) of a reader.
The information made available could possibly
be different depending on the moment and
the place of the reader.
SpeDCluj-Napoca, 15-17 October 2013
-
8/13/2019 Proiect MappingBooks
13/60
MappingBooksa project proposal
multi-dimensional mash-ups combining textual,
geographical and temporal data
adequate presentation to the reader
links sensitive to:
the context of the mentions in the book,
the moment the user initiates an access
the current location of the user
make heavy use of entity linking techniques
spot the book mentions (persons and locations)
in the real and virtual world
SpeDCluj-Napoca, 15-17 October 2013
-
8/13/2019 Proiect MappingBooks
14/60
MappingBooksan architecture
SpeDCluj-Napoca, 15-17 October 2013
-
8/13/2019 Proiect MappingBooks
15/60
Our aims
1) connect entities mentions in the form ofnominals (noun phrases) => one coreferential
chain corresponds to each entity;
2) no preliminary records about linked entities =>the knowledge base evolves from scratch;
3) look for 4 types of relations => referential,
kinship, affective, social;4) texts under investigation: fiction => reference
area: well delimited, BUT form of expression:
totally unrestricted. SpeDCluj-Napoca, 15-17 October 2013
-
8/13/2019 Proiect MappingBooks
16/60
Entities
Types PERSON, LOCATION
Individuals (Marcus Vinicius, the emperor), groups (his
mother and father, the soldiers) and classes (emperor) Syntactic realisation: NPs (determiners, adjectives,
complement PPs included; but NO relative clauses)
Characterised by distinctive, person type, heads
If intersectedimbricated ([[the young lady]s mother])
SpeDCluj-Napoca, 15-17 October 2013
-
8/13/2019 Proiect MappingBooks
17/60
Entities
Types PERSON, LOCATION
Only identification, not also characterisation descriptions
([the man with a straw hat], but only: [that man],corrupted to the marrow of his bones)
included entities (in Romanian in the position of subject
(dar l iubea din tot sufletul => but (3rdpers sing)-loved
him with the whole soul)
SpeDCluj-Napoca, 15-17 October 2013
-
8/13/2019 Proiect MappingBooks
18/60
Relations
Anaphoric
Non-anaphoric:
kinship
affection
social
SpeDCluj-Napoca, 15-17 October 2013
-
8/13/2019 Proiect MappingBooks
19/60
Anaphoric relations
coref member-of, has-as-member (inverse)
isa, class-of(inverse)
part-of, has-as-part(inverse)
subgroup-of, has-as-subgroup(inverse)
has-name, name-of (inverse)
SpeDCluj-Napoca, 15-17 October 2013
-
8/13/2019 Proiect MappingBooks
20/60
Kinship relations
parent-of, child-of(inverse)
grandparent-of, grandchild-of(inverse)
sibling(symmetrical)
ant-uncle-of, nephew-of(inverse relations)
cousin-of(symmetrical)
spouse-of(symmetrical unknown
SpeDCluj-Napoca, 15-17 October 2013
-
8/13/2019 Proiect MappingBooks
21/60
Affective relations
friend-of, enemy-of(inverse)
love,hate(inverse)
worship
SpeDCluj-Napoca, 15-17 October 2013
-
8/13/2019 Proiect MappingBooks
22/60
Social relations
inferior-of, superior-of(inverse)
colleague-with
SpeDCluj-Napoca, 15-17 October 2013
-
8/13/2019 Proiect MappingBooks
23/60
Poles and directionality
In case of referential:
from anaphor towards antecedent
In case of non-referential: from sourcetowards destination
-
8/13/2019 Proiect MappingBooks
24/60
How to identify poles
Imbricated:
the anaphor/source is larger than the
antecedent/destination
Non-imbricated: referential:
the anaphoris to the right of the antecedent
non-referential: from sourceto destination, as the trigger reads
-
8/13/2019 Proiect MappingBooks
25/60
Poles and triggers
Referentiality: coref
John met Maria on the ski slope. He raced her.
anaforantecedent
SpeDCluj-Napoca, 15-17 October 2013
-
8/13/2019 Proiect MappingBooks
26/60
Poles and triggers
Referentiality: coref
John met Maria on the ski slope. He raced her .
anaforantecedent
SpeDCluj-Napoca, 15-17 October 2013
-
8/13/2019 Proiect MappingBooks
27/60
Poles and trigger
Kinship: parent-of
their father
trigger
destination
source
SpeDCluj-Napoca, 15-17 October 2013
-
8/13/2019 Proiect MappingBooks
28/60
Poles and trigger
Social: inferior-of
Cesar s principal courtiers
trigger
destination source
SpeDCluj-Napoca, 15-17 October 2013
-
8/13/2019 Proiect MappingBooks
29/60
Poles and trigger
Affective: worship
Lygia dropped on her knees to implore someone else .
trigger
destinationsource
SpeDCluj-Napoca, 15-17 October 2013
-
8/13/2019 Proiect MappingBooks
30/60
EntitiesPetroniu
Vinicius was
the son of his oldest sister ,
who years before had married his father ,
a man of consular dignity from the time of Tiberius .
SpeDCluj-Napoca, 15-17 October 2013
-
8/13/2019 Proiect MappingBooks
31/60
Referential relations: coref
Petroniu
Vinicius was
the son of his oldest sister ,
who years before had married his father ,
a man of consular dignity from the time of Tiberius .
SpeDCluj-Napoca, 15-17 October 2013
-
8/13/2019 Proiect MappingBooks
32/60
Petroniu
Vinicius was
the son of his oldest sister ,
who years before had married his father ,
a man of consular dignity from the time of Tiberius .
Referential relations: coref
SpeDCluj-Napoca, 15-17 October 2013
-
8/13/2019 Proiect MappingBooks
33/60
-
8/13/2019 Proiect MappingBooks
34/60
Referential relations: class-of
Petroniu
Vinicius was
the son of his oldest sister ,
who years before had married his father ,
a man of consular dignity from the time of Tiberius .
SpeDCluj-Napoca, 15-17 October 2013
-
8/13/2019 Proiect MappingBooks
35/60
Kinship relations: sibling
Petroniu
Vinicius was
the son of his oldest sister ,
who years before had married his father ,
a man of consular dignity from the time of Tiberius .
SpeDCluj-Napoca, 15-17 October 2013
-
8/13/2019 Proiect MappingBooks
36/60
-
8/13/2019 Proiect MappingBooks
37/60
Kinship relations: parent-of
Petroniu
Vinicius was
the son of his oldest sister ,
who years before had married his father ,
a man of consular dignity from the time of Tiberius .
SpeDCluj-Napoca, 15-17 October 2013
-
8/13/2019 Proiect MappingBooks
38/60
Kinship relations: spouse-of
Petroniu
Vinicius was
the son of his oldest sister ,
who years before had married his father ,
a man of consular dignity from the time of Tiberius .
SpeDCluj-Napoca, 15-17 October 2013
-
8/13/2019 Proiect MappingBooks
39/60
Social relations: inferior-of
Petroniu
Vinicius was
the son of his oldest sister ,
who years before had married his father ,
a man of consulardignity from the time of Tiberius .
SpeDCluj-Napoca, 15-17 October 2013
-
8/13/2019 Proiect MappingBooks
40/60
The Quo Vadis RO-EntLink corpus
words ent ref aff kin soc
Now 25314 3663 2045 39 29 15Whole 137510 19483 9523 301 127 103
annotators: 12 master students, first year in CL
3 experts
SpeDCluj-Napoca, 15-17 October 2013
-
8/13/2019 Proiect MappingBooks
41/60
-
8/13/2019 Proiect MappingBooks
42/60
We do not mark
Negated relations
1:[Lygia]could not become 2:[the concubine of any man]
Characterisations addressing subjects, expressed by
predicative nouns
in 1:[her]there was 2:[something uncommon]
Interpreted relations, besides coreferential
Petronius felt that beneath a statue of 1:[that maiden]
one might write 2:["Spring."] (name-of-interpret)
SpeDCluj-Napoca, 15-17 October 2013
-
8/13/2019 Proiect MappingBooks
43/60
We do not mark
More than the minimum number of relations Se repezi la1:[Petru]i, lundu-2:[i] 3:[minile], ncepu
s 4:[i]5:[le]srute.
[2] coref [1],
[3] part-of [1] (or [3] part-of [2]),
[4] coref [1] (or [4] coref [2]),
[5] coref [3]
[5] part-of [1] superfluous; by transitivity, from [5] coref
[3] and [3] part-of [1]
SpeDCluj-Napoca, 15-17 October 2013
-
8/13/2019 Proiect MappingBooks
44/60
Our thanks go to
the class of first year master students in
Computational Linguistics, from the Faculty ofComputer Science, Alexandru Ioan Cuza
University of Iai, second term 2012-2013
SpeDCluj-Napoca, 15-17 October 2013
-
8/13/2019 Proiect MappingBooks
45/60
And my thanks to
YOU!
SpeDCluj-Napoca, 15-17 October 2013
-
8/13/2019 Proiect MappingBooks
46/60
And now
Lets build the Mapped Books!
F t t t h
-
8/13/2019 Proiect MappingBooks
47/60
Features we want to have
Understand what a text says about
Know who is who
Recognise real world entities
Know where I am
What real world entities are in my proximity
Trace on Google Maps a path described in the book
Fetch, process and make use of geo-data
Mix images with generated info
Attractive user interfaces
Client-server
F t t t h
-
8/13/2019 Proiect MappingBooks
48/60
Features we want to have
Understand what a text says about
the capacity to see a text different than a string ofletters
TEXT ANALYTICS
-
8/13/2019 Proiect MappingBooks
49/60
F t t t h
-
8/13/2019 Proiect MappingBooks
50/60
Features we want to have
What real world entities are mentioned in the book
virtual world vs real world
link textual mentions of entities in the virtual world
decide on relevant info from virtual to be presented to
Master use multiple sources
ENTITY CROWLING
F t t t h
-
8/13/2019 Proiect MappingBooks
51/60
Features we want to have
Know where I am
What real world entities are in my proximity
detection of my position
computation of distances from the mentioned places
signalling interesting locations in proximity
LOCALISATION
Features we want to have
-
8/13/2019 Proiect MappingBooks
52/60
Features we want to have
Trace on Google Maps a path described in the book
path detection in text
use Google Maps APIs
trace locations and paths on Google Maps
PATH DETECTION
MAPS&TRAJECTORIES
-
8/13/2019 Proiect MappingBooks
53/60
Features we want to have
-
8/13/2019 Proiect MappingBooks
54/60
Features we want to have
Mix images with generated info
process images => segment, contours, recognition
sense orientation of the camera
decide info to be displayed
use info provided by the camera focus
AUGMENTED REALITY
-
8/13/2019 Proiect MappingBooks
55/60
Features we want to have
-
8/13/2019 Proiect MappingBooks
56/60
Features we want to have
Client-server
users Portrait
the database
the architecture
standards and communication protocols
CLIENT-SERVER
-
8/13/2019 Proiect MappingBooks
57/60
-
8/13/2019 Proiect MappingBooks
58/60
Modules
1. TEXT ANALYTICS
2. NAME ENTITY
RECOGNITION3. ENTITY CROWLING
4. LOCALISATION
5. PATH DETECTION
6. MAPS&TRAJECTORIES
7. GEOGRAPHY
8. AUGMENTED REALITY
9. INTERFACES
10. CLIENT-SERVER
11. RESOURCES
12. MANAGEMENT AND
EVALUATION
-
8/13/2019 Proiect MappingBooks
59/60
Organisational details
2 projects!
Two groups: lead by Ionu Pistol and Mdlina Rschip
Each project includes a number of modules (12?)
Each module is built by a team
Each team has aprox. 10-15 members
Only one Management & Organisation team
-
8/13/2019 Proiect MappingBooks
60/60
Immediate deadline
Teams organised next week!