Proiect MappingBooks

download Proiect MappingBooks

of 60

Transcript of Proiect MappingBooks

  • 8/13/2019 Proiect MappingBooks

    1/60

    Cursuri 5-6

    Proiectul:

    MappingBooksLet me jump in the book!

  • 8/13/2019 Proiect MappingBooks

    2/60

    Linking Book Characters

    Building A Corpus Encoding

    Relations Between Entities

    Dan Cristea1,2, Eugen Ignat1

    1 Alexandru Ioan Cuza University of IaiFaculty of Computer Science

    2 RomanianAcademy, the Iai branchInstitute for Computer Science

    {dcristea, eugen.ignat}@info.uaic.ro

    SpeDCluj-Napoca, 15-17 October2013

    http://www.sped2013.ro/http://www.sped2013.ro/http://www.sped2013.ro/http://www.sped2013.ro/http://www.sped2013.ro/http://www.sped2013.ro/http://www.sped2013.ro/http://www.sped2013.ro/http://www.sped2013.ro/http://www.sped2013.ro/
  • 8/13/2019 Proiect MappingBooks

    3/60

    I like to read books and to travel

    SpeDCluj-Napoca, 15-17 October 2013

  • 8/13/2019 Proiect MappingBooks

    4/60

    Going out of the book

    SpeDCluj-Napoca, 15-17 October 2013

  • 8/13/2019 Proiect MappingBooks

    5/60

    I need help to

    remember all

    kinship

    relationsbetween

    characters

    SpeDCluj-Napoca, 15-17 October 2013

  • 8/13/2019 Proiect MappingBooks

    6/60

    Characters in Forsyte Saga The old Forsytes

    Ann, the eldest of the family

    Old Jolyon, the patriarch of the family, having made a fortune in tea

    James, a solicitor, married to Emily, a most tranquil womanSwithin, James's twin brother with aristocratic pretensions; a bachelor

    Roger, "the original Forsyte"

    Julia (Juley), a fluttery dowager; Mrs. Septimus Small

    Hester, an old maid

    Nicholas, the wealthiest in the family

    Timothy, the most cautious man in England

    Susan, the married sister

    The young Forsytes

    Young Jolyon, Old Jolyon's artistic and free-thinking son, married three times

    Soames, James and Emily's son, an intense, unimaginative and possessive solicitor, married to the unhappy Irene, who later marries Young Jolyon

    Winifred, Soames's sister, one of the three daughters of James and Emily, married to the foppish and lethargic Montague Dartie

    George, Roger's son, a dyed-in-the-wool mocker

    Francie, George's sister and Roger's daughter, emancipated from God

    Their children

    June, Young Jolyon's defiant daughter from his first marriage; engaged to an architect, Philip Bosinney, who becomes Irene's lover

    Jolly, Young Jolyon's son from his second marriage; dies of enteric fever during the Boer Wars

    Holly, Young Jolyon's daughter from his second marriage, to June's governessJon, Young Jolyon's son from his third marriage, to Irene, Soames's first wife

    Fleur, Soames's daughter from his second marriage, to a French Soho shopgirl Annette; Jon's lover; later marries a baronet, Michael Mont

    Val, Winifred and Montague's son; fights in the Boer Wars; marries his cousin Holly

    Imogen, Winifred and Montague's daughter

    Others

    Parfitt, Old Jolyon's butler

    Smither, Aunts Ann, Juley and Hester's housekeeper

    Warmson, James and Emily's butler

    Bilson, Soames's housemaid

    Prosper Profond, Winifred's admirer and Annette's lover SpeDCluj-Napoca, 15-17 October 2013

  • 8/13/2019 Proiect MappingBooks

    7/60

    SpeDCluj-Napoca, 15-17 October 2013

  • 8/13/2019 Proiect MappingBooks

    8/60

    This presentation

    The entity linking problem

    The MappingBooks project proposal

    Design conventions of a corpus

    Preliminary statistics and what next

    SpeDCluj-Napoca, 15-17 October 2013

  • 8/13/2019 Proiect MappingBooks

    9/60

  • 8/13/2019 Proiect MappingBooks

    10/60

    The art

    Classes studied in classification schemes:

    Container-Contained, Time-Event, Product-

    Producer; classes like Tool-Object are more

    vaguely defined

    SemEval-07 Task 4: 7 important nominal relations

    Patterns: use context information for

    classification and extraction of relationships

    Supervised machine learning: match entity

    mentions onto their correspondent KB records

  • 8/13/2019 Proiect MappingBooks

    11/60

    Examples

    Syntactic structures of the kind N1 N2

    dog food=> foot consumed by dogs

    summer morning=> a morning that happens in

    the summer

    Patterns of the type:

    [Prefix] CW1 [Infix] CW2 [Postfix]

    with [Prefix] [Infix] [Postfix]frequent words and

    CW1, CW2content words

    SpeDCluj-Napoca, 15-17 October 2013

  • 8/13/2019 Proiect MappingBooks

    12/60

    MappingBooksa project proposal

    A MappedBook is a book connected with

    locations/events in the virtual and real world

    and sensitive to the instantaneous location (as

    seized by the mobile/tablet) of a reader.

    The information made available could possibly

    be different depending on the moment and

    the place of the reader.

    SpeDCluj-Napoca, 15-17 October 2013

  • 8/13/2019 Proiect MappingBooks

    13/60

    MappingBooksa project proposal

    multi-dimensional mash-ups combining textual,

    geographical and temporal data

    adequate presentation to the reader

    links sensitive to:

    the context of the mentions in the book,

    the moment the user initiates an access

    the current location of the user

    make heavy use of entity linking techniques

    spot the book mentions (persons and locations)

    in the real and virtual world

    SpeDCluj-Napoca, 15-17 October 2013

  • 8/13/2019 Proiect MappingBooks

    14/60

    MappingBooksan architecture

    SpeDCluj-Napoca, 15-17 October 2013

  • 8/13/2019 Proiect MappingBooks

    15/60

    Our aims

    1) connect entities mentions in the form ofnominals (noun phrases) => one coreferential

    chain corresponds to each entity;

    2) no preliminary records about linked entities =>the knowledge base evolves from scratch;

    3) look for 4 types of relations => referential,

    kinship, affective, social;4) texts under investigation: fiction => reference

    area: well delimited, BUT form of expression:

    totally unrestricted. SpeDCluj-Napoca, 15-17 October 2013

  • 8/13/2019 Proiect MappingBooks

    16/60

    Entities

    Types PERSON, LOCATION

    Individuals (Marcus Vinicius, the emperor), groups (his

    mother and father, the soldiers) and classes (emperor) Syntactic realisation: NPs (determiners, adjectives,

    complement PPs included; but NO relative clauses)

    Characterised by distinctive, person type, heads

    If intersectedimbricated ([[the young lady]s mother])

    SpeDCluj-Napoca, 15-17 October 2013

  • 8/13/2019 Proiect MappingBooks

    17/60

    Entities

    Types PERSON, LOCATION

    Only identification, not also characterisation descriptions

    ([the man with a straw hat], but only: [that man],corrupted to the marrow of his bones)

    included entities (in Romanian in the position of subject

    (dar l iubea din tot sufletul => but (3rdpers sing)-loved

    him with the whole soul)

    SpeDCluj-Napoca, 15-17 October 2013

  • 8/13/2019 Proiect MappingBooks

    18/60

    Relations

    Anaphoric

    Non-anaphoric:

    kinship

    affection

    social

    SpeDCluj-Napoca, 15-17 October 2013

  • 8/13/2019 Proiect MappingBooks

    19/60

    Anaphoric relations

    coref member-of, has-as-member (inverse)

    isa, class-of(inverse)

    part-of, has-as-part(inverse)

    subgroup-of, has-as-subgroup(inverse)

    has-name, name-of (inverse)

    SpeDCluj-Napoca, 15-17 October 2013

  • 8/13/2019 Proiect MappingBooks

    20/60

    Kinship relations

    parent-of, child-of(inverse)

    grandparent-of, grandchild-of(inverse)

    sibling(symmetrical)

    ant-uncle-of, nephew-of(inverse relations)

    cousin-of(symmetrical)

    spouse-of(symmetrical unknown

    SpeDCluj-Napoca, 15-17 October 2013

  • 8/13/2019 Proiect MappingBooks

    21/60

    Affective relations

    friend-of, enemy-of(inverse)

    love,hate(inverse)

    worship

    SpeDCluj-Napoca, 15-17 October 2013

  • 8/13/2019 Proiect MappingBooks

    22/60

    Social relations

    inferior-of, superior-of(inverse)

    colleague-with

    SpeDCluj-Napoca, 15-17 October 2013

  • 8/13/2019 Proiect MappingBooks

    23/60

    Poles and directionality

    In case of referential:

    from anaphor towards antecedent

    In case of non-referential: from sourcetowards destination

  • 8/13/2019 Proiect MappingBooks

    24/60

    How to identify poles

    Imbricated:

    the anaphor/source is larger than the

    antecedent/destination

    Non-imbricated: referential:

    the anaphoris to the right of the antecedent

    non-referential: from sourceto destination, as the trigger reads

  • 8/13/2019 Proiect MappingBooks

    25/60

    Poles and triggers

    Referentiality: coref

    John met Maria on the ski slope. He raced her.

    anaforantecedent

    SpeDCluj-Napoca, 15-17 October 2013

  • 8/13/2019 Proiect MappingBooks

    26/60

    Poles and triggers

    Referentiality: coref

    John met Maria on the ski slope. He raced her .

    anaforantecedent

    SpeDCluj-Napoca, 15-17 October 2013

  • 8/13/2019 Proiect MappingBooks

    27/60

    Poles and trigger

    Kinship: parent-of

    their father

    trigger

    destination

    source

    SpeDCluj-Napoca, 15-17 October 2013

  • 8/13/2019 Proiect MappingBooks

    28/60

    Poles and trigger

    Social: inferior-of

    Cesar s principal courtiers

    trigger

    destination source

    SpeDCluj-Napoca, 15-17 October 2013

  • 8/13/2019 Proiect MappingBooks

    29/60

    Poles and trigger

    Affective: worship

    Lygia dropped on her knees to implore someone else .

    trigger

    destinationsource

    SpeDCluj-Napoca, 15-17 October 2013

  • 8/13/2019 Proiect MappingBooks

    30/60

    EntitiesPetroniu

    Vinicius was

    the son of his oldest sister ,

    who years before had married his father ,

    a man of consular dignity from the time of Tiberius .

    SpeDCluj-Napoca, 15-17 October 2013

  • 8/13/2019 Proiect MappingBooks

    31/60

    Referential relations: coref

    Petroniu

    Vinicius was

    the son of his oldest sister ,

    who years before had married his father ,

    a man of consular dignity from the time of Tiberius .

    SpeDCluj-Napoca, 15-17 October 2013

  • 8/13/2019 Proiect MappingBooks

    32/60

    Petroniu

    Vinicius was

    the son of his oldest sister ,

    who years before had married his father ,

    a man of consular dignity from the time of Tiberius .

    Referential relations: coref

    SpeDCluj-Napoca, 15-17 October 2013

  • 8/13/2019 Proiect MappingBooks

    33/60

  • 8/13/2019 Proiect MappingBooks

    34/60

    Referential relations: class-of

    Petroniu

    Vinicius was

    the son of his oldest sister ,

    who years before had married his father ,

    a man of consular dignity from the time of Tiberius .

    SpeDCluj-Napoca, 15-17 October 2013

  • 8/13/2019 Proiect MappingBooks

    35/60

    Kinship relations: sibling

    Petroniu

    Vinicius was

    the son of his oldest sister ,

    who years before had married his father ,

    a man of consular dignity from the time of Tiberius .

    SpeDCluj-Napoca, 15-17 October 2013

  • 8/13/2019 Proiect MappingBooks

    36/60

  • 8/13/2019 Proiect MappingBooks

    37/60

    Kinship relations: parent-of

    Petroniu

    Vinicius was

    the son of his oldest sister ,

    who years before had married his father ,

    a man of consular dignity from the time of Tiberius .

    SpeDCluj-Napoca, 15-17 October 2013

  • 8/13/2019 Proiect MappingBooks

    38/60

    Kinship relations: spouse-of

    Petroniu

    Vinicius was

    the son of his oldest sister ,

    who years before had married his father ,

    a man of consular dignity from the time of Tiberius .

    SpeDCluj-Napoca, 15-17 October 2013

  • 8/13/2019 Proiect MappingBooks

    39/60

    Social relations: inferior-of

    Petroniu

    Vinicius was

    the son of his oldest sister ,

    who years before had married his father ,

    a man of consulardignity from the time of Tiberius .

    SpeDCluj-Napoca, 15-17 October 2013

  • 8/13/2019 Proiect MappingBooks

    40/60

    The Quo Vadis RO-EntLink corpus

    words ent ref aff kin soc

    Now 25314 3663 2045 39 29 15Whole 137510 19483 9523 301 127 103

    annotators: 12 master students, first year in CL

    3 experts

    SpeDCluj-Napoca, 15-17 October 2013

  • 8/13/2019 Proiect MappingBooks

    41/60

  • 8/13/2019 Proiect MappingBooks

    42/60

    We do not mark

    Negated relations

    1:[Lygia]could not become 2:[the concubine of any man]

    Characterisations addressing subjects, expressed by

    predicative nouns

    in 1:[her]there was 2:[something uncommon]

    Interpreted relations, besides coreferential

    Petronius felt that beneath a statue of 1:[that maiden]

    one might write 2:["Spring."] (name-of-interpret)

    SpeDCluj-Napoca, 15-17 October 2013

  • 8/13/2019 Proiect MappingBooks

    43/60

    We do not mark

    More than the minimum number of relations Se repezi la1:[Petru]i, lundu-2:[i] 3:[minile], ncepu

    s 4:[i]5:[le]srute.

    [2] coref [1],

    [3] part-of [1] (or [3] part-of [2]),

    [4] coref [1] (or [4] coref [2]),

    [5] coref [3]

    [5] part-of [1] superfluous; by transitivity, from [5] coref

    [3] and [3] part-of [1]

    SpeDCluj-Napoca, 15-17 October 2013

  • 8/13/2019 Proiect MappingBooks

    44/60

    Our thanks go to

    the class of first year master students in

    Computational Linguistics, from the Faculty ofComputer Science, Alexandru Ioan Cuza

    University of Iai, second term 2012-2013

    SpeDCluj-Napoca, 15-17 October 2013

  • 8/13/2019 Proiect MappingBooks

    45/60

    And my thanks to

    YOU!

    SpeDCluj-Napoca, 15-17 October 2013

  • 8/13/2019 Proiect MappingBooks

    46/60

    And now

    Lets build the Mapped Books!

    F t t t h

  • 8/13/2019 Proiect MappingBooks

    47/60

    Features we want to have

    Understand what a text says about

    Know who is who

    Recognise real world entities

    Know where I am

    What real world entities are in my proximity

    Trace on Google Maps a path described in the book

    Fetch, process and make use of geo-data

    Mix images with generated info

    Attractive user interfaces

    Client-server

    F t t t h

  • 8/13/2019 Proiect MappingBooks

    48/60

    Features we want to have

    Understand what a text says about

    the capacity to see a text different than a string ofletters

    TEXT ANALYTICS

  • 8/13/2019 Proiect MappingBooks

    49/60

    F t t t h

  • 8/13/2019 Proiect MappingBooks

    50/60

    Features we want to have

    What real world entities are mentioned in the book

    virtual world vs real world

    link textual mentions of entities in the virtual world

    decide on relevant info from virtual to be presented to

    Master use multiple sources

    ENTITY CROWLING

    F t t t h

  • 8/13/2019 Proiect MappingBooks

    51/60

    Features we want to have

    Know where I am

    What real world entities are in my proximity

    detection of my position

    computation of distances from the mentioned places

    signalling interesting locations in proximity

    LOCALISATION

    Features we want to have

  • 8/13/2019 Proiect MappingBooks

    52/60

    Features we want to have

    Trace on Google Maps a path described in the book

    path detection in text

    use Google Maps APIs

    trace locations and paths on Google Maps

    PATH DETECTION

    MAPS&TRAJECTORIES

  • 8/13/2019 Proiect MappingBooks

    53/60

    Features we want to have

  • 8/13/2019 Proiect MappingBooks

    54/60

    Features we want to have

    Mix images with generated info

    process images => segment, contours, recognition

    sense orientation of the camera

    decide info to be displayed

    use info provided by the camera focus

    AUGMENTED REALITY

  • 8/13/2019 Proiect MappingBooks

    55/60

    Features we want to have

  • 8/13/2019 Proiect MappingBooks

    56/60

    Features we want to have

    Client-server

    users Portrait

    the database

    the architecture

    standards and communication protocols

    CLIENT-SERVER

  • 8/13/2019 Proiect MappingBooks

    57/60

  • 8/13/2019 Proiect MappingBooks

    58/60

    Modules

    1. TEXT ANALYTICS

    2. NAME ENTITY

    RECOGNITION3. ENTITY CROWLING

    4. LOCALISATION

    5. PATH DETECTION

    6. MAPS&TRAJECTORIES

    7. GEOGRAPHY

    8. AUGMENTED REALITY

    9. INTERFACES

    10. CLIENT-SERVER

    11. RESOURCES

    12. MANAGEMENT AND

    EVALUATION

  • 8/13/2019 Proiect MappingBooks

    59/60

    Organisational details

    2 projects!

    Two groups: lead by Ionu Pistol and Mdlina Rschip

    Each project includes a number of modules (12?)

    Each module is built by a team

    Each team has aprox. 10-15 members

    Only one Management & Organisation team

  • 8/13/2019 Proiect MappingBooks

    60/60

    Immediate deadline

    Teams organised next week!