Semantic Parsing
- Mixing Weak Learners in Semantic Parsing, Rodney D. Neilsen,
Sameer Pradhan, In proceedings of EMNLP 2004
[PDF]
- Uses random forests for semantic parsing
- Suggests cool feature space dimensionality reduction algorithm.
- Features drawn from Gildea and Jurafsky 2002, Pradhan et al 2003,
& Surdeanu et al. 2003.
- Pradhan et al 2003 & Surdeanu et al. 2003 features
- Name entities
- Head word POS
- Content word
- Verb Cluster
- Half Path
- Introduces 2 features: governing preposition (GP), &
content word base (normalized content word - singular form,
no prefixes, digits replaced with 'n').
- Using Predicate-Argument Structures for Information Extraction,
Mihai Surdeanu, Sanda Harabagin, John Williams, and Paul
Aarseth, In proceedings of ACL 2002
[PDF]
- Uses semantic parsing for information extraction
- Introduces named entity based features, POS features, as well as
'content' word feature (i.e.
the content word is a heuristically determined informative word,
that in many cases will be different from the head of a constituent)
- Semantic parsing based on a decision tree classifier (C5).
- The Necessity of Parsing for Predicate Argument Recognition, Daniel
Gildea, & Martha Palmer, In proceedings of ACL 2002
[PDF]
- Gildea & Jurafsky features and back-off based semantic parser
applied to (early version of) PropBank rather than FrameNet
- Compares and contrasts Framenet and PropBank results
- Using automatic parsers, performance on FrameNet is better -
(P:64.6,R:61.2) vs (P:57.7,R:50.0)
- Examines the value of accurate parses in the semantic role labeling
task
- Using 'gold standard' parsers, PropBank performance increases to
(P:71.1,R:64.4), and subsequent filtering out of under represented
examples further increases performance to (P:73.5,R:71.7).
- Conclusion --> Having good parses it critical to achieving good
performance
- Argues for the necessity of syntactic parsing in semantic role
labeling by comparing system that uses features extracted from
a parse tree to a system that relies on an idealized* base-level
constituent chunker (* - constituent chunks are based on gold standard
parses)
- Chunker based system is a bit of a 'straw man', for a better one
see Hacioglu et al. 2003 (HTL proceedings)
- Chunker based system has terrible performance using even a
very liberal scoring metric (P:49.5,R:35.1), with strict scoring
the results are even worse (P:27.6,R:22.0)
- However, the system does illustrated the relative importance of
two features, head word & path, in the performance of the
parse tree based system.
- Target Word Detection and Semantic Role Chunking using
Support Vector Machines, Kadri Hacioglu, & Wayne Ward,
In proceedings of HLT-NACCL 2003
[PDF]
- Demonstrates that better than expected semantic role labeling
performance can achieved by a chucking based system
- Competitive with Gildea & Jurafsky 2002
- achieved overall (P:67.6,R:55.9) - in comparsion
Gildea & Jurafsky achieved (P:65.0,R:61.0).
- Used FrameNet data set with the same set of semantic role
mappings used in Gildea & Jurafsky 2002.
- Performed both the task of identification of the target word
and labeling of semantic roles
- achieved (P:76.8,R:73.1) when identifying the target word
- Features - all contained in five word sliding window around
the current word being labeled: word identities, part of
speech, constituent chunk information, and classifier
labels assigned to two proceedings words
- Implemented using
Yamcha/
TinySVM.
-
Maximum Entropy Models for FrameNet Classification, Micheal Fleischman,
Namhee Kwon, & Eduard Hovy, in proceedings of EMNLP 2003
[PDF]
- Training & evaluation done using the same division of the FrameNet
data set seen in Gildea & Jurafsky 2002
- Uses maximum entropy to estimate model probabilities
- Just using features drawn from G&J 2002,
using maximum entropy, rather than the G&J back off model,
results in an increase in
performance from 78.5 to 81.7 % on the labeling task.
- Introduces 3 new features for used in the labeling task
- New Features:
-
Order - the linear position of the frame element in
the context of the other frame elements to be labeled
(i.e. whether the frame element is the first, second, third, etc.
frame element in the sentence that is to be labeled)
- Syntactic pattern - a global feature for a sentence that
reflects the phrase type & logical function
(governing cat in G&J 2002) of
each frame element to be labeled, ordered according to
their linear position in the sentence with the position of the
target reflected by a 'target' entry in the list
- Previous role - a feature that reflects the role assigned to the
previous frame element or roles assigned to the previous 2 frame
elements. Using this
feature involves performing a viterbi search over possible
assignments of roles to the frame elements
- Using syntactic pattern & order, in addition to the baseline
features,
increases performance to 83.6 on the labeling task.
- Using all three new features (+baseline features) increases
performance to 84.7 on the labeling task.
- Frame identification - performed using only three sets of features:
path, path /\ target, & target /\ head word.
- Achieves (P:.736,R:.679) on the pure identification task
- When identification is combined with frame element classification,
the system the performance is (P:.6,R:.554).
- In comparison
G&J 2002 report (P:.726,R:.631) and (P:.67,R:.468), on
pure identification & identification + labeling, respectively.
- Experiments involving varying the size of the training set suggest that
further increases in the amount of data available could
substantially
improve performance on the classification task, but with smaller
gains on the identification task. This suggests a more sophisticated
model is necessary for further improvements in frame element
identification.
- SENSEVAL Automatic Labeling of Semantic Roles using Maximum Entropy,
Namhee Kwon, Michael Fleischman, & Eduard Hovy, In proceedings of
Senseval-3 (2004)
[PDF]
- Builds on Fleischman, Kwon & Hovy 2003
- Uses Features drawn from Gildea & Jurafsky 2002, and
Fleischman, Kwon, & Hovy 2003.
- Incorporates three new features
motivated by the task definition for the senseval track on
automatic labeling of semantic roles.
- Frame - identity of the frame
- lexical unit - base form of the predicate being labeled
represented in conjunction with the predicate's
grammatical type (i.e. verb, noun, adjective)
- lexical unit type - grammatical type of the predicate
(i.e. verb, noun, adjective),
as derived from the representation of the lexical unit.
- Introduces a 'partial path' feature - identical to standard path
feature iff constituent being labeled is under the same "S" as the
target word. Otherwise, it is set to the value "nopath".
- Introduces a sentence segmentation step into the labeling pipeline
- New pipeline: segmentation->identification->tagging
- Segmentation consists of selecting the sequence of
constituents that covers the entire input sentence
and are at the highest level possible in the parse tree
while still allowing the target to be contained within it's
own segment
- Advantages:
- Less candidate FEs during identification step (results in
faster training)
- Allows the identification of FEs to be done using a
straight forward application of a MEMM to the resulting
sequence of segments. This allows for easy inclusion of features
that encode dependencies between a FE labels in the sequence
- Disadvantage: while 85.8% of FEs correspond to a constituent in
the parse tree, only 79.5% of the FEs correspond to the resulting
segments. Thus, the upper bound on successful FE identification
is lowered.
- Performance
- Using scoring script from senseval -
Restricted task (id+labeling): (P:.802,R:.654) label overlap 0.784;
Unrestricted task (labeling): (P:.867,R:.858) label overlap 0.866
- Using exact match scores for the test set -
Restricted task (id+labeling): (P:.711,R:.585);
Unrestricted task (labeling): (P:.867,R:.858)
- Best performing senseval-3 system (UTDMorareseu) -
Restricted task (id+labeling): (P:.899,R:.772) label overlap 0.882;
Unrestricted task (labeling): (P:.946,R:.907) label overlap 0.946
- SENSEVAL-3 TASK Automatic Labeling of Semantic Roles,
Kenneth C. Litkowski, In proceedings of Senseval-3 (2004)
[PDF]
- Summarizes senseval-3 automatic semantic role labeling task and
results
- Used subset of FrameNet 1.1 database
- Complete FrameNet 1.1 database - 132,968 annotated sentences,
487 frames, 696 distinctly-named frame elements
- For this task: 8,002 annotated sentences
(in the evaluation set), 40 frames
- Frames choosen at random from those with at least 370
annotations
- Answers sumbitted as a plain text file with the results for one
annotated sentence per line. Each such line indicates the frame
of the sentence, the sentence's unique id number, the semantic roles
present in the frame as well as the character positions that those
semantic roles occur in the sentence
(e.g. "Motion.1087911 Theme(82,88) Path(0,0)").
- Null instantions, i.e. semantic roles that are conceptually
present but not explicitly represented in the sentence, can be
indicated by a system by a semantic role with start and end positions
of 0.
- Two varietations on the task
- Restricted - The information the system can used during
evaluation is restricted to: the sentence to be labeled, the
the identity (/position of the) target predicate, &
the lexical unit
- Unrestricted - During evaluation the system can use all of
the information in the FrameNet database except for the
identity of the frame element to be labeled
- Scoring
- Precision & Recall of semantic roles returned by the
system - whereby such semantic roles must overlap with the
manually annotated semantic roles by at least on character
position
- Overlapp - the degree to which correctly labeled semantic roles
returned by the system overlap with manually annotate semantic
roles, as measured by fraction of overlappping characters
- Attempted - number of semantic roles returned by the system
divided by the number of manually annotated semantic roles.
- Null intantiations did not effect any of the scoring metrics
- Submissions: 20 systems by 8 teams
- Results:
- Restricted task
- Average - Prec: 0.803, Rec: 0.757
- Best(UTDMorarescu) - Prec: 0.946, Rec: 0.907, Over: 0.946,
Att:95.8
- Unrestricted task
- Average - Prec: 0.595, Rec: 0.481
- Best(UTDMorarescu) - Prec: 0.899, Rec: 0.772, Over: 0.882,
Att: 85.9
- Semantic Parsing Based on Framenet, Cosmin Ardian Bejan,
Alessandro Moschitti, Paul Morarescu, Grabriel Nicolae,
and Sanda Harabagiu, In Proceedings of Senseval-3 (2004)
[PDF]
- Best results for SENSEVAL-3 Automatic labeling of semantic roles
bakeoff
- SVM based system
- For the labeling task, a separate multi-class SVM based
classifier was trained for each frame
- Each multi-class classifier used for the labeling task was
implemented as a set of one vs. all (OVA) binary classifiers.
In case of two binary classifiers attempting to assign a label
to the same frame element, they "select(ed) the classification
which was assigned the highest score by the SVM". Presumably
this means the output of the SVM in which the FE was
the furthest away from the decision boundary
- While not entirely clear from the paper, I would guess that
separate binary classifiers, one for each frame, were used for
the identification task as well
- Features
- Drawn from Gildea & Jurafsky 2002, Surdeanu et al. 2003,
and Pradhan et al. 2004.
- Gildea et al. 2002: phrase type, parse tree path,
positition, voice, head word, governing category,
target word
- Surdeanu et al. 2003: Content word, part of speech of
head word, part of speech of content word,
named entity class of content word, boolean named entity
flags
- Pradhan et al. 2004: Parse tree path w/o direction of
transitions, partial path, first word in constituent,
last word in constituent, part of speech of first word,
part of speech of last word, left constituent phrase type,
left constituent head, left constituent head POS,
right constituent phrase type, right constituent head,
right constituent head POS, pp preposition, tree distance
to target
- Introduced a ton of new features
- Human - true if phrase is a personal pronoun or
is a hyponym of PERSON sense 1 in WordNet
- Support verb - if the target is in a VP, then is set
to the POS of the VP. Otherwise, it's set to NULL.
- Target type - the target's lexical class
- List constituent FEs - a list of the phrase types of the
frame elements in a sentence
- Grammatical function - as given in the FrameNet
database for each labeled FE
- List grammatical function - a list of the grammatical
functions of the frame elements in a sentence
- Number FE - the number of frame elements in a
sentence
- Frame name - the name of the frame under which the
frame elements are to be labeled
- Coverage - Boolean feature that indicates whether
or not the FE is perfectly covered by a constituent in the
parse tree
- Coreness - whether the frame element being labeled
is core, peripheral, or extra-thematic
- Sub corpus - the subcorpus that contains the sentence
being labeled
- For identification task features used are those from
Gildea et al. 2002, Surdeanu et al. 2003, Pradhan et al. 2004,
as well as support verb, target type, frame net, and sub corpus
for the new feature set defined above
- Parse normalization heuristics introduced:
- For the unlabeled task, if there is no one constituent
that exactly matches the boundaries of the FE but there are
a sequence of constituents that are exactly contained within it,
then a new NP-merge node is introduced to join the smaller
constituents
- If the target word is a noun or an adjective, consecutive nouns
within the same noun phrase of the target are combined into a
new larger NP
- Results
- Unrestricted: (P:.945,R:0.906)
- Restricted: (P:.824,R:.711)
- Calibrating Features for Semantic Role Labeling, Nianwen Xue and
Martha Palmer, In proceedings of EMNLP 2004
[PDF]
- Theme - Prior research has not fully exploited all of the information
that is present in a parse tree and useful for semantic parsing. Further,
it is possible to only use parse tree based features, along
with log-linear classifiers rather than SVMs, in order to
obtain performance that is very close to more sophisticated systems.
- Dataset - Propbank (version released on 2/4/2004)
- Classifier - Maximum entropy. Given the amount of training time
required for maximum entropy relative to SVMs, they can be used
to much more rapidly explore what features are most effective.
- Features - based on features given in Gildea & Jurafsky 2002,
with a couple of new features and with the rest being carefully
engineered conjunctions of features:
- Identification
- Path
- Head
- Head part of speech
- Phrase type /\ predicate
- Head /\ Predicate
- Predicate /\ Distance from predicate
- New features (/feature combinations) for classification
- Syntactic frame - list of the FE phrase types being
labeled, with the targets position being indicated with
'target', and with flagging of the current FE
being labeled (e.g. np_v_NP_np, whereby NP corresponds
to the FE currently being labeled)
- Head of PP parent - if the constituent is immediately
embedded in a PP, then the head of that PP.
- Lexicalized constituent type - position /\ voice
- Lexicalized head word - head /\ predicate
- Voice position combination - phrase type /\ predicate
- Indroces novel way of filtering arguments to be labeled.
That is, starting
with the predicate, the system collects all of it's immediate sisters
in the tree. If one of the sisters is a PP, then the system
collects all of the PPs immediate children as well. The process
the repeats of the parent of the predicate, and after that the parent's
parent. The process continutes until the top of the tree is reached.
Using gold standard parses, 99.3% of the arguments are caputured.
Using parsers from the Collins parser, 88.9% of the
arguments are captured.
- Results
- Gold standard parses - Classification: 92.95%,
Identification & classification: 88.51 % (F-score)
- Automatic parsed - Identificaiton & classication:
76.21%
- Comparsion made to Pradhan et al. 2004 (using gold standard
parses) -
Classification: 93.0%,
Identification & classification: 89.4% (F-score)
- Shallow Semantic Parsing using Support Vector Machines,
Sameer Pradhan, Wayne Ward, Kadri Hacioglu, James H. Martin,
Dan Jurafsky, In proceedings of HTL 2004
[PDF]
- Theme - Using features from Gildea & Jurafsky 2002,
and a couple of features from Surdeanu et al. 2003, as well as
a large number of novel features, the authors build what was
at the time of publication the best performing shallow
semantic parsing system. Experiments are also done to
assess the value of the various features used, explore the
use of HMM rescoring of a labeled sentence, as well as
preformacne of the system on different versions of the propbank
corpus and a new corpus based on data from the AQUAINT project data
- Dataset - PropBank July 2002 release, as well as a handful
of experiments with the ProbBank Feb 2004 release.
- Classifier - SVM w/polynomial kernel degree 2 (Yamcha + TinySVM)
- Features:
- Drawn from Gildea & Jurafsky 2002 and Surdeanu et al. 2003
- Predicate (G&J)
- Path (G&J)
- Phrase type (G&J)
- Position (G&J)
- Voice (G&J)
- Head word (G&J)
- Sub-categorization - phrase structure rule used to expand
the predicate's parent node (listed a being from
Gildea & Jurafsky; although I don't remember this
feature being in that paper)
- Named entities in constituent (Surdeanu et al. 2003)
- Head word POS (Surdeanu et al. 2003)
- Novel features
- Verb clusterings - one of 64 clusters of verbs where by the
clusters were created using a collection of verb-direct-object
relationships collected by
Minipar (Lin 1998)
- Verb sense information - as tagged in the PropBank corpus
(an "oracle" feature)
- Partial path - path from constiteunt being labeled to the
constitient that is the lowest common ancestor of the constituent
being labeled and the predicate
- Head word of prepositional phrases -
For prepositional phrases, the head of the first noun phrase
inside the prepositional phrase. The preposition is preserved
by attaching it to the phrase type (It sounds like this feature
replaces the traditional head feature for PP, rather than
just adding in another feature to the mix)
- First and Last Word/POS in Constituent
- Ordinal constituent position - linear distance,
in intervening words, to the predicate.
- Constituent tree distance - distance to the predicate as
measured by tree archs
- Constituent relative features - Head, POS, and phrase type
of the constituent's parent as well as left & right siblings
- Temporal cue words - keywords that attempt to flag temporal
expressions often missed by the named entity tagger
- Dynamic class context - the predicated class of the previous
two arguments labeled
- HMM rescoring
- Use plautt's algorithm to get probablities from SVM's
- Trigram language model trained over core arguments
- Two variations - One with a identical representation
for all predicates
and one with predicates being represented by a specific lemma
- Baseline - (P:.900,R:.861,F:.880),
Shared predicate representation - (P:.980,R:.863,F:.885),
Specific predicate lemma - (P:.905,R:.874,F:.889)
- Best Performance
- July 2002 data, manual parses - (all args)
Classification: 91.0%, ID+Classification (P:.889,R:.846,F:.867);
(core args) Classification: 93.9%, ID+Classification
(P:.905,R:.874,F:.889)
- July 2002 data, automatic parses - (all args)
Classifcation: 90.0%, ID+Classification (P:.840,R:.753,F:.794)
- Feb 2004 data, manual parses - (all args)
Classification: 93.0%, ID+Classification (P:.899,R:.890,F:.894)
- AQUAINT data - (all args) Classificaion: 83.8%,
ID+Classification (P:.652,R:.615,F:.633)
Machine Translation
- Scaling Phrase-Based Statistical Machine Translation
to Larger Corpora and Longer Phrases Chris Callison-Burch, Colin Bannard,
and Josh Schroeder, ACL 2005
[PDF]
-
A Hierarchical Phrase-Based Model for Statistical Machine
Translation, David Chiang (ACL 2005 best paper award)
[PDF]