Congratulations, Dr. Yang
Sunday, September 6th, 2009Li Yang successfully defended his dissertation, Re-evaluating and Exploring the Contributions of Constituent Grammar to Semantic Role Labeling, on Sept. 4.
Committee: Steve Abney (Chair), George Michailidis, Drago Radev, Rich Thomason
Li will continue his work for Janya in Buffalo, NY, a company that develops information extraction software.
Congratulations, Dr. Yang!
Abstract:
Since the seminal work of Gildea and Jurafsky (2000), semantic role labeling (SRL) researchers have been trying to determine the appropriate syntactic/semantic knowledge and machine learning algorithms to tackle the challenges in SRL. In search of the appropriate knowledge, SRL researchers
shifted from constituency grammar to dependency grammar around 2007 due to the suspension in improvement in the systems relying on features based on constituency grammar. However, the results from the CoNLL-2008 SRL systems, all of which utilized dependency grammar-based features, did not support the hypothesis that dependency grammar was more suitable for SRL. Therefore, determining the right syntactic/semantic knowledge for SRL still remains an open question. This entails that finding the right syntactic/semantic knowledge to create features that generalize across the syntactic variations that a verb appears in and involve argument movement or displacement remains a challenge as well.
The current dissertation continues the effort to discover the appropriate syntactic/semantic knowledge for SRL. Specifically, while seeking the proper features to solve the SRL problem in general, the present work focuses on tackling the syntactic variation challenge by integrating three types of less thoroughly explored knowledge in constituency grammar-based SRL systems, including context dependence among the semantic roles of core arguments, syntactic structures involving argument movement or displacement, and dependency grammar relations. Integrating such knowledge leads to the following novel approach.
The system identifies the core and non-core semantic arguments of a verb. To classify a non-core argument, the system uses a set of generic features. For a core-argument, the system relies on the preceding types of knowledge to extract the base argument configuration (BAC) feature in which the core arguments’ positions overlap with those of an argument structure of the verb. As a result, BAC features generalize across the syntactic variations a verb appears in. Together with the two levels of backoff features dealing with unrealized core arguments and unknown verbs respectively, BAC features effectively solve the argument classification task and successfully handles the preceding challenge. However, the experimental results indicate that the overall performance is affected by the argument identification module. The immediate future work would be to improve the identification module.




