alwaysaditi's picture
End of training
dc78b20 verified
since 1995, a few statistical parsing algorithms (magerman, 1995; collins, 1996 and 1997; charniak, 1997; rathnaparki, 1997) demonstrated a breakthrough in parsing accuracy, as measured against the university of pennsylvania treebank as a gold standard. yet, relatively few have embedded one of these algorithms in a task. chiba, (1999) was able to use such a parsing algorithm to reduce perplexity with the long term goal of improved speech recognition. in this paper, we report adapting a lexicalized, probabilistic context-free parser with head rules (lpcfg-hr) to information extraction. the technique was benchmarked in the seventh message understanding conference (muc-7) in 1998. several technical challenges confronted us and were solved: treebank on wall street journal adequately train the algorithm for new york times newswire, which includes dozens of newspapers? manually creating sourcespecific training data for syntax was not required. instead, our parsing algorithm, trained on the upenn treebank, was run on the new york times source to create unsupervised syntactic training which was constrained to be consistent with semantic annotation.this simple semantic annotation was the only source of task knowledge used to configure the model. we have demonstrated, at least for one problem, that a lexicalized, probabilistic context-free parser with head rules (lpcfghr) can be used effectively for information extraction. instead, our parsing algorithm, trained on the upenn treebank, was run on the new york times source to create unsupervised syntactic training which was constrained to be consistent with semantic annotation. while performance did not quite match the best previously reported results for any of these three tasks, we were pleased to observe that the scores were at or near state-of-the-art levels for all cases. since 1995, a few statistical parsing algorithms (magerman, 1995; collins, 1996 and 1997; charniak, 1997; rathnaparki, 1997) demonstrated a breakthrough in parsing accuracy, as measured against the university of pennsylvania treebank as a gold standard. we evaluated the new approach to information extraction on two of the tasks of the seventh message understanding conference (muc-7) and reported in (marsh, 1998). our system for muc-7 consisted of the sentential model described in this paper, coupled with a simple probability model for cross-sentence merging. for the following example, the template relation in figure 2 was to be generated: "donald m. goldstein, a historian at the university of pittsburgh who helped write..."