Search
11 resources
-
Ehud Reiter, Anja Belz|Dec 10th, 2009|journalArticleEhud Reiter, Anja BelzDec 10th, 2009
There is growing interest in using automatically computed corpus-based evaluation metrics to evaluate Natural Language Generation (NLG) systems, because these are often considerably cheaper than the human-based evaluations which have traditionally been used in NLG. We review previous work on NLG evaluation and on validation of automatic metrics in NLP, and then present the results of two studies of how well some metrics which are popular in other areas of NLP (notably BLEU and ROUGE)...
-
Cynthia Lee, Kelvin C.K. Wong, William K...|Feb 10th, 2009|journalArticleCynthia Lee, Kelvin C.K. Wong, William K...Feb 10th, 2009
-
Klaus Zechner, Derrick Higgins, Xiaoming...|Oct 1st, 2007|journalArticleKlaus Zechner, Derrick Higgins, Xiaoming...Oct 1st, 2007
-
Anat Ben-Simon, Randy Elliot Bennett|Mar 10th, 2007|journalArticleAnat Ben-Simon, Randy Elliot BennettMar 10th, 2007
This study evaluated a “substantively driven” method for scoring NAEP writing assessments automatically. The study used variations of an existing commercial program, e-rater®, to compare the performance of three approaches to automated essay scoring: a brute-empirical approach in which variables are selected and weighted solely according to statistical criteria, a hybrid approach in which a fixed set of variables more closely tied to the characteristics of good writing was used but the...
-
Joseph P. Turian, Luke Shen, I. Dan Mela...|Jan 1st, 2006|conferencePaperJoseph P. Turian, Luke Shen, I. Dan Mela...Jan 1st, 2006
Evaluation of MT evaluation measures is limited by inconsistent human judgment data. Nonetheless, machine translation can be evaluated using the well-known measures precision, recall, and their average, the F-measure. The unigram-based F-measure has significantly higher correlation with human judgments than recently proposed alternatives. More importantly, this standard measure has an intuitive graphical interpretation, which can facilitate insight into how MT systems might be improved. The...
-
Alon Lavie, Kenji Sagae, Shyamsundar Jay...|Mar 10th, 2004|bookSectionAlon Lavie, Kenji Sagae, Shyamsundar Jay...Mar 10th, 2004
-
Chin-Yew Lin, Franz Josef Och|Mar 10th, 2004|conferencePaperChin-Yew Lin, Franz Josef OchMar 10th, 2004
-
Paul Deane, Kathleen M. Sheehan|Apr 10th, 2003|conferencePaperPaul Deane, Kathleen M. SheehanApr 10th, 2003
-
George Doddington|Mar 10th, 2002|conferencePaperGeorge DoddingtonMar 10th, 2002
-
Lawrence M. Rudner, T. Liang|Mar 10th, 2002|conferencePaperLawrence M. Rudner, T. LiangMar 10th, 2002
-
Kishore Papineni, Salim Roukos, Todd War...|Mar 10th, 2001|conferencePaperKishore Papineni, Salim Roukos, Todd War...Mar 10th, 2001