Search
11 resources
-
Ehud Reiter, Anja Belz|Dec 22nd, 2009|journalArticleEhud Reiter, Anja BelzDec 22nd, 2009
There is growing interest in using automatically computed corpus-based evaluation metrics to evaluate Natural Language Generation (NLG) systems, because these are often considerably cheaper than the human-based evaluations which have traditionally been used in NLG. We review previous work on NLG evaluation and on validation of automatic metrics in NLP, and then present the results of two studies of how well some metrics which are popular in other areas of NLP (notably BLEU and ROUGE)...
-
Cynthia Lee, Kelvin C.K. Wong, William K...|Feb 22nd, 2009|journalArticleCynthia Lee, Kelvin C.K. Wong, William K...Feb 22nd, 2009
-
Klaus Zechner, Derrick Higgins, Xiaoming...|Oct 1st, 2007|journalArticleKlaus Zechner, Derrick Higgins, Xiaoming...Oct 1st, 2007
-
Anat Ben-Simon, Randy Elliot Bennett|Jan 22nd, 2007|journalArticleAnat Ben-Simon, Randy Elliot BennettJan 22nd, 2007
This study evaluated a “substantively driven” method for scoring NAEP writing assessments automatically. The study used variations of an existing commercial program, e-rater®, to compare the performance of three approaches to automated essay scoring: a brute-empirical approach in which variables are selected and weighted solely according to statistical criteria, a hybrid approach in which a fixed set of variables more closely tied to the characteristics of good writing was used but the...
-
Joseph P. Turian, Luke Shen, I. Dan Mela...|Jan 1st, 2006|conferencePaperJoseph P. Turian, Luke Shen, I. Dan Mela...Jan 1st, 2006
Evaluation of MT evaluation measures is limited by inconsistent human judgment data. Nonetheless, machine translation can be evaluated using the well-known measures precision, recall, and their average, the F-measure. The unigram-based F-measure has significantly higher correlation with human judgments than recently proposed alternatives. More importantly, this standard measure has an intuitive graphical interpretation, which can facilitate insight into how MT systems might be improved. The...
-
Alon Lavie, Kenji Sagae, Shyamsundar Jay...|Jan 22nd, 2004|bookSectionAlon Lavie, Kenji Sagae, Shyamsundar Jay...Jan 22nd, 2004
-
Chin-Yew Lin, Franz Josef Och|Jan 22nd, 2004|conferencePaperChin-Yew Lin, Franz Josef OchJan 22nd, 2004
-
Paul Deane, Kathleen M. Sheehan|Apr 22nd, 2003|conferencePaperPaul Deane, Kathleen M. SheehanApr 22nd, 2003
-
George Doddington|Jan 22nd, 2002|conferencePaperGeorge DoddingtonJan 22nd, 2002
-
Lawrence M. Rudner, T. Liang|Jan 22nd, 2002|conferencePaperLawrence M. Rudner, T. LiangJan 22nd, 2002
-
Kishore Papineni, Salim Roukos, Todd War...|Jan 22nd, 2001|conferencePaperKishore Papineni, Salim Roukos, Todd War...Jan 22nd, 2001