The ‘Reproducible Research’ idea posits that publishing data and code, not just statistical summaries, makes for better and faster science. In particular, shared datasets and shared evaluation metrics lower barriers to entry, and allow meaningful comparison of scientific hypotheses with engineering algorithms.
In this lecture, Mark Liberman will describe the origins and development of the ‘Common Task’ method in DARPA’s human language technology program, its broader influence on recent research and development practices, and its lessons for the future. Large, shared datasets and well-defined evaluation metrics allow the steady improvement of technologies a decade or more in advance of commercial viability. There are important opportunities to apply similar ideas in a wide variety of areas, from autism research to STEM education and writing instruction.
Mark Liberman is the Christopher H. Browne Professor of Linguistics at the University of Pennsylvania, with positions in the department of computer science and in the psychology graduate group. He is also founder and director of the Linguistic Data Consortium. Before coming to the University of Pennsylvania, he was head of the linguistics research department at AT&T Bell Laboratories.
Registration information coming soon.
If this lecture is videotaped, it will be posted here after production.