publications | Matthew T. Dunn

2017

SearchQA: A New Q&A Dataset Augmented with Context from a Search Engine

Matthew Dunn, Levent Sagun, Mike Higgins, and 3 more authors

2017
Early Predictability of Asylum Court Decisions

Matt Dunn, Levent Sagun, Hale Şirin, and 1 more author

In Proceedings of the 16th Edition of the International Conference on Articial Intelligence and Law, 2017

2021

Actionable Conversational Quality Indicators for Improving Task-Oriented Dialog Systems

Michael Higgins, Dominic Widdows, Chris Brew, and 9 more authors

2021

Bib

@article{higgins2021actionable,
  title = {Actionable Conversational Quality Indicators for Improving Task-Oriented Dialog Systems},
  author = {Higgins, Michael and Widdows, Dominic and Brew, Chris and Christian, Gwen and Maurer, Andrew and Dunn, Matthew and Mathi, Sujit and Hazare, Akshay and Bonev, George and Hockey, Beth Ann and Howell, Kristen and Bradley, Joe},
  year = {2021},
  eprint = {2109.11064},
  archiveprefix = {arXiv},
  primaryclass = {cs.CL}
}

2022

Domain-specific knowledge distillation yields smaller and better models for conversational commerce

Kristen Howell, Jian Wang, Akshay Hazare, and 7 more authors

In Proceedings of the Fifth Workshop on e-Commerce and NLP (ECNLP 5), May 2022

Abs

We demonstrate that knowledge distillation can be used not only to reduce model size, but to simultaneously adapt a contextual language model to a specific domain. We use Multilingual BERT (mBERT; Devlin et al., 2019) as a starting point and follow the knowledge distillation approach of (Sahn et al., 2019) to train a smaller multilingual BERT model that is adapted to the domain at hand. We show that for in-domain tasks, the domain-specific model shows on average 2.3% improvement in F1 score, relative to a model distilled on domain-general data. Whereas much previous work with BERT has fine-tuned the encoder weights during task training, we show that the model improvements from distillation on in-domain data persist even when the encoder weights are frozen during task training, allowing a single encoder to support classifiers for multiple tasks and languages.