Probing Language Models

Abstract

Using probing methods designed for language models, we compare syntactic representations learned by recurrent and attention-based deep learning models. Surprisingly, we find that recurrent models capture POS tag and syntax tree information to a higher degree than attention-based models do. However, these conclusions do not hold as strongly for syntax tree information, as there are no control tasks for structural probes yet, and other NLP tasks may yield different results. Interesting directions for further work are therefore to explore probing for other syntactic and semantic tasks, and to design control tasks for these other tasks.

Type
Publication
Probing Language Models