Clara Meister
Clara Meister
Home
Publications
Talks
Light
Dark
Automatic
Publications
Type
Conference paper
Journal article
Date
2023
2022
2021
2020
Tokenization and the Noiseless Channel
Subword tokenization is a key part of most NLP pipelines.However, little is known about why some tokenizer and hyperparameter …
Vilém Zouhar
,
Clara Meister
,
Juan Gastaldi
,
Li Du
,
Mrinmaya Sachan
,
Ryan Cotterell
Cite
URL
On the Efficacy of Sampling Adapters
Sampling-based decoding strategies are widely employed for generating text from probabilistic models, yet standard ancestral sampling …
Clara Meister
,
Tiago Pimentel
,
Luca Malagutti
,
Ryan Cotterell
Cite
A Measure-theoretic Characterzation of Tight Language Model
Language modeling, a central task in natural language processing, involves estimating a probability distribution over strings. In most …
Li Du
,
Lucas Torroba Hennigen
,
Tiago Pimentel
,
Clara Meister
,
Jason Eisner
,
Ryan Cotterell
Cite
URL
A Formal Perspective on Byte-Pair Encoding
Byte-Pair Encoding (BPE) is a popular algorithm used for tokenizing data in NLP, despite being devised initially as a compression …
Vilém Zouhar
,
Clara Meister
,
Juan Gastaldi
,
Li Du
,
Tim Vieira
,
Mrinmaya Sachan
,
Ryan Cotterell
Cite
URL
On the Usefulness of Embeddings, Clusters and Strings for Text Generation Evaluation
A good automatic evaluation metric for language generation ideally correlates highly with human judgements of text quality. Yet, there …
Tiago Pimentel
,
Clara Meister
,
Ryan Cotterell
Cite
URL
Mutual Information and Hallucinations in Abstractive Summarization
Despite significant progress in the quality of language generated from abstractive summarization models, these models still exhibit the …
Liam van der Poel
,
Ryan Cotterell
,
Clara Meister
PDF
Cite
URL
On the probability–quality paradox in language generation generation
Clara Isabel Meister
,
Gian Wiher
,
Tiago Pimentel
,
Ryan Cotterell
PDF
Cite
Estimating the Entropy of Linguistic Distributions
Aryaman Arora
,
Clara Isabel Meister
,
Ryan Cotterell
PDF
Cite
Analyzing Wrap-Up Effects through an Information-Theoretic Lens
Clara Meister
,
Tiago Pimentel
,
Thomas Hikaru Clark
,
Ryan Cotterell
,
Roger P. Levy
PDF
Cite
On Decoding Strategies for Neural Text Generators
When generating text from probabilistic models, the chosen decoding strategy has a profound effect on the resulting text. Yet …
Gian Wiher
,
Clara Meister
,
Ryan Cotterell
PDF
Cite
URL
Naturalistic Causal Probing for Morpho-Syntax
Afra Amini
,
Tiago Pimentel
,
Clara Meister
,
Ryan Cotterell
PDF
Cite
Locally Typical Sampling
Today’s probabilistic language generators fall short when it comes to producing coherent and fluent text, despite the fact that …
Clara Meister
,
Tiago Pimentel
,
Gian Wiher
,
Ryan Cotterell
PDF
Cite
Cluster-based Evaluation of Automatically Generated Text
While probabilistic language generators have improved dramatically over the last few years, the automatic evaluation metrics used to …
Tiago Pimentel\*
,
Clara Meister\*
,
Ryan Cotterell
PDF
Cite
URL
Revisiting the Uniform Information Density Hypothesis
Clara Meister
,
Tiago Pimentel
,
Patrick Haller
,
Lena Jäger
,
Ryan Cotterell
,
Roger Levy
PDF
Cite
Phone-level Uniform Information Density across and within Languages
Tiago Pimentel
,
Clara Meister
,
Elizabeth Salesky
,
Simone Teufel
,
Damián Blasi
,
Ryan Cotterell
PDF
Cite
On Homophony and Rényi Entropy
Tiago Pimentel
,
Clara Meister
,
Simone Teufel
,
Ryan Cotterell
PDF
Cite
Keyword2Text: A Plug-and-Play Method for Controlled Text Generation
Damian Pascual
,
Beni Egressy
,
Clara Meister
,
Ryan Cotterell
,
Roger Wattenhofer
PDF
Cite
Conditional Poisson Stochastic Beams
Clara Meister
,
Afra Amini
,
Tim Vieira
,
Ryan Cotterell
PDF
Cite
Language Model Evaluation Beyond Perplexity
We propose an alternate approach to quantifying how well language models learn natural language: we ask how well they match the …
Clara Meister
,
Ryan Cotterell
PDF
Cite
Is Sparse Attention more Interpretable?
Sparse attention has been claimed to increase model interpretability under the assumption that it highlights influential inputs. Yet …
Clara Meister
,
Stefan Lazov
,
Isabelle Augenstein
,
Ryan Cotterell
PDF
Cite
Determinantal Beam Search
Beam search is today’s go-to strategy for decoding neural sequence models. The algorithm can naturally be viewed as a subset …
Clara Meister
,
Martina Forster
,
Ryan Cotterell
PDF
Cite
A Cognitive Regularizer for Language Modeling
The uniform information density (UID) hypothesis, which posits that speakers prefer utterances that distribute information uniformly …
Jason Wei
,
Clara Meister
,
Ryan Cotterell
PDF
Cite
Testing Machine Translation via Referential Transparency
Machine translation software has seen rapid progress in recent years due to the advancement of deep neural networks. People routinely …
Pinjia He
,
Clara Meister
,
Zhendong Su
Cite
Source Document
Searching for Search Errors in Neural Morphological Inflection
Neural sequence-to-sequence models are currently the predominant choice for language generation tasks. Yet, on word-level tasks, exact …
Martina Forster
,
Clara Meister
,
Ryan Cotterell
PDF
Cite
Poster
If Beam Search is the Answer, What was the Question?
Clara Meister
,
Tim Vieira
,
Ryan Cotterell
PDF
Cite
Slides
Structure-Invariant Testing for Machine Translation
Machine translation software has increasingly been integrated into our daily lives. People routinely use machine translation for …
Pinjia He
,
Clara Meister
,
Zhendong Su
PDF
Cite
Machine Translation Testing via Pathological Invariance
Machine translation software has become heavily integrated into our daily lives due to the recent improvement in the performance of …
Shahij Gupta
,
Pinjia He
,
Clara Meister
,
Zhendong Su
PDF
Cite
Best-First Beam Search
Decoding for many NLP tasks requires a heuristic algorithm for approximating exact search since the full search space is often …
Clara Meister
,
Tim Vieira
,
Ryan Cotterell
PDF
Cite
Slides
Generalized Entropy Regularization or: There's Nothing Special about Label Smoothing
Prior work has explored directly regularizing the output distributions of probabilistic models to alleviates peaky (i.e. …
Clara Meister
,
Elizabeth Salesky
,
Ryan Cotterell
PDF
Cite
Slides
Cite
×