Jannis Born, Tien Huynh, et al.
NeurIPS 2021
Natural language text generation has seen significant improvements with the advent of pre-trained language models. Using such language models to predict personal data entities, in place of redacted spans in text, could help generate synthetic datasets. In order to address privacy and ethical concerns with such datasets, we need to ensure that the masked entity predictions are also fair and controlled by application specific constraints. We introduce new ways to inject hard constraints and knowledge into the language models that address such concerns and also improve performance on this task.
Jannis Born, Tien Huynh, et al.
NeurIPS 2021
Elita Lobo, Yash Chandak, et al.
NeurIPS 2021
Jehanzeb Mirza, Leonid Karlinsky, et al.
NeurIPS 2023
Megh Thakkar, Quentin Fournier, et al.
ACL 2025