Paper

“You are Beautiful, Body Image Stereotypes are Ugly!” BIStereo: A Benchmark to Measure Body Image Stereotypes in Language Models

Abstract

While a few high-quality bias benchmark datasets exist to address stereotypes in Language Models (LMs), a notable lack of focus remains on body image stereotypes. To bridge this gap, we propose \suite{}, a suite to uncover an LM's biases towards people of certain physical appearance characteristics. \suite{} encompasses five dimensions of a body image, namely, \textit{skin complexion, body shape, height, attire,} and a \textit{miscellaneous category} including factors such as \textit{hair texture, eye color, and more}. Our dataset contains 14k sentence triplets designed to assess an LM's preference for certain body types. We also examine the sentiment LMs associate with sentences containing stereotypically desirable and undesirable body image descriptors. We propose a metric that captures the biased preferences of LMs towards a certain body type over others. Additionally, we generated 472 tuples comprising \textit{body image descriptor, gender, and a stereotypical attribute}. These tuples were vetted by a diverse pool of annotators for the presence of physical appearance stereotypes. Using \suite{}, we assess the presence of body image biases in ten different language models, revealing significant biases in models like Muril, Bernice, and XLMR towards certain body types among men and women. We further evaluate the LMs through downstream NLI and Analogy tasks aimed at uncovering stereotypical associations related to physical appearance. Our NLI experiments highlight notable patterns in the LMs that align with the well-documented cognitive bias in humans known as \textbf{\textit{the Halo Effect}}

Related