Sony has conducted groundbreaking research calling for a more comprehensive approach to measuring skin color in AI systems. While efforts have been made to address biases based on the lightness or darkness of people’s skin tones, Sony’s research highlights the need to consider red and yellow hues as well. The aim is to create more diverse and representative AI systems.
For years, researchers have been highlighting biases in AI systems related to skin color. A notable study by Joy Buolamwini and Timnit Gebru in 2018 revealed that AI was more prone to inaccuracies when used on darker-skinned females. In response, companies have been working to test the accuracy of their systems across a range of skin tones.
However, Sony argues that the existing measurement scales primarily focus on the lightness or darkness of skin tone. According to Alice Xiang, Sony’s global head of AI Ethics, if products are evaluated in this narrow way, many biases will go undetected and unmitigated. Sony’s research aims to replace the existing scales with a more multidimensional approach that takes into account biases against various skin hues, such as those of East Asians, South Asians, Hispanics, Middle Eastern individuals, and others who don’t neatly fit into the light-to-dark spectrum.
To demonstrate the impact of this measurement approach, Sony’s research found that common image datasets overrepresented individuals with lighter and redder skin tones while underrepresenting those with darker, yellower skin tones. This can lead to less accurate AI systems. For example, Twitter’s image-cropper and two other image-generating algorithms favored redder skin tones, mistakenly classifying people with such hues as “more smiley.”
Sony’s proposed solution is to adopt an automated approach based on the CIELAB color standard, which already exists. This approach would replace the manual categorization method used in the Monk scale. The Monk Skin Tone Scale, named after its creator Ellis Monk, is intentionally limited to 10 skin tones to offer diversity without introducing inconsistencies associated with more categories.
Monk defends his scale, emphasizing that it does take undertones and hue into account. He explains that research was dedicated to determining which undertones to prioritize along the scale and at which points.
While Sony’s approach offers a more multifaceted measurement of skin color, some argue for the simplicity and cognitive ease of the Monk scale. Regardless, major AI players such as Google and Amazon have expressed their interest in Sony’s research and are reviewing the paper.
In conclusion, Sony’s call for a more comprehensive measurement of skin color in AI systems aims to identify and mitigate biases related to red and yellow hues. By adopting an automated approach based on the CIELAB color standard, Sony hopes to improve the accuracy and diversity of AI systems. This research holds promise for making AI more inclusive and representative of diverse skin tones.