To clean up the chatbot, OpenAI turned to Sama, an African company that tags violent content online. According to TIME magazine, Sama used more than 50,000 people to fulfill all the contracts.
The success story isn’t just about Silicon Valley geniuses in ponytails and hipster sandals. There is something else behind the enthusiasm for new technologies. Time magazine revealed that Kenyan foreign workers earning less than two dollars an hour are behind ChatGPT’s artificial intelligence. It’s slavery 3.0, the one that allows tech industries to grab billions of dollars.
Open AI’s ChatGPT is no exception. The genius machine that writes how humans work thanks to taggers, they are the new face of dirty work, consisting of invisible workers who catalog rape, pedophilia, suicide, violence, incest and hatred nine hours a day to clean up artificial intelligence. from all the evil in the world.
How ChatGPT was built
ChatGPT was great from the start, but there was a problem. In between his muscle strains were violent, sexist and racist comments. In turn, it was formed with hundreds of billions of words taken from the web. That’s why she was so good and for the same reason she wrote words like “vaff***ulo”, “neg*o” and so on. Before showing it to the world, it was necessary to “filter” the AI, and to do this, an additional security mechanism based on artificial intelligence was needed.
OpenAi therefore takes over the Facebook playbook, which had addressed the problem earlier. The solution is quite simple, to teach the AI what to censor, just feed it labeled examples of violence, hate speech and sexual abuse. So, in November 2021, OpenAI sends thousands of text fragments to an outsourcing company in Kenya, Sama, which suddenly faces all the evils of the Internet: pedophilia, murder, suicide, torture, self-harm and incest.
Why it’s hard to moderate AI
This is not the first time for Sama: he has already worked with Google, Meta and Microsoft. It’s officially an “ethical AI” company that has helped lift more than 50,000 people out of poverty, in Kenya, Uganda and India. In reality, inside Sama, workers are paid between $1.32 and $2 to capture terrible content.
TIME magazine spoke with Sama employees who participated in the project. An employee reading and tagging text for OpenAI explained that he suffered from recurring visions after reading a graphic depiction of a man having sex with a dog in front of a child. “It’s torture,” he said, “you’ve been reading statements like this all week. »
“Our mission is to provide general AI benefits for all humanity, and we work hard to build safe and useful AI systems that limit bias and harmful content,” said a spokesperson from OpenAI, who confirmed the partnership with Sam. . »
Therefore, work is as necessary as it is cruel, burdened by the exploitation of work. “Despite the critical role these data enrichment professionals play, a growing body of research reveals the unsafe working conditions these workers face,” says the Partnership on AI, a coalition of AI organizations to which OpenAI belongs . “This may be the result of efforts to hide the support of AI in this large workforce when celebrating the effectiveness of the technology. »
How does Sam work?
OpenAi signed three $200,000 contracts with Sama at the end of 2021. To keep pace, workers were divided into three teams by subject. Three employees explained that they would have to read and tag between 150 and 250 text passages per nine-hour shift. Contracts with OpenAI charged an hourly rate of $12.50, but the workers’ wages at the end of the month were close to $170.
An employee earns $1.32 an hour, rising to $1.44 if they exceed all of their goals. Labelers responsible for controlling the workflow manage to earn $2 per hour. This is possible because there is no universal minimum wage in Kenya.
An OpenAI spokesperson placed all the blame on Sama, explaining that the company has not set any productivity targets “we take the mental health of our employees and that of our contractors very seriously. Workers could reject any content with impunity, exposure to explicit materials would be limited, and sensitive information would be handled by specially trained workers. »
In February 2022, Sama starts a new project for OpenAI: collecting sexual and violent images, some of which are illegal, to submit to the company ChatGPT. According to a billing document, Sama submitted a sample of 1,400 images to OpenAI. Some were classified as “C4,” OpenAI’s internal label for child sexual abuse, according to the document, others “C3,” ie. animals, rape and sexual slavery, finally “V3” images, graphic details of death, violence or serious physical injury. . . OpenAI paid Sama $787.50 to collect the images, the document said.
The problem comes from images “C4” and “C3”. Sama said in a statement that his agreement did not include any reference to illegal content, only after the work began that OpenAI sent “additional instructions” referring to “some illegal categories”, namely the C4 and C3 images related to abuse. of minors. and rape. “For this reason, Sama immediately ended the image classification pilot and announced that we will cancel any remaining ones with OpenAI.” Indeed, Sama submitted the last tagged data in March, eight months before the contract expired.
OpenAI confirmed that it received 1,400 images from Sama, which “include, but are not limited to, C4, C3, C2, V3, V2, and V1 images. We engaged Sama as part of our ongoing work to make the systems of safer AI. We have never sought to collect illegal content because it is not necessary for our filters. We ask our employees to actively avoid it. There was a communication problem, we did not open or view the content in question, so we cannot confirm whether it contained images in the C4″ category.
However, Sama decides to terminate all contracts and calls the company’s employees in February 2022 to explain the termination to OpenAI. Most workers have moved to other, lower-paying workflows to catalog clear content, others have lost their jobs.
An unsolved problem
On January 10, Sama went further by announcing that it will remove all other works with sensitive content. The company said it will not renew its $3.9 million content moderation contract with Facebook, resulting in the loss of 200 people from its Nairobi office. “After much discussion with our global team, Sama made the strategic decision to completely move away from natural language processing and content moderation work to focus on computer vision data annotation solutions,” the company explained in a note.
The result of this operation is that thousands of people, having been traumatized by the violent content, have lost their jobs and to support their families, as one employee explains, it was even better to spend hours and hours unpaid, leaving the entire flow Bad. on the web background. Not only that, Sama is shutting down, but the need for data labeling for AI systems remains. “They’re great, but ChatGPT and other generative models aren’t magic. They are based on massive supply chains of human labor and data, much of which is unattributed and used without consent,” said University of California ethicist Andrew Strait, recently wrote, “AI. So if it’s not them, they will be next.