Authors sue OpenAI for copyright infringement, claim ChatGPT unlawfully 'ingested' their books

Name: ChatGPT company facing lawsuit over data privacy
Uploaded: 2023-06-29T16:23:36-04:00
Duration: 4 min 13 s
Description: Digits co-founder and CEO Jeff Seibert says AI companies should be more ‘transparent’ on their data usage and says the government should start ‘developing guidelines’ on how AI should be trained.

Digits co-founder and CEO Jeff Seibert says AI companies should be more ‘transparent’ on their data usage and says the government should start ‘developing guidelines’ on how AI should be trained.

video

ChatGPT company facing lawsuit over data privacy 

Digits co-founder and CEO Jeff Seibert says AI companies should be more ‘transparent’ on their data usage and says the government should start ‘developing guidelines’ on how AI should be trained.

Authors Paul Tremblay and Mona Awad filed a class-action complaint in California federal court alleging OpenAI broke copyright law by training its software to "ingest" their books without permission.

ChatGPT, a large language model, is "trained" by copying massive amounts of text and extracting expressive information from it to form a compilation of input material known as the "training dataset," according to the complaint filed in U.S. District Court in San Francisco.

The lawsuit says neither Tremblay nor Awad, both writers who live in Massachusetts, consented to the use of their copyrighted books as training material for ChatGPT. Nonetheless, "their copyrighted materials were ingested and used to train ChatGPT."

Tremblay owns registered copyrights in several books, including "The Cabin at the End of the World." Awad owns registered copyrights in several books, including "13 Ways of Looking at a Fat Girl" and "Bunny."

OPENAI FORCES SHUTDOWN OF CONSERVATIVE CHATGPT-POWERED AI BOT, CREATOR CLAIMS

OpenAI is facing a new copyright infringement claim in San Francisco court. (Nikolas Kokovlis/NurPhoto via Getty Images / Getty Images)

"Indeed, when ChatGPT is prompted, ChatGPT generates summaries of Plaintiffs’ copyrighted works — something only possible if ChatGPT was trained on Plaintiffs’ copyrighted works," the 17-page complaint says. "Defendants, by and through the use of ChatGPT, benefit commercial and profit richly from the use of Plaintiffs’ and Class members’ copyrighted materials."

The complaint cites a June 2018 paper in which OpenAI revealed it trained its GPT-1 tool on BookCorpus, a collection of "over 7,000 unique unpublished books from a variety of genres, including Adventure, Fantasy, and Romance."

"OpenAI confirmed why a dataset of books was so valuable: ‘Crucially, it contains long stretches of contiguous text, which allows the generative model to learn to condition on long-range information.’ Hundreds of large language models have been trained on BookCorpus, including those made by OpenAI, Google, Amazon, and others," the complaint notes.

Paul Tremblay at New York City movie premiere

Author Paul Tremblay arrives for the world premiere of Universal Pictures' "Knock at the Cabin" at Jazz at Lincoln Center's Frederick P. Rose Hall in New York City Jan. 30, 2023. He is suiting OpenAI for copyright infringement (Angela Weiss/AFP via Getty Images / Getty Images)

Andres Guadamuz, a reader in intellectual property law at the University of Sussex, told The Guardian the complaint represents the first against OpenAI regarding copyright law.

BANKING INDUSTRY PUSHES BACK ON CFPB'S WARNING OVER USE OF AI CHATBOTS

Joseph Saveri and Matthew Butterick, attorneys representing the authors, told the newspaper using books to train large language models is ideal because they contain "high-quality, well-edited, long-form prose," essentially forming "the gold standard of idea storage for our species."

Authors filed a lawsuit against OpenAI for alleged copyright infringement. (CFOTO/Future Publishing via Getty Images / Getty Images)

"Defendants breached their duties by negligently, carelessly, and recklessly collecting, maintaining and controlling Plaintiffs’ and Class members’ Infringed Works and engineering, designing, maintaining and controlling systems — including ChatGPT — which are trained on Plaintiffs’ and Class members’ Infringed Works without their authorization," the complaint says.

GET FOX BUSINESS ON THE GO BY CLICKING HERE

The lawsuit seeks an award of statutory and other damages.

Fox News Digital reached out to OpenAI for comment Wednesday but did not immediately hear back.

Recommended Videos

Flex CEO: We will be one of the largest electrical businesses in the industry

Blue collar workers needed for AI revolution

Alphabet to replace Verizon on the Dow Jones

Charles Payne: J.R. Simplot's journey is the 'stuff of legends'

Humanoid robots on display at Chicago's Automate Show

Humanoid robots showcase skills at industry event, highlighting AI's growing role

Ray Wang: Memory is 'so hot' at the moment

Wall Street’s bull market is ‘nowhere near over’ despite tech sell-off, investor says

Meta announces new AI Smart Glasses with Kylie Jenner collaboration

President Trump lauds Pennsylvania manufacturing growth

This is how investors can develop a strong portfolio

Trader sentiment isn’t what the market is reflecting: Meridian Equity Partners trader

Quantum computing's threat to encryption explained

Quantum computing is a ‘serious security threat,’ says BTQ product head

The AI trade is much more robust and has a very long horizon, Futurum CEO says

Charles Payne: Most of this is emotional

Market expert says 'overcrowded trade' is behind sell-off

Amazon showcases latest warehouse automation with Proteus robot

Charles Payne points to anxiety over fundamentals regarding tech stock sell-off

Stuart Varney: Investors got swept up in Elon Musk’s future vision

Authors sue OpenAI for copyright infringement, claim ChatGPT unlawfully 'ingested' their books

Authors Paul Tremblay, Mona Awad file class-action complaint alleging OpenAI is 'training' its software tools using their books without permission

ChatGPT company facing lawsuit over data privacy