Artificial Intelligence has now reshaped all industries, and along with it, a societal battleground has been formulated over the very basic building block of AI itself: data privacy. At the core of this conflict lies whether AI models should be permitted to learn from copyrighted text without consent. A major player in the AI race, OpenAI, very much in favor of this, has sought legislative support for its proposal of “Fair use” concerning AI training.

With tech firms and content creators involved in a debate on intellectual property rights, the outcome of this particular debate might determine the future direction of AI innovation across the U.S. and beyond. The fact that OpenAI would like the federal government to codify the ‘fair use’ doctrine in respect of AI training is one of the boldest moves in shaping the future of artificial intelligence regulations. This proposal was included in the ‘AI Action Plan’ of the U.S government initiated by the Trump Administration to bring a redefinition to how the country looks at AI.

According to OpenAI, ensuring a strong copyright framework by which AI models could continue to learn from copyrighted material is very important in maintaining the leadership of America in the application of AI development. OpenAI said,

“America has so many AI startups, attracts so much investment, and has made so many research breakthroughs largely because the fair use doctrine promotes AI development”

The company’s statement highlights how the fair use application encourages innovations, with the result that AI is now using quite a lot of publicly available data.

OpenAI’s Stance for AI Training

OpenAI has had several occasions to advocate for broad interpretations of fair use, such as to permit the training of AI. The company has often trained its models on huge datasets obtained from the web, apparently with little to no regard for the wishes of the content owners, and before you know it, it resulted in such controversy.

Last year, OpenAI articulated essentially the same argument before the U.K. House of Lords and said that,

“limiting AI training to public domain content might yield an interesting experiment, but would not provide AI systems that meet the needs of today’s citizens”.

This position implies that OpenAI considers access to copyrighted material beneficial and necessary for developing AI to assist societies in their current needs.

Content Owners Rejection & Xerox Controversy

Surprisingly, copyright holders have opposed OpenAI’s second push for codifying fair use. Some content creators and organizations have filed lawsuits against OpenAI, charging that their copyrighted material was used without consent. These lawsuits reveal a clash between the tech industry’s demands for free access to data and the rights of content creators who want to control how their work is being used.

One of the earliest controversies regarding fair use concerned the creation of the photocopier. History clearly repeats itself in the digital age! In the American Geophysical Union Case, the introduction of commercial xerographic photocopying machines led to significant debate on fair use and copyright infringement, as libraries and institutions began making copies of materials, raising concerns among publishers regarding possible losses in revenue.

When Xerox launched its first commercial photocopying machine, it led to fears of copyright infringement among publishers and authors. Though some decades have passed, the debate continues concerning how an AI would “read” or be allowed to learn material covered under copyright.

At the very heart of all this is a kind of battle not really over copyright laws or access to data, rather it’s a fight about how to progress while maintaining some protection from it. Fair use is necessary to keep innovation alive for OpenAI and the rest of the AI revolution. At the same time, it indicates a deep ocean to the unregulated exploitation of intellectual property by creators. While doing that, they risk either stagnating the development of AI or narrowing the rights of artists, writers, and other creative professionals. Decisions now weigh heavily in the stakes and will echo for years in how AI is perceived and used in ethics and laws.