DPIIT wants AI to pay

"AI is like electricity. Just as electricity transformed every major industry a century ago, AI is now poised to do the same." - Andrew Ng, AI Entrepreneur and Computer Scientist

Public data is fuel

Large Language Models (LLMs) are advancing rapidly due to continuous improvements in machine learning and extensive access to vast amounts of internet data. AI firms argue that information freely available online should be usable for training models. However, copyright law dictates that using this material, even for training, should be subject to a license fee and explicit consent from the original content producer.

Core conflict

This fundamental disagreement has spurred a fierce debate between powerful AI hyperscalers and content producers, which include news agencies, book publishers, and entertainment companies. In this light, the Department for Promotion of Industry and Internal Trade, DPIIT, has released a working paper. This proposal is seen as a welcome step towards finding a solution that supports content providers without putting India's domestic AI ecosystem at a disadvantage.

India’s new approach

The DPIIT paper suggests establishing a mandatory framework for data scraping. This system would allow AI developers to scrape publicly available information while ensuring that a non-profit copyright society collects payments. These payments would be accrued from AI developers based on the revenue they earn, derived from the benefit of training their models on Indian content producers' data.

Mandated scraping is problematic

The reasoning for the proposed system is viewed by some as partially flawed, resting on the belief that data processing is an inherent right for AI models and by overriding existing copyright law. This approach is detrimental to small publishers, who put significant effort into their work but would receive minimal royalty, while large media houses potentially profit substantially from the new structure.

Remuneration?

The lack of a uniform system for remuneration is urgent to prevent continuous litigation between publishers and AI firms. The government needs to seize the momentum provided by the white paper and the accompanying dissent by the tech industry to ensure a collaborative and flawless framework is enacted. A poorly designed system could seriously disrupt the media and internet landscape.

Summary

India's DPIIT has proposed a mandatory data scraping framework to ensure AI companies pay royalties to content creators for using their data. While a welcome step, critics argue the system is flawed by overriding existing copyright law and unfairly disadvantaging small publishers, highlighting the urgent need for a fair, collaborative remuneration model.

Food for thought

If the development of large language models relies entirely on the existing corpus of human-generated work, what exactly constitutes innovation in AI and where should the line be drawn between learning and theft?

AI concept to learn: Copyright Debate in AI Training

The copyright debate in AI training centers on whether using copyrighted text, images, audio, and video to train large models constitutes fair use or unauthorized reproduction. Creators argue that AI companies benefit from their work without permission or compensation, while developers claim that training uses data statistically, not as stored copies. Courts worldwide are examining issues like transformative use, market harm, and transparency obligations. Some propose licensing, compensation pools, or opt-out mechanisms. As generative AI expands, balancing innovation with creators’ rights has become one of the most important legal and ethical challenges in the AI ecosystem.

Copyright content for AI model training DPIIT

[The Billion Hopes Research Team shares the latest AI updates for learning and awareness. Various sources are used. All copyrights acknowledged. This is not a professional, financial, personal or medical advice. Please consult domain experts before making decisions. Feedback welcome!]

Insights - Billion Hopes

Header$type=social_icons

DPIIT wants AI to pay

Public data is fuel

Core conflict

India’s new approach

Mandated scraping is problematic

Remuneration?

Summary

Food for thought

AI concept to learn: Copyright Debate in AI Training

Categories:

COMMENTS

/fa-eye/ MOST READ$type=list

Search this site

JOIN NEWSLETTER

TESTIMONIAL

SOCIAL MEDIA

ACADEMY COURSES

INSIGHTS ON AI

100 AI FAQs

DPIIT wants AI to pay

Public data is fuel

Core conflict

India’s new approach

Mandated scraping is problematic

Remuneration?

Summary

Food for thought

AI concept to learn: Copyright Debate in AI Training

Categories:

SHARE:

COMMENTS

/fa-eye/ MOST READ$type=list

Search this site

JOIN NEWSLETTER

TESTIMONIAL

SOCIAL MEDIA

ACADEMY COURSES

INSIGHTS ON AI

100 AI FAQs