/* FORCE THE MAIN CONTENT ROW TO CONTAIN SIDEBAR HEIGHT */ #content-wrapper, .content-inner, .main-content, #main-wrapper { overflow: auto !important; display: block !important; width: 100%; } /* FIX SIDEBAR OVERFLOW + FLOAT ISSUES */ #sidebar, .sidebar, #sidebar-wrapper, .sidebar-container { float: right !important; clear: none !important; position: relative !important; overflow: visible !important; } /* ENSURE FOOTER ALWAYS DROPS BELOW EVERYTHING */ #footer-wrapper, footer { clear: both !important; margin-top: 30px !important; position: relative; z-index: 5; }

DPIIT wants AI to pay

"AI is like electricity. Just as electricity transformed every major industry a century ago, AI is now poised to do the same." - A...

"AI is like electricity. Just as electricity transformed every major industry a century ago, AI is now poised to do the same." - Andrew Ng, AI Entrepreneur and Computer Scientist

Public data is fuel

Large Language Models (LLMs) are advancing rapidly due to continuous improvements in machine learning and extensive access to vast amounts of internet data. AI firms argue that information freely available online should be usable for training models. However, copyright law dictates that using this material, even for training, should be subject to a license fee and explicit consent from the original content producer.

Core conflict

This fundamental disagreement has spurred a fierce debate between powerful AI hyperscalers and content producers, which include news agencies, book publishers, and entertainment companies. In this light, the Department for Promotion of Industry and Internal Trade, DPIIT, has released a working paper. This proposal is seen as a welcome step towards finding a solution that supports content providers without putting India's domestic AI ecosystem at a disadvantage.

India’s new approach

The DPIIT paper suggests establishing a mandatory framework for data scraping. This system would allow AI developers to scrape publicly available information while ensuring that a non-profit copyright society collects payments. These payments would be accrued from AI developers based on the revenue they earn, derived from the benefit of training their models on Indian content producers' data.

Mandated scraping is problematic

The reasoning for the proposed system is viewed by some as partially flawed, resting on the belief that data processing is an inherent right for AI models and by overriding existing copyright law. This approach is detrimental to small publishers, who put significant effort into their work but would receive minimal royalty, while large media houses potentially profit substantially from the new structure.

Remuneration?

The lack of a uniform system for remuneration is urgent to prevent continuous litigation between publishers and AI firms. The government needs to seize the momentum provided by the white paper and the accompanying dissent by the tech industry to ensure a collaborative and flawless framework is enacted. A poorly designed system could seriously disrupt the media and internet landscape.

Summary

India's DPIIT has proposed a mandatory data scraping framework to ensure AI companies pay royalties to content creators for using their data. While a welcome step, critics argue the system is flawed by overriding existing copyright law and unfairly disadvantaging small publishers, highlighting the urgent need for a fair, collaborative remuneration model.

Food for thought

If the development of large language models relies entirely on the existing corpus of human-generated work, what exactly constitutes innovation in AI and where should the line be drawn between learning and theft?

AI concept to learn: Copyright Debate in AI Training

The copyright debate in AI training centers on whether using copyrighted text, images, audio, and video to train large models constitutes fair use or unauthorized reproduction. Creators argue that AI companies benefit from their work without permission or compensation, while developers claim that training uses data statistically, not as stored copies. Courts worldwide are examining issues like transformative use, market harm, and transparency obligations. Some propose licensing, compensation pools, or opt-out mechanisms. As generative AI expands, balancing innovation with creators’ rights has become one of the most important legal and ethical challenges in the AI ecosystem.

Copyright content for AI model training DPIIT

[The Billion Hopes Research Team shares the latest AI updates for learning and awareness. Various sources are used. All copyrights acknowledged. This is not a professional, financial, personal or medical advice. Please consult domain experts before making decisions. Feedback welcome!]

COMMENTS

Loaded All Posts Not found any posts VIEW ALL READ MORE Reply Cancel reply Delete By Home PAGES POSTS View All RECOMMENDED FOR YOU LABEL ARCHIVE SEARCH ALL POSTS Not found any post match with your request Back Home Sunday Monday Tuesday Wednesday Thursday Friday Saturday Sun Mon Tue Wed Thu Fri Sat January February March April May June July August September October November December Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec just now 1 minute ago $$1$$ minutes ago 1 hour ago $$1$$ hours ago Yesterday $$1$$ days ago $$1$$ weeks ago more than 5 weeks ago Followers Follow THIS PREMIUM CONTENT IS LOCKED STEP 1: Share to a social network STEP 2: Click the link on your social network Copy All Code Select All Code All codes were copied to your clipboard Can not copy the codes / texts, please press [CTRL]+[C] (or CMD+C with Mac) to copy Table of Content