A platform enabling creators to monetize their content for utilization in LLM training
Wong mentioned that the Corpus tool by Avail challenges recent statements made by Mustafa Suleyman, the Chief Executive Officer of Microsoft AI, during an interview at the recent Aspen Ideas Festival.
Wong mentioned that the Corpus tool by Avail challenges recent statements made by Mustafa Suleyman, the Chief Executive Officer of Microsoft AI, during an interview at the recent Aspen Ideas Festival. “In trying to clarify what type of content is safeguarded by publishers, he went on to express: ‘Considering content that is already present on the open web, the consensus since the 1990s is that it falls under fair use. Individuals can duplicate it, recreate it, or replicate it. It has been considered freeware, so to speak; that has been the mutual understanding.'”
If a tool like Corpus had been accessible on the internet in the 1990s, according to Wong, “I believe content creators would have been duly recognized and remunerated for their content. Currently, discussions are ongoing to determine whether copyright information for LLM training should be classified as fair use; however, the instant access to data should be acknowledged as valuable to both users and vendors, and this content should not be deemed as freeware.”
He mentioned that the US copyright office has not stopped “LLM vendors from utilizing copyrighted data for training their models. Typically, the vendors claim that the use of copyrighted data falls under the legal principle of fair use, allowing individuals/companies to utilize limited portions of the work for non-commercial, educational, or transformative purposes.”
