The New York Times and Daily News have initiated legal action against OpenAI and its backer Microsoft, suspecting that their copyrighted content was used in the training of ChatGPT. Recently, it has come to light that OpenAI engineers inadvertently erased critical research related to the training data last week.
Lawyers for NY Times had potential evidence against OpenAI deleted
Kyle Wiggers of TechCrunch reports:
Earlier this autumn, OpenAI consented to furnish two virtual machines for the attorneys representing The Times and Daily News to search for their copyrighted materials within its AI training datasets. In a letter, the publishers’ lawyers indicated that they and their hired experts have dedicated over 150 hours since November 1 to comb through OpenAI’s training data.
However, on November 14, OpenAI personnel deleted all search data belonging to the publishers that was saved on one of the virtual machines, as stated in the letter filed late Wednesday in the U.S. District Court for the Southern District of New York.
This letter is now accessible online for public viewing.
It appears that after dedicating considerable hours to gather data from ChatGPT’s training set, the NY Times lawyers faced a setback when their research was wiped by OpenAI.
According to the letter, OpenAI managed to retrieve a significant portion of the data, but only in a form that is not permissible as evidence in court. Consequently, this means it cannot be utilized against OpenAI in the ongoing case, and the costly, time-intensive process will have to start all over again.
Insights from DMN
The specific training datasets utilized by numerous AI companies remain largely obscure. Not all publishers can afford to engage in legal battles against tech giants, so having your research inadvertently erased by OpenAI engineers is decidedly unprofessional.
What are your thoughts on this situation? Share your opinions in the comments.
Top holiday gift ideas for Apple products
: . More.