Generative AI and its legal issues with hunger for data
The rapid evolution of generative AI is posing numerous legal challenges, primarily regarding where to place this new generative tool from a legal perspective. There are countless issues that can arise from the use of artificial intelligence, especially when such use is not explicitly declared in the creation of innovative content.
One question that arises is who has actually created such work, whether it's the human author who instructed AI to create or the AI itself, which operated perfectly according to the instructions it received from humans each time.
Another extremely complex aspect, also due to the lack of global harmonization of the law, in the face of the global use of AI from various perspectives, is the source from which AI draws to learn and educate itself. It is known that some databases have explicitly denied access to certain AI to prevent them from incorporating their data and using it as a genuine product without any limits or protection for third parties. It is also well-known that some specific authors, particularly active in the fields of art and literature, have explicitly recognized themselves in the responses and derivative products generated by AI, which had incorporated, read, and reworked some of their works without truly being innovative. This often involves copying a literary style, a type of expression, or a unique human characteristic considered sufficiently distinctive to be protected.
This is the case with playwright Sarah Silverman, who sued OpenAI and Meta for deliberately copying her memoir "The Bedwetter" using online libraries like Bibliotik, Library Genesis, Z-Library, and similar sources that use a torrent system to accumulate readable books without any payment and make them available without respecting any intellectual property rights. In a particularly important moment for content creators, who are also at the center of a months-long protest regarding fair compensation for their creative efforts, a protest that has literally paralyzed Hollywood and many content streaming platforms, this dispute is of great importance. Silverman's work was copied without credits and without explicit consent from the author, and without any form of compensation.
Basic errors or outright plagiarism pose immense risks for generative AI. What are the trends among companies in using such tools? We are not yet at a clear point regarding how data is used, so it would be very helpful for companies to limit the excessive use of AI tools without fully understanding the consequences.
Some companies have decided to expressly prohibit inputting pieces of contractual agreements, confidential documents, or other particularly specific and identifying information about the company into these systems out of fear of unintentionally disclosing their contents. Why would someone enter an agreement into an AI system? For example, to quickly translate, to request specific data abstraction (contractual terms, risk analysis, or other information), but in doing so, they inevitably share a series of external contents with the AI management company that must remain confidential and cannot be shared.
But there are also cases where skeptical database owners who had explicitly banned generative AI from their content have instead reconsidered their stance just a few months later, entering into highly advantageous agreements with what they previously referred to as potential disruptors of their assets until recently. This is the case with Getty Images, which initially banned AI from its systems but has recently concluded an agreement with NVIDIA to jointly create an image generator based on its own data (and photographs). Getty has thus decided to compete with Shutterstock, which had entered into a similar partnership with OpenAI. At this point, it will become essential to understand the ethical and commercial rules by which an image can be considered innovative enough to be free from previous constraints and rights.
Until the rules on the dissemination and use of such content become clearer, it would be wise to be extremely cautious with sensitive content that could put much more at risk than necessary, both in terms of intellectual property rights and confidential or even sensitive information.