May 12, 2025
Trending News

Regulations can’t stop artificial intelligence companies: they continue to collect data from the internet

  • June 22, 2024
  • 0

With the rise of artificial intelligence, companies entering this field need vast amounts of data to develop their own tools. The first alternative that comes to mind to

With the rise of artificial intelligence, companies entering this field need vast amounts of data to develop their own tools. The first alternative that comes to mind to find this data is of course the Internet. on the other hand all data on the internetNot every item can be used to train artificial intelligence. Websites indicate whether data can be collected from them with a file called robots.txt.

According to Reuters, a lot artificial intelligence developer They choose to bypass the prompts in this file and collect data from these sites. Although Perplexity, which bills itself as a “free artificial intelligence search engine,” is one of the companies that is generating the most reaction in this regard, it is not alone in this practice.

OpenAI, anthropic…

According to reports, many artificial intelligence developers robots.txt It bypasses the files and continues to receive content from the sites. While no names were given in the report, it was learned that OpenAI and Anthropic were among these companies. bewilderment It turned out that a server that was being used was also not following these guidelines. Perplexity CEO Aravind Srinivas had previously said the company is “not in a position to first circumvent the protocol and then lie about it.”

The Robots.txt protocol, on the other hand since the 1990s It is used and actually has no legal binding. Perhaps creating a new, stricter and more detailed protocol in this area will help solve the problem.

Source: Web Tekno

Leave a Reply

Your email address will not be published. Required fields are marked *

Exit mobile version