Categories: Tech

OpenAI releases webcrawler GPTBot, how to block it

OpenAI has launched web crawler GPTBot to improve artificial intelligence models.

“Web pages crawled with the GPTBot user agent may potentially be used to improve future models and are filtered to remove sources that require paywall access, are known to gather personally identifiable information (PII) or have text that violates our policies,” the company said in a post on its website. 

“Allowing GPTBot to access your site can help AI models become more accurate and improve their general capabilities and safety,” OpenAI wrote. 

A web crawler is a type of bot. 

WHAT IS AI?

The OpenAI ChatGPT logo is seen on a mobile phone. (Jaap Arriens/NurPhoto via Getty Images)

It is usually operated by search engines that index the content of websites for the sites to appear in search results, according to internet company Cloudflare. 

They are called “web crawlers” because crawling is the term for automatically accessing a website and obtaining data using software.

OpenAI also provided instructions on disallowing the GPTBot from accessing a website – either partially or fully. 

Sam Altman, chief executive officer of OpenAI Inc., speaks with members of the media during the Allen & Co. Media and Technology Conference in Sun Valley, Idaho, on Wednesday, July 12, 2023. (David Paul Morris/Bloomberg via Getty Images)

WHAT IS CHATGPT?

Websites can block the crawler’s IP address or add the GPTBot to the site’s robots.txt file. The file essentially instructs web crawlers on what is accessible from a site.

“To allow GPTBot to access your only parts of your site you can add the GPTBot token to your site’s robots.txt,” it explained. 

A man is seen using the OpenAI ChatGPT artificial intelligence chat website. (Jaap Arriens/NurPhoto via Getty Images)

CLICK HERE TO GET THE FOX NEWS APP 

“For OpenAI’s crawler, calls to websites will be made from the IP address block documented on the OpenAI website,” OpenAI concluded. 

Notably, AI companies, including OpenAI, previously signed an agreement with the White House to develop a watermarking system to let internet users know if something was generated by AI. However, the organizations have not pledged to stop using internet data for training.

Share

Recent Posts

Dutch firebrand Geert Wilders joins new government as Europe’s ‘liberal elites’ put on notice

close Video Mass migration, high levels of crime dominated Dutch elections: Nile Gardiner Heritage Foundation's…

18 mins ago

Biden called out for past desegregation remarks after praising 1954 landmark Supreme Court ruling

President Biden spoke to Black leaders Friday on the 70th anniversary of the 1954 Supreme…

3 hours ago

Speaker Johnson pushes ‘decorum’ after AOC, Marjorie Taylor Greene duel in heated House hearing

Speaker Mike Johnson (R-La.) condemned the fiery House hearing after Reps. Alexandria Ocasio-Cortez and Marjorie…

11 hours ago

US says NATO military trainers will eventually be sent to Ukraine: report

close Video Xi rolls out the red carpet for Putin in Beijing as he seeks…

11 hours ago

Israeli army finds bodies of 3 hostages in Gaza killed at Oct. 7 music festival

close Video Fox News Flash top headlines for May 17 Fox News Flash top headlines…

11 hours ago

UN experts say South Sudan is close to securing a $13 billion oil-backed loan from a UAE company

close Video Fox News Flash top headlines for May 17 Fox News Flash top headlines…

12 hours ago