Training

Meta blocks Apple bots!

Meta has stopped Apple bots from using Instagram and Facebook data to train its AI models

Martin Crowley
August 30, 2024

Meta has blocked Apple’s we-crawler bots—Applebot and Applebot-Extended—from scraping data from its social media platforms, Instagram and Facebook, and using it to train its AI models.

Along with Meta, several major news publications and other social media platforms, including The New York Times, Facebook, Instagram, Craigslist, Tumblr, Financial Times, The Atlantic, USA Today, and Conde Nast—have also opted out of allowing Apple to use their data to advance its AI.

Applebot was first introduced in 2015 to scrape data from sites and improve Apple’s voice assistant Siri and its search feature Spotlight. Three months ago, they launched Applebot-Extended which was specifically designed to scrape data for AI training purposes but came with the option for publications to opt out of the scheme if they wanted.

“With Applebot-Extended, web publishers can choose to opt out of their website content being used to train Apple’s foundation models powering generative AI features across Apple products, including Apple Intelligence, Services, and Developer Tools.” — Apple

All they have to do to opt-out is update the publicly accessible robots.txt file and the bots will not be able to gather data from their site and use it for AI training.

Although Apple only released the tool three months ago, only 6-7% of high-traffic sites and 25% of news outlets have chosen to block these bots, meaning many either don’t mind Apple taking their content to further their AI, or they don’t know the option to opt-out exists.