117
2day
6

Google Built Its Empire Scraping The Web. Now It’s Suing To Stop Others From Scraping Google

https://www.techdirt.com/2025/12/24/google-built-its-empire-scraping-the-web-now-its-suing-to-stop-others-from-scraping-google/
mesa - 2day

Google and OpenAI sucks:

Google’s legal theory has another significant problem: the requirement that a TPM must “effectively control” access. Just last week, a court rejected Ziff Davis’s attempt to turn robots.txt into a 1201 violation when OpenAI allegedly ignored its crawling restrictions. The court’s reasoning is directly applicable here:

OpenAI slamed my small server into the ground, until I put fail2ban on top. It was really bad, like thousands of requests per second bad.

18
apftwb @lemmy.world - 23hr

How does fail2ban prevent scrapping? My understanding was that fail2ban works on failed login attempts.

1
mesa - 21hr

There's some premade scripts out there that make it do more. I have it hooked up to nginx and other such logs. Its common enough in login attempts for login portals online, not just ssh. It can work with any grep-able log file.

I just took two scripts other people have made, verified they soon my mini PC and set it loose. Within about 10 min it caught most scrappers and banned the IPs.

1
watson - 2day

Fuck Google

7