Consider installing a firewall on your website, such as Wordfence or BBQ Firewall. Both can be run simultaneously without issues. With Wordfence, you have the ability to block IP addresses, domain names, user agents, and more. The paid version even allows you to block entire countries and manages some of the IP address blocking for you. Be cautious of spammers sending unwanted content, especially through forms on your site. It’s advisable to never open a message with an attachment.
Use Cloudflare and set up the WAF rules.
It’s surprising how little proactive blocking seems to happen. You’d think ISPs would want to help curb this, especially for new sites.
How do you monitor that kind of traffic?
Brady said:
How do you monitor that kind of traffic?
It’s in the HTTP logs on your server, you can tell they are bots because they request things that don’t exist and they request them back to back dozens of times.
Cloudflare WAF + Wordfence and problem is solved.
Because it’s a big problem and would take a lot of effort for them to do, and do well without also blocking legit services.
To you, it’s malicious traffic. But, to someone else, there might be legit traffic coming from those IP blocks. So, they can’t do blanket bans like that across their entire server pool. Also, you can see the entirety of the unencrypted request coming to your server, but your ISP cannot.
But, you can block IP ranges from your server with tools like ModSecurity and Fail2Ban. There are lists of troublesome IPs that you can subscribe to.
Because IP addresses, particularly for personal devices, aren’t fixed.
Maintaining those long and ever-changing lists of IP addresses takes up a lot of space and time. It’s much better to block based on behavior / heuristics than an IP that is going to change again next week.
The answers you’re getting here are correct. One other thing I want to add is that not all bots are bad bots, so you don’t necessarily want them all to be blocked automatically.
If you care about SEO, you probably don’t want GoogleBot as well as the many other search engine crawlers getting blocked.
If you care about social media link previews (e.g. page title + thumbnail pop up in posts/messages), you wouldn’t want to block bots from Slack, Discord, Facebook, X, or any number of social media platforms.
There are many other utilities as well. Our tool, Cloudtrellis, scans sites for broken links, which means having to validate every single link on every page we scan, even if that’s an external link. So if someone at example.com
writes a blog post that links to yoursite.com/blog/some-post
, we’re going to send a HEAD request to validate that it at least returns a 200. If you block us, you’re doing a disservice to the person using our tool that operates example.com since they may get a false positive that your site is returning an error, when it actually isn’t.
The point is you want bad bots to get blocked. There are also many good bots that you (probably) don’t want to get blocked.
@Pierce
It’s true there are both good and bad bots. From personal experience, figuring out how best to manage bot traffic on a website can be tricky. When I set up my site, I noticed a spike in traffic from bots trying to find vulnerabilities, and separating the beneficial from harmful ones took some trial and error. Instead of blocking all bots, I used tools like Cloudflare for managing malicious requests, while allowing search engine and social media bots that help with visibility.
Bots are using IP addresses from cloud providers’ pools like everyone else.
It’s really easy to get a new IP instantly; IP Spoofing, and not every case is cut and dry. Blocking IP addresses doesn’t accomplish much if you know what you’re doing, and the drawbacks could be large for non-technical users in the event of a false flag.
- Blocking - that’s another service.
- Moreover. Yesterday it was a bot - today, no.
- Who said you don’t need that traffic
There’s a list. You can download it and install it onto your server if you manage your own server. This is one of the best reasons for managing your own server.
Will result in a lot of legitimate traffic blocked by mistake; many IPs are shared and often API calls have high volume clusters of requests.
The real answer is that there are IP ranges assigned to mobile device carriers, so they can’t and wont ever be banned.
ISPs usually don’t block known bot IP addresses because many of them use dynamic IPs, which makes tracking a bit tricky. Plus, some bots have legitimate uses, so blocking them could lead to legal issues. ISPs might also have limited resources to manage all that bot traffic. As a website owner, you can use security plugins, web application firewalls (WAF), and CAPTCHAs to help reduce unwanted bot activity.
Why use those ISPs then?