Anti-bot Discussion on the i=1 thing

Supernova3339 · October 10, 2023, 10:54pm

Haha true, but those words are magical and cause them to fix something immediately. ( as SSH on free hosting via JavaScript is a giant security vulnerability for any other website out there ).

Oxy · October 11, 2023, 1:33am

You forgot to mention the biggest advantage of aes.js, for which I think that ifastnet is the most interesting
it uses client resources for validation

and the biggest flaw is that it puts i=1 in the URL…except for bad looks I always remember users who said that they couldn’t log into the WP admin section and they were trying to visit example.com?i=1/wp-admin/

I definitely think that after that initial layer would be good to have a second layer as well
and that is “rate limiting”

Everything that passes the first layer and has a cookie stored has a free hand to do whatever it wants, and that’s not right.

In order to protect server bandwidth (in the end, electricity costs and other things), which is probably in the interest of hosting, it wouldn’t be bad to have some FW rules…
if some IP requests the same file/URL 10 times in 5 seconds → to put it on ice for 10 seconds…
if he tries the same thing again x3, then a daily ban (blacklist, etc.).
I would especially emphasize the PHP files here because in addition to bandwidth, they also consume CPU/RAM.

In my case, Cloudflare is very good at protection, and I configured it in detail,
it is basically my main hosting because I use “cache all” and 99% are served from their servers.
Iin most other cases when someone is suspended here on hosting or runs out of daily hits
it is due to PHP files, because even when you are using CF, they are called directly (no cache no store), so CF cannot help much like in the case of static files.
In such cases, you don’t have to be attacked by a bot at all
it is enough for someone to find a PHP file on your website that consumes the most resources and to visit it 100+ times and thus disable you for 1 day because you are suspended.

And the rate limit would help here again
or if nothing else to record the “violent IP” and send it to the user in the form of a list that would be in the control panel, so that he can see that it is not the hosting’s fault, but some IP/ASN should be blocked manually if necessary.
That’s right - it’s an additional complication, additional price and hassle for implementation,
but the ratio of the time spent for it - I believe it would be worth it again, because on the other hand, other costs in the form of bandwidth, electricity, equipment, lifetime of hardware, etc. would be reduced.

The more opportunities you have to exclude bad traffic the more resources remain for real users
and this is then reflected in the quality of service, the speed of websites, etc.

Admin · October 11, 2023, 9:39am

The free hosting platform already does this. Rate limiting is already in effect.

I don’t know if rate limiting is applied before or after the cookie check, but I’ve been told it exists, and that it actually blocks plenty of bad traffic.

It’s actually good that everyone is suggesting “you should to rate limiting”, because it means it’s so unobtrusive that people don’t even realize it exists. It does mean it’s less effective of course. But given that we already see some sites, notably image galleries, hit Apache process limits because there are too many requests on the page, I think reducing the limits risks having too much impact on legitimate visitors on some sites.

I really don’t understand how some code in a browser would give anyone shell access to the server. SSH access is blocked in the server firewall and running system commands is disabled in PHP. So even without any website security in place, no unauthorized user would be able to do SSH.

Checking against a third party service isn’t really a good idea for general web traffic. Doing a request like that easily adds several hundred milliseconds to a request. You can mitigate the impact with cache, but if AbuseIPDB has issues and their APIs are responding more slowly, you’d risk slowing all websites hosted here to a crawl. So in most cases, you’ll want to use such services only when doing specific requests, such as in a registration form.

Also, checking against blacklists isn’t exactly a fool proof method. All you do is check against a list of known bad IPs that have been reported to that blacklist. But many attackers easily pick up new IPs, either from VPS providers or dynamic consumer IPs. A blacklist check cannot account for new traffic very well.

A layer 3/4 DDoS attack will just saturate your network bandwidth, and this security system won’t do anything against it. But that’s not the only kind of DDoS attack.

I’ve also experienced DDoS attack where the backend application got hit so hard the number of PHP processes skyrocketed until all CPU power and RAM was used up and things broke down. I’ve also had people DDoS a completely static site, and still be able to break it by saturating the connection limit on the load balancer. Both of these attacks used up all available resources on something other than the raw network bandwidth, and be able to cause damage just the same.

In both of these cases actually, the attacks punched through Cloudflare. Firewalls and rate limits were all in effect too, but the attackers were able to avoid those by sending seemingly normal requests from a very large number of IP addresses.

Finding the real IP is easy. When a network packet comes in, it has an IP address in it, which it must have for the network connection to work.

Cloudflare does this for free too. Those “separate headers” all have the same data too, they just have different header names because different backend applications accept forwarded IP addresses in different ways. Cloudflare HTTP request headers · Cloudflare Fundamentals docs

Not necessarily. I’ve seen DNS hosting providers that will let you add subdomains to their platform with no issues. We actually had a subdomain of infinityfree.net hosted with Amazon’s Route53 DNS service. I really don’t see any reason why Cloudflare couldn’t support this too. This is just a policy decision on their end, not something with any technical necessity.

chiucs123 · October 11, 2023, 10:04am

Hi Oxy,

I think the rate limit should come before the bot challenge, as the challenge itself does not protect against DDoS. The resource consideration is only valid if the crawler attempts to run the JavaScript. In a real-world DDoS attack, bots do not care what the target server has to reply to, they simply keep on sending requests without waiting for the reply, therefore the implementation does not involve mitigating that. Rate limit on the other hand effectively protects against DDoS, but at the expense of certain resources to keep track of the rate on a per IP basis.

I also think that the server has to give in the same amount of effort as the server verifies the cookie header to see if it matches the expected answer. In a DDoS scenario, this involves the server attempting to issue a large number of challenges and storage prepared for the expected answer to be used later on, at least per IP level.

From my observation, the server would issue the same challenge over and over again if the header does not get returned. Since no cookie headers are involved in the process, this is not Session-based, but rather IP+time(+UA?)-based challenge generation.

As long as the i param does not get incremented when retrying, the i will always stay at 1.

Hi Admin,

So it’s an iFastNet implementation I see. It really depends on how the rate is limited, it could be traffic rate or request per time frame. From what I can see it’s somewhere around 250kbps instead of hits. To clear things up, that is the speed limit as well. I do not have information on the hits meter tho.

I do see some other rate limits available on the market by also counting the rate of failed requests only. This way, hitting too many 403s and 404s would make a certain IP get banned quickly while still enforcing a reasonable rate limit on legitimate ones.

An SSH-like terminal can be emulated with PHP and some fancy coding, but I’m not sure if there are ways to get past the existing setup and make system calls. As far as I know, it cannot be done via PHP directly, maybe sketchy glitching tricks might work but it’s all speculative and there’s no PoC or obvious viable ways of doing it yet.

Yes, highly agree. It ain’t foolproof, and not all web devs know about AbuseIPDB anyway. but it’s something that can be worked with. Calling their API on the spot is definitely not a good idea, but syncing their bad IP list can help by a bit, not god-effective but it’s something.

This is a botnet thing, and I can tell it’s a pattern of Pet**bot from your text (but I can tell it’s not the only source). As long as they originate from host companies with a whitelist or huge IP ranges, Cloudflare does not want to risk mis-banning a lot of others who share the same IP and only let those through, kinda not what we signed up Cloudflare for but still, we had no choice.

Cheers!

Admin · October 11, 2023, 10:21am

It doesn’t actually. The cookie that’s sent to the server just contains the calculated cryptographic value, and the cookie has a validity of decades. For the cookie to be valid for that long, the server can’t realistically keep track of every visitor of every domain, along with which parameters have been presented to that visitors, so it knows which visitor should respond with what value.

So the actual parameters being used are quite static. They are rotated regularly, but at any point in time the values are the same for all visitors. Meaning they only have to be set and calculated once, and then the server doesn’t need to recalculate the values.

I really mean it’s an IP-based rate limit, like you would expect. I don’t know the exact limits, time frames and so on, and I don’t think iFastNet wants those to be known, but it’s not a bandwidth limit. The point is to block bad traffic, not just slow them down.

True, you can block by failed requests, but that requires a lot of per-site tuning. Counting 404s is easy, but if one website owner makes some mistakes and has a lot of broken URLs on their page, you don’t want all visitors to be blocked because of it. It’s not a good solution to apply system wide.

There are multiple levels of protections to prevent web shells from being setup on our hosting. We don’t want those things on our servers or for them to be used by anyone. So that’s not what this security system is intended to protect against.

Oxy · October 11, 2023, 12:20pm

I think it’s something else, it is probably made to protect the server from poorly written code (loops) or websites that exceed a certain value in the sense that the server has to serve too many different files
in a short time - it basically behaves like bandwidth throttling (probably that’s what it’s for)

unlike if something requests the same file/URL several times in a short time,
the emphasis is on the reqs of the same file/URL within a short time, which would suggest that it is “bad” traffic where the bot constantly hits some URL because it is stupid or it was made on purpose and when it does that it should be rejected through some rate limiting…even returning 403 is better in this case instead of allowing it to call some PHP that then pulls another 80+ resources every time it hits the same URL.

I don’t know what to say
I think it would be good to get rid of that ?i=1 from the address,
and certainly that when someone comes to i=3 that it doesn’t take them to Google, because it’s not right that twice a year ifastnet DDoS Google servers due to problems with aes.js, besides that it confuses users who suddenly see a page from Google regarding cookies instead of a message from their hosting…
I understand how great it is to send most of the bad traffic to Google and reduce the load on your own servers, which “should” respond to that bad traffic.

Admin · October 11, 2023, 2:28pm

I only referred to it to indicate that legitimate websites can have high rates of traffic, and that having strict rate limits will probably impact legitimate sites.

Using EP limits for rate limiting to protect sites would be a laughably stupid solution given that it penalizes the entire account if the usage is too high. Any rate limiting solution should keep out bad traffic and let good traffic go through, which is the kind of distinction a process limit doesn’t do at all.

Tracking individual URLs for rate limiting can be useful, but comes with a lot of caveats and sharp edges. It can capture too little or too much depending on how the site is built.

If a site is using a single URL route to expose different features, like /index.php?url=/posts/123/comments or /api.php?function=GetPostComments&postId=123 you can have legitimate traffic making a large number of requests to the same path. This can be mitigated by also checking the query string, but that would make it trivial to bypass the rate limits by just adding bogus query parameters.

At the same time, an attacker could very quickly try to scan the URLs /user/1/settings, /user/2/settings, etc. which is definitely not regular traffic, but queries many different URLs.

The final thing to consider is the amount of bookkeeping that needs to be done. Rate limiting in web servers can be done very efficiently with in-memory hash buckets, but any such internal register has a limited size. Keeping track of just IP addresses is doable, but if you have to keep track of request counts on IP+URL, the number of counters quickly becomes very big, so either you need to dedicate a lot of memory to it, or evict items really quickly.

I personally really doubt how effective a URL based rate limiting system could really be, given that a lot of attack traffic doesn’t involve hitting the same URL repeatedly. Many bots will just fire off thousands of requests at different URLs with different parameters to try and poke for certain software with certain configuration.

Do you have any ideas on how we could rid of the URL parameter?

The ?i=1 part is a counter to keep track of the number of attempts. That way, if the browser does the redirect but doesn’t send the cookie, it doesn’t get stuck in an infinite redirect loop where it will just keep refreshing the page until the user navigates away, DDoS-ing the server in the process.

Admittedly, redirecting to the Google cookies page is probably just being lazy. A page that would describe what happened (security system, browser validation, checks JS and cookies, etc.) would be more helpful, rather than a page about cookies with no information as to why.

Oxy · October 11, 2023, 2:46pm

Such information is charged additionally

CF has something similar but “doesn’t bother” with queries

Of course that’s why there should be more layers as well as more rules in FW
which would repel at least the stupidest bots in the simplest and most painless way

it is easy for me to configure a lot of things for my website
for example, I have a rate limit for all .php on CF
and of course I don’t use a clean url (no extensions), so whatever you add to the url to make it fake and unique, chances are high that you’ll be blocked/RL

I have the impression that we are only discovering hot water here…
it would be ideal to have bot management that can detect and block bad bots using various techniques,
anything less than that is not smart enough and is easily circumvented and it is not fully automated (the rules change on the fly depending on what is happening in RT on the server), etc.

And with all that, we should also think about the appearance of quantum computers and adapt the aes.js code so that outsmarts them

Admin · October 11, 2023, 6:33pm

From what I understand from that article, it’s not actually that similar. Cloudflare’s solution appears to inject code into the web page after it has already been returned. It can be used to stop bots on subsequent requests, but the first request still goes through.

The testcookie solution is most similar to the Under Attack Mode solution from Cloudflare. Under Attack mode also checks for Javascript and cookies.

I don’t know what happens if you get an Under Attack mode page but cookies are disabled. Will you just keep looping the challenge page or will Cloudflare kick you out?

Oxy · October 11, 2023, 6:59pm

I don’t know - as far as I understand it stays in a loop until you enable js and cookie so that you are able to solve the challenge https://community.cloudflare.com/t/understanding-under-attack-mode/358178

They probably follow bots that are unable to solve even after x attempts and do not offer them a challenge at all through some time period and I assume that they are then shown the classic CF ray ID (error)

chiucs123 · October 12, 2023, 3:57am

Hi Admin and Oxy,

Just got back from a development project.

I think we have to acknowledge the difference between a hosting and a proxy provider. Proxy focuses on the network layer while hosting can range from the application layer to the network layer (if excluding hardware for both for the time being). Cloudflare can do this without query parameters because their system is engineered to keep track of the connection even after the challenge and make sure it stays that way. However for hosting, once the cookie header is set, everything is good to pass, or at least not all network traffic with the header is rescanned for malicious things.

It’s more about the fundamental thing: the current system relies on the client-side to store the iteration, which can be forged. In a poorly coded PHP while loop scenario, the URL will always come without i=1 and no cookie storage, effectively DDoSing the server. While certain rate limits can block the IP, it’s easy to change the IP to bypass that as well, blocking innocent visitors if there’s an open Wi-Fi in the process. As long as the iteration is kept on the client side, this ain’t gonna improve. Cloudflare has a middle machine to keep track of the iterations so it does not have an impact on the web browser, but that’s an additional cost from a hosting perspective, that’s why I say there’s a difference between proxy and hosting in the first place.

lol Or maybe you can redirect to something like error.inifityfree.com to show something about the error or simply instruct users to turn on Js and have a read more info button that goes to Google?

Good doubt indeed, as it’s also not effective. Bad actors have already evolved beyond that to send distinct requests to bypass checks on request headers, user agents and even custom request methods. So URLs are just child’s play for them.

I do see framework-specific solutions implementations on the market, that can track routes based on the framework’s architecture, on which rate-limit is imposed. For example, many routes that are under WordPress Admin are towards wp-admin/index.php, then those routes are detected from within WordPress function calls and tracked that way, but I doubt if developing such a thing from a hosting management perspective is cost-effective given the amount of application that is supported in PHP, not to mention custom PHP scripts, of which is the majority here.

They place either a localStorage or a cookie-based token to make subsequent requests inherit the previously passed challenge, but it also gets refreshed after a while. Fundamentally speaking, it’s still JavaScript and Cookie for sure.

Solution: you’ll be prompted to turn on cookies without page refresh if in a browser (both legitimate/emulated)

That time period is very long, presumably not exist at all, but it’s just there as a good to know. They cannot risk blocking legitimate visitors that come right after a Ddosser on the same IP, similar to the free WiFi paradox.

Cheers!

SpookyKipper · October 12, 2023, 1:59pm

This is what it will show (I saw that myself before)

(Original image source: https://community.cloudflare.com/t/the-web-always-shows-enable-cookies/433257)

You know InfinityFree is MOFH… It basically means Admin can’t do anything to change that

Admin · October 12, 2023, 7:11pm

I tested this shortly after asking this question myself. And it turns out that you’ll just stay on the challenge page and aren’t redirected at all.

I wondered a bit why, but then I realized you could just test within Javascript whether cookies are enabled. You can read the document.cookie string after writing it or check the navigator.cookieEnabled parameter to see it. Checking that could be a nice and simple addition to the current security check to avoid the redirect to Google. Only browsers that do support cookies in Javascript but don’t send them to the server would be affected, which is a very small group.

And maybe with some combination of request logging and banning, where you ban IPs that see the challenge page but keep trying for some reason. That doesn’t seem too complicated to implement and might allow you to get rid of the URL counter at all.

The URL counter is only really relevant against “accidental” DoS attacks. If you want to keep hammering the challenge page with requests, you can just do so. Just have a loop that calls the same URL over and over again and don’t look at the results. No need to do anything with the Javascript code or the redirects if you don’t want to.

For the most part, IP based rate limits are pretty effective. Static blocklists don’t work very well due to the reasons you named, but you can’t just spoof your IP address. One public WiFi network will probably just have one IP address you talk out with, and while you can reset your home router, you can’t just send requests from thousands of IPs at the same time. Getting many servers at cloud hosting companies gives you a lot of different IPs, but those can be blocked by the recipient with very little side effects. And actually having a botnet is such a high level of sophistication that there isn’t a lot you can do against that.

True, such functionality exists, and the client area also has it. But its usefulness is limited because for this rate limit to work, the PHP request must already be started, which from a DoS prevention makes it pretty worthless. It does work well against (light) brute force attacks and spam though.

Oxy · October 14, 2023, 6:16pm

At the beginning of 2018, in one post I expressed my surprise that aes.js was not minified (all in one line of code) and today everything is the same, in addition, it also contains a lot of comments inside.
I understand that the ones related to copyright were left, but I don’t understand why the long ones that describe the code were not removed…

If they used minified code, then the required traffic would be cut in half (bandwidth)

Screenshot 2023-10-14 195357

and Gzip/Brotli was not used either

Screenshot 2023-10-14 201341

wackyblackie · October 14, 2023, 10:27pm

I think this may actually be one of the things they could easily (is anything ever easy for iFN though? lol) and reasonably implement.

chiucs123 · October 15, 2023, 9:03am

Hi Admin,

I was getting occupied by some other things this weekend, but I’m back to the discussion.

In this case, I would like how will any user get out of the IP ban if they use shared IPs like public WiFi hotspots or coffee shop networks?

Yes, this part is what I was talking about, Ddosers can simply do this to eat up the bandwidth and just keep downloading the challenge page, which is kinda still an effective Ddos if you ask me.

VPN, mobile cellular network, or use another hotspot, dah~ While it’s possible to identify VPNs and simply block them, it would also mean that some visitors aren’t going to connect if they are in locations without regular internet, this might not be a concern for most website owners tho.

Hi Oxy and wackyblackie,

The notion of “if it doesn’t break, don’t change it”

Cheers!

Oxy · October 15, 2023, 9:36am

I wanted to show actually how illogical it is because
on the one hand, there is an antibot system that should prevent unnecessary traffic
and on the other hand there is an unoptimized script (a script that is served millions of times per day) that consumes/generates probably as much traffic as the bots.

Admin · October 15, 2023, 2:46pm

I intend to compile a list of suggestions from this article and discuss those with iFastNet. Compressing/minifying the aes.js file won’t help address the fundamental issues we’ve been discussing here, but it seems like a straight forward improvement to me.

It would have to be a short lived ban in combination with reasonable limits. All it needs to do is prevent someone from hammering the server, not replicate the “three strikes” system that currently exists. Just something to stop the repeating loop.

You could even limit it on IP+User Agent to reduce the impact somewhat.

It’s useful against one very specific type of DDoS. It’s indeed useless against all others. The usefulness is limited, but it’s definitely there.

Cellular network is 1 IP, every hotspot is 1 IP, but you may have to move around to access different hotspots, VPNs can provide multiple servers, but have limits on the number of connections you can use at any time. But you could use multiple VPN providers, depending on your system and it’s networking stack.

With some effort you could maybe get a dozen simultaneous IPs active. It’s simple enough to block them all. URLs, user agents, etc. can all be filled with unique values so you can have thousands or millions of unique entries with little effort.

chiucs123 · October 24, 2023, 1:34pm

Just one thought when I saw the other post mentioning social preview, is it possible to whitelist certain popular well-known social services like Google, Facebook, Twitter (now X), WhatsApp, Signal and Telegram?

I think there’s a valid point for those to bypass the security thing at least from a website owner perspective. Otherwise getting traffic and ranking SEO would be quite tough here if website can simply rely on word-of-mouth.

system · November 8, 2023, 1:34pm

This topic was automatically closed 15 days after the last reply. New replies are no longer allowed.