I only referred to it to indicate that legitimate websites can have high rates of traffic, and that having strict rate limits will probably impact legitimate sites.
Using EP limits for rate limiting to protect sites would be a laughably stupid solution given that it penalizes the entire account if the usage is too high. Any rate limiting solution should keep out bad traffic and let good traffic go through, which is the kind of distinction a process limit doesn’t do at all.
Tracking individual URLs for rate limiting can be useful, but comes with a lot of caveats and sharp edges. It can capture too little or too much depending on how the site is built.
If a site is using a single URL route to expose different features, like /index.php?url=/posts/123/comments
or /api.php?function=GetPostComments&postId=123
you can have legitimate traffic making a large number of requests to the same path. This can be mitigated by also checking the query string, but that would make it trivial to bypass the rate limits by just adding bogus query parameters.
At the same time, an attacker could very quickly try to scan the URLs /user/1/settings
, /user/2/settings
, etc. which is definitely not regular traffic, but queries many different URLs.
The final thing to consider is the amount of bookkeeping that needs to be done. Rate limiting in web servers can be done very efficiently with in-memory hash buckets, but any such internal register has a limited size. Keeping track of just IP addresses is doable, but if you have to keep track of request counts on IP+URL, the number of counters quickly becomes very big, so either you need to dedicate a lot of memory to it, or evict items really quickly.
I personally really doubt how effective a URL based rate limiting system could really be, given that a lot of attack traffic doesn’t involve hitting the same URL repeatedly. Many bots will just fire off thousands of requests at different URLs with different parameters to try and poke for certain software with certain configuration.
Do you have any ideas on how we could rid of the URL parameter?
The ?i=1 part is a counter to keep track of the number of attempts. That way, if the browser does the redirect but doesn’t send the cookie, it doesn’t get stuck in an infinite redirect loop where it will just keep refreshing the page until the user navigates away, DDoS-ing the server in the process.
Admittedly, redirecting to the Google cookies page is probably just being lazy. A page that would describe what happened (security system, browser validation, checks JS and cookies, etc.) would be more helpful, rather than a page about cookies with no information as to why.