Bingbot Unable to Crawl My MediaWiki Site Due to JavaScript Redirect - Need Help Disabling It

Kerywek · March 2, 2025, 1:29pm

I’m hosting a MediaWiki site on InfinityFree (domain: pygly.ct.ws), and I’ve encountered an issue where Bingbot (and likely other crawlers) cannot access my site’s content due to a JavaScript redirect. This is preventing my site from being indexed properly, and my Sitemap shows “0 URLs discovered” in Bing Webmaster Tools. I’d appreciate any help or suggestions from the community or support team to resolve this!

When Bingbot tries to crawl my site (e.g., https://pygly.ct.ws/index.php/Main_Page), it doesn’t see the actual wiki content. Instead, it gets redirected to a page with this HTML:

<html>
  <body>
    <script type="text/javascript" src="/aes.js"></script>
    <script>
      function toNumbers(d) { var e = []; d.replace(/(..)/g, function(d) { e.push(parseInt(d, 16)); }); return e; }
      function toHex() { for (var d = [], d = 1 == arguments.length && arguments[0].constructor == Array ? arguments[0] : arguments, e = "", f = 0; f < d.length; f++) e += (16 > d[f] ? "0" : "") + d[f].toString(16); return e.toLowerCase(); }
      var a = toNumbers("f655ba9d09a112d4968c63579db590b4"), b = toNumbers("98344c2eee86c3994890592585b49f80"), c = toNumbers("51aea0a9b6e9900c72b637f37ad9e3ac");
      document.cookie = "__test=" + toHex(slowAES.decrypt(c, 2, a, b)) + "; expires=Thu, 31-Dec-37 23:55:55 GMT; path=/";
      location.href = "https://pygly.ct.ws/index.php/%E9%A6%96%E9%A1%B5?i=1";
    </script>
    <noscript>
      This site requires Javascript to work, please enable Javascript in your browser or use a browser with Javascript support
    </noscript>
  </body>
</html>

This page sets a cookie (__test) and redirects to the actual content with a ?i=1 parameter. Since Bingbot doesn’t execute JavaScript, it can’t follow the redirect and gets stuck, unable to index my site.

Normal browser visits work fine—I can see my wiki pages without issues. But crawlers like Bingbot are blocked by this redirect.

What I’ve Tried So Far

To isolate the issue, I’ve done some troubleshooting:

Tested with a Static File
I uploaded a simple test.html file to my root directory with this content:

<html><body><h1>Test Page</h1></body></html>

Browser: Visiting https://pygly.ct.ws/test.html shows
Test Page
as expected.
Bingbot: Using curl to simulate Bingbot:

curl -A "Mozilla/5.0 (compatible; bingbot/2.0; +http://www.bing.com/bingbot.htm)" https://pygly.ct.ws/test.html

Returns the JavaScript redirect page instead of my test.html content.
2. Checked My Configuration

I’m not using Cloudflare or any custom CAPTCHA/verification scripts.
My .htaccess file only handles HTTPS redirects and doesn’t include this redirect logic:

RewriteEngine On
RewriteCond %{HTTPS} !=on
RewriteRule ^(.*)$ https://%{HTTP_HOST}/$1 [R=301,L]
ErrorDocument 404 /404.php
RewriteCond %{HTTPS} off [OR]
RewriteCond %{HTTP_HOST} ^www\.pygly\.ct\.ws [NC]
RewriteRule ^(.*)$ https://pygly.ct.ws/$1 [R=301,L]

My MediaWiki LocalSettings.php also has no custom redirect code—just standard settings and extensions like AutoSitemap.

Conclusion
Since I haven’t added this JavaScript redirect myself, I suspect it’s an InfinityFree default security feature (e.g., anti-bot protection) that’s triggering for crawler requests.

What I Need Help With

I’d like Bingbot (and other legitimate crawlers like Googlebot) to access my site’s actual content without being blocked by this JavaScript redirect. Here’s what I’m hoping to get assistance on:

Disabling the Redirect

Is this JavaScript challenge a built-in InfinityFree feature? If so, can it be disabled for my account, or can I whitelist crawlers like Bingbot and Googlebot?
I don’t have direct server access (since it’s free hosting), so any solution would need to work within InfinityFree’s control panel or file manager.

Workaround Suggestions

Has anyone else faced this issue on InfinityFree? Are there workarounds (e.g., tweaking .htaccess or Sitemap URLs) that don’t require server-level changes?
I’ve considered adding ?i=1 to all Sitemap URLs and testing if Bingbot can crawl those—would that work?

Why This Matters

This issue is killing my site’s SEO. Without proper crawling, my MediaWiki pages won’t appear in Bing search results, and I’m losing potential visitors. Any help or pointers would be greatly appreciated!

JavesPotato · March 2, 2025, 1:31pm

JavesPotato · March 2, 2025, 1:40pm

Did it happen in the verification process? I do not know anything about Bingbot, but you can try the instructions stated in this article:

wasik405 · March 2, 2025, 1:51pm

I have heard that the crawlers from the webmaster tools do not support JavaScript and get blocked by the security system @JavesPotato mentioned. However, the main Bing crawler does support JavaScript, so there is a possibility that your site could get indexed in Bing’s SERP.

Simulating through cURL wont work because of the security system

This is not true. Every crawler that supports JavaScript & Cookies can access your site content. This includes major search engines, such as Google and Bing’s main crawlers.

Greenreader9 · March 2, 2025, 3:04pm

Google and bings web crawlers do work, and can index your website. However, the validation checkers, site map checks, they don’t. But then Bing goes to actually index your site, it will work.

You can’t emulate Bing using cURL they way you tried because the crawler does more then just send a specific UA. The IP address, other headers, and cookies, JS execution, etc are all different.

system · March 9, 2025, 3:05pm

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.