The server does not allow some search bots from Russia to pass through

8 Likes

Yandex robot receives server response code 403
There are no restrictions in htaccess for the Yandex robot

Please read this thread, the discussion here may help you:

This was my guess about Yandex being blocked by the security system:

I am afraid this is the conclusion that was reached:

Without that being an intentional act from InfinityFree’s side:

8 Likes

What others? I didn’t understand your answer. Are you blocking Russia?
Those. What to do? Abandon search engines?

Are you using Cloud Flare?

5 Likes

Some sites hosted on InfinityFree seemed to be indexed fine by Yandex, but others do not. The reason isn’t really known with absolute certainty.

Not intentionally as far as I’m aware of, especially since you are able to sign up for the services and host a website here just fine. As mentioned in the thread I linked you to, Yandex keeps changing their IP ranges, so keeping track of the changes and constantly updating the whitelist to prevent the security system from blocking their requests isn’t an easy task.

Use a different search engine? Most of them shouldn’t have an issue.

Perhaps you can utilize Cloudflare and see if Yandex is able to index the page then. We have a guide for that, if you want to try it out:

9 Likes

All this is interesting, but I have a free domain and hosting and I am interested in visitors from Russia
Why are you talking about IP permissions? After all, there is a bot name where the word yandex is indicated. I think this is enough to determine it?

Like @ChrisPAR mentioned, Yandex keeps changing their IP ranges. In that case, InfinityFree’s security system could block Yandex requests due to not having the latest IP address in the whitelist. And it’s better if you buy a custom domain name and index it rather than indexing a free one (it’s just an opinion).

6 Likes

I have already said that it is not at all necessary to determine by IP.
It’s even easier to use the name of the bot where there is the word yandex
If it doesn’t work on the free one, then why will it work on the paid one?
Not sure at all. Probably we should fix it, and not offer to pay

It is not the way this security system works, and using a name to validate a request is not a good way.

I am not saying it will work on paid domains; what I was trying to say is that using a custom domain name is more reliable and professional than using a free one.

Also, when you use a custom domain, you can use Cloudflare as your DNS handler, which can protect your site from various kinds of attacks and have more control over DNS records.

5 Likes

We only want to whitelist specific traffic that we can be sure is from actual search engine crawlers. Searching engines with a list of known IP addresses that are only used for their crawlers lets us allow those IPs specifically. IP address spoofing is pretty difficult in general, and basically impossible for web traffic as far as I know.

User agents are freely configurable. So allowing any visitor with a Yandex user agent would make it trivial for malicious bots to impersonate the Yandex crawler and bypass our security measures.

Making it work is the easy part. Not punching a truck sized hole in our security in the process is what makes it difficult.

Why is any website in the world searchable in any search engine?

Free hosting is not premium hosting. We say right on the front page of our website that they are different services and they work differently.

If you signed up here because you want to see if premium hosting is a good fit for you, then stop what you’re doing immediately and actually try premium hosting. Please don’t discard premium hosting because a different product you don’t actually want turns out to be a product you don’t want.

If you want to have this problem fixed and actually intend to get premium hosting, which doesn’t have this issue to begin with, then just get premium hosting. Please don’t demand us to fix an issue with free hosting for you to stop using free hosting as soon as we fix the issue you’re having.

6 Likes

I don’t believe that paid hosting doesn’t have this problem.
Your statements are not supported by anything.
This is my first time encountering hosting where search bots are prohibited.
Those. Why create a website if it is not indexed and does not participate in searches? Marvelous!

Your statements are not supported by anything either.

All we have established here is that Yandex crawling on free hosting has issues.

We don’t know anything about paid hosting. You conclude that because the free and paid hosting are provided by the same company, the services must function the same, therefore paid hosting has this issue too. I can guarantee you: they do not function the same, and paid hosting does not have this issue.

But it seems that you are just not willing to believe me. So the only way to know for sure if paid hosting has this issue is to test paid hosting.

Unfortunately, I don’t have a demo site on paid hosting and I don’t have a Yandex account, so I cannot prove that Yandex works.

But if I use an SEO analyzer like the one from RankMath, and I test a site on free hosting, the “Common keywords” shows that it hits the browser validation challenge. If I do the same with rf.gd (which is actually hosted on a premium account), it s hows results from the actual page.

This is because the security system that blocks bot is completely absent from premium hosting. So, assuming this issue is also what blocks Yandex, it means Yandex will just work on premium hosting.

Your site is indexed and searchable. You can find it in Google, Bing and Brave. Maybe others too. It’s just Yandex that has this issue because Yandex makes it a lot harder than everyone else to reliably detect their crawler.

6 Likes

The statements are not supported by anything. Indexing on your server is apparently completely prohibited.
Google, Bing and Brave bots do NOT visit my site. I keep statistics and know this for sure. There is only google-image
Making websites on your hosting is apparently useless
I think the discussion should be stopped, and you need to correct your inhibitions
Let’s end this useless chatter

There have been plenty of websites hosted on InfinityFree that have been indexed and shown on Google. Some factors, such as the fact that you’re using a free domain, can affect the speed it shows up. Registering your site with the Google search console manually can speed up that process. I believe Bing has something similar, though I do not have experience with Yandex on that matter.

6 Likes

Right, I made a misleading statement. I checked your site and it doesn’t appear to have been indexed yet.

Your site can be indexed by other search engines. That doesn’t mean they actually do. It can take months for a new site to appear in search engines, depending on traffic and backlinks. And your site is only a few weeks old.

It can help to sign up for Google Search Console, Bing Webmaster Tools, etc. to both notify the respective search engines that your site exists and you care about it being searchable.

Other sites on free hosting are indexed and can be searched. Here are a number of sites also using a free.nf domain that can be found in Google:

If this doesn’t prove that free sites are being indexed and can be searched, then I give up.

Right back at you. You’re blaming us for issues that you assume exist (indexing doesn’t work at all, paid hosting has all the same limitations as free hosting), based on flawed understanding of how things work (like search engine crawling). We can’t help you if you only want to blame us and not want to talk about actual solutions.

If you think our hosting sucks, then please don’t use it. We want only happy users. We try to make as many people happy as we can. But to the people for whom we can’t, I say best of luck, and hope you will be happy elsewhere.

10 Likes

Actually what InfinityFree is doing follows the best practise of Search Engine management, and this is Yandax’s fault for horrendously managing their transparency and with a very questionable background and reputation back when their search engine was immensely abused. Blocking Yandex by default is a very common practice even back in 2000s. As a user of this free hosting, I would for sure drop the hosting service entirely for good if this ever changes.

Most notorious services like Yandax and Facebook maintain their ability to crawl data either by forcing the owner in terms of service bundles like when placing ads and getting their site indexed, often of which is via reverse DNS against the connecting machine without disclosing the IP ranges of their service. This way these poorly-managed search engines/platform places the computation burden on the hosting side rather than taking up the responsibility to uphold transparency. It’s a rather disgusting move, to be honest.

While it is possible to verify each connection by reverse checking their DNS domain and checking whether that specific connection is coming from a machine that has a domain name under whatever they say at the time, web admins like myself won’t give a (insert word here) to what that is given that Yandex is not a multinational company and held almost 100% user base as marketing hostage, hence it has no grounds in leveraging international network compliance. Web admins will simply isolate Yandex unless they learn it the hard way by not telling the Internet what to do. The same goes for all platform that tries to do the same, if they have no content, they have no value, and this right is always reserved by web admins.

Free websites can be indexed by search engines, especially reputable ones and those which are genuinely servicing for the greater good, most offer self-site management and we get our sites indexed almost right away.

Support

image
Their official statement is “You can check the authenticity of a robot using reverse DNS lookup.”

WARNING: unsafe link:
How to check that a robot belongs to Yandex - Webmaster. Help

7 Likes

Exactly - yandex should be blamed

Simply because yandex uses the same IPs for crawling as they rent out to customer infrastructure…
And then it happens that besides legitimate yandex bots, some traffic from these IP addresses is malicious !

I convinced myself by reading my logs several times and saw that their 2 ASNs are used for brute force and vulnerability scanning and probably as a tool of the russian secret services.

So if it is blocked there is a reason!

This is certainly not the normal behavior of a search engine bot
(see path as well as how many variations for IP)



And until they publicly announce the list of IPs that can be whitelisted, some traffic can be expected to be blocked (but they don’t have it because everything is mixed up there).

And if these two ASNs are allowed in hosting
then you can expect the same from my screenshots
a lot of suspensions due to mass REQs (bad bots)

10 Likes

Lucky for you, I had to deal with Yandex back in the days before Cloudflare’s a thing, that’s hell of a burden!

5 Likes

The site is registered in Google search console, but his bots don’t come either.
Thanks for clarifying. I sent them your message, but I don’t rely on Yandex anymore.