C.M. Posted September 26, 2021 Share Posted September 26, 2021 В robots.txt: ### SOURCE AND UPDATES: https://akinix.com/forums/topic/163-how-to-block-ahrefs-semrush-serpstat-majestic-seo-megaindex-and-similar-bots-for-competitive-intelligence/ # Alexa: https://support.alexa.com/hc/en-us/articles/200450194-Alexa-s-Web-and-Site-Audit-Crawlers User-agent: ia_archiver Disallow: / # archive.org User-agent: archive.org_bot Disallow: / # Ahrefs: https://ahrefs.com/robot User-agent: AhrefsBot Disallow: / # MOZ: https://moz.com/help/moz-procedures/crawlers/rogerbot User-agent: rogerbot Disallow: / # MOZ: https://moz.com/help/moz-procedures/crawlers/dotbot User-agent: dotbot Disallow: / # DataForSeo https://dataforseo.com/dataforseo-bot User-agent: DataForSeoBot Disallow: / # Semrush: https://www.semrush.com/bot/ User-agent: SemrushBot Disallow: / # Semrush bot for site audit User-agent: SiteAuditBot Disallow: / # Semrush bot for brand monitoring User-agent: SemrushBot-BA Disallow: / # Semrush bot for site improvement User-agent: SemrushBot-SI Disallow: / # Semrush bot for site wide analysis User-agent: SemrushBot-SWA Disallow: / # Semrush bot for content tracking User-agent: SemrushBot-CT Disallow: / # Semrush bot for backlink monitoring User-agent: SemrushBot-BM Disallow: / # SplitSignal bot, Semrush's tool for SEO testing User-agent: SplitSignalBot Disallow: / # Majestic: https://mj12bot.com/ User-agent: MJ12bot Disallow: / # SerpStat: https://serpstatbot.com/ User-agent: serpstatbot Disallow: / # MegaIndex: https://ru.megaindex.com/blog/seo-bot-detection User-agent: MegaIndexBot Disallow: / # SEO-PowerSuite-bot: https://www.link-assistant.com/seo-workflow/site-audit.html User-agent: SEO-PowerSuite-bot Disallow: / # OpenLinkProfiler bot for backlink analysis http://OpenLinkProfiler.org/bot User-agent: spbot Disallow: / # https://www.linkdex.com/ User-agent: linkdexbot Disallow: / # https://www.seozoom.it/bot/ User-Agent: ZoomBot Disallow: / # seplinkbot bot for backlink analysis http://seplinkbot.com/ User-agent: seplinkbot Disallow: / # Linkpad bot for backlink analysis https://www.linkpad.ru User-agent: LinkpadBot Disallow: / # RankActive bot for rank tracking https://rankactive.com/resources/rankactive-linkbot User-agent: RankActiveLinkBot Disallow: / # AccuRanker https://www.accuranker.com/ User-agent: AccuRanker Disallow: / # SiteAnalyzer bot for website analysis https://www.site-analyzer.com/ User-agent: SiteAnalyzerBot Disallow: / # Similarweb bot for market intelligence https://support.similarweb.com/hc/en-us/articles/18091514382749-Similarweb-Bot User-agent: similarweb Disallow: / # BLEXBot for SEO web crawling and data collection https://help.seranking.com/hc/en-us/articles/17126130916636-BLEXBot-Crawler User-agent: BLEXBot Disallow: / # NetcraftSurveyAgent, https://www.netcraft.com/tools/ User-agent: NetcraftSurveyAgent Disallow: / User-agent: * Disallow: Но имейте ввиду, что некоторые сервисы не следуют инструкциям из файла robots.txt и всё равно продолжают выкачивать ваш сайт. В таких случаях нужно прибегнуть к более действенным мерам: блокировкам по User-Agent и IP. Например, чтобы забанить робота Web Archive, необходимо запретить доступ к сайту для IP 207.241.224.0/20 and 208.70.24.0/21. В файле .htaccess для сервера Apache правило будет выглядеть вот так: <RequireAll> Require all granted # Ban web.archive.org crawler, more info: https://akinix.com/forums/topic/163-how-to-block-ahrefs-semrush-serpstat-majestic-seo-megaindex-and-similar-bots-for-competitive-intelligence/ Require not ip 207.241.224.0/20 Require not ip 208.70.24.0/21 </RequireAll> Link to comment Share on other sites More sharing options...
Recommended Posts