#/** # * robots.txt # * # * bot control for http://clez.net # * refer to 'Robots Exclusion Standard RFC4' # * the use of crawlers or other automated tests on this host is forbidden for # * any path defined by 'Disallow' directives. # * # * @tabstop 4 # */ # concerns any client application User-agent: * # no image indexing on assets Disallow: /img/ # prevent the world from finding out the meaning of life Disallow: *.oliver # - i don't have them anyway, but as stated above, it's forbidden to crawl :) Disallow: /MSADC/ Disallow: /MSOffice/ Disallow: /_vti_bin/ Disallow: /_mem_bin/ Disallow: /c/ Disallow: /d/ Disallow: /scripts/ Disallow: /%7e Disallow: %7e # special interest rules User-Agent: Googlebot-Image User-agent: Mediapartners-Google* User-agent: WebZIP User-agent: WebTrends Disallow: / # bots a just dislike User-agent: EmailCollector User-agent: EmailSiphon User-agent: EmailWolf User-agent: WebEMailExtrac.* User-agent: autoemailspider User-agent: RPT-HTTPClient Disallow: / # archive.org User-agent: ia_archiver Disallow: / User-agent: * Sitemap: http://clez.net/sitemap.xml