Hier mal eine Liste, der Harvester/Suchmaschinen, die die robots.txt ignorieren.
IP-Adresse Useragent
24.4.92.95 Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.0) RPT-HTTPClient/0.3-3
61.172.211.68 Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1)
61.179.12.123 Mozilla/3.0 (compatible; Indy Library)
62.48.74.61 Mozilla/3.0 (compatible; Indy Library)
63.148.99.229 Mozilla/4.0 (compatible; MSIE 5.01; Windows NT 5.0)
63.148.99.234
63.148.99.236
63.148.99.238
63.148.99.239
63.148.99.240
63.148.99.244
63.148.99.245
63.148.99.247 Mozilla/4.0 (compatible; MSIE 5.01; Windows NT 5.0)
63.148.99.253 Mozilla/4.0 (compatible; MSIE 5.01; Windows NT 5.0)
64.140.49.69
64.246.0.17 Mozilla/4.0 (compatible; MSIE 5.01; Windows NT)
65.54.164.63 msnbot/0.11 (+http://search.msn.com/msnbot.htm)
65.102.0.16 Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.2)
65.102.10.112 Mozilla/5.0 (Macintosh; U; PPC Mac OS X; en-US; rv:1.0.1) Gecko/20020823 Netscape/7.0
66.79.165.40 Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; .NET CLR 1.1.4322; .NET CLR 1.0.3705)
66.79.165.50 Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; .NET CLR 1.1.4322; .NET CLR 1.0.3705)
66.135.35.75 Mozilla/4.0 (compatible; MSIE 5.0; Windows NT; DigExt; DTS Agent
66.147.154.3 http://www.almaden.ibm.com/cs/crawler
66.207.118.206
67.15.82.60 Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; .NET CLR 1.1.4322; .NET CLR 1.0.3705)
80.237.202.35 WWWeasel Robot v1.00 (http://wwweasel.de)
80.237.202.146 WWWeasel Robot v1.00 (http://wwweasel.de)
80.237.204.58 User-Agent: Mozilla/4.0 (compatible; MSIE 5.5; Windows 98)
81.169.180.237
84.131.255.115
210.192.120.74
210.192.120.82
211.99.203.196 Mozilla/3.0 (compatible; Indy Library)
211.99.203.197
211.99.203.198
211.99.203.199
211.99.203.200
211.99.203.201
211.99.213.13
211.99.213.16
211.99.213.17
211.99.213.18
211.99.213.19
211.99.213.20
211.99.213.21
211.99.213.22
211.157.8.42
211.157.8.44
211.157.8.48
211.152.14.91
211.152.14.93
211.152.14.94
211.152.14.95
211.152.14.96
211.152.14.97
211.152.14.98
211.157.8.43
211.157.36.1 Mozilla/3.0 (compatible; Indy Library)
211.157.36.2 Mozilla/3.0 (compatible; Indy Library)
211.157.36.3 Mozilla/3.0 (compatible; Indy Library)
211.157.36.4 Mozilla/3.0 (compatible; Indy Library)
211.157.36.5 Mozilla/3.0 (compatible; Indy Library)
211.157.36.6 Mozilla/3.0 (compatible; Indy Library)
211.157.36.7 Mozilla/3.0 (compatible; Indy Library)
211.157.36.8 Mozilla/3.0 (compatible; Indy Library)
211.157.36.9 Mozilla/3.0 (compatible; Indy Library)
213.239.194.170 www.adressenonline.de
213.168.102.30 Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.0)
216.88.158.142 Mozilla/4.0 compatible ZyBorg/1.0 (wn.zyborg@looksmart.net; http://www.WISEnutbot.com)
217.20.17.34 Mozilla/4.0 (compatible; MSIE 5.5; Windows NT 4.0)
217.69.120.66 Mozilla(IE Compatible)
217.172.172.126 curl/7.11.2 (i686-pc-linux-gnu) libcurl/7.10.2 OpenSSL/0.9.6i ipv6 zlib/1.1.4
217.237.171.232
217.237.171.233
217.237.171.234
217.237.171.235
217.237.171.236
217.237.171.237
217.237.171.238
217.237.171.239
63.148.99.237
81.2.103.246 Mozilla/4.0 (compatible; MSIE 5.0; Windows 98; DigExt)
210.75.206.15 Mozilla/3.0 (compatible; Indy Library)
217.237.44.56
193.99.144.71
216.65.117.228 Mozilla/4.0 (compatible; MSIE 5.01; Windows NT)
63.148.99.250
62.216.255.10 Mozilla/4.0 (compatible; MSIE 5.0; Windows 98; DigExt)
217.79.182.86
211.157.8.46
211.157.8.41
38.118.42.34
38.118.42.38
213.61.218.29 PhpDig/1.6.2 (PHP; MySql)
70.84.132.74 Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.0;)
38.118.25.60
84.131.246.221
70.84.196.98 Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.0;)
38.118.42.35
24.28.84.204 Mozilla/4.0 (compatible; grub-client-2.6.0)
217.172.186.195 Mozilla/4.0 (compatible; MSIE 6.0; Windows 98; Win 9x4.90)
193.254.187.72
66.17.15.138 Schmozilla/v9.14 Platinum
66.17.15.164 Schmozilla/v9.14 Platinum
128.138.124.27
64.246.28.28 wbdbot
81.180.251.250 MSIE 6.0
139.18.13.203 findlinks/0.913 (+http://wortschatz.uni-leipzig.de/findlinks/)
128.138.177.64
65.19.150.219
217.20.113.110 Mozilla/4.78 (Windows NT 5.1; U) Opera 7.21
193.254.187.74
128.138.177.248
84.0.191.128 Java/1.4.1_04
128.138.124.132
220.194.54.27
66.17.15.154 Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; SV1; .NET CLR 1.1.4322; .NET CLR 2.0.50215)
64.92.201.114
66.132.132.63 Mozilla/4.0 (compatible; MSIE 5.01; Windows NT)
209.249.80.241 Clushbot/3.62-Laomedon (+http://www.clush.com/bot.html)
80.142.230.58 libwww-perl/5.801
81.92.6.7
66.36.230.12 Mozilla/4.0 (compatible; MSIE 5.01; Windows NT)
217.52.206.31
67.15.175.114 wbdbot
72.3.248.68
84.60.119.146
64.151.111.116 Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.0)
66.90.103.23
67.52.241.195
207.195.243.144
62.3.32.53
72.5.115.26 Mozilla/5.0 (Windows; U; Windows NT 5.0; en-US; rv:1.7.7) NimbleCrawler 1.11 obeys UserAgent NimbleCrawler For problems contact: crawler_at_dataalchemy.com
84.149.248.65 Mozilla/4.0 (compatible; Win32; IWCS 3.2.250)
80.28.207.180
200.141.124.147
84.149.223.130 Mozilla/4.0 (compatible; Win32; IWCS 3.2.257)
80.77.86.240 Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1)
70.112.77.82
129.175.81.73
216.255.189.226 Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1)
68.87.64.117
81.168.228.218
196.40.43.218
62.156.146.2
83.14.66.206
218.189.215.178
82.234.138.4
217.76.144.121
219.170.4.85
220.5.80.193
69.42.74.75 Mozilla/4.0 (compatible; MSIE 6.0)
81.223.254.34 Mozilla/4.0 (compatible; MSIE 5.01; Windows NT)
82.165.176.62 EmeraldShield.com Web Spider (http://www.emeraldshield.com/webbot.aspx)
66.79.162.109
145.253.94.146
202.75.128.24
203.81.31.212
131.107.65.41 MSR-ISRCCrawler
87.106.223.68 Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.0)
213.217.54.99 Mozilla/4.0 (compatible; MSIE 5.0; Windows NT)
118.152.33.237 Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; SV1)
199.126.151.229 Java/1.6.0_11
207.210.81.156 Mozilla/4.0 (compatible; MSIE 7.0; Windows NT 5.1; .NET CLR 2.0.50727; .NET CLR 1.1.4322)
69.84.207.62 Mozilla/4.0 (compatible; MSIE 7.0;Windows NT 5.1;.NET CLR 1.1.4322;.NET CLR 2.0.50727;.NET CLR 3.0.04506.30)
74.52.45.250 Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; en) Opera 8.50
24.132.117.221 Java/1.6.0_12
84.171.66.206 Java/1.6.0_13
89.178.176.177 Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; SV1; .NET CLR 1.1.4322; .NET CLR 2.0.50727)
192.129.3.46 MSIE 6.0
72.51.37.147 Mozilla/4.0 (compatible; MSIE 5.01; Windows NT)
89.122.224.52 Java/1.6.0_04
64.122.154.52 Java1.4.0_01
88.198.53.43
174.133.177.66 Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; en) Opera 8.50
192.114.71.13
66.36.230.78 Mozilla/4.0 (compatible; MSIE 5.01; Windows NT)
82.83.242.34 Mozilla/5.0 (Windows; U; Windows NT 6.0; de; rv:1.9.1.2) Gecko/20090729 Firefox/3.5.2
75.129.133.227 Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; SV1)
65.208.151.118
78.46.39.55
82.195.246.199 Microsoft URL Control - 6.00.8862
 
Wie man hier sehen kann, nutzen solche Harvester, die auf Adressenjagd sind, Useragents, die sich von normalen
Websitebesuchern nicht unterscheiden. Auch sind hier richtige Suchmaschinen enthalten, die zu Neugierig waren.
Die meisten Harvester kommen aus dem chinesischen Raum. Auch der Spam kommt meistens aus dem asiatischen oder amerikanischen Raum.
© Michael Raab
Zurück