New Bookstore! Save 20% on books with discount code: LAUNCH
Web Dev + WordPress + Security

Yahoo! Once Again Caught Disobeying Robots.txt Rules

Hmmm.. Let’s see here. Google can do it. MSN/Live can do it. Even Ask can do it. So why oh why can’t Yahoo’s grubby Slurp crawler manage to adhere to robots.txt crawl directives? Just when I thought Yahoo! finally figured it out, I discover more Slurp tracks in my Blackhole trap for bad spiders:

IP:   74.6.17.163
Host: 163.17.6.74.in-addr.arpa llf520189.crawl.yahoo.net

[2008-07-30 (Wed) 16:51:59] "GET /blackhole/ HTTP/1.0"
Mozilla/5.0 (compatible; Yahoo! Slurp; http://help.yahoo.com/help/us/ysearch/slurp)

OrgName:    Inktomi Corporation
OrgID:      INKT
Address:    701 First Ave
City:       Sunnyvale
StateProv:  CA
PostalCode: 94089
Country:    US

NetRange:   74.6.0.0 - 74.6.255.255
CIDR:       74.6.0.0/16

NetName:    INKTOMI-BLK-6
NetHandle:  NET-74-6-0-0-1
Parent:     NET-74-0-0-0-0
NetType:    Direct Allocation
NameServer: NS1.YAHOO.COM
NameServer: NS2.YAHOO.COM
NameServer: NS3.YAHOO.COM
NameServer: NS4.YAHOO.COM
NameServer: NS5.YAHOO.COM

RegDate:    2006-02-13
Updated:    2007-03-09

RAbuseHandle: NETWO857-ARIN
RAbuseName:   Network Abuse
RAbusePhone:  +1-408-349-3300
RAbuseEmail:  network-abuse@cc.yahoo-inc.com

OrgAbuseHandle: NETWO857-ARIN
OrgAbuseName:   Network Abuse
OrgAbusePhone:  +1-408-349-3300
OrgAbuseEmail:  network-abuse@cc.yahoo-inc.com

OrgTechHandle: NA258-ARIN
OrgTechName:   Netblock Admin
OrgTechPhone:  +1-408-349-3300
OrgTechEmail:  netblockadmin@yahoo-inc.com

# ARIN WHOIS database, last updated 2008-07-29 19:10

I enjoy the control provided by the robots.txt protocol and look forward to the day when all of the major search engines get it right.

Jeff Starr
About the Author
Jeff Starr = Web Developer. Book Author. Secretly Important.
USP Pro: Unlimited front-end forms for user-submitted posts and more.

2 responses to “Yahoo! Once Again Caught Disobeying Robots.txt Rules”

  1. This just shows that Google’s nearest competitor is just oh so far behind in the game.

  2. Jeff Starr
    Jeff Starr 2008/08/18 4:11 pm

    So true. I mean, how difficult is it for a computer to obey the operational rules it’s been given? I honestly can’t think of any excuse for Slurp’s recent deviancy into explicitly forbidden territory. Given that computers do not make “mistakes,” such illicit crawl behavior seems intentional. In either case, it’s not looking good for Yahoo.

Comments are closed for this post. Something to add? Let me know.
Welcome
Perishable Press is operated by Jeff Starr, a professional web developer and book author with two decades of experience. Here you will find posts about web development, WordPress, security, and more »
WP Themes In Depth: Build and sell awesome WordPress themes.
Thoughts
Take a screenshot with Firefox (no extension required). Open Developer Tools Settings and enable the “Take a screenshot” button. Then click the button :)
Take a screenshot with Chrome (no extension required). Open DevTools, type Cmd + Shift + P, then type screenshot.
After 10 years working on my 2010 iMac, my upgrade finally arrived. Shiny new iMac shipped from Ireland :)
Too much caffeine weirds me out. But I love the taste of coffee. So once in a while I enjoy a small cup of decaf. Hits the spot.
Chris Coyier is a truly awesome person. One of the finest people I've ever worked with. Just #gottasayit
Excel won't open CSV file because SYLK format? Open it with text editor and add an apostrophe ' at the beginning of the file, save changes, done.
Displaying too many social media buttons and links all over the place imho makes you look desperate and frankly kinda sad.
Newsletter
Get news, updates, deals & tips via email.
Email kept private. Easy unsubscribe anytime.