Robots.txt Disallow entries now must have leading slash
- Status
- Closed
- Subject
- Robots.txt Disallow entries now must have leading slash
- Version
- 1.8.x
1.9.x
2.x - Category
- Error
- Feature
- All / Undefined
Installer (profiles, upgrades and server-related issues)
Search engine optimization (SEO) - Resolution status
- Fixed or Solved
- Submitted by
- John Hadjisky
- Lastmod by
- Marc Laporte
- Rating
- Description
Although the RFC (for example, http://www.robotstxt.org/wc/norobots-rfc.html) doesn't explicitly require a leading slash (/) before the page name, I have found that, as of late Oct, 2005, many 'bots, including Googlebots, have started requiring them.
For example, before the change,
Disallow: tiki-pagehistory.php
would prevent well-behaved 'bots from trying to index tiki-pagehistory.php. However, after the change, I had to have:
Disallow: /tiki-pagehistory.php
in robots.txt, or else all my page history would be indexed! I verified this using my server log, and also by doing google searches against my site for phrases that only appeared in page history. I have every reason to believe this is a problem for all other TikiWiki-based sites.
Others have noticed this. There is discussion in the forums at:
- Googlebot ignoring robots.txt — includes a graph of my bandwidth usage before and after.
- Adding <meta name="robots" content="noindex,nofollow"> to History pages
- Yahoo search indexes Print pages instead of Read pages
- Files
- Solution
Putting a leading slash before all page references in robots.txt solved the problem. See http://ihuck.com/robots.txt (a TikiWiki site), compare to e.g. http://dupli.tikiwiki.org/robots.txt
- Change the robots.txt in the CVS so that there is a leading slash before all page references (there is already a leading slash before all vdir references)
- Post an article on tw.o urging existing users to change their robots.txt
- Change the robots.txt on all sites *.tikiwiki.org that use the standard robots.txt, for example, dupli.tw.o, doc.tw.o, probably others.
I have many years of web dev experience, and two plus years experience with PHP and TikiWiki, but almost no CVS experience. I'm happy to learn CVS and implement this solution, but I am hoping first for some feedback from the community re have I overlooked any reason not to make these changes. Thanks.Assign this back to me and I'll start working on the changes (except for the 3rd change which a *.tw.o admin will need to do).
- Importance
- 5
- Priority
- 25
- Demonstrate Bug on Tiki 19+
-
This bug has been demonstrated on show2.tiki.org
Please demonstrate your bug on show2.tiki.org
Show.tiki.org is not configured properlyThe public/private keys configured to connect to show2.tiki.org were not accepted. Please make sure you are using RSA keys. Thanks.
- Demonstrate Bug (older Tiki versions)
-
This bug has been demonstrated on show.tikiwiki.org
Please demonstrate your bug on show.tikiwiki.org
Show.tiki.org is not configured properlyThe public/private keys configured to connect to show.tikiwiki.org were not accepted. Please make sure you are using RSA keys. Thanks.
- Ticket ID
- 442
- Created
- Wednesday 21 December, 2005 00:51:59 UTC
by Unknown - LastModif
- Saturday 06 July, 2024 10:21:44 UTC