drive4ucomua (drive4ucomua) wrote,
drive4ucomua
drive4ucomua

Learn About Robots.txt with Interactive Examples

One of the things that excites me most about the development of the web is the growth in learning resources. When I went to college in 1998, it was exciting enough to be able to search journals, get access to thousands of dollars-worth of textbooks, and download open source software. These days, technologies lik Khan Academy iTunesU Treehouse n Codecademy ake that to another level.

I've been particularly excited by the possibilities for interactive learning we see coming out of places like Codecademy. It's obviously most suited to learning things that look like programming languages - where computers are naturally good at interpreting the "answer" - which got me thinking about what bits of online marketing look like that.

The kinds of things that computers are designed to interpret in our marketing world are:

  • Search queries particularly those that look more like programming constructs than natural language queries such as [site:distilled.net -inurl:www]
  • The on-site part o setting up analytics setting custom variables and events, adding virtual pageviews, modifying e-commerce tracking, and the like
  • Robots.txt syntax nd rules
  • HTML onstructs like links, meta page information, alt attributes, etc.
  • Skills lik Excel formulae hat many of us find a critical part of our day-to-day job

I've been gradually building out codecademy-style interactive learning environments for all of these things fo DistilledU, our online training platform, but most of them are only available to paying members. I thought it would make a nice start to 2013 to pull one of these modules out from behind the paywall and give it away to the SEOmoz community. I picked th robots.txt one ecause our in-app feedback is showing that it's one of the ones from which people learned the most.

Also, despite years of experience, I discovered some things I didn't know as I wrote this module (particularly about precedence of different rules and the interaction of wildcards with explicit rules). I'm hoping that it'll be useful to many of you as well - beginners and experts alike.

Interactive guide to Robots.txt

Robots.txt is a plain-text file found in the root of a domain (e.g. www.example.com/robots.txt). It is a widely-acknowledged standard and allows webmasters to control all kinds of automated consumption of their site, not just by search engines.

In addition to reading about th protocol, robots.txt is one of the more accessible areas of SEO since you can access any site's robots.txt. Once you have completed this module, you will find value in making sure you understand the robots.txt files of some large sites (for exampl Google n Amazon).

For each of the following sections, modify the text in the textareas and see them go green when you get the right answer.

Subscribe
  • Post a new comment

    Error

    Anonymous comments are disabled in this journal

    default userpic

    Your IP address will be recorded 

  • 0 comments