Schema.org is a type of markup that you can put in the code of your website. Using schema.org, you can tell Google which picture on your site is your logo, where your reviews are, where your videos are, what type of company you are, where you are located and much more. Google has hinted over the last year that schema.org will help your website rank better in Google search. Recently, Google’s John Mueller, said in a Google Hangout on Sept. 11 (at the 21:40 minute mark) that “over time, I think it [structured markup] is something that might go into the rankings as well.”
To avoid undesirable content in the search indexes, webmasters can instruct spiders not to crawl certain files or directories through the standard robots.txt file in the root directory of the domain. Additionally, a page can be explicitly excluded from a search engine's database by using a meta tag specific to robots (usually ). When a search engine visits a site, the robots.txt located in the root directory is the first file crawled. The robots.txt file is then parsed and will instruct the robot as to which pages are not to be crawled. As a search engine crawler may keep a cached copy of this file, it may on occasion crawl pages a webmaster does not wish crawled. Pages typically prevented from being crawled include login specific pages such as shopping carts and user-specific content such as search results from internal searches. In March 2007, Google warned webmasters that they should prevent indexing of internal search results because those pages are considered search spam.
Thanks for the great post. I am confused about the #1 idea about wikipedia ded links…it seems like you didn’t finish what you were supposed to do with the link once you found it. You indicated to put the dead link in ahrefs and you found a bunch of links for you to contact…but then what? What do you contact them about and how do you get your page as the link? I’m obviously not getting something 🙁