Yahoo! Slurp is Yahoo!'s web-indexing robot. The Yahoo! Slurp crawler collects documents from the Web to build a searchable index for search services using the Yahoo! search engine. These documents are discovered and crawled because other web pages contain links directing to these documents.
As part of the crawling effort, the Yahoo! Slurp crawler will take robots.txt standards into account to ensure we do not crawl and index those pages that you would not like to have returned via Yahoo! Search Technology. If a page is disallowed to be crawled by robots.txt standards, it is neither considered for inclusion nor placed in the search engine's database.
Yahoo! Slurp follows HREF links. It does not follow SRC links. This means that Yahoo! Slurp does not retrieve or index individual frames referred to by SRC links.
Yahoo! Slurp has support for frames and makes an effort to crawl complex URLs such as those generated by forms, content generation systems, and dynamic page generation software.
How can I prevent Yahoo Slurp from following links from a particular page or archiving a copy of a page?
Yahoo! Slurp obeys the noindex meta-tag. If you place:
<META NAME="robots" CONTENT="noindex">
-in the head of your web document, Yahoo! Slurp will retrieve the document, but it will not index the document or place it in the search engine's database.
|<META NAME="robots" CONTENT="noindex">
||Yahoo Slurp will retrieve the document, but it will not index the document.
|<META NAME="robots" CONTENT="nofollow">
||Yahoo Slurp will not follow any links that are present on the page to other documents.
|<META NAME="robots" CONTENT="noarchive">
||Yahoo maintains a cache of all the documents that we fetch, to permit our users to access the content that we indexed (in the event that the original host of the content is inaccessible, or the content has changed). If you do not wish us to archive a document from your site, you can place this tag in the head of the document, and Yahoo will not provide an archive copy(Cache) for the document.
Yahoo Slurp indexes not only the title and meta tags, but also the full text of webpages. So including quality content in the webpage is as important as including keywords in the title and meta tags.
Yahoo searches for pages and when it finds a page with the required keyword, it lists the page in its SERPs. The position of the page depends on the content. But there are chances for a page with the required keyword being left out due to poor content or because Yahoo could not find the page.
How to attract Yahoo slurp to crawl the site ?
There are 3 ways you attract the Yahoo Crawler in crawling the site:
1. Get links from sites that are regularly crawled by the Yahoo Robot, If that is done Yahoo regular visits the site and crawls it, Regular Yahoo visit is a good sign and helps a lot of getting good Ranking,
2. As yahoo says you can trigger the Yahoo Robot by browsing a site using the Yahoo companion toolbar, Yahoo says this will trigger the Yahoo slurp Bot,
3. Through the Infamous PFI/PPC program sitematch, This type of inclusion guarantee's an inclusion into the Yahoo index, so no problem using it,