How to be Googled.Some strategies for ensuring websites are well represented in Google (and other search engines).
Category Managers need to put time aside to ensure their areas of the site are well represented in Google, by regularly testing keywords their audiences use and keywords the Cat Mans want to market for better use, and then tuning pages/sites for better representation where possible. In areas with legacy content (non Rx, non jsp) link checking should also be a routine. Although we are well represented generally in Google, for several areas in Science and Nature Online especially, considerable work is needed to enable Google to see the huge range of content we have.
The main, but not the only Google (and now perhaps Yahoo) relevance algorithm is based on links a page gets better ranking if more pages link to it. This is an important issue for pages that respond to popular keywords like "dinosaur" "museum" "news". Thus, it is important that our main pages are well interlinked on the site (which they already are, in general), and that we encourage external sites (that we feel are good enough quality of course) link to us.
Google dislikes long dynamic URLS, repetition of key words, broken links and recursion (all explained further below).
For more specialised keywords the long tail of interest that is one of the defining aspects of "Web 2.0", relevance by links remains important, but for most topics it becomes very important to tune the relevant pages for Google. Again no specific definition is available for this, but there are several important aspects, teased out below
Tools for testing:
Yahoo site explorer: https://siteexplorer.search.yahoo.com/ - shows that the NHM site has 88,000 inlinks (referring links) recorded.
It is also known that words used in hyperlinks to a page influence rankings of the target page considerably.
Sites and pages
There is a distinction to be made here between "sites" and "pages". A common mistake is to label all pages in a given site with the same general topic keyword. This is not useful. Only the home page of the site needs this general word, and just having the word in that one place improves that page's ranking and there is no dilution. The rule is: for pages within a site, keep your key words local and accurate to the page content.
Improving the ranking of pages in Dynamic Sites
Google has trouble with many dynamic (database driven) sites. One problem with such sites is that they can have so many ways to link pages together, that too much indexing occurs. Also, if not well designed, recursion can occur (infinite looping of links which look slightly different to Google). Google has built-in mechanisms to avoid this, which means that unless the site is carefully programmed, Google will often give up and not index it. I have noticed that Google does not like too many "parameters" in a dynamic URL.
Google will also ignore sites with dynamic "session IDs" as these make the URLs transient. Google only cares about relatively permanent URLs.
Tactic for dynamic sites: provide simple browse pages with simple URLs to the detail pages of all dynamic sites we want in Google.
Returning to the Site/Page distinction, sometimes the index or home page of a site/subsection contains all the relevant information so if this page is returned for the appropriate keyword, other pages do not need "tuning".
If you want to market a page, think about its topic . what is/are the key words or phrases? Tune the page (as above) to those. Then track it once a month for several months in Google rankings.
Using a mixture of the techniques described above for both static and dynamic content, everything we want to expose properly to Google, should be. Quite a bit of this work has been done and generally we are well represented in Google.