|
Search engine optimization
Search engine optimization (SEO) is a set of methods aimed at improving the ranking of a website in search engine listings. The term also refers to an industry of consultants that carry out optimization projects on behalf of clients' sites.
Overview
Using search engines, visitors can find sites in a variety of ways: via paid-for advertisements in the search engine results pages (SERPs), via third parties who are listed in the search engines, or via "organic" listings, i.e. the results the search engines present users. SEO is primarily concerned with improving the visibility of a site in the organic search results.
High rankings in the organic search results can provide targeted traffic for a site. Obtaining that traffic by other means can potentially be expensive. For moderately competitive terms, the cost per click can range up to several dollars, or more, when Pay Per Click or banner advertising are used. Given those costs, it often makes sense for site owners to optimize their sites for organic search.
Not all sites have identical goals in mind when they optimize for search engines. Some sites are seeking any and all traffic, and may be optimized to rank highly for common search phrase. This can be a poor marketing strategy for a business because it can generate a large volume of low-quality inquiries that cost money to handle, yet result in little business. The "shotgun approach" to search optimization can possibly work well for a site that has broad interest, such as a periodical, a directory, or site that displays advertising with a CPM revenue model.
Other sites target a specific population, with particular needs or interests. Many businesses try to optimize their sites for large numbers of highly specific keywords that indicate a prospective customer who is ready to buy their product. Focusing on targeted traffic can generate more high-quality sales leads, and fewer time-wasting inquiries.
Factors that Google considers
The following are some of the considerations for search included in Google patents.: [1]
- Age of site
- Length of time domain has been registered
- Age of content
- Regularity with which new content is added
- Age of link and reputation of linking site
- Standard on-site factors
- Negative scoring for on-site factors (for example, a dampening for sites with extensive keyword meta tags indicative of having being SEO-ed)
- Uniqueness of content
- Related terms used in content (the terms the search engine associates as being related to the main content of the page)
- Google Pagerank (Only used in Google's algorithm)
- External links, the anchor text in those external links and in the sites/pages containing those links
- Citations and research sources (indicating the content is of research quality)
- Stem-related terms in the search engine's database (finance/financing)
- Incoming backlinks and anchor text of incoming backlinks
- Negative scoring for some incoming backlinks (perhaps those coming from low value pages, reciprocated backlinks, etc.)
- Rate of acquisition of backlinks: too many too fast could indicate "unnatural" link buying activity
- Text surrounding outward links and incoming backlinks. A link following the words "Sponsored Links" could be ignored
- Use of "rel=nofollow" to suggest that the search engine should ignore the link
- Depth of document in site
- Metrics collected from other sources, such as monitoring how frequently users hit the back button when SERPs send them to a particular page
- Metrics collected from sources like the Google Toolbar, AdWords|Google AdWords/Google Adsense|Adsense programs, etc.
- Metrics collected in data-sharing arrangements with third parties (like providers of statistical programs used to monitor site traffic)
- Rate of removal of incoming links to the site
- Use of sub-domains, use of keywords in sub-domains and volume of content on sub-domains… and negative scoring for such activity
- Semantic connections of hosted documents
- Rate of document addition or change
- IP of hosting service and the number/quality of other sites hosted on that IP
- Other affiliations of linking site with the linked site (do they share an IP? have a common postal address on the "contact us" page?)
- Technical matters like use of 301 to redirect moved pages, showing a 404 server header rather than a 200 server header for pages that don't exist, proper use of robots.txt
- Hosting uptime
- Whether the site serves different content to different categories of users (cloaking)
- Broken outgoing links not rectified promptly
- Unsafe or illegal content
- Quality of HTML coding, presence of coding errors
- Actual click through rates observed by the search engines for listings displayed on their SERPs
- Hand ranking by humans of the most frequently accessed SERPs
The relationship between SEO and the search engines
In the early 2000, search engines and SEO firms attempted to establish an unofficial "truce." There are several tiers of SEO firms, and the more reputable companies employ content-based optimizations which meet with the search engines' (reluctant) approval. These techniques include improvements to site navigation and copywriting, designed to make websites more intelligible to search engine algorithms.
Getting discovered by search engines
New sites no longer need to be submitted to search engines to be listed. A simple link from an established site will get the search engines to visit the new site and spider its contents. It is rarely more than a few days from the acquisition of the link to all the main search engine spiders visiting and indexing the new site.
Naturally, this means that it is good practice to have some means (such as a site map, or plain hypertext links) so that once a spider finds part of a site, it can navigate to the rest. Otherwise, individual, isolated, dead-end pages must be found one-by-one from outside the site; any pages that are not linked to from outside can only be found by links internal to the site.
White hat versus black hat techniques
SEO techniques can be classified into two broad categories: techniques that search engines recommend as part of good design, and those techniques of which search engines do not approve. The search engines attempt to minimize the effect of the latter. Industry commentators have classified these methods, and the practitioners who employ them, as either White hatSEO.[2] White hats tend to produce results that last a long time, whereas black hats anticipate that their sites may eventually be banned either temporarily or permanently once the search engines discover what they are doing.[3]
An SEO technique is considered white hat if it conforms to the search engines' guidelines and involves no deception. As the search engine guidelines[4][5][6] are not written as a series of rules or commandments, this is an important distinction to note. White hat SEO is not just about following guidelines, but is about ensuring that the content a search engine indexes and subsequently ranks is the same content a user will see. White hat advice is generally summed up as creating content for users, not for search engines, and then making that content easily accessible to the spiders, rather than attempting to trick the algorithm from its intended purpose. White hat SEO is in many ways similar to web development that promotes accessibility,[7] although the two are not identical.
Black hat SEO attempts to improve rankings in ways that are disapproved of by the search engines, or involve deception. One black hat technique uses text that is hidden, either as text colored similar to the background, in an invisible div, or positioned off screen. Another method gives a different page depending on whether the page is being requested by a human visitor or a search engine, a technique known as cloaking.
Search engines may penalize sites they discover using black hat methods, either by reducing their rankings or eliminating their listings from their databases altogether. Such penalties can be applied either automatically by the search engines' algorithms, or by a manual site review. One example was the February 2006 Google removal of both BMW Germany and Ricoh Germany for use of deceptive practices.[8] Both companies, however, quickly apologized, fixed the offending pages, and were restored to Google's list.[9]
Ethical methods
So-called "ethical" (also known as "white hat") methods of SEO involve following the search engines' guidelines as to what is and what isn't acceptable. Their advice generally is to create content for the user, not the search engines; to make that content easily accessible to their spiders; and to not try to game their system. Often webmasters make critical mistakes when designing or setting up their web sites, and "poison" them so that they will not rank well. Ethical SEO attempts to discover and correct mistakes, such as machine-unreadable menus, broken links, temporary redirects, or a generally poor navigation structure that places pages too many clicks from the home page.
Because search engines are text-centric, many of the same methods that are useful for web accessibility are also advantageous for SEO. Methods are available for optimizing graphical content, even Flash animation (by placing a paragraph or division within, and at the end of the enclosing OBJECT tag), so that search engines can interpret the information.
Some methods considered ethical by the search engines:
- Using a robots.txt file to grant permissions to spiders to access, or avoid, specific files and directories in the site
- Using a short and relevant page title to name each page
- Using a reasonably sized description meta tag without excessive use of keywords, exclamation marks or off topic comments
- Keeping the page accessible via links from other pages on the site and, preferably, from a sitemap
- Developing links via natural methods: Google doesn't elaborate on this somewhat vague guideline, but buying a link from an off-topic page purely because it has a high PageRank is probably not considered acceptable. Dropping an email to a fellow webmaster telling him about a great article you've just posted, and requesting a link, is most likely acceptable.
- Using a clean css layout with the content near the top of the source
Unethical methods
As search engines operate in a highly automated way it is often possible for webmasters to use methods and tactics not approved by search engines to gain better ranking. These methods often go unnoticed unless an employee from the search engine manually visits the site and notices the activity, or a change in ranking algorithm causes the site to lose the advantage thus gained. Sometimes a company will employ an SEO consultant to evaluate competitor's sites, and report "unethical" optimization methods to the search engines.
So-called "unethical" methods may include:
Keyword spamming (or keyword stuffing) involves the insertion of hidden, random text on a webpage to raise the keyword density or ratio of keywords to other words on the page. Hiding text out of view of the visitor's screen is done in many different ways. A popular technique is text colored to blend with the background. Using CSS "Z" positioning to place text "behind" an image -- and therefore out of view of the visitor -- is also common. Other ways include using CSS absolute positioning to have the text positioned several feet away from the page center and, again, out of physical view of the visitor but plainly text that any search engine would pick up in a crawl of the page. Invisible text is a bad idea, as of 2005, because top search engines apparently can detect it.
Abusing NOSCRIPT tags is another way to place hidden content within a page so that the search engines will index it, but the visitor won't see the content. NOSCRIPT tags are also a valid optimization method for displaying an alternative representation of JavaScript content, such as dynamic methods. The NOSCRIPT tags is not unethical by itself, only if misused.
The inserted text sometimes includes words that are frequently searched (such as "sex") even if those terms bear little connection to the content of the page. The goal in these cases is plainly to increase traffic at all costs whether that traffic is relevant or not. Once traffic comes to the page, the unethical webmaster may hope to monetize the traffic by displaying ads.
Spamdexing is the promotion of irrelevant, chiefly commercial, pages through abuse of the search algorithms. Many search engine administrators consider any form of search engine optimization used to improve a website's page rank as spamdexing. However, over time a widespread consensus has developed in the industry as to what are and are not acceptable means of boosting one's search engine placement and resultant traffic.
Cloaking refers to any of several means to serve up a different page to the search-engine spider than will be seen by human users. It can be an attempt to mislead search engines regarding the content on a particular web site. It should be noted, however, that cloaking can also be used to ethically increase accessibility of a site to users with disabilities, or to provide human users with content that search engines aren't able to process or parse. It is also used to deliver content based on a user's location; Google themselves use IP delivery, a form of cloaking, to deliver results.
Link spam is the placing or solicitation of links randomly on other sites, placing a desired keyword into the hyperlinked text of the inbound link. Guest books, forums, blogs and any site that accepts visitors comments are particular targets and are often victims of drive by spamming where automated software creates nonsense posts with links that are usually irrelevant and unwanted.
The following techniques are also widely acknowledged as being spam (electronic)|spam, or "black hat":
- Mirror sites
- Doorway pages
- Link farms
Some SEOs argue that the terms ethical and unethical should not be applied to the work they do. They maintain that on the principle of basic freedom everybody should be free to post whatever they choose on a site they own, as long as they stay within the law. The responsibility to block search engines access to that content is not one the webmaster should automatically assume. SEOs then explain that typically search engines visit sites uninvited and help themselves to the entire content of that site. Should the search engine then apply some software to "digest" that content and use it in their search results (often monetized with their own advertising) then pinning an "unethical" label on the webmaster is neither fair nor accurate. The flip side is that when a webmaster submits a site to a search engine he is actually inviting the search engine over. However, nowadays, the invitation is unnecessary as search engine spiders are aggressive in finding links to new pages and in crawling that new content, often within hours or minutes, unless they have specifically been excluded by a webmaster-prepared robots.txt file, or a robots exclusion meta tag.
References
- ↑ The Google search patent [1]
- ↑ Andrew Goodman. "Search Engine Showdown: Black hats vs. White hats at SES". SearchEngineWatch. http://searchenginewatch.com/showPage.html?page=3483941. Retrieved May 9, 2007.
- ↑ Jill Whalen (November 16, 2004). "Black Hat/White Hat Search Engine Optimization". searchengineguide.com. http://www.searchengineguide.com/whalen/2004/1116_jw1.html. Retrieved May 9, 2007.
- ↑ "Google's Guidelines on Site Design". google.com. http://www.google.com/webmasters/guidelines.html. Retrieved April 18, 2007.
- ↑ "Guidelines for Successful Indexing". bing.com. http://onlinehelp.microsoft.com/en-us/bing/hh204434.aspx. Retrieved September 7, 2011.
- ↑ "What's an SEO? Does Google recommend working with companies that offer to make my site Google-friendly?". google.com. http://www.google.com/webmasters/seo.html. Retrieved April 18, 2007.
- ↑ Andy Hagans (November 8, 2005). "High Accessibility Is Effective Search Engine Optimization". A List Apart. http://alistapart.com/articles/accessibilityseo. Retrieved May 9, 2007.
- ↑ Matt Cutts (February 4, 2006). "Ramping up on international webspam". mattcutts.com/blog. http://www.mattcutts.com/blog/ramping-up-on-international-webspam/. Retrieved May 9, 2007.
- ↑ Matt Cutts (February 7, 2006). "Recent reinclusions". mattcutts.com/blog. http://www.mattcutts.com/blog/recent-reinclusions/. Retrieved May 9, 2007.