There are those who think that search engine optimisation is a mysterious thing, voodoo to a lot of people, SEO can be easy to master if you understand the concepts, there are numerous things which you must do to your website in order for it to be seen as search engine friendly.
In this post I will try to explain what goes into “on-site SEO” and identify some basic elements of what makes the code of your site friendly to the search engines.
Basic Checks, in no particular order include:
It is widely accepted that pages which have their target keyword(s) within their title tag often get a rankings boost, the title tag is used by the search engines to help determine what the page content is about. Making sure the title tag contains relevant keywords helps to tie the page content to the user’s search query, and helps the search engine to rank your page.
It is also important for the keywords to be in a prominent position at the beginning of the title tag, when the user performs a search, any matching keywords will be bolded in the SERPs, this is thought to help improve upon click through rates (CTR).
It is important that the title tag has the correct character length, roughly 54 characters or as many characters as will fit into 512 pixels wide, seems to be commonly accepted as the “best practice” length. There are two main reasons for this, firstly if the title is too long the effectiveness of the keyword targeting is diluted and reduced. Secondly there is a limit to the number of characters Google will display in the SERPs, if the title is too long the search engines will display an Ellipsis “…” indicating that the title has been cut short.
Moz have a Title Tag Preview Tool available here: http://moz.com/blog/new-title-tag-guidelines-preview-tool that can help gauge how your title tags will display in search.
In the past it was considered one of the most important factors of on-site SEO, having keywords feature in the H1 of the page was associated with high rankings for many years. A lot of SEO specialists still believe that it plays a part in correctly optimising a page for the search engines.
Having a keyword rich URL structure is also one of the basic SEO factors you should be implementing where possible. For new pages I would recommend adding a keyword to the URL, it can be difficult to edit URL’s in some instances.
It would seem fairly obvious that your pages should contain the keywords you are hoping to rank for within the physical copy on the page, but sometimes this is overlooked, especially if a copywriter has written the content or the company is unwilling to alter their copy.
It is vital that the keywords are present in the page content.
Keyword Density has been the subject of debate in the SEO community for many years, the central question is “does it really have an impact on rankings?”. In my opinion it is still important because it is a good indicator as to the quality of the page copy, how relevant it is to a search query and how spammy it is.
Keyword Density is the percentage of times a keyword or phrase appears within the page copy compared to the total word count.
If the percentage is too high the copy may not read very well and the search engines may determine that the keywords have been forced into the content in an attempt to game the system and improve the rankings. In my experience the ideal percentage is around 3% – 5%.
Here is a useful tool, which can help you determine the density of your keywords: http://tools.seobook.com/general/keyword-density/.
The number of words on a webpage is important, Google has started to penalise pages it sees as being “thin” in content, or pages with very little substance. There is no “correct” number of words that should be included on a page, and it is a common misconception that pages should have a certain number.
Use as many words as is required to explain a topic, trying to hit a mythical number of words per page may result in content which “drones on”. Always write copy with the users in mind, it should be easy to skim through and not too long (yes I know this post is going to be quite long).
The image “Alt” attribute is used by web browsers to display a short description of an image should it fail to load, this “alternate text” is used to meet accessibility guidelines. However it has long been recognised that the Alt attribute also helps the search engines determine the topic of a page, and as a result the SEO industry has been using them for many years. They are still an important factor in on-site SEO and keywords should be used where possible, in a natural way to help explain the subject of the image.
The W3C guidelines for the use of the alt attribute are available here for further reading: http://www.w3.org/TR/WCAG20-TECHS/H37.html.
The Meta Keywords tag was removed as a ranking factor from the Google search engine in September 2009, however it is still used by other search engines to help determine the topic of a web page. It should still be used in my opinion to cater for these search engines, and it won’t harm your rankings in Google.
Search queries are generally made up of multiple words, this is especially true for “long-tail” keywords. The keyword proximity refers to the distance between the search query’s individual keywords that make up the query within the copy on the page.
An example would be the query “web development derby”, taking the sentence “The Web Design Group are a web design and development agency based in Derby”.
The proximity of the keywords “web” and “development” is two, because there are two words between them, the proximity of “development” and “Derby” would be three.
The lower the proximity the more relevant it is to the search engines. Proximity is something to be considered when writing copy for your web pages.
Proofreading the copy on your website is essential, it helps prevent spelling and grammatical errors.
Google has started to clamp down on poor spelling and grammar in its Panda Algorithm, by proofreading you are making sure your content doesn’t get marked down by the search engines.
It is important to give your pages unique Meta Descriptions and Title Tags, this is to avoid duplication penalties by Google.
Since Google released it’s Panda algorithm having unique page titles has become increasingly important. Having duplicated Meta Descriptions and Title Tags is reported upon in Google Webmaster Tools as an HTML error.
Outbound links from one site to another are seen as a “vote” for the destination site, the originating website will pass “link juice” (ranking power) through these links onto the destination site.
It is commonly accepted that linking out to high quality sources within your content can help your website become a trusted resource.
If you are confident that the site you are linking to is of sufficient quality and is RELEVANT to your own site then you shouldn’t need to worry about adding the NoFollow attribute to the outbound link.
However if you aren’t sure about the quality of the site you are referencing in your content it is advisable that you add the NoFollow attribute to the anchor tag.
Moz dives deeper into the positive reasons for outbound linking in this article: http://moz.com/blog/5-reasons-you-should-link-out-to-others-from-your-website.
Linking internally is important for SEO, in a similar way to building links to your most important pages from external websites. Linking related sub pages to the most important pages on your site has been recognised as having benefits for SEO.
Google reports on the number of internal links to your pages in Webmaster Tools, and has confirmed this as a ranking signal in this post: https://support.google.com/webmasters/answer/138752?hl=en.
404 errors or missing pages are bad for SEO, they get reported by the search engines when they try to crawl a link to a page which no longer exists. This is really common for websites which have been redesigned or where old content has been taken down. Google reports these within the “crawl errors” section in Webmaster Tools. It is highly important that these are fixed, both for usability reasons and SEO.
The most common way to fix these is to implement a 301 permanent redirect to the new version of the content, this will instruct the search engines that the old page content has been moved and the new URL should be indexed instead.
The canonical link tag is a great way to help control duplicate content on your website. It instructs Google which version of a URL should be indexed.
In this instance there may be two versions of /index.php in the search engine index, this may cause duplicate content.
To avoid this the canonical link should look like this:
<link rel="canonical" href="http://yourdomain.com/index.php" />
The canonical link is especially useful for ecommerce websites where product category pages may be paginated. Read more about canonical URLs in this post: https://support.google.com/webmasters/answer/139066?hl=en.
The Robots.txt file is extremely important for SEO, it provides instructions to the search engine robots on how to crawl and index (or not as the case may be) certain areas of your website. There are certain commands which are used to control how the site pages and directories are indexed.
“User-Agent:” – Refers to the specific search engine spider, most commonly it simply uses the “*” wildcard to influence all spiders.
“Allow” – This command is used to allow access to certain areas of the website, it is most commonly used to allow access to the entire site at the root level by using “Allow: /”.
“Disallow” – This command is used to block access to certain areas of the site, areas which you don’t want the spider’s indexing, administration areas, CMS directories etc.
The sitemap.xml file is used to help the search engines discover and crawl the individual pages on your website. All the pages should be listed within this file, it should then be submitted to the search engines. Google and Bing let you submit the “sitemap” directly via their Webmaster Tools control panels.
There are online sitemap generators which are really useful in creating your sitemap, but for CMS systems, most of them generally have a script built in which will generate the sitemap, in the case of WordPress there are various SEO plugins available which will also generate a “sitemap.xml” file.
Sounds obvious doesn’t it, pages that are blocked to the search engines cannot be indexed and the content on those pages won’t appear to users making a search. This is extremely bad for SEO!
Pages can be blocked to the search engines in numerous different ways, it is important that you check each of these to ensure your pages are indexable.
The Meta Robots tag:
<meta name="robots" content="noindex, nofollow">
If it is set to “noindex” or “nofollow” or both, this will severely hinder your webpage’s chances of ranking and appearing in the search engines index.
The Robots.txt File:
User Agent * Disallow /page.php Disallow /directory/
By using the disallow command in the robots.txt file you are telling the search engines not to crawl a certain area of your website, make sure this is set up correctly and not blocking any areas of your site which are required by the search engines to fully index your site.
Choosing to use the “www.” in your website address or not is important, but you can’t use both. The reason for this is that search engines may treat them both as two separate sites.
Both of these two URLs would be seen as a duplicate copy of each other, duplicate content is seen as bad in SEO.
Every page on your site would also be seen as a duplicate.
Depending on the server configuration which hosts your website there are numerous different methods of setting up these redirects.