Search Engine Optimization Overview
Some programs in the College of Education recently asked for some help with search optimization and I did some research on it. Most of the tequniques involved were already employed by the College in the name of usability or accessibility. Most of the tricks to search engine optimization are also just good web publishing tequniques.
References
If your aren't attending this talk, skip right to better sources for your information:
- Google Webmaster Guidelines: http://www.google.com/support/webmasters/bin/answer.py?answer=35769
- Google Webmaster Tools: http://www.google.com/webmasters/
- SEO Logic Guide: http://www.seologic.com/guide/
- Matt Cutts SEO http://www.mattcutts.com/blog/type/googleseo/ and Videos http://www.mattcutts.com/blog/type/movies/
- Google Public Search https://services.google.com/publicservice/login
Outline
- Tools
- Google Toolbar
- Google Webmaster Central
- Get your content indexed.
- have a robots file
- clean urls
- avoid javascript generated content
- avoid content buried behind forms
- provide alt text for images
- make sure all pages are linked to
- Optimize your words.
- title
- keyword and description meta tags
- h1 and content at top of pages.
- proximity
- provincialism, colloquialism, and localized jargon
- PageRank
- Get other to link to your site
- Get in Directories
- Resolve hostname redirects
- Spam and Penalties
Tools: Google Bar
http://www.google.com/tools/firefox/toolbar/index.html
The toolbar helps you look at pages' rank within Google algorithms. Google says "Wondering whether a new website is worth your time? Use the Toolbar's PageRank™ display to tell you how Google's algorithms assess the importance of the page you're viewing." PageRank will help your results return higher, but the search words must match content on your site. PageRank is based on how many sites link to yours and how important the sites that link to yours are. http://en.wikipedia.org/wiki/PageRank
The PageRank in the toolbar is handy for seeing problems with web pages.
Tools: Google Webmaster Central
You sign up and prove that you have control over a site by placing a file in within the site. After this Google Webmaster Tools gives you some analysis tools for your site. You may add multiple sites to manage.
http://www.google.com/webmasters/
Features:
- Web Crawl: When google last indexed your site, a list of pages that had HTTP errors and not found pages, urls restricted by robots.txt, etc. The listing of the url will say when the error occurred, but will not tell what page linked to it.
- Robots.txt analysis. Simply tests if robots.txt file allows google.
- Top Search Queries: searches that most often returned your pages.
- Top Search Queries Clicks: searches that actually brought traffic to your site.
- Average Top Position: The highest position a page on your site has risen to for that phrase.
- Summary of your site's PageRank
- Page Analysis by type and encoding.
- Submit a Google Sitemap http://www.google.com/support/webmasters/bin/answer.py?answer=40318
- for new sites, new pages, and pages that normally wouldn't be linked to such as some dynamic pages
- allows you to put importance on pages. Worth the trouble?
You can download the search query stats as an CSV file to benchmark your site over time.
Get Indexed: Robots.txt
Robots.txt files won't encourage robots to index your site, but can be used to discourage indexing of particular directories or types of files. http://en.wikipedia.org/wiki/Robots.txt .
Get Indexed: Clean URLS
Dynamic urls with many parameters are less likely to be followed by search indexers. They also present usability problems and expose platform information about your site.
Don't change URLs with technologies (.asp,.aspx, .cfm) etc. If you are using asp or cfm pages simply map .html to that technology in the directories that are using the technology. The newer the url to a given page is the lower its page rank will be. Links to it will be broken. Links will have to be updated every time you change techologies.
Use URLs that can be transmitted easily via print, phone, email.. Use single case, no _ or spaces, keep as short as possible, don't have parameters in them ?id=x&type=7.
Use a single URL for each content page. When the same dynamic content can have multiple urls, it creates a problem for log analysis as well as Google. Google search results may return 2 listings of the same page with different URLs.
- http://www.port80software.com/support/articles/nextgenerationurls
- http://www.searchtools.com/robots/goodurls.html
- http://alistapart.com/articles/urls/
Get Indexed: Don't hide content from search engines.
Javascript generated content will not be indexed by search engines. e.g. key concepts on http://globalizationandeducation.ed.uiuc.edu/
Content behind forms will not be indexed. https://www-s.continuinged.uiuc.edu/ao/registration/courses.cfm Make sure engines and users have a way to browse to content. You phone support people should not need to tell user to click this form and select that option to get to a course description or faculty directroy entry; they should be able to tell them the url over the phone or send it in an email.
Images, videos, flash, etc will not be indexed. Make sure you have alternative text that can be indexed. Google will index alt text, but can not index text images.
Also make sure all your pages are crawlable from links on your home page or subsequent pages. If you must have pages on your site that are not linked to, use a Google sitemap. This will not help with other search engines though.
Optimize Your Words: Make High Importance Content Count
Content in some parts of your pages gets a higher priority in search indexing.
- Make sure your title has important keywords and phrases in it.
- Use Keywords and Meta Tags. These fall in and out of favor with different search engines so you should get it right when you first create a page.
- Content in H1, H2 etc tags have higher priority in some search engines, so use them correctly.
- Proximity. Search result rankings on Google are partially based on the proximity of words being searched for in your pages. Avoid a page layout that breaks up sequential content.
Optimize Your Words: Avoid Jargon
Make sure common phrases someone might search for are included in your main pages and in your header level 1 tags. For example, if you think people would search for "teacher certification", "teacher training" or "teacher education" to get to your content, make sure these phrases are in your site, even if your department uses different phrasing for the same concepts. Avoid phrases that only have meaning locally such as "public engagement", "academic outreach" or "teleport." Such phrases may only confuse users anyway. If your organization has a name that doesn't resonate with what the public would think of its content, work around it in page titles and header elements and stick in in footers and sidebar credits. Or change your units name.
Use the Google search tools to see what people are searching on to get to your site. Some of this info will also be in your log files if you are tracking referring urls and can be aggregated with log analysis software.
PageRank and Site Credibility
What is Google PageRank and what affects it? Credible liks to your site. Redirects. Hostname longevity.
PageRank: Links
Get other sites, especially important ones to link to your site.
Keep your content within your domain. This will improve the PageRank of your domain. This also helps if you are using Google Public search which will search on a single domain.
Find the appropriate locations within the following directories and list your web site. A good way to find categories for your site is to search the given directory by relevant keywords. More directory links to your site will help your search optimization as well as help those browsing directories find your content.
- Open Directory/Dmoz Submission (www.dmoz.org)
- Yahoo Directory.
- Other Directories (http://www.seologic.com/all-search-engines/directories.php)
PageRank: Spam and Google Penalities
.