LSI Latent semantic indexing

neuralnet

Neural Net

by Jonathon Blakeley – Jan 30 2011
Google have recently changed their search algorithm. Keywords are becoming less & less important. This is due to unscrupulous web masters placing too much emphasis on keywords. As a consequence LSI (Latent semantic indexing) has become more and more important. The method is also called Latent Semantic Analysis (LSA). As such queries or searches, which are performed with LSA, will display similar related results.

What is LSI?

LSI is about semantic (semantic = relating to meaning in language or logic) relationships. If my customer has a website about surfboards, the word surfboard then forms links to related words, i.e. tube, nose-riding, wax, long boards, fins rockers etc. So putting surfboards on every page as a keyword is a bit disingenuous to say the least. Google now penalise this kind of SEO sloppiness.

Latent semantic indexing LSI is a mathematical technique that is applied to text. Singular Value Decomposition (SVD) spots patterns and relationships in text. LSI measures the distance between semantically similar concepts on a page, across the web site and the external Internet. Each page must reflect its content in its meta tags or risk being viewed as irrelevant by the search engines.

  • Variety is the spice of SEO. Pages of related keywords tend to rank higher. Increasing the relationships – improves the ranking.
  • LSA analyses a page and then proceeds to scan the entire web site looking for any other relevant and related content.

What does this mean in practice?

Let’s say a website is about holiday cottages in Cornwall. If the phrase ‘holiday cottages’ does not appear to the second or third sentence – then points away and the ranking of that page will do poorly. Similarly using related words like Cornish, self-catering will do well. Particularly if in paragraph one.

Using LSI Google can measure without human observation the value of the content and the expertise of the content creators. LSI also allows Google to spot dangerous content, adult material or criminal content.

LSI Timeline

  1. Mid-1960s – Factor analysis technique first described and tested (H. Borko and M. Bernick)
  2. 1988 – Seminal paper on LSI technique published (Deerwester et al.)
  3. 1989 – Original patent granted (Deerwester et al.)
  4. 1992 – First use of LSI to assign articles to reviewers[12] (Dumais and Nielsen)
  5. 1994 – Patent granted for the cross-lingual application of LSI (Landauer et al.)
  6. 1995 – First use of LSI for grading essays (Foltz, et al., Landauer et al.)
  7. 1999 – First implementation of LSI technology for intelligence community for analyzing unstructured text (SAIC).
  8. 2002 – LSI-based product offering to intelligence-based government agencies (SAIC)
  9. 2005 – First vertical-specific application – publishing – EDB (EBSCO, Content Analyst Company)

What does LSI mean to your business?

Well LSI can have a major effect on the ranking of your web pages. It provides search engines with a quick method of measuring the value of the content. It means that quality, well-researched content is even more important, but should also reflect semantically in the meta tags of that and related web pages.

Leave a Reply

Your email address will not be published. Required fields are marked *

*


*

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>

Site protected by VNetPublishing.Com Web Security Tools