MSN Live Search Engine: Patents, White Papers & Technology

Includes white papers and patents from MSN Search, including references to publications at the MSN Research website.

MSN Sandbox

Microsoft Research

Algorithms and Search Technology

Understanding Inverse Document Frequency
Paper from Microsoft Research, Cambridge, authored by Stephen Robertson. (PDF Document)

Query Expansion for Short Queries by Mining User Logs
Paper examines short vs. long user search queries and discusses expansion techniques, including tf*idf (Term Frequency & Inverse Document Frequency.

New Ways to Search the Web
Brief, introductory explanation of VIPS and Block Level Analysis

VIPS / Block Level Link Analysis

Vision Based Page Segmentation
Paper at Microsoft Research, published in 2003, available in both MS Word and PDF format, both printable.

Block Level Link Analysis
An excellent paper from MSN. In a nutshell, where on the page a link is located is a factor to be considered. Long considered by some who are into SEO but seldom, if ever, publicly mentioned - for obvious reasons.

Block-Based Web Search
Examines several approaches to page segmentation.


System and method for query refinement to enable improved searching based on identifying and utilizing popular concepts related to users' queries
Patent No. 7,136,845
Granted November 14, 2006
Inventor: Chandrasekar , et al.
Assignee: Microsoft Corp.


Learning to Cluster Web Search Results
Algorithm which clusters results with common words that have different meanings, and that indicate a different context, into relevant categories.

Instance-based Schema Matching for Web Databases by Domain-specific Query Probing
References are made to interpreting semantic similarities and synonyms.

Probabilistic Model for Contextual Retrieval

Web-page Classification through Summarization
Reference is made to topical identification.

From Latent Semantics to Spatial Hypertext - An Integrated Approach

The PowerRank Link Analysis Algorithm


Organic Cyberspace
Microsoft researcher examines the dynamics of communication in online communities.