Google Patents, Applications and Publications
Patents granted to Google, Inc. as Assignee, as well as pending
Patent Applications that have been applied for and publicly published
at the U. S. Patent Office.
Patent Applications
Systems
and methods for analyzing boilerplate
U. S. Patent Application: 20080040316
Date Published: February 14, 2008
Date Filed: March 31, 2004
Inventor:Lawrence; Stephen R.
Assignee: Google Inc.
Abstract
Systems and methods for analyzing boilerplate are described.
In one described system, an indexer identifies a common element
in a plurality of related articles. The indexer then classifies
the common element as boilerplate. For example, the indexer may
identify a copyright notice appearing in a plurality of related
articles. The copyright notice in these articles is considered
boilerplate.
Information
retrieval based on historical data
(Gone MIA at the patent office, copy on this site. Loads slowly
now, will be breaking it up into separate pages.)
Document
Scoring Based on Document Inception Date
U.S. Patent Application: 20070094254
Date published: April 26, 2007
Application No.: 10/676,651
Date Filed: November 20, 2006
Assignee: Google Inc.
Inventors: Cutts; Matt; (Los Altos, CA) ; Dean; Jeffrey; (Palo
Alto, CA) ; Haahr; Paul; (San Francisco, CA) ; Henzinger; Monika;
(Corseaux, CH) ; Lawrence; Steve; (Mountain View, CA) ; Pfleger;
Karl; (Mountain View, CA) ; Tong; Simon; (Mountain View, CA)
Abstract
A system may determine a document inception date associated
with a document, generate a score for the document based, at least
in part, on the document inception date, and rank the document
with regard to at least one other document based, at least in
part, on the score.
Systems
and methods for determining document freshness
United States Patent Application: 20050144193
Date published: June 30, 2005
Date filed: June 30, 2004
Assignee: Google, Inc.
Inventor: Henzinger, Monika
Abstract
Systems and methods for determining document freshness Abstract
A system determines a freshness of a first document. The system
determines whether a freshness attribute is associated with the
first document. The system identifies, based on the determination,
a set of second documents that each contain a link to the first
document. The system assigns a freshness score to the first document
based on a freshness attribute associated with each document of
the set of second documents or the freshness attribute associated
with the first document.
Document
Scoring Based on Query Analysis
U.S. Patent Application: 20070088692
Date published: April 19, 2007
Date filed: November 22, 2006
Serial No.: 562617
Assigned: Google, Inc.
Inventors: Dean; Jeffrey; (Palo Alto, CA) ; Haahr; Paul; (San Francisco,
CA) ; Henzinger; Monika; (Corseaux, CH) ; Lawrence; Steve; (Mountain
View, CA) ; Pfleger; Karl; (Mountain View, CA) ; Sercinoglu; Olcan;
(Mountain View, CA) ; Tong; Simon; (Mountain View, CA)
A system may determine an extent to which a document is selected
when the document is included in a set of search results, generate
a score for the document based, at least in part, on the extent
to which the document is selected when the document is included
in a set of search results; and rank the document with regard
to at least one other document based, at least in part, on the
score.
Document
Scoring Based on Traffic Associated with a Document
U. S. Patent Application: 20070088693
Date published: April 19, 2007
Date filed: November 30, 2006
Serial No.: 565026
Assignee: Google, Inc.
Inventor: Lawrence; Steve; (Mountain View, CA)
Abstract
A system determines an extent to which advertisements are
presented or updated within a document, a quality of an advertiser
associated with an advertisement provided within the document,
whether an advertisement in the document relates to an advertising
document that has more than a threshold amount of traffic, and/or
an extent to which an advertisement provided within the document
generates user traffic to an advertising document related to the
advertisement. The system generates a score for the document based,
at least in part, on the extent to which advertisements are presented
or updated within the document, the quality of the advertiser
associated with the advertisement provided within the document,
whether the advertisement relates to an advertising document that
has more than the threshold amount of traffic, and/or the extent
to which the advertisement generates user traffic to the advertising
document. The system ranks the document with regard to at least
one other document based, at least in part, on the score.
Presentation
of search results based on document structure
U..S. Patent Application: 20060074907
Date published: April 6, 2006
Date filed: September 27, 2004
Serial No. 949708
Abstract
A system identifies a document relating to a search term,
where the document includes a set of structural elements. The
system determines a distribution of occurrences of the search
term in the document, identifies one of the structural elements
based on the distribution of occurrences of the search term in
the document, and presents information associated with the identified
structural element.
Document
Scoring Based on Link-Based Criteria
U.S. Patent Application: 20070094255
Published: April 26, 2007
Filed: November 30, 2006
Assignee: Google, Inc.
Inventors: Acharya; Anurag; (Campbell, CA) ; Cutts; Matt; (Los
Altos, CA) ; Dean; Jeffrey; (Palo Alto, CA) ; Haahr; Paul; (San
Francisco, CA) ; Henzinger; Monika; (Corseaux, CH) ; Lawrence; Steve;
(Mountain View, CA) ; Pfleger; Karl; (Mountain View, CA) ; Tong;
Simon; (Mountain View, CA)
Abstract:
A system may determine time-varying behavior of links pointing
to a document, generate a score for the document based, at least
in part, on the time-varying behavior of the links pointing to
the document, and rank the document with regard to at least one
other document based, at least in part, on the score.
Multi-stage
query processing system and method for use with tokenspace repository
Patent Application: 20060036593
Published: February 16, 2006
Filed: August 13, 2004
Inventors: Dean; Jeffrey Adgate; (Palo Alto, CA) ; Haahr; Paul
G.; (San Francisco, CA) ; Sercinoglu; Olcan; (Mountain View, CA)
; Singhal; Amitabh K.; (Palo Alto, CA)
Abstract:
A multi-stage query processing system and method enables multi-stage
query scoring, including "snippet" generation, through incremental
document reconstruction facilitated by a multi-tiered mapping
scheme. At one or more stages of a multi-stage query processing
system a set of relevancy scores are used to select a subset of
documents for presentation as an ordered list to a user. The set
of relevancy scores can be derived in part from one or more sets
of relevancy scores determined in prior stages of the multi-stage
query processing system. In some embodiments, the multi-stage
query processing system is capable of executing one or more passes
on a user query, and using information from each pass to expand
the user query for use in a subsequent pass to improve the relevancy
of documents in the ordered list.
Variable
length snippet generation
U.S. Patent Application: 20050278314
Date Published: December 15, 2005
Filing Date: June 9, 2004
Inventors: Buchheit, Paul; (Mountain View, CA)
Abstract:
A method and system are disclosed that provide a variable
length snippet when returning snippets in response to a search
request. Under conditions where the search query matches a document
with a high degree of certainty, a shorter snippet is provided
than when the document does not match the search query with a
high level certainty. A variable snippet length is also based
on an estimate of how likely a user will recognize the document.
For example, shorter snippets are provided is a user has recently
viewed a document, but longer snippets are provided if a user
has not recently viewed the document.
Google Patents
System
and method for selectively searching partitions of a database
U.S. Patent: 7,254,580
Date granted: August 7, 2007
Application No.: 10/676,651
Date Filed: September 30, 2003
Assignee: Google Inc.
Inventors: Gharachorloo; Kourosh (Menlo Park, CA), Chang; Fay Wen
(Mountain View, CA), Wallach; Deborah Anne (Emerald Hills, CA),
Ghemawat; Sanjay (Mountain View, CA), Dean; Jeffrey (Menlo Park,
CA) (Mountain View, CA)
Abstract:
When a search query is received, a plurality of partition indexes
are searched using the set of search terms in the search query.
Each partition index corresponds to a partition of a document index.
The search of each respective partition index identifies a subset
of a plurality of document index sub-partitions corresponding to
the respective partition index. Next, the search query is executed
by only those document index sub-partitions identified by the subsets,
thereby identifying documents that satisfy the search query. By
using the partition index to reduce the number of document index
sub-partitions searched while executing a search query, the execution
of the search query is made more efficient.
Link
based clustering of hyperlinked documents
U.S. Patent: 7,213,198
Date granted: May 1, 2007
Assignee: Google, Inc.
Filed: August 10, 2000
Inventor: Author: Georges R. Harik
Abstract:
Techniques for grouping hyperlinked documents are provided.
Links near or in the neighborhood of the hyperlinked documents
are analyzed in order to group the hyperlinked documents by topic.
For example, links that are search results can be grouped by identifying
other hyperlinked documents that have multiple forward links to
the search results. The search results can then be grouped according
to the forward links of the other hyperlinked documents.
Ranking
search results by reranking the results based on local inter-connectivity
Granted to Krishna Bharat with Google.com as assignee. Often referred
to as the LocalRank patent.
Patent number: 6,526,440
Filing date: Jan 30, 2001
Issue date: Feb 25, 2003
Inventor: Krishna Bharat
Assignee: Google, Inc.
Abstract:
A search engine for searching a corpus improves the relevancy
of the results by refining a standard relevancy score based on
the interconnectivity of the initially returned set of documents.
The search engine obtains an initial set of relevant documents
by matching a user's search terms to an index of a corpus. A re-ranking
component in the search engine then refines the initially returned
document rankings so that documents that are frequently cited
in the initial set of relevant documents are preferred over documents
that are less frequently cited within the initial set.
Google Patents Related to Duplicate Content
Detecting
query-specific duplicate documents
U.S. Patent 6,615,209
Date granted: September 2, 2003
Application date: October 6, 2000
Inventors: Gomes; Benedict (Berkeley, CA), Smith; Benjamin Thomas
(Mountain View, CA)
Patent originally applied for Oct. 6, 2000 and granted to Google
Sept. 2, 2003 by the U.S. Patent Office utilizes query-relevant
information for similarity comparisons, in some cases relying on
extracted snippets from the documents rather than the entire documents
themselves.
Detecting
duplicate and near-duplicate files
Authored by Wm. Pugh and Monika Henzinger
More from William Pugh on this work:
Detecting
duplicate and near-duplicate files
|