<
previous
next >
APPLICATIONS
What Can LSI Do For Me Today?
Throughout this document, we have been presenting LSI in its role
as a search tool for unstructured data. Given the shortcomings in
current search technologies, this is undoubtedly a critical application
of semantic indexing, and one with very promising results. However,
there are many applications of LSI that go beyond traditional information
retrieval, and many more that extend the notion of what a search
engine is, and how we can best use it. To illustrate this, here
are just a few examples of the areas where exciting work is happening
(or should be happening) with LSI:
-
Relevance Feedback
Most regular search engines work best when searching a small
set of keywords, and very quickly decline in recall when the
number of search terms grows high. Because LSI shows the reverse
behavior (the more it knows about a document, the better it
is at finding similar ones), a latent semantic search engine
can allow a user to create a 'shopping cart' of useful results,
and then go out and search for futher results that most closely
match the stored ones. This lets the user do an iterative search,
providing feedback to guide the search engine towards a useful
result.
-
Archivist's Assistant
In introducing LSI we contrasted it with more traditional approaches
to structuring data, including human-generated taxonomies. Given
LSI's strength at partially structuring unstructured data, the
two techniques can be used in tandem. This is potentially a
very powerful combination - it would allow archivists to use
their time much more efficiently, enhancing, labeling and correcting
LSI-generated categories rather than having to index every document
from scratch. In the next section, we will look at a data visualization
approach that could be used in conjunction with LSI to create
a sophisticated, interactive application for archivist use.
-
Automated Writing Assessment
By comparing student writing against a large data set of stored
essays on a given topic, LSI tools can analyze submitted assignments
and highlight content areas that the student essay didn't cover.
This can be used as a kind of automated grading system, where
the assignment is compared to a pool of essays of known quality,
and given the closest matching grade. We believe a more appropriate
use of the technology is a feedback tool to guide the student
in revising his essay, and suggest directions for further study.
{ More info and demo: http://www-psych.nmsu.edu/essay/
}
-
Textual Coherence:
LSI can look at the semantic relationships within a text to
calculate the degree of topical coherence between its constituent
parts. This kind of coherence correlates well with readability
and comprehension, which suggests that LSI might be a useful
feedback tool in writing instruction (along the lines of existing
readability metrics).
{ source: http://www.knowledge-technologies.com/papers/abs-dp2.foltz.html
}
-
Information Filtering:
LSI is potentially a powerful customizable technology for filtering
spam (unsolicited electronic mail). By training a latent semantic
algorithm on your mailbox and known spam messages, and adjusting
a user-determined threshold, it might be possible to flag junk
mail much more efficiently than with current keyword based approaches.
The same may apply to common Microsoft Outlook computer viruses,
which tend to share a basic structure.
LSI could also be used to filter newsgroup and bulletin board
messages. { source: http://www-psych.nmsu.edu/~pfoltz/cois/filtering-cois.html
}
<
previous
next >
|