.../data/salience/opinions

The salience/opinions directory contains files that aid in opinion extraction for entities. Users can customize the following files

*.ptn

Pattern files for recognition of opinions, see below.

opinions.imp

File of pattern macros for use in opinion extraction, see below.

*.dat

Data files containing verbs found in opinions, see below.

Background on entity opinions

Opinions are implied or direct quotations by a person about a topic. They consist of the following parts:

  • Speaker - An entity who is believed to hold the opinion. This is usually, but not always, a person.
  • Topic - An entity or theme involved in the opinion.
  • Sentiment - The sentiment of this opinion towards the topic. Note that because opinions are often very short pieces of text, the accuracy of the sentiment value will be lower than when Salience analyzes an entire document for sentiment.
  • Quotation - The text Salience interpreted as being an opinion.

Implied and Direct Quotations

A direct quotation is one marked with quotation marks: George said, "Massachusetts is a lovely state in the autumn." The quotation may span multiple sentences.

An implied quotation is a situation where an idea is attributed to a person, but without explicitly stating the quotation: George feels that Massachusetts is prettiest in the autumn. An opinion is implied, but not explicitly stated. Both forms are recognized by Salience.

Opinion Topics

If a quotation contains multiple topics, one Opinion is returned for each. If a quotation does not contain a topic, it is not returned at all.

Customizing opinion extraction in user/salience/opinions

Modifications to the opinion patterns are not recommended: the code is linguistically complex.

The pattern files cross_sentence.ptn, quotation.ptn and implied_quotation.ptn provide the linguistic patterns for different contexts, and make use of the other data files.