salience/opinions directory contains files that aid in opinion extraction for entities. Users can customize the following files
Pattern files for recognition of opinions, see below.
File of pattern macros for use in opinion extraction, see below.
Data files containing verbs found in opinions, see below.
Opinions are implied or direct quotations by a person about a topic. They consist of the following parts:
- Speaker - An entity who is believed to hold the opinion. This is usually, but not always, a person.
- Topic - An entity or theme involved in the opinion.
- Sentiment - The sentiment of this opinion towards the topic. Note that because opinions are often very short pieces of text, the accuracy of the sentiment value will be lower than when Salience analyzes an entire document for sentiment.
- Quotation - The text Salience interpreted as being an opinion.
A direct quotation is one marked with quotation marks: George said, "Massachusetts is a lovely state in the autumn." The quotation may span multiple sentences.
An implied quotation is a situation where an idea is attributed to a person, but without explicitly stating the quotation: George feels that Massachusetts is prettiest in the autumn. An opinion is implied, but not explicitly stated. Both forms are recognized by Salience.
If a quotation contains multiple topics, one Opinion is returned for each. If a quotation does not contain a topic, it is not returned at all.
Modifications to the opinion patterns are not recommended: the code is linguistically complex.
The pattern files
implied_quotation.ptn provide the linguistic patterns for different contexts, and make use of the other data files.
Updated over 2 years ago