Salience

The Salience Developer Hub

Welcome to the Salience developer hub. You'll find comprehensive guides and documentation to help you start working with Salience as quickly as possible, as well as support if you get stuck. Let's jump right in!

Get Started

.../data/chunker

This directory contains files that support the chunker. For file that can be overridden/customized by users, click on the filename for more detailed information below.

auxiliary.datA list of verbs that are commonly prefixed into longer verb phrases, e.g. "planned to go", "need to think", "gets to come"
copulas.datExternalization of linking verbs considered by the chunker.
describers.datUsed by sentencetype.dat in identifying sentence types
discourse.datData file containing words that weight sentiment phrases following them in situations of mixed sentiment
instructive_modal.datUsed by sentencetype.dat in identifying sentence types
intensifiers.datData file of words that are a direct multiplier of adjacent sentiment-bearing phrases
negationbreaks.datExternalization of patterns used by chunker to impact negation of chunks.
negations.datExternalization of the negators considered by the chunker.

These files may be customized within a chunker section of a user directory, however it is not recommended.

auxiliary.dat

Certain verbs commonly occur in longer verb phrases. This file lists verbs that fall into this category to ensure the two verbs are part of the same chunk.
Back to top

copulas.dat

This file contains a list of verbs which the chunker uses as linking verbs. Note that this file does not contain all forms of individual verbs. For example, the verb "to be" in English is conjugated as "I am, you are, he/she/it is, we are, they are"; in this file you see the forms "are" and "be".
Back to top

describers.dat

This file is used by sentencetype.dat in the identification of different sentence types, such as imperative sentences. If sentencetype.dat is overridden in a user directory, this file may also be needed.
Back to top

discourse.dat

This file is used to specify words that weight the sentiment of phrases following them. For example:

The restaurant was nice and clean, but the food was awful.

The use of the word "but" indicates a change in sentiment, and the weight for "but" in discourse.dat allows for slightly higher weighting on the sentiment phrases found at the end of the sentence based on observations that they convey the true sentiment intended.
Back to top

instructive_modal.dat

This file is used by sentencetype.dat in the identification of different sentence types, such as imperative sentences. If sentencetype.dat is overridden in a user directory, this file may also be needed.
Back to top

intensifiers.dat

This file contains a list of words that will modify the the sentiment score of the next token only. The file format is:

<word> <tab> <intensifier-multiplier-amount>

Note: If a word is both an intensifier and a sentiment phrase, then it WILL NOT contribute its sentiment score to the document.
Back to top

negationbreaks.dat

This file contains a list of words which will stop negation in the middle of a chunk.
Back to top

negations.dat

This file contains a list of words that can negate (or invert) sentiment. The file format is:

<word-or-construction> <tab> <negation-multiplier-amount>

In all cases in the out-of-the-box default negations file, the inversion is an exact mirror. For example, the sentiment of "I enjoy watching baseball." is inverted when the following negation is encountered: "I never enjoy watching baseball."

Note: Entries without a negator multiplier amount are automatically assigned -1.0 for a value.
Back to top

Updated 5 months ago

.../data/chunker


Suggested Edits are limited on API Reference Pages

You can only suggest edits to Markdown body content, but not to the API spec.