Salience

The Salience Developer Hub

Welcome to the Salience developer hub. You'll find comprehensive guides and documentation to help you start working with Salience as quickly as possible, as well as support if you get stuck. Let's jump right in!

Get Started

.../data/themes

The themes directory contains the files that control theme extraction at all levels; documents, entities, and collections.

The following files may be customized by users in the themes section of a user directory. Click the name of the file for more information below.

rules.ptnPart-of-speech patterns that define theme extraction
stopwords.datA list of words that should not be considered for themes
normalization.datRules for relating themes together

rules.ptn

This controls the POS rules that determine if a combination of words is a theme or not. It is uses the Pattern File format.

stopwords.dat

This file is used to eliminate phrases that would match the POS rules contained within rules.ptn but are too common to be considered useful, last week for example.

The file is a single column .dat file. It can contain both single words and phrases (multi-word)

Single words will act as a stop on any phrase containing them:

hello will stop any phrase appearing that contains the word hello

Phrases will act as a stop on that particular phrase:

next week will stop next week, it will not stop sometimes next week

NOTE: stopwords.dat is case insensitive.

normalization.dat

NOTE: Salience does NOT ship with a normalization.dat by default.

If you create a normalization.dat, it is possible to normalize multiple different themes into the same theme. This is useful if you want to do some sort of roll-up. For example, you could normalize poor sound, great sound and good quality speakers into ''audio quality''.

To enable theme normalization create a normalization.dat under /data/user/themes with each entry in the format:

  • [theme][normalized_form]

NOTE: theme can either be the unstemmed or stemmed form

Updated 5 months ago

.../data/themes


Suggested Edits are limited on API Reference Pages

You can only suggest edits to Markdown body content, but not to the API spec.