Basic Language Support

Salience offers an expanded list of languages with "basic" support, consisting of entities, queries and document level sentiment. Basic Languages are support via a local server application that Salience will manage. The server needs to run on the machine with Salience, but does not need any upload or download access to your LAN or the wider internet.

The following functions are supported (the others will return an "LXA_UNAVAILABLE_IN_CURRENT_SESSION" error):

lxaPrepareText
lxaPrepareTextFromFile
lxaSetSalienceOption
lxaGetSentiment
LxaGetNamedEntities
lxaGetQueryDefinedTopics

The following languages are currently under basic support:

Russian (RU)
Arabic (AR)
Turkish (TR)
Hebrew (HE)
Polish (PL)
Thai (Th)
Vietnamese (VI)

Installation of Basic Languages

After installing Salience, simply run a basic language installer on windows to begin using it. A [language code]-Basic directory will be created under your Salience installation: initialize a session with this directory to begin processing text in that language.

On Linux, navigate to your installation directory (lxainstall) and untar a basic language. On Linux you must additionally set a REPUSTATE_HOME directory to point to $lxainstall/Salience/BasicLanguages. We recommend setting this environment variable in your bash profile.

Here is a complete recommended .bash_profile file for a salience installation including Basic Languages support:

Add the following:

export lxainstall=/home/devuser/lexalytics/salience-6.1.1.1479

export REPUSTATE_HOME=$lxainstall/salience/BasicLanguages

export LD_LIBRARY_PATH=$lxainstall/salience/lib

export JAVA_HOME=$HOME/jdk1.6.0_24

export JAVA_OPTS="-Djava.library.path=$LD_LIBRARY_PATH"

export PATH=$JAVA_HOME/bin:$PATH:$HOME/bin

Refresh your environment:

source .bash_profile

Configuring Basic Languages

Although usually not necessary, several options are available to configure the server used to support basic languages. By default the server will be started up the first time a basic language is used and closed when the last session supporting them is closed. See here for recommendations on special situations such as processing multiple basic languages.

Repustate Initialization

When starting up a Salience session that uses basic languages you will also need to start Repustate. This process can take some time, in some cases up to a minute, and in this start up time you might see some messages as below:

Warning: Repustate health request failed: curl error: 7 : Couldn't connect to server
2020/06/22 14:42:53 Failed to load Deep Search config (file not found). Deep Search is disabled.
2020/06/22 14:42:54 Loading data models for ar ...
2020/06/22 14:42:54 Loading data models for ru ...
Warning: Repustate health request failed: curl error: 7 : Couldn't connect to server
...
2020/06/22 14:43:20 Loading custom entity extraction models ...
2020/06/22 14:43:20 Serving [::]:9000 with pid 19998

You can ignore the messages as long as you eventually see a message like in the last line above. This line indicates that Repustate is ready to go.

Basic Language Options

Basic Language support is provided via a third party library. These options allow you to configure that library.

Option Name

Default

Description

Basic Port

9000

Which port to run the Repustate server on

Basic IP Address

http://localhost

Scheme and host (e.g., http://10.12.5.121)

Manual Basic Management

false

When true, Salience will not start up/kill Repustate but rely on the user application to do so (default: false)

Basic Server Options

none

Repustate server command line options (e.g. -langs, -skip_ds, etc.)

Basic Server Version

v4

Repustate server version (used in urls, mostly needed for entities)

Basic Server Startup Timeout

150

Seconds to wait for Repustate to become healthy after starting process

Basic Server Request Timeout

30

Seconds to wait for a single Repustate request

Basic Server Max Request Attempts

5

How many times to retry a single request before restarting server

Basic Server Request Attempt Delay

3000

Millisecs to wait after each failed request

Basic Server Close Delay

1

Seconds to wait for Repustate server to close before restarting

Basic Port

A port number to run the basic languages server on. This does not require outside access: Salience communicates directly to the basic languages server on this machine via this port.

Manual Basic Management

By default, Salience will launch the basic languages support server when you first use it and close it after the last Salience session is closed. Even if you use multiple processes, the server will go up and down correctly. However, starting the server can take up to a minute, depending on the number of languages configured. So, using an externally managed server can give you better performance.
To manage the lifespan of the Basic Languages server yourself set this option to true. Note that in this case it's your responsibility to ensure the session is always running when a Salience session is trying to process Basic Languages text.