.../data/salience/relationships

The salience/relationships directory contains files that aid in relationship extraction for entities. Users can customize the following files

*.ptn

Pattern files for recognition of relationships, see below.

grammar.imp

File of pattern macros for use in relationship extraction, see below.

Related pages:
Salience pattern syntax
Description of sample relationship pattern

Customizing relationship extraction in user/salience/relationships

.PTN files: Pattern (.ptn) files hold human-designed patterns. The first line of a .ptn file is the name of the relationship. Each subsequent line is a separate pattern. The patterns are based on the [pattern syntax][5]. You can use any of the constructs defined on the Pattern Syntax page when forming relationship patterns, plus a few additional features:

Entity tags: Since relationships exist between entities, you need a way of expressing which entities can go where. To do so, place the entity type you'd like to match in curly braces. For example, '{Person} (=’works’) (=’for’) {Company}' will create a relationship between people and companies. The entity tags cannot be combined with other Pattern symbols, such as ? or *. You cannot place entities inside of parentheses. You can place @n at the end of an Entity tag to enforce an order on the entities. By default, Entities will be returned from left to right. "Bob sued Mary" and "Bob was sued by Mary" both express the relationship <Lawsuit, Bob, Mary>, but the sentences actually communicate very different things. You can use the following construct to ensure that the plaintiff always comes before the defendant:

{[email protected]} (='sued') {[email protected]}
{[email protected]} (='was') (='sued') (='by') {[email protected]}.

When using @n, n must be unique for every entity, and run from 1 to the number of entities. While not required, it's strongly recommended that you always use the @n construct to keep your results consistent and easy to process later on. You can order the entities however makes sense for you, but our recommended best practices is to order the entities as you would in the simplest active sentence you can think of that expresses the relationship.

Entity Restrictions: You can place additional restrictions on which entities can match an entity tag using restrictions. These are placed after a comma within an entity tag, e.g. {[email protected],next}. Only one restriction can be used per entity tag. Restrictions are as follows:

  • next: This entity must be the next occurrence of its type since the previous matched entity. Not valid on the first entity.
  • next of: This entity must be the next occurrence of the named type since the previous matched entity. next of * means it must be the next entity of any type. You can use semicolons to separate a list of multiple entity types: e.g. {Company,next of Company;Organization}
  • token: Used with comparison operaters (=,!=,>,<,<=,>=) to specify the number of words allowed in a matching entity.
  • end: Requires that this entity span the last tokens in the sentence.
  • start: Requires that the sentence containing this relationship start on this entity.

Captures: Sometimes information about a relationship is included in text that isn't an entity. For example, Salience ships with a quotation relation that connects people with quotes. Knowing how the idea was expressed is a useful piece of information. Did the person say this? Predict it? Apologize it? The feature of the pattern syntax can be used to extract additional information like this. By putting a segment of the pattern in parentheses and starting it with '?<capture_name>' you can extract additional information from the sentence. The text that matches the part of the pattern in parentheses will be reported with the relationship. Captured patterns cannot include entities that are part of the pattern.

Metaentities

You can also use the relationship syntax to define a compound entity made up of other entities by editing 'metaentities.ptn' in the /salience/entity directory. There is an example pattern shipped with English Salience to join strings of people names with a shared last name into a family entity:
"John and Mary Smith announced a new baby" produces a Family Entity of "John and Mary Smith".

The syntax is "Metaentity Labelrelationship pattern rule". You can use all the same syntax as other relationships.

The entire string matched by the relationship pattern will be returned as the entity mention. The individual entities matched by the relationship will appear in the entity's "related entities". By default, they will appear as a list under the heading "members". However, you can also assign a label to each role using the @label syntax. For example:
Lawsuit{[email protected]} V. {[email protected]}

Creates a metaentity of label "Lawsuit" with the span "Smith V. Jones", and a related entity of label "Plaintiff" pointing to "Smith", and a related entity of label "Defendant" pointing to "Jones".