Entity Attributes

What's an entity attribute?

An entity attribute is a useful bit of metadata that gets attached to an entity during text analytics processing. Salience has always had several kinds of built-in entity metadata:

  • The entity type: Person, Place, Company, Product or whatever type is defined by a CDL file. or user entity list
  • The normalized name of the entity, e.g., "Lexalytics" or "LXA" might get normalized as "Lexalytics, Inc."
  • A "label", which is just a catch-all container for whatever information a customer might want to attach to an entity by way of a CDL file or user entity list. Typically a label is used to describe a sub-type of an entity (e.g., an "Organization" is considered to be a sub-type of "Company"), but it can hold any type of auxiliary information, such as a link to a Person's Wikipedia entry. However, in order to make use of entity attributes, the use of a entity label should be reserved to identify a sub-type.

Entity attributes expand the kind of metadata that can be carried by entities. Rather than having just a single entity "label" field that can hold only a single item of metadata with an unknown role, there can now be arbitrarily many entity attributes, each with its own distinct name and each holding a different kind of metadata. For example, the entity for a pharmaceutical drug might have an attribute for its manufacturer and another for its generic name.

Why use it?

Entity attributes are a convenient way of providing metadata to the customer that they might otherwise not have, such as the generic names for all pharmaceuticals. They also allow for a way to easily filter text analytics results, such as finding only those mentions of drugs manufactured by a particular pharmaceutical company.

How to use it

Entity attributes and their values are defined through ATR lists stored in the data/salience/entities or data/user/salience/entities subdirectories. Entity attributes are provided in some Industry Packs (such as Healthcare).

Limitations

Since entity attributes are defined through configuration files, they are best suited for attributes that don't change (such as generic drug names), but not for dynamic attributes (such as end-of-day stock prices).