# Language Models Data Stores

Tisane language models are stored in directories. They can be divided into:

1. **Language-specific data** that describes a particular language.
2. **Crosslingual data** used by all languages (for example, semantic connections between concepts).


### Language-Specific Data

Language-specific data stores are named according to the following convention: `(language_code)-(data_store_name)`

* Language code: Based on ISO-639-1 language code standard, optionally including dialects.
* Data store name: Structures stored.


Examples:

* `en-phrase`: English phrasal patterns
* `fr-nondic`: French nondictionary entity heuristics
* `zh_CN-phrase`: Chinese (Simplified) phrasal patterns


### Crosslingual Data Stores

These data stores used by *all* languages:

* `family`
* `role`
* `pragma`


**Important:** All data stores for a language must reside in the *same* directory.

### Partial Distribution

In order to conserve space or out of other considerations, it is possible to exclude languages or components from deployment.

## Providing Selected Languages Only

To include only specific languages, identify the appropriate language codes (e.g., `en`, `de`, `zh_CN`) and include the corresponding language-specific data stores along with the three shared data stores (`family`, `role`, `pragma`).

## Providing Partial Functionality

Stores `xx-famlex` and `xx-famphrase` are used for translation only, and can be excluded from distribution if Tisane is not used for translation.

## spellchecking

Spellchecking data is stored under `xx-spell` stores. If omitted, spellchecking will not work.