SIL International Home

The Linguist's Shoebox

Integrated data management and analysis for the field linguist

Tip

Semantic categories are useful for analyzing, managing, and publishing lexical data.

Categorizing data items is a fundamental part of analysis—determining whether or not things are related within the scope under scrutiny. Well-organized categories are also an essential part of effective data management. Researchers can publish a series of separate topically-oriented volumes about vernacular generic terms or semantic domains (e.g., birds, fish, plants). The Multi-Dictionary Formatter (MDF) provides data fields for emic and etic semantic categories:

\th Thesaurus. The vernacular generic term that the people themselves use to classify the lexeme. It overlaps with the Gen (i.e., generic) lexical function. Because it is an emic category, it does not necessarily correlate with western taxonomies or with the semantic domains. For example, masy 'fish' has a broader semantic range in Selaru than fish does in English, because it also includes sea mammals and crustaceans. Set up a data link from the \th field to the \lx field. This will ensure consistency when entering data and let you jump to the related lexical record.
\sd Semantic domain. Researchers should continually refine the categories (e.g., kin, cut, color) as they grow in their understanding of the language and culture. A semantic category can indicate how the lexeme behaves in certain grammatical constructions. To ensure consistency, set up a range set for the \sd field.
\is Index of semantics. An etic checklist can supplement a system of emic categories, because it covers the general topics of language and culture comprehensively. It can also help in cross-referencing and comparing data collected by other researchers. Set up a range set or a data link for the \is field.

For more information, read Making Dictionaries.

Index of tips: categories, semantic; fields, semantic category; lexical data; semantic domains; thesaurus
List of tips