The Linguist's Shoebox

Integrated data management and analysis for the field linguist


Lines in the list of primary sorting characters correspond to sections in a dictionary.

In Shoebox language encodings, it is important to correctly define the sort order. Many Shoebox users are not sure where to enter "special characters", especially the ones that contain diacritics (e.g., tone marks). Here is a practical principle that applies the researcher's understanding of the language to the details of setting up the computer software: Characters that belong in the same section of a dictionary should be entered in the same line in the list of primary characters (e.g., C c Ç ç). In general, different phonemes should be entered in different lines (e.g., N n vs. Ng ng). Here is a way to check that Shoebox has been correctly set up to sort a lexical database: export it using the Multi-Dictionary Formatter. In the heading above each section of a dictionary or gloss index (finderlist), MDF prints the first uppercase-lowercase pair from the corresponding primary sorting line (e.g., C c). If a writing system does not have case distinctions, MDF simply prints the first character.

