Tuesday, January 16, 2007

How to Make a Dictionary, Session 9, Tuesday 2006-12-12


Introduction to a Field Linguists Toolbox
Guest Lecturer: Sascha Griffiths


The linguistic association SIL (compare: www.sil.org) is documenting unknown languages all over the world with the help of a specific system called TOOLBOX.
TOOLBOX was developed in order to help linguists to generate their field work studies of foreign languages. With TOOLBOX, new vocabulary, grammar, morphology, syntax and phonology can be registered and used to create new dictionaries.

The term toolbox is derived from the word "shoebox", the ancient method of gaining foreign language information: in times when modern computer systems have not been available, linguists had to carry their information on foreign languages in ordinary shoeboxes. During their field work studies, they noted the information they got from several interviews with native foreign language speakers on cards they collected in ordinary boxes.
Since modern computer systems, laptops, hardware and software are available and easily transportable, the old shoeboxes have been replaced by computer toolboxes that provide modern (dictionary) databases.

Toolbox is a computer program that allows us to enter and review (new) lexical entries easily. The main page consists of two windows whereof the left one contains the ordinary dictionary microstructure. The right window shows the specific dictionary entry.



Concordance
An important aspect a linguist has to consider, is the concordance of new terms/ new lexical entries entered into toolbox.
This means for instance, that the linguist has to count the amount of times a new word occurs.
When it appears very often, it must be an important lexical of functional word. It may be part of the basic vocabulary of a specific language (fundamental vocabulary) or may be essential to grammar or syntax (for instance: it may be essential to the creation of a time and act as an auxiliary, modus or aspect).
A part from the frequency of a word, the linguist also has to consider its environment or context. Therefore, it is important to know where a specific word "normally" appears.
Are there any preferences of appearance, or are there even specific conditions that have to be given in order for a specific word to appear?
The answer to these questions can tell a lot about the usage of words, their importance and their relation to larger contexts in general.
Since spoken (and written) language consists of the combination of words on the basis of specific grammatical rules and usage limitations, unknown languages can be observed, described and finally explored by the previous methods of concordance.

The recorded data can easily be exported via toolbox. One time entered into the database system, it is relatively easy to create a dictionary data base.



Inflection and Compounding

Inflection
A word consists of a stem and an inflection (a stem is whether a root or a derived stem!).
The inflection is related to the external structure, to the syntax of the phrase/ utterance. The inflection a word takes has to fit to the environment of the word, it has to be embedded into the context. Even if an inflection is totally missing, this absence carries an information on morphology: a stem + a zero inflection can mean singular or indefinite form of word!

In Latin and in German the inflection system is even more complicated than in English. English does not differ between different case- forms of nouns. The first noun within a sentence has to be the subject, while supplementary nouns that follow the verb of the sentence have to take the function of the object(s).
In Latin and German the sentence structure is less stable and static. Objects can be differentiated from the subject by their inflection form and can therefore also emerge at the beginning of the sentence. In English this is not possible without changing the meaning or aspect (f. ex.: active vs. passive) of the utterance.

Example:

German

Ich sehe den Mann. (Accusative)

English

I see the man. (No inflection, the subject has to be in first position)

German

Den Mann sehe ich. (Possible sentence/ variation)

English

*The man I see. (This sentence is grammatically incorrect)


Whereas German uses inflection in many cases only in combination with its articles (determiners), Latin even possesses a very complex noun-bound inflection in 6 cases (Nominative, Genitive, Dative, Accusative, Ablative, Vocative).

For example:

ara, arae (Nom., sing., pl.)
arae, ararum (Gen., sing., pl.)
arae, aris (Dat., sing., pl.)
aram, aras (Acc., sing., pl.)
ara, aris (Abl., sing., pl.)


With the help of derivation, words can even change their part of speech:

Example:
to run (verb) → runner (noun)

In a view cases, even zero derivation (the absence of a suffix) can lead to a POS shift of words:

Example:
to run (verb) → a run (noun)



Compounding

Compounds normally consist of a binary division (2 items that can be identified by drawing an internal tree structure).
But very long compounds can also consist of more than just two items (divisions).
A compound stem can consist of a derived stem which can consist of a root.


Finally, there are only three possible ways of creating new words in a particular language:
1.) Creating words by the invention of new forms of roots.
2.) Creating words by deriving already existing linguistic material.
3.) Creating new words by compounding two or more already existing terms.

0 Comments:

Post a Comment

<< Home