Structure of the TELL database Version 1.0

TELL Version 1.0 consists of the following information:

25,000 distinct lexemes (dictionary headwords), in Turkish orthography. These are taken more or less intact from the 2d and 3d editions of the Oxford Turkish-English dictionaries (duplicate entries are collapsed). Many of the words are morphologically complex (either suffixed, compounded, or both).
4,900 place names, in Turkish orthography. These are taken from an atlas of Istanbul and a telephone area code directory of Turkey. Many of the place names are morphologically complex.
Transcribed pronunciations of those 17,500 of the dictionary lexemes and place names known to a 63-year old native speaker of Standard Istanbul Turkish. Each word is transcribed phonemically. To reveal morphophonemic alternations, each word was elicited and transcribed in a variety of morphological contexts. Nominals were elicited in five contexts: nominative ("citation"), which has no suffix, and preceding accusative, professional, 1sg possessive, and 1sg predicative suffixes. Verbs were elicited in three contexts: the long infinitive ("citation"), the aorist, and the causative. More
11,500 etymologies. TELL workers searched the literature seeking etymological source languages for the words in the database. Native vs. nonnative status is encoded, and for loans, the source language (or language family, in the case of some Romance loans of ambiguous origin) is identified. Note that identifying the source of a loanword does not necessary imply the route of transmission; e.g., many Arabic loans came into Turkish through Farsi. More
17,500 morphological roots (in Turkish orthography). The dictionary entries were analyzed by Prof. Kemal Oflazer's morphological analyzer, and the results were proofread by a native speaker. For each dictionary entry recognized by the parser, TELL supplies the morphological root of the word.

TELL home page