Building a Lexicon

Building a Lexicon

Currently in my constructed-language work for The Curse of Steel, I’m selecting word roots from my script-generated list of all the legal possibilities.

I’m not being particularly systematic here. I started with the roots for several names I had already settled on during early development, and from an earlier word-list that I built before I started getting my computer to help out with all this. (Along the way, I discovered that I had broken some of my own rules about legal word-root formation. Time to make minor tweaks to the word-lists!)

With that finished, I’ve been grabbing words from a variety of sources: color terms, the numerals from one to ten, and so on. I’ve even pulled down my copy of the Silmarillion and started paging through the appendices for ideas – that’s kind of a ready-made list of vocabulary prompts for any naming language! Not that I’m slavishly imitating any one source, but if my final lexicon ends up sounding vaguely Indo-European and vaguely like Sindarin, I suppose I can be accused of stealing from the best.

So far I’ve got about 80 word-roots. The list follows, taken straight from my growing spreadsheet. A couple of notes first.

You’ll notice the word roots incorporate some numerals and special characters. Those are meant to represent some phonemes that would normally be expressed with more than one character. That way, when I pull them over to be processed by another Perl script, I won’t have to fuss too much with parsing those out. If you know anything about PIE phonology, you’ll probably recognize that I’m using a similar set of three “laryngeal” consonants, that will disappear from daughter languages but give rise to a variety of vowel colorations. Other special characters represent aspirated or labialized consonants (e.g., representing the differences among phonemes we might pronounce as g-, gh-, or gw-).

Meanwhile, every word root has a “weight” attached. This is something I built into the script to generate the word roots, to enforce some assumptions about which phonemes are most common.

Ur-Language RootWeightPart of SpeechMeaningNotes
re2n567AdverbParticle for future aspect of verbs
we2489AdverbParticle to indicate negation of verbs
2sper352Adverb“away”
te2n440Conjunction“and”
rey540Noun“chieftain, noble, king”
d2en440Noun“man,” also numeral “ten”
we@420Noun“water”
ke2m392Noun“hand,” also numeral “five”
kest392Noun“head”
@e2n378Noun“tree”
2eng378Noun“iron”Probably borrowed from another language group
%en360Noun“girl, woman”
1kwes313Noun“lake, pond, pool”
me2r@302Noun“fate, doom”
$2er252Noun“home, dwelling”
ke3lm196Noun“hill, knoll, rock”
ye1480Numeral“one”
kens1403Numeral“seven”
tre1s403Numeral“three”
2tes392Numeral“two”
semt1358Numeral“six”
we2rs352Numeral“four”
let3244Numeral“eight”
pen@3189Numeral“nine”
weytN/AVerb“to know, to see (visions)”Not a legal ur-language root, probably borrowed from another language group
1es640Verb“to be” (indicating a state of being)
ken630Verb“to think, to engage in spiritual activity”
ret630Verb“to guard, to protect”
wer630Verb“to die”
ne2r567Verb“to be glorious, to be brilliant”
tren567Verb“to be stiff, to be taut, to be mighty”
mew560Verb“to partition”
re@540Verb“to hit, to strike”
kres504Verb“to mix up, to confuse”
me2r504Verb“to crowd, to form a crowd”
kel489Verb“to be cold, to be chilly”
nek2441Verb“to strip away, to expose”
pret441Verb“to exchange”
terk441Verb“to break”
t2er440Verb“to crash, to smite”
dren2396Verb“to lengthen, to be long”
gre1n388Verb“to sanctify, to make a treaty”
1@em384Verb“to stand”
$er360Verb“to turn”
me3r360Verb“to be large, to be great”
kre2s352Verb“to be black”
ke3350Verb“to bend”
2lew342Verb“to flow (like water)”
kelt342Verb“to hammer, to work with metal”
welk342Verb“to tear”
teym336Verb“to encircle, to finish (a circle)”
de3n315Verb“to give, to receive a gift, to be guest-friends”
dre3315Verb“to have sacred power”
ke3r315Verb“to run”
kre2w308Verb“to make a harsh sound, to croak”
sen2@302Verb“to be old, to be ancient”
2el@293Verb“to be white”
2ewg293Verb“to hear”
te$280Verb“to be wild, to be free”
ske2t274Verb“to hate”
te2lm274Verb“to spread”
#e2n252Verb“to go, to walk”
ke3rs252Verb“to stand tall, to tower”
wer#252Verb“to threaten”
de3w244Verb“to be dark (in color)”
k3el244Verb“to be whole, to be unmarred”
kwe3244Verb“to be loyal”
le3k244Verb“to burn, to set aflame”
we3k244Verb“to speak, to call”
kle2w240Verb“to cut, to slice”
g2els235Verb“to be green”
ske2@235Verb“to darken”
de1#224Verb“to take”
@er#216Verb“to bite”
1rew#201Verb“to be red”
te2$196Verb“to hurt, to harm”
3re$180Verb“to straighten, to direct”
$2ey168Verb“to be blue”
$eyt168Verb“to be white”
de!140Verb“to divide”

I think I’ll probably generate a few dozen more roots, then copy them into a separate spreadsheet where I’ll build actual words. Most of the roots will make perfectly good words without modification, but I’ll also apply some of the word morphology rules I’ve worked out to derive more words. I imagine I’ll have as many as 200-250 words by the time I’m done, enough to form the basis for a decent naming language. Then to build Perl scripts to apply the sound-change rules.

Once that’s done – no doubt with a certain amount of tweaking to suit my aesthetic tastes – I’ll have a system by which I can quickly create and record new words as I write the story. In three different, but clearly related, languages!

Lots of work up front, to save a lot of work and frustration later. That’s what computers are for, right?

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.