#but basically there are CV syllables and VC syllables
Explore tagged Tumblr posts
Text
worked on my conlang a lot today :] lost track of time a little bit .. i got around to drawing the base of sll the glyphs so now i just need 2 make all the variants for them
#i already finished 1... so 17 more to go lol#and by 17 i mean. 17*18. since there are 9 vowels + the up and down variations#if snybody is curious its a syllabary. rather than an alphabet.. it does sort of work similarly to an alphabet tho so idk if it counts as a#like. True syllabary#but basically there are CV syllables and VC syllables#its written on a line and CV syllables go above the linr and VC syllables go below the line#and each consonant has like a base glyph and then there are 9 vowels. which are umm#the way im representing the vowels is with 1-3 lines either perfectly horizontal tilted up or tilted down#grouped together on the base consonant.#that doesnt rly make sense i cn upload image 2 explain if anybody wants...#i havent actually decided which glyph represents which sounds yet LMAO. im juet working on getting them done so i cn decide based on how#they look ...and once i actually have them done then ill need t umm. go through and just write them a bunch#to simplify them. since famously writing tends to get simplified over time ...#ill have t rly work out the language first. ITS ALSO COMPLICATED BC. so rhe way language is in my pretend world#is the like. well they srent rly human. but the people in the world couldnt talk before they were taught by the gods#so all languages have a shared like..root language. obv it changes and brandhes out over time which gives me an exciting opportunity to do#lots and lots of languages without having to come up with new roots ... this is rly fun TO ME.#the thang with it tho is that idt they learned how to write immedistely after they learned speech.. i rhjnk that mightve come later#so the syllabary im working on rn is rly just the likee. semi modern WRITTEN form of the original language .. not rly modern its still#likee. ancient#AAAA this reminds me i wanna make a calendar system....#ill have t work on that as well. i wish i had that one expensive ass game that lrts u likee. fuck around with le solar system#2 see the effect it has on yr planet... bc mine has 2 moons#that r on opposite sides of the earth at all times#they arent fr moons theyre the creation gods you see. but they appear as moons#i also need to flesh out more of the gods.. bc it won't judt be the freation gods there will be a pantheon#but im thinking abt it a lot bc likee. so the creation gods Are the moons you know. so for the other ones i cant decide if theyre moons of#other planets or if they just Are the planets... OR if theyre just floating about out there yk. i suppose rly they could be whatever they#want. yk.. and obviously they all have nultiple forms likeee. yk...#also maybe the creation gods arent themselves the moons maybe they just live there Much 2 think about.
3 notes
·
View notes
Text
...And now I'm thinking about the time travel conlang and how I might want that to work.
I looked up the phrase just to see if I could get inspiration from other time travel–related languages and found Mpiua Tiostouea, the language of all time. It's got some neat concepts, though it was designed to have an... interesting... phonology and I'd definitely make some different choices—which is good! It means I won't be copying ideas when I make my own conlang.
A conlang for time travellers needs to be able to express some complex and seemingly self-contradictory tenses. For instance, I might tell you this sentence:
"After I go to my date with the time worm, I'll text you how it went."
Except today is Thursday, and my date with the time worm, which I'm going to tomorrow, is Wednesday (yesterday), and I plan on jumping again afterwards, but I'm not sure in which direction or how long it'll take me to get around to texting you, and at any rate you only experience time forwards and will certainly receive the text in the next few subjective and objective days.
...Also, while I, the person talking to you, am going to be going to the date and sending the text, I'm not dating the time worm—the date is between myself from three years into the future (as opposed to an alternate version of myself whom I never have been and never will be), I'm spying on it, and also the time worm experiences all of time simultaneously in every universe and thus has no time clones or past/future selves.
...And the groupchat has like three versions of you in it.
A properly time travel–inclusive language should encode all of these things efficiently through the use of creative agreements, pronouns, and tenses.
It should also be inclusive towards people who experience time in reverse. Not those who've lived backwards all their lives—they can learn any language just fine, the same way everyone else does—but people who've found themselves temporarily moving the wrong way through time, despite having learned the language forwards. I think this can be settled by having two acceptable word orderings—one the reflection of the other—and employing asymmetrical particles that indicate important things like proper nouns and sentences, and maybe having a necessarily asymmetrical syllable structure.
Like CV. Every syllable necessarily has one consonant followed by one vowel, unless you're experiencing time backwards relative to your conversation partner, in which case all their speech will sound to you like every syllable is VC, and the same from you to them. That ought to work and to be simple enough that anyone, with any native language from anywhere across time, can pick it up with relative ease.
Then we get to pronouns. Mpieua Tiostoeia has an impressive set of seven grammatical persons, numbered 1–7. I understand and respect the reasoning behind such a choice (and a dedicated grammatical person for antimemes is pretty darn cool), but I'd rather go in the opposite direction:
1st person: I, the one talking to you. 1.5th person: Me, but a different instance of me than the one talking to you. 2nd person: You, the one listening to me. 2.5th person: A different instance of you than the one listening to me. 3rd person: That guy, the one I'm pointing to. 3.5th person: That guy, but an instance of them that's not right here. 4th person: The time worm, which experiences all of time and the multiverse simultaneously.
...Which coincidentally is also seven grammatical persons.
Due to the need to stress subjective and objective time experience for multiple entities, basically everything that can take agreement will agree with the person and gender of whatever it can agree with—most crucially, verbs, which might include tense markings that have to agree with any number of people:
"I'm having a party with these guys last week, do you want to come?"
Where I'm going to the party in the future and inviting you to come along in your subjective future (while acknowledging you may have already been), but some of the people I'm gesturing to have already been to the party and others have yet to go. Also one of them is the time worm. I think this party might be where we met... will meet.. whichever. Both.
Now, when I say gender, I don't mean male vs. female. Time travellers can come from any timeline. Some of them have only one acknowledged gender. Others have three. A few have as many as sixteen, or even more. Some of them plot gender on a four-dimensional spectrum encoded in the phonology of their gender pronouns. The only way to please everyone's idea of what gender trappings deserve encodement is to encode them all equally—that is to say, not at all.
Besides, we're all time travellers here. I don't need to specify how you identify with each word. I want to know if this is you, or your future self, or your evil alternate universe self. That's the kind of gender I'm concerned with.
Which means you can have a mixed-gender group (the three versions of you in the groupchat) that needs to be referred to with... essentially, it'd be something like you (2sg) and you (2.5pl), where you (2.5pl) is gendered both for your past self and for your alternate universe self, which are two different genders.
I think this ought to be my next conlang project. It's been way too long since I really got into one—right now, Yvelse is my only conlang that's not either dead or been in cold storage for the past year+.
46 notes
·
View notes
Note
SO I went through the Project Opal tag and WOW. Great worldbuilding, I can picture it. How do you come up with names and words in the language? I focus on the real world with my writing so not much is left up to me to decide.
I’m glad you asked! Which I’m realizing is a phrase I use a lot!
Loredump time~!
And also
Linguistics time~!
So! The language spoken by people in the Vandeth Desert is called Vandeth. You asked about names and words, so I’ll talk about names and words.
I knew I wanted to use a constructed language (or conlang, for short) for the Vandeth people. After a previous project proved extremely time-consuming and not at all worth it, I decided to create Vandeth using a top-down method. I started by just making up words, then seeing what they had in common, finding rules they follow. Every word I made after that would follow those rules. And when I needed grammar rules, I made those up, and continued following them.
Some of the most important things are vowel inventory, consonant inventory, and phonotactics (what sequences of sounds are allowed to go together). Vandeth uses the standard /a/, /e/, /i/, /o/, and /u/ vowels, and that’s it. (They’re pronounced consistently, unlike in English.) I won’t write out all the consonants, but at this point I’m no longer adding any new ones.
Now, the phonotactics. This is mainly about syllable shapes. In Vandeth, the most common vowel shape is CV. After that is CVC. A very rare syllable shape is VC. Even rarer is CVV and CCVV. I also try to have a good balance of how often certain sounds appear and where. Hard, sharp sounds are more common, while soft, round sounds are rarer. What makes a sound hard, sharp, round, or soft is kinda vague. It’s a bit of a kiki/bouba situation. But to me, a word like “luvimo�� doesn’t sound like Vandeth at all, but “shivaki” does.
But how do I even come up with new words? Well, I first look at the words I have and consider if it can be derived from any of those. At one point, I wanted a word for “gossip”. I looked at the words I had, and I noticed blai, “stain” and saksa, a verb stem meaning “to talk”. In Vandeth, words go after the word they describe, and when a word is derived from two others the words swip-swap. So the word for “gossip” ends up being blaisaksa. As another example, the word geital is a combination of gi, “two”, and keital, which used to be “gikeital” before it was shortened to be easier to say. The reason for this is that a geital is the same length as a keital but twice the length. The word keital itself actually comes from the verb stem kei, “to wear”, and the noun tal, “shadow”.
And what about names? Well, usually it’s just a word. Or it should be. I gave a throwaway character (an infant) the name Kimi, which in Vandeth means “pearl”. It’s kind of a cutesy name. Most often I just pick sounds that are Vandeth-y. It’s really important to me that Vandeth names (Vennem, Kalami, Mela) sound distinct from Delgane names (Lynn, Elvi, William, basically any English name) and names from other languages (Sóf, Markhi, Lili).
Don’t even get me started about grammar. There’s lots of linguistics and affixes involved, and admittedly, I haven’t made a whole lot of full sentences so the grammar is actually not super fleshed out. There’s enough for deriving words, though. Maybe I should just start translating random things, or have people send things to be translated via asks. Hmm… Anyway.
#stars. i’m such a nerd#but i love it and i love my work#this is super fun to me and i’m really glad there’s other people who enjoy it#asks#answered asks#cb answering stuff#project opal#lore dump#lore#wip lore#worldbuilding#conlang#constructed language
7 notes
·
View notes
Text
Conlang part 4: syllable structure
welp we're doing all the consonants and all the vowels, so we'll need to figure out syllables. following are the possible rules i'm allowing for basic placement for each individual syllable:
(C)V(C): vowel required, up to one consonant on either side (C)(C)V(C): up to two consonants before,up to one after. (C)V(C)(C): up to one consonant before,up to two after (C)VC(C): up to one consonant before, 1-2 after(but NOT 0 after) (C)CV(C): 1-2 consonants before, up to 1 after (C)CVC(C): 1-2 consonants before, 1-2 consonants after.
@ignisuada
7 notes
·
View notes
Text
tempted to make a clang that's like
Parthia hosting a row of embassies between the Chinese and Roman empires. For whatever reason the Chinese and Roman families mix and a mix of 1st century Latin, late Old Chinese/Early Middle Chinese, and late Parthian
Syntax would be SVO
Cultures would supply vocabulary appropriate to their source, e.g. ? < *sɨ for silk from MC, or slɨ from OC; ? < *ɔɫeʊ̃ˑfor olive oil from Latin, ? < something like *muɾɣ for chicken from Parthian
Political and work related vocabulary might lean IE while household vocabulary might lean Sinitic if just to provide a semantic guideline for vocabulary. Initially the populations would be trilingual but overtime the pidgin would pick up Parthian phonetics, filling out a lot of the common vocabulary, and I guess I'm just imagining that the Romans and the Parthians would already have something of a common diplomatic pidgin to work out of due to Graeco-Persian relations and Sinitic is left out of that, but, idk, maybe the diplomats have a lot of daughters to marry off.
Phonology like
p t tS k <p t č k>
b d dZ g <b d j g>
f s S x <f s š x>
z Z R <z ž ǧ>
m n J N <m n ny ng>
w l j r <v l y r>
i ɨ u <i ë u>
e ə o <e ä o>
a <a>
Length matters, somewhat controlled by syllable weight though.
Stress would to land on the last syllable.
consonant clusters would be simplified by epenthesis. Syllables want to be CV > VC, CVC > CVV >>> others, basically. Whether to infix or prefix the epenthetic vowel would be according to sonority; əndar <ntar but tərē < trej
A glottal stop would be semi-phonemic, starting V initial words and responsible for seemingly short final syllables, mostly from Xan.
Latīnə short non-low vowels would generally be represented by the central vowels due to mismatches.
Morphology generally Parthian
Nouns would only show inflection in the plural. The developing split ergativity of Middle Persian and Parthian would be suppressed for SVO, but -ān still only being an oblique plural lends towards generalizing a long vowel nominative from Latin. Though the Latin by this point is already pretty fairly leveled case-wise, so it might actually just come from an accusative. Final s from Latin would generally be lost, which would imply a merger of neuter and feminine nouns, so generalized from -ā, which has the benefit of looking like the oblique plural -ān inherited from Persian.
Verbs would be leveled to just a stem, which functions as a participle, and an infinitive, the stem suffixed in an. These would be combined periphrastically with forms of budan and estadan for passive, past, and perfective stems.
The four short romans took the women as wives
pay? čahār twanC rōmānōs capere pay? zan čīyōn uxores
pe čahār tuan rōmānōh kap pe zan čīğōn ɨksəre
Pe čahār-ā tuan-ā rōmān-ā kap ašt pe zanān čē-ğōn ëksär-ān
the women had been taken
Pe zanā kap eštad bud ašt
DEF woman-PL take PERF PASS PAST
I am taking it/will take the silk
Ängō kapan pe sëlū.
idk
15 notes
·
View notes
Note
Could you explain the difference between “CVVC” and “VCCV” ? I get them mixed up a lot, and if it’s not to much can you explain it in a way a dyslexic person could understand. Because I might have it.
Hi, thanks for asking! I’m assuming you’re talking about English UTAUs, because both those formats can be used for multiple languages.
CVVC stands for consonant - vowel - vowel - consonant and VCCV stands for vowel - consonant - consonant - vowel.
I only know the basics about using each really, but here’s my logic. (Don’t mind the colors I made my UTAU editor fall themed)
Here’s a line in English CVVC. It starts with “da” (CV) and goes into “a m” (VC) The other syllables follow that pattern, too. Most CVVC voicebanks have more than that though, like “s k” (CC) in the picture.
Here’s the same line in VCCV. Some of the vowels are different (”naI” vs “nI”) and there’s some more consonant/vowel blends (skI).
In English, the two recording styles are pretty similar in terms of how they’re used.
I hope some part of this helped!
5 notes
·
View notes
Note
Hi there Mr. Peterson! I'm curious about the different kinds of scripts out there, such as alphabets like in Latin languages, syllabraries like Japanese, and one I recently learned about, alphasyllabraries/abugidas like Devanagari. Are there any other types of scripts, and if so, how are they different from each other?
It depends if you mean naturally occurring scripts or possible scripts. Because you can do just about anything you could ever imagine—including a 2D script/language that takes full advantage of the 2D space. Basically, if you break down a script and figure out how it represents the language in question, you can imagine a script that differs based on every single variable. For example, glyphs in a script can represent…
a sound
a consonant
a feature
a max CV syllable
a max VC syllable
a max CVC syllable
a word
a sentence
some piece of any of the above
For example, you could imagine a script that represented things above the level of a feature, but below the level of a sound. Something like:
*^ = p
*& = t
*# = k
*$ = pʰ
*@ = tʰ
*£ = kʰ
%^ = b
%& = d
%# = g
%$ = bʰ
%@ = dʰ
%£ = gʰ
Above, * stands for something that’s voiceless and a stop, % stands for something that’s voiced and a stop, ^, &, and # are for things that are unaspirated and labial, coronal, and velar, respectively, and $, @, and £ are for things that are aspirated and labial, coronal, and velar, respectively. If you had another that was, say, !, for something that’s voiceless and a fricative, then, !& would be [s], while !@ would be [sʰ].
So yeah, you can do whatever. The writing systems that exist represent a much narrower range of what is possible, but that’s because not everything that’s possible is convenient—and, as with language itself, writing systems exist because they’re the most convenient way of doing whatever it is people thought of doing.
If you want to see probably every writing system in existence, you should head over to Omniglot.com: A site dedicated to writing systems, both natural and created. It’s a great place to lose a few hours!
35 notes
·
View notes
Text
Linguistics: Identifying Syllables
Identifying syllables may seem easy (and we usually do it instinctively) but it really is not. A syllable is a sound unit that is larger than a single phonetic segment, and smaller the word. But identifying how many syllables a word has, or how they are divided up, can be complicated – the word fire could be said to have one or two syllables, for example; and in a word like master, should the second syllable start with the s, t, or e?
In phonology, the usual approach is to look at how the specific language combines sounds to produce typical sequences. Syllables are made up of consonants and vowels (C & V). Vowels, in phonology, are defined as sound units that can occur on their own, or at the centre of a sound sequence. Consonants can't occur on their own, and are at the edge/s of a sound sequence, not in the centre.
In English, typical consonant/vowel sequences are CV (see), VC (is), CVC (hat), CCVC (stop), and CVCC (pots). The consonants & vowels must be though of as speech sounds, not as spellings. The word see is transcribed as /si:/, and has only one vowel in terms of sounds.
By looking at a language in this way, the range of syllable types can be identified, and different languages can also be compared. Some languages, such as Hawai'ian, use only V and CV syllables. Others, such as English, can have several consonants before & after the vowel.
English can have up to 3 consonants before the vowel (e.g. strap & sprig). It can have up to 4 consonants after the vowel (e.g. glimpsed [mpsd] and twelfths [lfθs]).
Not all consonant/vowel combinations are possible within a certain language. For example, English can combine /s + t + r/ in words like string and strength. But we do not combine /ʃ + t + r/ to begin a word with shtr (accent variations in words like strength aside). German, on the other hand, does use that combination, for words like Strang (cord) and Strand (beach). (The s is pronounced as /ʃ/.)
So the syllable is an important abstract unit that helps to explain how vowels & consonants are organized with a language's sound system.
When we make a “slip of the tongue”, we mix up & swap around parts of two words. When this happens, it usually is based on syllabic structure, which suggests that we understand syllables on a basic subconscious level. For example, initial consonants are often swapped around – weak and feeble could become feak and weeble. But it would be unlikely for initial and final consonants to be swapped as well – weak and feeble would not be accidentally said as leak and keeble. So a slip of the tongue is probably a slip of the phonological part of the brain.
4 notes
·
View notes
Note
How many glyphs do you need to make for a syllabary?
That’s a hard question to answer. I am assume this is a font question since I am the fontbastard. It depends on the language and the usage for the most part. The easiest way to do it is with ligatures like crazy.
Some applications, like MS word, have a lot of autocorrecting and polishing components, so if you use those, it tends to be a good idea to cover the additional characters so if it autocorrects it, it does not do so to a character not in the font — i.e.; changing an “a” to an “A” when you do not have an “A.”
Then you have to ask what your syllable units are: V, CV, CVC, VC?
Then you have to count up all of your consonants and vowels and start working out the math.
So let’s say you have a full number and punctuation group. That’s 32 glyphs.
And then you have the basic latin letters. That’s 21 consonants and 5 vowels.
Then you have all 21 consonants paired with all 5 vowels. Another 105.
So with just V, CV, and some punctuation and numbers, we’re to 163.
And we probably have to copy and code for uppercase just to keep the applications from screwing it up: 294.
Then you say you want to have CVC and VC. That’s another 105 for the VC and 2205 for the CVC. 210 and 4410 with uppercase.
So V, CV could be 163 to 294.
V, CV, CVC, VC would be 2473 to 4914.
And that is just with the basic latin and not going for CCV, CCVC, VCC... etc.
Doing an abugida is often enough the same thing, but it is a lot easier because you can use copy/paste and conditional elements if in some font applications (such as auto-generating ligatures, so all you have to do is go into each and adjust the diacritic position). As many glyphs, but the pipeline is a lot more fluid. Mind-numbing, but fluid.
So yeah, if you told me your exact syllable structure, vowels, and consonants, I could give a closer estimate to your needs. But it is a large number.
And if you want IIMF formatting, multiply it by 8.
3 notes
·
View notes
Text
Old Anitami, Part One: Orthography, Phonology, and Phonotactics
For once, I followed through. Let’s get started.
Actually, an initial note - I know that many of you are reading this through machine translation. That should be fine for the most part, but my examples are all written in modern Anitami, and often rely on particular constructions whose grammatical nuances won’t carry over. I want this to be accessible to as many people as possible, so I won’t categorically advise everyone to turn off the translator, but it if you can read even a little Anitami I strongly recommend giving it a try. If anyone wants to volunteer to translate this guide into another language, please let me know! I’d love to collaborate, especially if it’s a language I speak.
1. Orthography
I wasn’t sure about devoting a whole section to orthography, since it’s almost identical to modern Anitami, and the differences mostly have to do with phonetic shifts, which I’ll discuss below. Still, there are a couple of special characters to keep in mind.
First, the voiced dental fricative is represented by <ʋ>, or, more rarely, <ɹ̟>. Most browsers outside of Anitam and Tapa don’t support those characters, so I’ll transliterate them to th or dh (depending on context). Second, Old Anitam has a velar nasal (think the –ng in sing), which reduces to –n in the modern language. It’s represented by an <ŋ>, but I’m using the more commonly supported ñ. Finally, the Anitami vowels <ae>, <ai>, and <ei> are often diphthongs in other writing systems, but they’re their own letters in ours. Old Anitami doesn’t have that many true diphthongs, and these aren’t them.
This is also as good a place as any for a historical note: most of the surviving Old Anitami manuscripts we have were written by greens, and written down by yellows. (The second thing is more variable in older texts – ‘scribe’ wasn’t a separate occupation for a long time, and it was initially fairly unclear which caste it was going to be). Obviously, our understanding of the language is mainly limited to green and to an extent blue (courtly) dialects. There’s a lot of interesting research being done in vernacular reconstruction, but it hasn’t hit the grammar books yet.
More relevant to today’s topic, all the evidence we have suggests that greens and yellows, then as now, had different accents. This means that almost everything we have from the classic period was transcribed by someone who would have heard it differently from poet or author who wrote and recited it (early Anitami literature: basically inseparable from performance). Reconstructing pronunciation is a pain at the best of times, and our analysis of e.g. vowel sounds is still fairly speculative.
2. Phonology
A caveat: I don’t like phonology. I can mostly regurgitate what I know, but I’m less equipped to go into detail about this than I am about almost any other aspect of Old Anitami. If you have a question about how something’s pronounced in a specific word, just ask, I don’t want to get into all the edge cases.
And now that I’ve established that I don’t actually know what I’m talking about and nobody else does either, let’s move forward! Old Anitami has 14 consonants: c, f, h, k, l, m, n, ��, p, s, t, th, w, and y. With a couple of exceptions (in addition to the above mentioned), these are all pronounced as in the modern language. <c> is, of course, identical to <k>, though there’s some evidence that waaaaay back when it was a velar fricative. <th> in old Anitami is always voiced, it only loses its voicing much later in word-final positions or through assimilation to voiceless consonants. Anitami-speakers will also notice the absence of <b> and <d>.
There are seven vowels: a, i, e, o, ei, ae, and ai. Old Anitami lost /u:/ at some point in its divergence from Proto-Anitami-Tapap and got it back from Tapap much later. <ei>, <ai>, and <ae> are always long by nature, the other vowels may be either short or long. The long vowels have all shifted a little. <ei> was initially pronounced as /ɛi/ (roughly ehh); it has since merged into <ai> (pronounced /aɪ/, rhymes with eye, then as now). <ae> in Old Anitami was prounced /æː/ (more like ahh), now /eɪ/ (ay).
The rules governing stresses and vowel shifts within words can get fairly complicated, and it’ll make more sense to bring them up as they become relevant, but a handful are relatively basic and consistent. Vowels in open syllables are necessarily long unless they’re between two consonants, vowels in closed syllables (that is, those ending in a consonant) are short unless they’re long by nature. <a> likes to be long, <i> and <e> don’t and generally won’t unless they’re word-final. Sometimes, long vowels are forced to reduce by phonetic circumstances; when this happens <ai> usually becomes <i> and <ae> becomes <a>.
Stress is lexical! It defaults to the penultimate syllable, unless that syllable is short, in which case it’ll will move to the final syllable, unless it’s also short, in which case the stress bounces back and the penultimate syllable lengthens (in poetry), or the stress stays on the antepenultimate syllable – of course, there are exceptions.
3. Phonotactics
Or, how syllables are made. (A note on phonotacic conventions: V stands for vowel, C stands for consonant). Old Anitami has a really restricted range of possible syllables: V, CV, VC, CVC, and, much more rarely CVCC. (There are like three cases of CCVC syllables in the entire language). VC is also pretty rare – syllables strongly prefer to have onsets (initial consonants), and VC syllables can’t ever be stressed. Codas (the consonant or consonants at the end of a syllable) are almost never stops or fricatives. The only exceptions to this rule are in CVCC syllables, where the first syllable of the coda must be a nasal or an approximant, and the final consonant must be a stop or a fricative. By far the most common case is a sibilant (<s> etc.) following a nasal (<n>, <m> etc.). There are no restrictions on which sorts of consonants can be onsets.
Syllable divisions generally obey the maximal onset principle – that is, consonants between vowels are assigned to the following vowel. Ki-san-ta-mi, not Kis-an-tami, Pa-sen-de, etc. All syllables must have a vowel. Sometimes two consecutive CVC syllables will give rise to consonant clusters that wouldn’t be permitted in a coda if broken up into a CVCC and a VC syllable. This is technically allowed but the language doesn’t like it, it’s common to insert a vowel in between them, and doing so sounds more natural.
And that’s enough to start learning words! Tune in next week for Noun Classes, Or: More Arbitrary Categories For People And Things, which we’re obviously all lacking in our daily lives.
Next: Noun Classes >>
21 notes
·
View notes
Text
rambles on triconsonantal roots, statistics, phonotactics
this was originally a short explanation of why triconsonantal roots can be expected from vocabularies as well as why indigenous japanese root lexemes tending towards CVCV is not a coincidence but it got away from me and i don't have a point it's just the first points might be interesting for conlangers if only because it might give an idea on what you need to create a naturalistic vocab from scratch and some math tricks you can play to get your self something sorry i think i started writing this at 8 now it's like 11 and im sleepy
Word frequency follows a statistical pattern called Zipf's law. Basically an inverse power law. The long and short of it - the most frequent words do most of the work. Some of this is because math and Zipf's shows up a lot, part of it is because no one wants to do more work than they have too, so this ends up forcing some things to do more of the work for both speakers and listeners, and other reasons. But because the most frequent words do a disproportionate amount of the work, it happens that around 850 lexemes or so most of a language's work is being done. It also happens that most languages vocabularies more or less stop at around 15000 words. Let's say an average language has around 25 segments. Let's ignore phonotactics for second and assume any sound can follow any sound. 25*25 = 625 25^3 = 15625 It happens that triconsonantal roots have the capacity to express a more-or-less "complete" set of lexemes. Of course, phonotactics exist, limiting the number of potential syllables. On top of that, for the same reasoning about reusing things being easier, it's easier if you reuse segment pairs, letting them exist together in a family of related meanings and thus letting brains do less word fetching. Now most languages don't like consonants next to each other. There's also some reasons in the physics of how we make sounds giving us a preference for open syllables cross-linguistically (CV). So what then? Well, the root of 25 is 5. So 25*25*25=25*5*5*25=25*5*25*5 Of course phonotactics still exist; likewise five of those "average" segments were also vowels. Real languages might instead prefer something like 19*5*20*5 = 9,500; this is reasonably fluent - most language courses consider around 8000 lexemes proficient. Add a few affixes to that and you get right back up to the 15000 ish lexemes needed to have a "complete" language. Of course you're reusing a lot of it because of the principle of least effort (why have to separate roots for similar ideas?) and some words will still sound awkward or too much like something detrimental to communication (don't want don't sounding like do!) and there's likely diachronic reasons for things as well so you'll have more affixes and that will balance the missing possible roots. So basically the triconsonantal root is because three segments carrying a large amount of information are expected to satisfy the needs of almost every language before giving information to vowels. So a language with extreme ablauting should be expected to gravitate towards triconsonantal roots. Also reversing that, if you wanted to posit a gestural language with no underlying vowels or something a reasonable rich consonant inventory of 25 or so would meet your information needs with an average of 3 consonants a root. I actually used this info in my model lang years ago and revisiting the conlang spurred me to make this post. Our physiological preference for open syllables mean that most languages with information heavy vowels are going to like lexemes that are around 3~4 segments long, which, hey Japanese I'm looking at you. I don't know if the reason languages tend to have around 20 segments with 5 vowels is due to how close the sizes of vocabularies are to 25^3 or what but what ever reason the variables on either side of things are word structure comes from the relationship between them. The fact that most languages do not have exactly that is probably because it's a figment of other data. It just so happens that for the mechanical reasons things are sonorant or not means there's a lot more at end of the hierarchy than the other, and because the shape of the attack (the sound wave) between sounds has a sharper contrast going CV than VC (also air release issues and other things) the patterns created just happen to look in that neighborhood and we're starting from there to create the 15000 or so words we need for our experiences. and now i welcome you to look into English phonotactics and realize just how much this language does that carries no information and how fundamentally weird it is. like most languages maintain something around the 20*5 thing I'm talking about. When one thing is high usually the other is correspondingly lower; Arabic has a large consonant inventory but 3 vowels (well, 6, but information wise the chromeme can be boiled out as if it were another vowel or semivowel - you could make an argument that as far as segment/information ratios are concerned, long vowels are diphthongs consisting of vowel and a mora; but then you get into an issue of pharyngealization being more of a supersegmental feature than unique property of the 3rd series); Caucasian languages have famously small vowel inventories on average, but generally have shifted the features of the vowels to the consonants and even then there's probably some tradition interacting with underlying information and actualization. Indonesian has about 18 consonants in native words, and has 6 vowels. Outside of that kind of compensation Mandarin has a moderately large sum of vowels and consonants but has a highly restricted syllable that drives the number of possible syllables really far down. Japanese has a slightly high consonant to vowel ratio, but it has rules that forbid certain mora around semivowels and also as far as I can tell the "actually silent vowel" thing is a proscribed feature and actual human Japanese has some moraic consonants with underlying vowels and whatever, 22/5 is about good. Spain Spanish has basically the perfect ratio of 20/5 with some dialects loosing out on consonants and others picking up some vowels but basically the sometimes convoluted word structure from Latin's inflectedness can afford Spanish a little loss of the 20/5 ratio. Then there's English with something like 25/14 and on top of that root structures that look like CCCVCCC in addition to a set of words that wants to look like CCVCCVCV and honestly just what the fuck. I mutations and some segmentation issues (strengths is not monomorphemic) did a little of that and I guess the 14 or so could be charitably interpreted as something like 5*2+diphthongs with a broken symmetry because GVS but that's still high and weird. Hawaiian gets away with almost the opposite feature - 8/5 - but makes up for it with a lot of vowels in hiatus (right next to each other). That includes long vowels, which depending on how you count morae could mean 8/10 or 8/6 as far as information bearing goes. To be fair to English, though, this is kind of true of most languages in its family. All the Germanic languages have lots of consonants against lots of vowels, although for the most part this is explainable in part with historical reasons (and that some vowels don't (usually) occur monomorphemically), and like I said English broke its symmetries. IE languages in general have a weird relationship to this, such as with the Slavic languages (although they're considerably closer to typical if you treat palatization separately) or most of the romance languages (apart from Spanish but also Spanish kind of). Weird root structure is something all of us inherited from PIE, though, and that's a consequence of vowel reduction and morphemes fusing. Latin was comparitively agglutinative sticking parts of words together to get other words which is part of the word structure issues but most of these tread the thin line in the sand between decomposable and not for speakers. Indeed, Albanian took in a huge amount of Latin words - and chopped off basically everything carrying redundant or no information. So, English does have diachronic reasons for looking like it does, it's just - why haven't we pulled an albanian yet? Why haven't we seen a vowel collapse instead of just a bunch of chain shifts after chain shifts? Granted we're seeing slight consonant reduction and there's some things like /nju.kju.lr/ over /nju.kli.jr/ but we're still just weird. I mean we aren't - as much as people make fun of English, the arguably weirdest thing about it is that it only marks the 3rd person singular present for agreement but that's inverted for some speakers and disappearing for others and has a good, recent diachronic explanation. As far as I can tell, English shows no signs of it's CV sum reducing. Although thinking about it, Old English had an inventory closer to that of Spanish's today, just with some extra vowels because germanic pedigrees. So the sum is about as old as the 3rd person -s. But it's not going anywhere? idk ranted waaaay too much sorry
20 notes
·
View notes
Text
Session 5 - Linguistics 101
So, you’ve made half a dozen settlements across numerous continents. You’ve decided what types of climates these towns have, what sorts of trade they may excell in, but what to call them?
You could just use English words, especially if your conworld is inhabited by future Earth people. Expect names like “New London” or “New Japan”. Perhaps you are using English as the “common language” for your fantasy setting, and just want to make up towns like “Wheatfield” or “Riverrun”. Those work, and if they fit the feel of your setting, great! Stick with them. But if you want to make them feel a little more unique?
We need to create a language.
Now, I’m not talking about a fully functional language like Klingon or Sindarin. You don’t need to know the precise grammar to be able to name things in that language. All you need is a few words. So, where do we begin?
1: Select sounds.
Every language needs sounds, otherwise it’d be... silent. And there are an awful lot more sounds than in English. Most writers will just pick the same sounds as in English, and maybe that rough, throaty sound in the German word “Bach” or the Scottish word “Loch”. If you want your readers to be able to read a word with ease, it’s probably best to base it in English for now, lest you end up in a sea of asterisks and a downwards spiral as you force them to learn a completely new language for a single book. So, what else can we add in, or better yet, what can we take out?
Well, here is every consonant that humans can make:
...which is an awful lot. Sure, you may fall in love with the sound “dɮ”, but are your readers going to?
Lets cut that down a bit.
This is what English uses. Much simpler.
For constructing languages - conlangs - it’s better to think about features rather than individual sounds. It’d be strange to see a language that distinguises the voicing in p and b, but not in t and d. Similarly, it’d be strange to see a language with no alveolar stops - t & d, but with alveolar fricatives - s & z. So, lets take out an entire feature.
For my conlang, I want to make the language of the Barbarians, the native people to the temperate bands to the north. I want their people to be called something that can be misheard as “Barbarian”, so at the very least, I’ll want the sounds b, r, n, and some vowels in there too. I’m thinking of having them be very loud and emphatic speakers, so I’ll make all of their sounds voiced. So far, I’ll have:
Which is looking good to me. I also want it to have some trilled sounds, like the r in the Spanish word perro. So I’ll add that in. Just for funsies, I want to also add in the throaty r sound, like in the French word rendez-vous. Sounds good to me.
But, now I have three R-like sounds in my language. That’s too much for me, so maybe I’ll take out the regular r from English. And I think I’ve messed with English enough, so I’ll leave it there, for a grand total of:
And I’ll just add the basic 5 vowels of English - a, e, i, o, u.
But how do the sounds fit together? I can’t just pick up a handful and be done with it, otherwise I’d end up with trainwrecks like gzdilwlv, and not even I will be able to pronounce that.
2: Phonotactics
Every language has what is called phonotactics, which is basically a fancy term for how different strings of consonants and vowels make up a syllable. Some languages, like Japanese, will only allow CV, or one Consonant, and one Vowel. They’ll sometimes allow an n at the end of a syllable too. So we get words like “nani” or “jikan”. You can see just by the way that the sounds fit together that they are a different language to, say, German, with words like “Fremdschämen”, which has a lot more consonants in there. Take the English word “strengths”. that is one syllable long, and is CCCVCCCC. Yes, English has some god-awful words in it.
Perhaps we’ll want something in the middle. Not quite as simplistic as Japanese, but not quite as complex as English. Lets go with CCVC. This means that at the beginning of a syllable, we’ll allow a maximum of two consonants, and at the end, we’ll only allow one consonant. This could be something like “zdel”. I like it.
Now, this doesn’t mean that every word has to have all of those consonants, just like how every English syllable doesn’t end with a cluster of four consonants. That is just our maximum. I’ll say that, at minimum, it has to have at least the vowel, meaning we could have CV, VC, CVC, CCV, etc. I found a great page over at Zompist that will let you automatically generate words, given the sounds and phonotactics. Try fiddling around with it, until you get something that sounds like what you imagined.
3: Using the words
So, finally getting back to where we started: we need some town names. The first thing here is to start off with some simple elements that we can combine to make multiple town names. This is another way to make a fictional country feel more together, and distinguish it from other places. If I told you I live in Hertfordshire, that -shire ending basically says that I live in England. You’d be hard pressed to find a town named -shire in rural China.
So lets get ourselves a list, and I’ll fill it out with some words from this Barbarian language. We’ll need some nouns: River ( daðo ) Hill ( njave ) Town ( nareh ) Forest ( benmog ) Coast ( rozjo ) Farm ( wehe ) And some adjectives: Big ( gjob ) Small ( rin ) New ( nojziv ) High ( vmogren ) Green ( ðivab ) Red ( dzorun ) Long ( rjugna )
And then, put them together to make some new town names! Perhaps we enter “Nojzivwehe” (NewFarm) to gather supplies, before we get a call over to “Gedzewehe” (Gedze’s Farm) to clear out the Goblins. The King might reward us at “Dzorunareh” (RedTown), the capital. But we have to stop off at the forest of “Nojzibenmog” on the way. The key here is to have a few bits that get repeated over and over again. Maybe not in every town, but in enough that the readers get a feel of what this country sounds like. And those Barbarian towns will sound a lot different to the Elven towns I have made, with names like “Sotheqiquih” and “Jevaiphaunith”.
0 notes
Text
The evolution of numerals can be traced from Sarasvati (or Indus) numerals to Brahmi to modern international. Brahmi was used in India over 2000 years ago. What is called “HIndu” in the table refers to the standard Indian numerals after the symbol for zero came to be commonly used. The derivation of Brahmi numerals from the 4500 year old Sarasvati numerals is sketched in the paper by renowned Mr Subhas Kak at Department of Electronics Louisiana Stae University Baton Rouge, LA,
Bramhi
It is one of the most influential writing systems; all modern Indian scripts and several hundred scripts found in Southeast and East Asia are derived from Brahmi.
Rather than representing individual consonant (C) and vowel (V) sounds, its basic writing units represent syllables of various kinds (e.g. CV, CCV, CCCV, CVC, VC). Scripts which operate on this basis are normally classified as syllabic, but because the V and C component of Brahmi symbols are clearly distinguishable, it is classified as an alpha-syllabic writing system.
The Brahmi writing system is the modern name given to the oldest script used in India, during the final centuries BCE and the early centuries CE. Like its contemporary in what is now Afghanistan and Pakistan, Kharosthi, Brahmi was an abugida.
The best-known Brahmi inscriptions are the rock-cut edicts of Ashoka in north-central India, dated to the 3rd century BCE. Inscriptions in Tamil-Brahmi, a Southern Brahmic alphabet found on pottery in South India and Sri Lanka, may even predate the Ashoka edicts.
The Gupta script of the 5th century is sometimes called “Late Brahmi”. From the 6th century onward, the Brahmi script diversified into numerous local variants, grouped as the Brahmic family of scripts.
The script was deciphered in 1837 by James Prinsep, an archaeologist, philologist, and official of the British East India Company. Scholars, such as F. Raymond Allchin, take Brahmi as a purely indigenous development, perhaps with the Bronze Age Indus script as its predecessor.
G. R. Hunter in his book “The Script of Harappa and Mohenjodaro and Its Connection with Other Scripts (1934) details out the derivation of the Brahmi alphabets from the Indus Script, the match being considerably higher than that of Aramaic. Even though there is a lack of intervening evidence for writing during the millennium and a half between the collapse of the Indus Valley Civilization ca. 1900 BCE and the first appearance of Brahmi in the mid-4th century BCE, the Indus hypothesis is slowly gaining momentum because of the sheer differences between how Semitic alphabets work and how Brahmi works for an Indo-Aryan language.
While the contemporary and perhaps somewhat older Kharosthi script is speculated to be a derivation of the Aramaic script, the genesis of the Brahmi script is less straightforward. An origin in the Imperial Aramaic script has nevertheless been proposed by most scholars since the publications by Albrecht Weber (1856) and Georg Buhler’s On the origin of the Indian Brahma alphabet (1895).
Like Kharosthi, Brahmi was used to write the early dialects of Prakrit. Surviving records of the script are mostly restricted to inscriptions on buildings and graves as well as liturgical texts. Sanskrit was not written until many centuries later, and as a result, Brahmi is not a perfect match for Sanskrit; several Sanskrit sounds cannot be written in Brahmi.
DEVELOPMENT OF THE BRAHMI SCRIPT
Most examples of Brahmi found in North and Central India represent Prakrit language. The Ashokan Inscriptions already show some slight regional variations on the Brahmi script. In South India, particularly in Tamil-Nadu, Brahmi inscriptions represent Tamil, a language belonging to the Dravidian language family, with no linguistic affiliation to the Indo-Aryan languages such as Sanskrit or Prakrit.
Some Tamil examples come from inscribed potsherds found at Uraiyur (South India) dating to the 1st century BCE or the 1st century CE. In Arikamedu (South India) there is also evidence of an early form of Tamil in Brahmi inscriptions, dated to the early centuries CE. At this stage, different Brahmi characters specially adapted to suit Tamil phonetic were already in use. Examples of Tamil have not been identified among the earliest securely dated examples of Brahmi found at Anuradhapura in Sri Lanka, where the language represented is Prakrit.
By the 2nd century BCE, the Brahmi script becomes more popular with variations.
http://www.ancient.eu/Brahmi_Script/
Dr Subhas Kak, PhD Electronics in Baton Rouge ,LA
Bramhi
Evolution of Modern Numerals-Bramhi, India The evolution of numerals can be traced from Sarasvati (or Indus) numerals to Brahmi to modern international.
0 notes