notes-conLangsCommonSyllablesAndPhones

common syllables across languages

"Since words to call Mom or Dad are the most important thing for children, those words use some of the most common syllables spoken by babies (ba/pa and ma)." -- http://laowaichinese.net/cognate-coincidences.htm#comment-7590

baby babble

http://www.utexas.edu/features/2005/babble/

http://books.google.com/books?id=vDcUAgAAQBAJ&pg=PA386&lpg=PA386&dq=%22frequent+syllables%22+arabic&source=bl&ots=RHMGN4xVrT&sig=2xaKjcJCYESOipt7qkAaJV1aEDo&hl=en&sa=X&ei=SInuU6WqB5bYoASJ0oGwBw&ved=0CB8Q6AEwAA#v=onepage&q=%22frequent%20syllables%22%20arabic&f=false

common monosyllable shapes in Arabic children: CVC; CV;, CV;C

common monosyllable shapes in English children: CVC, CVV, CV

common monosyllable shapes in French children: CV, CCCV, CVCC

-- http://books.google.com/books?id=vDcUAgAAQBAJ&pg=PA386&lpg=PA386&dq=%22frequent+syllables%22+arabic&source=bl&ots=RHMGN4xVrT&sig=2xaKjcJCYESOipt7qkAaJV1aEDo&hl=en&sa=X&ei=SInuU6WqB5bYoASJ0oGwBw&ved=0CB8Q6AEwAA#v=onepage&q=%22frequent%20syllables%22%20arabic&f=false

common syllables in popular languages

Most frequent syllables in English

http://www.sttmedia.com/syllablefrequency-english

http://wiki.answers.com/Q/Most_common_syllables_of_US

Most frequent syllables in Mandarin Chinese

http://linguistics.stackexchange.com/questions/324/what-are-the-most-commonly-used-chinese-syllables http://www.cs.bath.ac.uk/~mdv/courses/CM30082/projects.bho/2007-8/Tait-D-Dissertation-2007-8.pdf (Appendix 1 and Appendix 2) http://technology.chtsai.org/syllable/ http://lingua.mtsu.edu/chinese-computing/phonology/ http://lingomi.com/blog/2011/04/whats-the-most-common-tone-combination-in-mandarin/ http://forum.wordreference.com/showthread.php?t=2298809 http://www-personal.umich.edu/~duanmu/10ChineseSyllable.pdf http://laowaichinese.net/pinyin-chart.htm

Most frequent syllables in Spanish

http://www.sttmedia.com/syllablefrequency-spanish

Most frequent syllables in Hindi

http://www.sttmedia.com/syllablefrequency-hindi http://home.iitk.ac.in/~prasant/HindiCorpus/gram.html

related: http://books.google.com/books?id=WMd9QM9ixRUC&pg=PA209&lpg=PA209&dq=%22frequent+syllable%22+hindi&source=bl&ots=x7jhSVgB7w&sig=C_UHtLWRg_Tiezva5oCBL2WBuW8&hl=en&sa=X&ei=KovuU9rOLc7zoASz24DQCg&ved=0CDMQ6AEwAg#v=onepage&q=%22frequent%20syllable%22%20hindi&f=false http://ltrc.iiit.ac.in/MachineTrans/publications/technicalReports/tr022/camera-187.pdf http://speech.tifr.res.in/chief/publ/13tts_ilsl12_Isca_ssw8_pp291_296.pdf

Most frequent syllables in Arabic

todo (i looked a little but couldn't find it)

phonology: http://en.wikipedia.org/wiki/Arabic_phonology

Across English and Chinese

todo

should this be an intersection, or an intersection plus the union of a smaller number of the most common ones in each? i'm thinking the latter

note: what i really care about is a list of all syllables that can be easily pronounced by both English and Chinese speakers. This should provide a subset of that list, but there may be syllables in one language that are pronouncable by speakers of the other even if they are not common in the other.

Across English, Chinese, Spanish

todo

(this is the intersection of the top few languages by number of native speakers, total speakers, and by GDP)

note: since there are now a bunch of languages, instead of taking an intersection, maybe 2/3?

Across English, Chinese, Spanish, Hindi, Arabic

todo

note: since there are now a bunch of languages, instead of taking an intersection, maybe 4/5?

sonority and Optimality Theory

somewhat related

pronouncable data

IPA

how large is the subset of IPA needed to encode all phonemes for English, Spanish, and Mandarin Chinese?

https://en.wikipedia.org/wiki/International_Phonetic_Alphabet_chart_for_English_dialects seems to have the following IPA letters:

ɐ̟ ɤ̯̈ a æ b ç d ð e e ə f h i j k l m n ŋ o ɵ ø ɔ œ p r s t u v w x z ʒ θ y(?however i cant find this one in the IPA chart?)

(37 total)

https://en.wikipedia.org/wiki/Help:IPA/English has:

g ɒ a æ b d ð e ə f h i j k l m n ŋ o ɔ p r s t u v w x z ʒ θ ʃ ʔ ɑ ɪ ʊ ɛ ʊ ʌ

taking their union:

g ɐ ɤ a æ b ç d ð e ə f h i j k l m n ŋ o ɵ ø ɔ œ p r s t u v w x y z ʒ θ ʃ ʔ ɑ ɪ ʊ ɛ ʌ

(38 total; i guess this is the the first list plus g; oops this left out a few though)

on https://en.wikipedia.org/wiki/Help:IPA/Spanish we see:

ɡ a b d ð e f i j k l m n ŋ o p r s t u v w x z β θ ɣ ʝ ʎ ɲ ɾ ʃ

on https://en.wikipedia.org/wiki/Help:IPA/Mandarin we see:

ɕ ɻ a e ɛ f i j k l m n ŋ o p s t u w x y ʂ ɥ ə ɤ ʊ ɚ ɹ

taking the union of these we have:

ɐ ɣ a æ b ç d ð e ə f g h i j k l m n ŋ o ɵ ø ɔ œ p r s t u v w x y z ʒ β θ

i think my program might be missing some things; for example it seems to have left out: ɪ ɒ ɑ ʃ ʔ ʊ ɛ ʌ ɣ ʝ ɟ ʝ ʎ ɲ ɾ ʃ ɕ ʈ ʂ ɻ ɥ ɹ ɤ ɚ

so (as a rough approximation), the real union is:

ɐ ɣ a æ b ç d ð e ə f g h i j k l m n ŋ o ɵ ø ɔ œ p r s t u v w x y z ʒ β θ ɪ ɒ ɑ ʃ ʔ ʊ ɛ ʌ ɣ ʝ ɟ ʝ ʎ ɲ ɾ ʃ ɕ ʈ ʂ ɻ ɥ ɹ̩ ɤ ɚ

which is about 61 IPA characters.

the intersection of English and Spanish is: ʃ a b d ð e f g i j k l m n ŋ o p r s t u v w x z θ

the intersection of English and Chinese is: ɤ ʊ a e ɛ ə f i j k l m n ŋ o p s t u w x y

the intersection of Spanish and Chinese is: a e f i j k l m n ŋ o p s t u w x

things in exactly 2 of English, Spanish, Chinese (14 total): ɤ ʃ ʊ b d ð ə ɛ r v w y z θ

the intersection of English and Spanish and Chinese is (17 total): a e f i j k l m n ŋ o p s t u w x

so things in at least 2 of English, Spanish, Chinese (31 total): a e f i j k l m n ŋ o p s t u w x ɤ ʃ ʊ b d ð ə ɛ g r v w y z θ

things in at least 2 of English, Spanish, Chinese for which IPA uses non-Latin letters (8): ŋ ɤ ʃ ʊ ð ə ɛ θ

other IPA symbols in English only (15): ɐ ɤ æ b ç h ɵ ø ɔ œ ʒ ʔ ɑ ɪ ʌ

other IPA symbols in Spanish only (6): β ɣ ʝ ʎ ɲ ɾ

other IPA symbols in Chinese only (6): ɕ ɻ ʂ ɥ ɚ ɹ

Roman letters which are not an IPA symbol in any of English, Spanish, or Chinese: c q

misc

So if we wanted these 61, plus 12 numerals (duodecimal), plus 32 symbols (like ASCII), we'd have 61+12+32 = 105 so far. I might suggest that we start with that and then augment to either 128 or 144, by adding in more symbols for common or fundamental words/concepts. So we have room for 23-39 other guys. I thought of a few candidates off the top of my head and already came up with more than 50. Otoh my keyboard only has 47 keys for letters, numbers, and symbols (not counting shifted items, and movement or shift keys) so maybe 144 is way too many 47 keys*2 (for shifting) is only 94 (presumably my keyboard is that way because there are 94 printable ascii characters). So, if we want to stick with the same number of printable characters as ASCII, and if we are keeping all of the punctuation and numerals, and upper-case characters (although we'd probably type the characters unshifted), then the only room we have left to change is the lowercase characters (of which there are 26). If we are adding in 2 more numerals, for duodecimal, then we have 24 spots left. I guess at least some of those should be for new letters, so that we have at least as many letters total as we have phonemes in English, Spanish, Chinese (English has about 37 right? suggesting that we should add at least 11 letters?), but we want to use some of that space for logograms/shorthand symbols too.

We should also note that, although there are 94 printable ASCII characters and about the same number of characters on a keyboard, it's kind of hard to move your hand quickly reliably to any character on the keyboard, much less to chord them. To see how many simultaineous inputs a human can easily manage with hands, look at console-game-style controllers. The Steam controller is perhaps an upper bound. The Steam controller has:

So a total of 15 buttons. (however, note that the Steam Controller excels in providing a variety of configurable combination input methods to quickly access many more than 15 things, including: Action Sets (multiple controller mappings that can be swapped into by pressing a button), Mode Shifts (not sure about this but i think this is how you would program key combo type things, eg so 'X+A' could have some meaning different from 'A'), and Touch Menus (which allows you to select between a small number of hotkeys using an onscreen menu controlled by one of the trackpads)).

So i guess we should be looking at intersections, not unions, of IPA phonemes for English, Spanish, Chinese, and we should also be looking for intersections on their high-frequency word lists.

So since there were 8 symbols in at least 2 of English, Spanish, Chinese's IPA letter notation for phoenemes (not counting IPA modifiers), let's add in those. So now we have 16 slots left (26 lowercase letters - 2 numerals - 8 IPA symbols). So we could use these 16 for logograms/macros/shorthand for common or important words/concepts.

misc

"English has a total of 36-37 phonemes in most dialects -- 24 consonants and 12 or 13 vowels -- but has to make do with a measly 26 letters to represent those distinctive sounds graphically." Thomas Wier, [1]

" Language with the shortest alphabet: Rotokas (12 letters). Approx. 4300 people speak this East Papuan language. They live primarily in the Bougainville Province of Papua New Guinea.

The language with the fewest sounds (phonemes): Rotokas (11 phonemes)

The most common consonant sounds in the world's languages: /p/, /t/, /k/, /m/, /n/ " -- [2]


most common phonemes

note: technically a phoneme is a (usually language-specific) equivalence class of 'phones', so what i am really asking is the most common phones.

https://www.google.com/search?q=most+common+phonemes

http://web.phonetik.uni-frankfurt.de/upsid_info.html

"The most common vowel system consists of the five vowels /i/, /e/, /a/, /o/, /u/. The most common consonants are /p/, /t/, /k/, /m/, /n/.[18] Relatively few languages lack any of these consonants, although it does happen..." -- https://en.wikipedia.org/wiki/Phoneme

"The vowels [a], [e], [i], [o], and [u] are all very common. You can find at least three of those in most languages, and many of them have this exact sequence as their vowel inventory — usually [a], [i], and either [u] or [o]. Among consonants, the nasal consonants (mainly [m] and [n])are also extremely common, and so are the voiceless stops [p],[t], and [k]." -- [3]

"The most common vowel system consists of the five vowels /i/, /e/, /a/, /o/, /u/. The most common consonants are /p/, /t/, /k/, /m/, /n/....I took the ten phonemes it mentioned and searched for them in the UCLA Phonological Segment Inventory Database http://web.phonetik.uni-frankfurt.de/upsid.html , which has phoneme information on 451 languages. Here are the percentages of languages in UPSID using each of the ten phonemes described as "most common" on wikipedia.

Here are the results:

    UPSID m: 425 languages (94 %)
    UPSID k: 403 languages (89 %)
    UPSID i: 393 languages (87 %)
    UPSID a: 392 languages (87 %)
    UPSID j: 378 languages (84 %)
    UPSID p: 375 languages (83 %)
    UPSID u: 369 languages (82 %)
    UPSID n: 237 languages (53 %)
    UPSID e: 186 languages (41 %)
    UPSID o: 181 languages (40 %)" -- [4]

" Hey I was wondering the same thing, decided to make one based on the UPSID database.

here are the first 32:

of lang sound description Index of /S 94.24% m voiced bilabial nasal S0621.html 89.36% k voiceless velar plosive S0573.html 87.14% i high front unrounded vowel S0532.html 86.92% a low central unrounded vowel S0300.html 83.81% j voiced palatal approximant S0568.html 83.15% p voiceless bilabial plosive S0721.html 81.82% u high back rounded vowel S0853.html 73.61% w voiced labial-velar approximant S0892.html 63.64% b voiced bilabial plosive S0349.html 61.86% h voiceless glottal fricative S0473.html 56.10% g voiced velar plosive S0438.html 52.55% N voiced velar nasal S0222.html 47.89% ? voiceless glottal plosive S0173.html 44.79% n voiced alveolar nasal S0635.html 43.46% s voiceless alveolar sibilant fricative S0784.html 41.69% tS voiceless palato-alveolar sibilant affricate S0827.html 41.46% S voiceless palato-alveolar sibilant fricative S0265.html 41.24% E lower mid front unrounded vowel S0196.html 40.13% t voiceless alveolar plosive S0797.html 40.13% "o mid back rounded vowel S0066.html 39.91% f voiceless labiodental fricative S0433.html 38.58% l voiced alveolar lateral approximant S0598.html 37.47% "e mid front unrounded vowel S0023.html 35.92% O lower mid back rounded vowel S0238.html 35.48% "n voiced dental/alveolar nasal S0053.html 33.70% "t voiceless dental/alveolar plosive S0102.html 31.26% nj voiced palatal nasal S0671.html 30.16% "l voiced dental/alveolar lateral approximant S0045.html 29.93% "s voiceless dental/alveolar sibilant fricative S0094.html 29.05% o higher mid back rounded vowel S0691.html 27.49% e higher mid front unrounded vowel S0419.html 26.61% d voiced alveolar plosive S0373.html

can find the rest in text files I published on my blog post: Most Common Phonemes, Least Common Phonems, of All languages of the world. http://weyounet.info/2014/01/most-common-phonemes-least-common-phonems-of-all-languages-of-the-world/ " -- [5]

" Collin Wheeler Collin Wheeler, Linguist, Guitarist, Conlanger, Gamer, Game Developer, Youtuber Answered Dec 11, 2017

Here is a link to a list of all phonemes (well, maybe not all of them). You are able to sort be most - least frequent (and vice-versa), and can filter out consonants or vowels, and search for specific phonemes, etc.

http://phoible.org/parameters " -- [6]

" André Müller, doing his PhD? in linguistics about language contact in Burma Answered Jun 7, 2016 · Author has 552 answers and 1.6m answer views

There’s a linguistic database out there that can answer your question and give some numbers, too. It’s called PHOIBLE Online and has information on 2155 languages. A colleague of mine is one of the creators. You can play around with it a bit, and then you can find out…

The most common consonants: /m, k, j, p, w, n, s, t, b, l, h, ɡ, ŋ, d/ - these sounds occur in more than 50% of all languages, in this order, e.g. /m/ in 95% of all languages, /d/ only in 54%.

The most common vowels: /i, a, u, o, e/ - these are the vowels occuring in more than half of all languages, in this order. So indeed /i, a, u/ are the most frequent (well over 80%). " -- [7]

" Michael Campbell polyglot, author, phonologist Answered May 28, 2016

The unvoiced stops are almost all present /p/, /t/, /k/. There are very few exceptions, a commonly cited one being Arabic which lacks /p/, but does have the voiced counterpart.

As for vowels, it is safe to say that /a/, /i/, /u/ are present in almost all of the world’s languages. Again, like Arabic there are only a handful that are missing one of these three. Mind you that the phonetic versions of the vowels may vary drastically from the perceived phonemic cardinal positions. For example, the Japanese /u/ certainly is not [u], but rather [ɯ] which is quite different from our perception of a cardinal /u/. Two Formosan languages that I speak, Bunun and Thaw, only have /a/, /i/, /u/.

Another response here gave you five vowels, however these are not as prevalent as the core three given above. Again, the emergence of /e/ and /o/ phonetically fall on a spectrum of sounds that can differ quite a lot from English. For example, phonetically speaking both Finnish and Japanese [e] and [o] occur halfway between English’s [e] / [ɛ] and [o] / [ɔ] respectively. In the Thaw language I mentioned above, both [e] and [o] are produced phonetically when they occur next to any /q/ or /r/, so, since they are conditional, they are not phonemic.

Liquids and nasals are not as stable and constant as the six phonemes above across all languages. A good reference point to start with would be the WALS database. Here is the link to the page with small vowel inventories, and you can explore more features starting from here. (Feature 2A: Vowel Quality Inventories). Then you can lookup the vowel inventories of any of these languages listed here as having small inventories. " -- [8]

" If we do compare phonemes, there are some that are very common like the cardinal vowels (Is the standard five-vowel inventory (a, e, i, o, u and its variants) the most common one?) and the consonants T and S. " -- [9]

this guy's answer is too long to quote but may be useful. He talks about common articulatory classes rather than phones: https://www.quora.com/What-phonemes-are-common-to-all-human-languages/answer/David-Rosson

" All languages have some kind of ‘a’. (one of /a,ɑ,æ,ɐ/)

Aside from that, all the other sounds have at least one language somewhere (often a linguistic curiosity) lacking it:

    No /p~b/ in Iroquois or Aleut
    No /t~d/ in Hawaiian or Nǁng
    No /k~g/ in Samoan or Xavante
    No /s/ in most Australian languages
    No /m/ or /n/ in Lushootseed or Rotokas
    No /w~v~β/, /l/ or any r-sound in Maxakali
    No /j/ in Georgian
    No vowels other than /a/ and /ə/ in Arrernte

You have to dig pretty damn deep to find languages that don’t have some kind of equivalent to all of /p~b/, /t~d/, /k~g/, /m/, /n/, /a/, /i/, /o~u~ɯ/, and at least 2 out of /w~v~β~ʋ/, /j/, /l~ɬ/, /r~ɾ~ʁ~ʀ~ɹ~ɻ~ɽ~ɺ/, but you can always find something.

If you use ridiculously wide definitions for sounds, you can get a sorta universal inventory that looks something like this:

    /p~b~t~d/
    /k~g~c~ʔ~kp/
    /m~w~v~β~mb/
    /n~l~ɬ~j~r~ɾ~ʁ~ʀ~ɹ~ɻ~ɽ~ɺ/
    /a~ɑ~æ~ɐ/
    /i~ɪ~e~ɛ~ə~ɨ~u~ʊ~o~ɔ/

" -- [10]

" It's really a scale rather than "uncommon" versus "common," but the absolutely most common sounds would probably be

    the stops [p], [t], [k] more on common stops
    the nasals [m] and [n] more on nasals
    the vowels [i], [a], [u] more on occurrences of vowels" -- [11]

https://www.reddit.com/r/conlangs/comments/26wdqe/most_common_phonemes_in_you_language/

" skookybird 13 points · 6 years ago

/a/ is the most common vowel, found in almost all languages. /i, u/ are close.

All languages have stops, most common being /t, k, p/ in that order. Very few languages lack any of those three, none(?) lack all three.

(Lifted from my introductory text.) level 1 Aksalon 6 points · 6 years ago

The term I think you're looking for is markedness. Very rare phonemes are called marked, while very common phonemes are called unmarked.

I don't know if there are any absolutely universal phonemes. /t/ is an example of a nearly universal one, but there are languages that lack it (e.g. Hawaiian). The stops /p, t, k/ are all quite unmarked. /m, n/ as well. Many languages also have the five vowel system /i, e, a, o, u/, meaning that those are also pretty unmarked.

Markedness is a gradient scale, so there are any number of phonemes you could consider to be relatively unmarked. It depends on what your point of reference is.

" -- [12]

" level 1 Wierdmin 1 point · 6 years ago

/t/

The answer is always /t/.

And /a/. " -- [13]

" level 1 hellohelicopter 0 points · 6 years ago

Wikipedia's got a small overview for consonants and a long one for vowels. Basically most (but not all) languages have [p,t,k,m,n] and some iteration of [i,a,u] and frequently [e,o] as well, but it's very hard to draw up universals since at least one language will usually break the rule.

" -- [14]

" All languages have at least one vowel, usually more (in some analyses, Abkhaz has only a single vowel /a/, but it is likely that /I/ should also be given phonemic status). Common, but certainly not universal, schemes are the 3-vowel system i-a-u, the 5-vowel system i-e-a-o-u and the 7-vowel system i-e-E-a-O-o-u.

I'm not sure stops are universal. When a language distinguishes between voiced and unvoiced stops (many don't), /b/, /t/, /d/ and /k/ are the most universal ones (/p/ and /g/ are more likely to be absent, cf. Classical Arabic).

Most languages have nasals, but there are some that don't. I would think, though, that an awful lot of languages have at least /m/ and /n/.

>Are there any phonemes we consider normal in Germanic and Romance languages >but that are barely present in any other language groups?

Again hard to say. I think the labiodentals /f/, /v/ are not terribly common. Many languages also do not distinguish /l/ and /r/. " -- [15]

" The guinness book of records says that virtually all languages have the the vowel a. " -- [16]

summary of section Most Common Phonemes

from wikipedia, we have:

a,i,u,e,o p,t,k,m,n

the PHOIBLE db would continue the constants with: jwlsb (or another commentator gave a different order: j,w,s,b,l)

PHOIBLE might then continue with nu (which is actually before e and o in PHOIBLE) and then (unicode LATIN SMALL LETTER SCRIPT G).

the weyounet guy's top 16, that he got from UPSID, is:

94.24% m voiced bilabial nasal S0621.html 89.36% k voiceless velar plosive S0573.html 87.14% i high front unrounded vowel S0532.html 86.92% a low central unrounded vowel S0300.html 83.81% j voiced palatal approximant S0568.html 83.15% p voiceless bilabial plosive S0721.html 81.82% u high back rounded vowel S0853.html 73.61% w voiced labial-velar approximant S0892.html 63.64% b voiced bilabial plosive S0349.html 61.86% h voiceless glottal fricative S0473.html 56.10% g voiced velar plosive S0438.html 52.55% N voiced velar nasal S0222.html 47.89% ? voiceless glottal plosive S0173.html 44.79% n voiced alveolar nasal S0635.html 43.46% s voiceless alveolar sibilant fricative S0784.html 41.69% tS voiceless palato-alveolar sibilant affricate S0827.html

someone else noted: UPSID m: 425 languages (94 %) UPSID k: 403 languages (89 %) UPSID i: 393 languages (87 %) UPSID a: 392 languages (87 %) UPSID j: 378 languages (84 %) UPSID p: 375 languages (83 %) UPSID u: 369 languages (82 %) UPSID n: 237 languages (53 %) UPSID e: 186 languages (41 %) UPSID o: 181 languages (40 %)

http://web.phonetik.uni-frankfurt.de/upsid_info.html says (i abbreviated/deleted the least common items in this table):

consonant: m k j p w b h g N ? n s tS S t in languages: 425 403 378 375 332 287 279 253 237 216 202 196 188 187 181 frequency: 94.2 89.4 83.8 83.2 73.6 63.6 61.9 56.1 52.6 47.9 44.8 43.5 41.7 41.5 40.1

(the table continues with f l "n "t nj)

vowel: i a u E "o "e O o e in languages: 393 392 369 186 181 frequency: 87.1 86.9 81.8 41.2 40.1

and from the section IPA, the intersection of English and Spanish and Chinese is (17 total): a e f i j k l m n ŋ o p s t u w x

so, the following seem to be (semi-)agreed upon as common:

a,i,u,e,o p,t,k,m,n

and then: jw

and then maybe: b

so taking 12 (because that's a nice number base too): a,e,i,j,k,m,n,o,p,t,u,w

unfortunately that's not enough to express even the most common syllables in English [17] [18]

what if we throw in the other things above 50% in both UPSID and PHOIBLE? in PHOIBLE we have: b l s nu g h and in UPSID we have b h g nu . So they both agree on b,h,g,nu, even if not on l,s. nu is the '-ing' sound by the way, i think, which is called 'velar nasal' in wikipedia [19], and UPSID writes it like N, i think.

so now we have 16: a,b,e,g,h,i,j,k,m,n,NU/ING,o,p,t,u,w

but neither English nor Spanish represents NU/ING with its own single letter, and [20] points out that many languages mostly just use that sound at the end of a word. So let's take that one out.

so now we have 15: a,b,e,g,h,i,j,k,m,n,o,p,t,u,w

this gets us the first 6 most frequent letters in English (etaoin, next would be srdlcf) and the first 3 in spanish (eao, next would be srdlcv).

Recalling that PHOIBLE liked ls too, and UPSID liked s, that suggests that we should add s. PHOIBLE and UPSID both have 'l' relatively high so maybe throw that in too. Also UPSID doesn't have English in it so it's hard for me to find English phonemes in that list.

PHOIBLE SPA has 'r' pretty low on its list: https://phoible.org/inventories/view/160 as does British English https://phoible.org/inventories/view/2180 and American English https://phoible.org/inventories/view/2176 . 'd' is pretty high up there though, although it's only at 26% in UPSID [21].

note that if we wanted to add a few more, both 's' and 'r' as letters are very common in English and Spanish text (as phonemes, 'r' is not as common, surprisingly to me; however maybe that's because it's just how other more common phonemes are spelled; https://www.dyslexia-reading-well.com/44-phonemes-in-english.html).

But 16 is kind of nice because a keyboard that used thumb touching each finger on each hand, cording between hands only, would have 16 keys (4*4) (note: a stenotype keyboard has about 22 letter keys; i guess that's two for each finger and then 2 more for each thumb on the bottom).

So the 16 letters would be:

a,b,e,h,g,h,i,j,k,m,n,o,p,s,t,u,w

The 22 letters on a steno keyboard are:

STKPWHRAO*EUFRPBLGTSDZ

So the ones we discussed above which are found on the steno keyboard are:

lrd

Steno also has '*' and 'f' and 'z'. '*' is not a Roman letter and 'f' and 'z' are not as common. So that would suggest 19 letters:

a,b,d,e,g,h,i,j,k,l,m,n,o,p,r,s,t,u,w

which letters are missing from the normal alphabet?

c,f,q,v,x,y,z

if we were thinking letters instead of phonemes, can we remove more? 'j' and 'k' are infrequently used letters in both English and Spanish. The others are all fairly frequent. But if we take out 'k', since we are already lacking 'c', it would be hard to indicate the 'k' sound. Also the 'j' sound is very common and the 'g' sound is somewhat common. So maybe leave those in.