proj-oot-ootNaturalLanguageNotes1

what do people talk about in human languages?

not so much algorithm specification, but rather, just passing what we would call data.

but this data is often in rather specific forms:

---

what is the typical bit-proportion of overhead from http://en.m.wikipedia.org/wiki/Agreement_(linguistics) etc (i guess http://en.m.wikipedia.org/wiki/Inflection is the general teerm?)in natural language? and how much of this is redundancy?

http://en.m.wikipedia.org/wiki/Synthetic_language

---

i recall that in college i was taught that children first learn words for 'basic level' categories. The evidence for this was surprisingly hard for me to Google for, so here it is:

https://www.google.com/search?client=ubuntu&hs=Hpj&channel=fs&biw=1200&bih=664&q=%22basic+level%22&oq=%22basic+level%22&gs_l=serp.3..0l10.1187146.1189379.0.1189492.13.8.0.0.0.0.405.766.3-1j1.2.0....0...1c.1.64.serp..11.2.764.XppE33qFlk8

https://www.google.com/search?client=ubuntu&hs=yej&channel=fs&q=superordinate+basic+subordinate+development+word&oq=superordinate+basic+subordinate+development+word&gs_l=serp.3...5730.6373.0.6445.5.4.0.0.0.0.0.0..0.0....0...1c.1.64.serp..5.0.0._-ry2ZoK6YI

THE MAIN ISSUE I WAS HAVING WAS THAT I WAS SEARCHING FOR 'BASE LEVEL' BUT THE CORRECT KEYWORD IS 'BASIC LEVEL'.

the keywords seem to be "superordinate, basic, subordinate" and maybe "categorization", probably conjoined with a query for 'word' or 'vocabulary' or 'lexicon' or 'language' and also with 'psychology' or 'developmental' or 'children' (or maybe 'learning' but this seems to give 'how to learn' how-to results rather than research, at least in a general google search rather than google scholar).

eg "Most of the early object words refer to basic-level categories, as opposed to superordinate or subordinate level (Anglin, 1977; Rescorla, 1980)." -- Early Child Development in the French Tradition

eg "...basic level examples were the most easily named by all the children.." -- https://www.researchgate.net/publication/289065846_Constraints_on_the_Representation_of_Word_Meaning_Evidence_From_Autistic_and_Mentally_Retarded_Children

" Basic level categories

The most inclusive level at which:

Used for everyday reference

Brown 1958, 1965, Berlin et al., 1972, 1973

Folk biology: Unique beginner: plant, animal Life form: tree, bush, flower Generic name: pine, oak, maple, elm Specific name: Ponderosa pine, white pine Varietal name: Western Ponderosa pine

No overlap between levels

Level 3 is basic Corresponds to genus Folk biological categories correspond accurately to scientific biological categories only at the basic level

Evidence Basic Level is Special

People almost exclusively use basic-level names in free-naming tasks Children learn basic-level concepts sooner than other levels Basic-level is much more common in adult discourse than names for superordinate categories Different cultures tend to use the same basic-level categories, at least for living things

" -- web.stanford.edu/class/linguist62n/lexsem.ppt

(mb should also lookup keywords hyponym hypernym)

a paper that creates a readability level metric for texts based on ratio of 'base level' words: Measuring Text Readability by Lexical Relations Retrieved from WordNet by Shu-yen Lin, Cheng-chao Su, Yu-da Lai, Li-chin Yang, Shu-kai Hsieh

(but since WordNet? doesn't tag 'base level', they have to use morphological features:) " ((Preliminary experiments to justify their automated classification of words into 'base level:))

A basic level word is assumed to have these features: (1) It is relatively short (containing less letters than their hypernyms/hyponyms in average); (2) Its direct hyponyms have more synsets1 than its direct hypernyms; (3) It is morphologically simple. Notice that some entries in WordNet? [2] contain more than one word. We assume that an item composed of two or more words is NOT a basic level word. A lexical entry composed of two or more words is defined as a COMPOUND in this study. The first word of a compound may or may not be a noun, and there may or may not be spaces or hyphens between the component words of a compound. ... basic level words have the highest compound ratios. In comparison with their hypernyms and hyponyms, they are much more frequently used to form compound words. ... Our data pose a challenge to Prototype Theory in that a subordinate word of a basic level word may act as a basic level word itself. The word ‘card’, a hyponym of ‘paper’, is of this type. With its high compound ratio of 25%, ‘card’ may also be deemed to be a basic level word. This fact raises another question as to whether a superordinate word may also act as a basic level word itself.

Many of the basic level words in our list have three or more levels of hyponym. It seems that what is cognitively basic may not be low in the ontological tree. A closer look at the distribution of the compounds across the hyponymous levels reveals another interesting pattern. Basic level words have the ability to permeate through two to three levels of hyponyms in forming compounds. By contrast, words at the superordinate levels do not have such ability, and their compounds mostly occur at the direct hyponymous level

... ((the criteria for their automated classification of words into 'base level:))

Based on the results of the two preliminary experiments, we assume that basic level words have at least the following two characteristics: (1) They have great ability to form compounded hyponyms; (2) Their word length is shorter than the average word length of their direct hyponyms. These characteristics can be further simplified as the Filter Conditionto pick out basic level words:

(1) Compound ratio of full hyponym >= 25%; (2) Average word length of direct hyponym minus target word length >= 4

... the information for each noun we need includes: (1) Length of the target word, i.e. how many letters the word contains; (2) Compound ratio of the target word, i.e. how many hyponyms of the word are compounds formed by the word. Note that here the hyponyms refer to the full hyponyms, so all the words in every hyponymous synset were counted; (3) Average word length of the direct hyponyms

...

paper is just the first step to measur e readability by lexical relations retrieved from WordNet? [2]. Twenty-five percent of the twenty basic level words defined by Rosch et al. [1] are NOT identified by our Filter Condition (e.g. ‘truck’, ‘shirt’, socks’). Among the identified basic level words in the three selected texts, some look rather dubious to us (e.g. ‘barometer’, ‘technology’).

((a question for future research:)) (3) Are basic level words frequent words in general? Can we use frequency to substitute for ‘basichood’ if the two criteria have approximately the same indexing power? We like to extend this question and ask whether the ontological relations between the lexical units in WordNet? are correlated with word frequency.

" -- http://www.aclweb.org/anthology/O08-1

not too useful but a random intro: http://www.wisegeek.com/what-is-a-basic-level-category.htm http://cogling.wikia.com/wiki/Levels_of_categorization http://cogling.wikia.com/wiki/Basic_level_category

Similar and Different: The Differentiation of Basic-Level: abstract: " Categories in the middle level of a taxonomic hierarchy tend to be highly differentiated in that they have both high levels of within-category similarity and low levels of between-category similarity. Research on similarity reveals a distinction between pairs of categories that are seen as dissimilar because they have few commonalities and pairs that are seen as dissimilar because they have many psychologically relevant alignable differences. The authors suggest that the low between-category similarity proposed for neighboring basic-level categories is actually a matter of having many psychologically relevant differences. In contrast, the low between-category similarity of superordinates is a result of their having few commonalities. The authors evaluate this claim in 4 experiments using a variety of natural stimuli and converging measures. The data support the importance of alignable differences for distinguish-ing between pairs of basic-level categories. "

"The level of specificity/generality of a concept within a conceptual hierarchy at which most people tend naturally to categorize it, usually neither the most specific nor the most general available category but the one with the most attributes distinctive of the concept in question." -- http://www.oxfordreference.com/view/10.1093/oi/authority.20110803095450465

---

my summary of the above may be found at notes-cog-basicLevelCategories

---

the 'subject' in a sentence is like a thread or process or 'actor'

---

the use of assertions instead of imperatives in natural language, and therefore of logic programming, is like a fancy db instead of a simple struct data structure; you can query it in forms other than you put the data in; like Facebook graphql but better (although graph ql seems more like views).

maybe focus on this more. This would be like oop except different 'objects' communicate via inference. But how to modularize? Remember need for neglect syndrome/paraconsistent reasoning. Also, multiple results with probabilities?

---

The Moon alphabet

http://www.omniglot.com/writing/moon.htm

apparently has some common words/abbreviations:

---

roles of words/phrases:

also,

-- [1]

"

whym 6 days ago [-]

This article does a great job at presenting a gist of the Japanese sentence structure. Nevertheless, it makes me want to point out that it's not the whole story. If you take into account topics such as modality and conjugation, some of the information you add to a verb is placed after the verb and cannot be freely reordered.

Japanese verbs are "greater" than English verbs in the sense that you conjugate/suffixate a verb to express negation, conjunctions, conditional forms etc, making it longer and longer: https://en.wikipedia.org/wiki/Japanese_verb_conjugation

In contrast, English has a relatively simple set of inflections of verbs. Many of those Japanese verb forms and suffixated long verbs are translated into multi-word phrases. Compare:

Anata wa kyou nemuru. (You sleep today.) -- verb is in normal form ("nemuru")

Anata wa kinou nemurenakatta. (You were not able to sleep yesterday.) -- verb is in continuative form ("nemuru"→"nemu") + possibility suffix ("reru"→"re") + negation suffix ("nai"→"naka") + past suffix ("ta"→"tta")

reply

klodolph 6 days ago [-]

The name for what you're talking about is "morphological typology". Languages occupy a spectrum from analytic (words stay the same) to synthetic (words change). English is usually categorized as analytic, since we only have a few ways to change words: plural -s, past tense -ed, etc., and English has been getting more analytic over time. Other European languages are more synthetic (fusional), like French, German, Spanish, etc. Japanese, Finnish, Hungarian are very synthetic (agglutinative). Chinese (all varieties) is on the opposite end of the spectrum, and it's much more analytic than English.

reply "

---

https://en.m.wikipedia.org/wiki/Pidgin#Common_traits

(easier to read at the above hyperlink than here, because a bunch of technical linguistic terms are hyperlinked to their descriptions)

" A pidgin[1][2][3] ..., or pidgin language, is a grammatically simplified means of communication that develops between two or more groups that do not have a language in common

Common traitsEdit

Pidgins are usually less morphologically complex but more syntactically rigid than other languages, usually have fewer morphosyntactic irregularities than other languages, and often consist of:[citation needed]

    Uncomplicated clausal structure (e.g., no embedded clauses, etc.)
    Reduction or elimination of syllable codas
    Reduction of consonant clusters or breaking them with epenthesis
    Basic vowels, such as [a, e, i, o, u]
    No tones, such as those found in West African, Asian and many North American Indigenous languages
    Use of separate words to indicate tense, usually preceding the verb
    Use of reduplication to represent plurals, superlatives, and other parts of speech that represent the concept being increased
    A lack of morphophonemic variation

"

---

'for example' might be a fundamental word, because it's useful in teaching a language

---

https://thelinguistblogger.wordpress.com/2008/07/20/what-makes-a-language-difficult/

excerpts, paraphrased:

difficult: conjugations, number agreement, cases, honorifics, noun genders or declensions

easy: Words are spelled very, very similarly to the way they are spoken

easy: straight forward writing system

hard: endings and the beginnings of words change depending on who is talking, who they are referring to, whether or not they are asking a question, etc

hard: so many consonants that just about any learner will have to tackle at least a few sounds that either seem impossible to produce or identical to anyone but a native

easy: simple grammar

https://www.fluentu.com/blog/easiest-language-to-learn/

quotes and paraphrases below

1. How closely it’s related to the languages you already know.

2. How complex its system of sounds is.

3. How complicated its grammar is.

Languages with Painless Pronunciation

in terms of phonology (the system of speech sounds in a language), not all languages are made equal: Some have dozens of different consonants and vowels, and some have only a few.

easy:

4. Spanish

In Spanish, a always sounds more or less like a (even with an accent mark),

5. Japanese

Japanese has historically gotten bad PR among language learners, but its pronunciation is actually remarkably simple. Of its 19 consonants, only a couple are rare among world languages, and its five vowels are remarkably similar to those in Spanish.

6. Italian

It’s got a few more vowels than its cousin down in Spain, but Italian’s big advantage is that most of its consonants and vowels are among the most common sounds found in world languages.

To see what’s out there for you beyond these three, you could start with this list of world languages ordered by number of phonemes (distinct speech sounds) to get an idea of which languages are more phonologically difficult than others. [2]

Keep in mind that most of the extremes (languages with very many or very few phonemes) are very old, very isolated languages that might not be easy or practical to learn, but you can still use the tool to compare whether Greek or Russian is your best choice.

Goodbye Grammar Book: Languages with Simple Structures

I’ve always shied away from German for this reason. Its four noun cases, infinite list of adjective declensions and word order rules are enough to send me running to the nearest biergarten.

easy:

7. Mandarin Chinese

This is probably the first time you’ve seen Chinese on a list of easy languages, right? That’s a shame, because structurally speaking, it’s a cinch. Almost every word of Mandarin has one and just one meaning. It also generally follows a subject-verb-object word order, common to most of the world’s larger languages, so no new tricky syntax for most learners.

8. Afrikaans

We mentioned Dutch above, but Afrikaans is like a grammatically boiled-down version of its parent language. Whereas Dutch demands verb conjugations like those in English—for instance, I am, you are, it is—Afrikaans doesn’t bother you with the details. In South Africa it’s ek is (I am), jy is (you are), sy is (she is). What could be easier?

9. Malay

The language known regionally as Indonesian or Malaysian totals around 270 million total speakers, making it both one of the largest and fastest-growing world languages. Even better, it has no grammatical categories for gender, number or tense. Basically, you learn one form of a word, and you can use it just about whenever you want.

10. Esperanto

This language was invented by some linguists who were also great global citizens, and even though it’s “made up,” its 2 million speakers, several hundred thousand Wikipedia articles, and organizations worldwide would argue that it still counts. Esperanto was designed with you in mind: Minimal grammar, easy rules and as a bonus, lots of things that resemble many other world languages.

.... (about english)

has some difficult sounds like interdental th, some phrasal verbs that admittedly make no sense and a spelling system that makes even less sense. But in general, English doesn’t have a lot of inflections, so there’s no messy grammar and most, though not all, of its sounds will be familiar to speakers of other languages ....

https://www.economist.com/blogs/economist-explains/2013/08/economist-explains-19 " Some languages proliferate endings on verbs and nouns, like Latin and Russian. Such inflection can be hard for learners who are not used to it. Several years ago, two scholars found that smaller languages (those with less contact with other languages) tended to have more inflection than big ones. By contrast, creole languages—which arise between groups that do not share a common language—are thought by scholars to be systematically simpler than other languages, even after they become “normal” languages with native speakers. They typically lack heavy inflection.

But inflection is only one element of “hardness”. Some languages have simple sound systems (such as the Polynesian languages). Others have a wide variety of sounds, including rare ones that outsiders find hard to learn (like the languages of the Caucasus). Some languages (like English) lack or mostly lack grammatical gender. Some have dozens of genders (also known as “noun classes”) that must be learned for each noun. Languages can have rigidly fixed or flexible word order. They can put verbs before objects or even objects before subjects. Yet it is not clear how to rank the relative difficulty of exotic consonants, dozens of genders or heavy inflection. Another recent approach sought to go around the problem by finding languages that had the most unusual features, skirting the question of whether those features were “hard”. Comparing 21 feature parameters across hundreds of languages, they ranked 239 languages. Chalcatongo Mixtec, spoken in Mexico, was the weirdest. English came in place number 33. Basque, Hungarian, Hindi and Cantonese ranked as among the most “normal”. The researchers did not find any larger similarities between “weird” and “normal” languages. (For example, they do not claim that smaller or bigger languages tend to be “weirder”.) But again, the caveat is that this only compares which languages are unusual in a global context, not which are hard.

So the two most robust findings seem to be that smaller languages are more heavily inflected, and that languages farther from your own in the linguistic family tree will be harder for you to learn. If you want a challenge, a good bet is to pick a tiny language from halfway around the world. "

" Most of the features that help make other languages easy can be found in Esperanto, including:

    The many cognates English shares with French.
    Spanish’s consistent spelling rules.
    Mandarin’s lack of verb conjugations and noun gender."

"

However, Afrikaans has one of the simplest grammar systems of any language out there. There are also very few exceptions to the rules and words are largely pronounced as they are spelled (unlike English).

Here are some copy-pasted bits from Wikipedia regarding it's grammar:

In Afrikaans grammar, there is no distinction between the infinitive and present forms of verbs, with the exception of the verbs 'to be' and 'to have':

In addition, verbs do not conjugate differently depending on the subject.

Only a handful of Afrikaans verbs have a preterite, namely the auxiliary wees ("to be"), the modal verbs, and the verb dink ("to think"). The preterite of mag ("may") is rare in contemporary Afrikaans.

All other verbs use the perfect tense (hê + past participle) for the past. Therefore there is no distinction in Afrikaans between I drank and I have drunk. (Also in colloquial German, and to some extent Dutch, the past tense is often replaced with the perfect.) "

" Let's think about verbs. In French there are three types of verbs: ones where the infinitive ends in -er, ones that end in -re and ones that end in -ir. Each type has it's own set of endings in the present tense. I won't bore you by listing them, but in order to use them correctly you need to know the correct endings for 6 persons (I, you, he etc . . .) for the three different types. So that's 6x3=18 endings you need to know (OK, some of them are repeats, but even so you need to know which ones). That's just the regular verbs. The irregular verbs are which are generally the most frequently used (to be, to make, to have...), so you've got to learn them. Each one. Each person of each one. Individually. And some of them are so far from the regular patterns as to look almost like a different language.

Now lets remember the rule for Esperanto. To make a present tense verb in Esperanto you take the stem and add -as. That's it. No complex rules. No irregularities. One sentence of explanation required. "

" It's often said that the first thing you forget when you don't use a language is its irregularities. Since Esperanto has so few of them, you tend to not forget as much "

" Esperanto really is phonetic. Spanish is pretty good at this as well, but even there their are two ways to write the sound we write in English as th, and there are two pronunciations for c depending on the following vowel "

---

speech acts used by S#, a generic game theory game playing algorithm:

[3]

---

what is the list of common words that has standardized shorthand abbreviations?

other ideas:

polarity, positive / thesis (polarity-1), negative/antithesis (polarity-not1), neutral/synthesis (polarity-0) active passive hidden non hidden superlative marker parentheses most, least, best, worst) replace / instead of, swap / switch, change / mutation function relation set list mapping / dictionary, sequence / ordered list, ordered / unordered, unique slash non-unique, with respect to in / out / within / item within a list isomorphic, homomorphic, injective/1-1/invertible, surjective, possible impossible modality instance of / instantiation / example / concrete why / motivation / cause / reason / because / goal

problem/solution problem/solution equals, less than/greaterthan, similar / different, map / foreach, between/among, generic modifier claws or word introduction the form of a syllogism given a, b therefore C that query monad stuff in F sharp and C sharp, if, repeat, copy, apply / use, together /with Query / interrogative mode towards / to, what/type/category/classification go/move, take, demand, promise, propose, me you it him/her/them strong, fast, smart good / correct / desired / goal, evil / incorrect / undesired / Antigoal how/method, who/person, interrogative blank marker attachment, why, quantity/how many, style / character / quality, nb / note, item / thing / object, super / sub / part / whole (a part of something else), within / encompassing (item on a list), subclass/superclass be/exist/declarative mode where/at/location, when/time, because/why, possession, is past tense, future tense, hypothetical tense imperative mode, interrogative mode some/exists, Maybe yes no, I don't know ,example, ie, and, or, xor, not, all. What are the other most common words?

for an example of combining these: before = greaterthan-genericmodifier-time

---

“We are born knowing there are causal relationships in the world, that wholes can be made of parts, and that the world consists of places and objects that persist in space and time,”

---

some words that tend to be taught early in language classes:

---

---

some conlangs that might be of interest:

---

"While it is true that virtually all languages reflect certain basic universals of word choice (e.g., all have words for sun, moon, speak, mother, father, laugh, I, you, one, two, water, blood, black, white, hot, cold, etc.)..." [4]

---

" All in all, neither logical languages such as Loglan nor interlanguages such as Esperanto, are designed specifically to achieve the purpose of cognitive exactness and conciseness of communication which is the goal of Ithkuil. Actually, Ithkuil might more readily be compared with the analytical language of John Wilkins of the Royal Society of London, published in 1668, in which he divided the realm of human conception into forty categories, each containing a hierarchy of subcategories and sub-subcategories, each in turn systematically represented in the phonological structure of an individual word. While unworkable in terms of specifics, Wilkins’ underlying principles are similar in a simplistic way to some of the abstract derivational principles employed in Ithkuil lexico-morphology and lexico-semantics. Another comparable predecessor in a simplistic sense is the musical language, Solresol, created by Jean François Sudre and published in 1866. "

" He divided the universe in forty categories or classes, these being further subdivided into differences, which was then subdivided into species. He assigned to each class a monosyllable of two letters; to each difference, a consonant; to each species, a vowel. For example: de, which means an element; deb, the first of the elements, fire; deba, a part of the element fire, a flame. In a similar language invented by Letellier (1850) a means animal; ab, mammal; abo, carnivore; aboj, feline; aboje, cat; abi, herbivore; abiv, horse; etc. In the language of Bonifacio Sotos Ochando (1845) imaba means building; imaca, harem; imafe, hospital; imafo, pesthouse; imari, house; imaru, country house; imedo, coloumn; imede, pillar; imego, floor; imela, ceiling; imogo, window; bire, bookbinder; birer, bookbinding. (This last list belongs to a book printed in Buenos Aires in 1886, the 'Curso de Lengua Universal', by Dr. Pedro Mata.)

The words of the analytical language created by John Wilkins are not mere arbitrary symbols; each letter in them has a meaning, like those from the Holy Writ had for the Cabbalists. Mauthner points out that children would be able to learn this language without knowing it be artificial; afterwards, at school, they would discover it being an universal code and a secret encyclopaedia.

Once we have defined Wilkins' procedure, it is time to examine a problem which could be impossible or at least difficult to postpone: the value of this four-level table which is the base of the language. Let us consider the eighth category, the category of stones. Wilkins divides them into common (silica, gravel, schist), modics (marble, amber, coral), precious (pearl, opal), transparent (amethyst, sapphire) and insolubles (chalk, arsenic). Almost as surprising as the eighth, is the ninth category. This one reveals to us that metals can be imperfect (cinnabar, mercury), artificial (bronze, brass), recremental (filings, rust) and natural (gold, tin, copper). Beauty belongs to the sixteenth category; it is a living brood fish, an oblong one. "

[5] https://en.wikipedia.org/wiki/An_Essay_towards_a_Real_Character,_and_a_Philosophical_Language

https://en.wikipedia.org/wiki/Oligosynthetic_language

"An oligosynthetic language (from the Greek ὀλίγος, meaning "few" or "little") is any language using very few morphemes, perhaps only a hundred, which combine synthetically to form statements"

https://en.wikipedia.org/wiki/Philosophical_language

i think i already noted this one: https://en.wikipedia.org/wiki/Semantic_primes

" or example, I noticed how elegant and efficient the three-letter root structure of Semitic languages like Arabic and Hebrew were as a means of building words compared to European languages. I noticed how the perfective versus imperfective verbal aspect of Slavonic languages like Russian were able to convey certain verbal distinctions easily which languages like English had to use whole phrases to convey. In other cases, I found certain languages that grammaticalized thoughts that most other languages did not (such as the “ 4 th person ” distinction of certain American Indian languages). " [6]

" I think most philosophers who write about language are not trained in linguistics and are therefore laughably (or dangerously) naïve in their understanding of language and the conclusions they reach about it (e.g., Wittgenstein, Russell) (one exception to this is the philosopher John Searle, whose book Mind, Language and Society provided me with ideas which constitute the bases for the Context and Illocution categories of Ithkuil -- see Sections 3.6 and 5.1 of the Ithkuil grammar). " [7]

" In this issue we will also talk about several other conlangs, such as the minimalistic “ language of good ” called Toki Pona and “ logical language ” Lojban. It seems that Toki Pona with its supersimplification approach is an exact opposite of Ithkuil. Have you ever heard of it ? What is your opinion then ? And what do you think about Lojban ? I looked at Toki Pona for the first time about six months ago, after reading about it in other conlang discussion groups. While I appreciate the idea of simplification as a way of fostering initial communication and goodwill, it is definitely 180 degrees away from the purpose and scope of Ithkuil. For me personally, I do not see how anyone could ultimately be satisfied with either creating or using a language like Toki Pona. It represents a sad compromise in the level of communication between people compared to the level at which language COULD be used for communication. As for Lojban, I studied symbolic logic (the predicate calculus) at university. I knew then that symbolic logic was NOT how I wanted Ithkuil to function. Languages such as Loglan and Lojban which are based on the predicate calculus are very interesting and precise symbol-manipulation systems, but they do not address the levels of language which represent the interrelationships between concepts that Ithkuil represents (the ability to designate exact interrelationships and derivations of secondary concepts from primary concepts), nor the semantic vagueness and cognitive intentionality problem which I discuss at length in Sections 0.3 through 0.5 of the Introduction section of the Ithkuil website.

" [8]

" In 2007, I introduced the long-promised revision of Ithkuil called Ilaksh. Its creation was motivated by the numerous emails and other comments I received asking for an easier-to- pronounce version of the language. At the same time, I used the opportunity to make some adjustments and changes to the grammar. " [9]

---

[10] has words:

door teleport correct incorrect unlocked locked help sound guide inside outside key friend for pillar

---

(copied to notes/cog/language/languageMisc)

i guess one thing about human language is the whole verb/noun thing.

Nouns, noun phrases, and adjectives fit in well with the general mathematical and programming language formalisms that we have. Conjunctions, pronouns, and prepositions are not quite as clear but presumably can be handled simply enough.

Verbs are something different (the relationship between nouns and adjectives is presumably the same as between verbs and adverbs, so we probably don't have to worry about adverbs either). Determiners are different too, but not as important as verbs. It seems to me that verbs, and especially the fact that they are so common and even mandatory, imply a set of assumptions that human thought makes. What are these?

One of them seems to be that most nouns can be thought of as agents, often via anthropomorphization. An agent implies a linear timeline. The timeline is partititoned into segments during which the agent is doing different actions (furthermore, the default assumption seems to be that at each moment in time, the agent is doing just one thing; although this just the default, because it is accepted that an agent can be doing multiple things at once).

Furthermore the concepts of owning, wanting, trying, using, believing, and liking, and emotional/experiential state are applied to agents (i don't think this is a comprehensive list btw, it's just the ones i thought of first). That is, agents can 'have' things; they can want things; they can attempt actions; they can use things to attempt actions; they can hold beliefs; they can have preferences; and they can be happy/sad/etc and be in pain/hungry/full/etc I mention these because, in math or computer programming, we don't generally apply those concepts to arbitrary nouns.

On the other hand, one thing which are like verbs in math and which are absent from human language are functions (another is group actions). But functions don't have an exact analog in language; a function can have arbitrarily many inputs, whereas a verb has a single subject. For example "add(2,2) = 4" does not easily/directly translate into human language's subject/verb structure.

---

i guess the previous on the assumptions implied by human language verb/noun structure is one reason why single-dispatch OOP is good; the distinguished subject in human language maps to the distinguished "receiver" in single-dispatch OOP.

---

more on linguistic universals

https://en.wikipedia.org/wiki/Greenberg%27s_linguistic_universals

---

more on common primitive words:

https://en.wikipedia.org/wiki/Natural_semantic_metalanguage

https://en.wikipedia.org/wiki/Swadesh_list

" The ranked Swadesh-100 list, with Swadesh numbers and relative stability, is as follows (Holman et al., Appendix. Asterisked words appear on the 40-word list):

    22 *louse (42.8)
    12 *two (39.8)
    75 *water (37.4)
    39 *ear (37.2)
    61 *die (36.3)
    1 *I (35.9)
    53 *liver (35.7)
    40 *eye (35.4)
    48 *hand (34.9)
    58 *hear (33.8)
    23 *tree (33.6)
    19 *fish (33.4)
    100 *name (32.4)
    77 *stone (32.1)
    43 *tooth (30.7)
    51 *breasts (30.7)
    2 *you (30.6)
    85 *path (30.2)
    31 *bone (30.1)
    44 *tongue (30.1)
    28 *skin (29.6)
    92 *night (29.6)
    25 *leaf (29.4)
    76 rain (29.3)
    62 kill (29.2)
    30 *blood (29.0)
    34 *horn (28.8)
    18 *person (28.7)
    47 *knee (28.0)
    11 *one (27.4)
    41 *nose (27.3)
    95 *full (26.9)
    66 *come (26.8)
    74 *star (26.6)
    86 *mountain (26.2)
    82 *fire (25.7)
    3 *we (25.4)
    54 *drink (25.0)
    57 *see (24.7)
    27 bark (24.5)
    96 *new (24.3)
    21 *dog (24.2)
    72 *sun (24.2)
    64 fly (24.1)
    32 grease (23.4)
    73 moon (23.4)
    70 give (23.3)
    52 heart (23.2)
    36 feather (23.1)
    90 white (22.7)
    89 yellow (22.5)
    20 bird (21.8)
    38 head (21.7)
    79 earth (21.7)
    46 foot (21.6)
    91 black (21.6)
    42 mouth (21.5)
    88 green (21.1)
    60 sleep (21.0)
    7 what (20.7)
    26 root (20.5)
    45 claw (20.5)
    56 bite (20.5)
    83 ash (20.3)
    87 red (20.2)
    55 eat (20.0)
    33 egg (19.8)
    6 who (19.0)
    99 dry (18.9)
    37 hair (18.6)
    81 smoke (18.5)
    8 not (18.3)
    4 this (18.2)
    24 seed (18.2)
    16 woman (17.9)
    98 round (17.9)
    14 long (17.4)
    69 stand (17.1)
    97 good (16.9)
    17 man (16.7)
    94 cold (16.6)
    29 flesh (16.4)
    50 neck (16.0)
    71 say (16.0)
    84 burn (15.5)
    35 tail (14.9)
    78 sand (14.9)
    5 that (14.7)
    65 walk (14.4)
    68 sit (14.3)
    10 many (14.2)
    9 all (14.1)
    59 know (14.1)
    80 cloud (13.9)
    63 swim (13.6)
    49 belly (13.5)
    13 big (13.4)
    93 hot (11.6)
    67 lie (11.2)
    15 small (6.3)

"

www.ucd.ie/artspgs/langworld/universals.rtf

https://www.jstor.org/stable/410101?seq=1#page_scan_tab_contents

https://en.wikibooks.org/wiki/Conlang/Advanced/Grammar/Government/Linguistic_universals

https://linguistics.stackexchange.com/questions/11413/does-any-linguist-honestly-believe-that-nouns-and-verbs-are-not-universals

"One has to be careful how the words Noun and Verb are understood, if one wants a good answer. Semanticists talk about Entities and Events" [11]

"

 Most languages do have words -- i.e, they can put word-level constituents together into larger ones.

But there are polysynthetic languages, like Eskimo languages and Salishan languages. In a polysynthetic language there is usually a very simple root system with dozens of derivations and inflections that get added on to form even the simplest utterance. And these roots are very general, and have lots of metaphoric and idiomatic associations -- like any other language -- so their possible forms get big very fast. Both in the sense of getting longer and longer, and in the sense of there being almost infinite extensibility of words.

It says something about Skagit (Puget Salish; Northern Lushootseed) that over 3/4 of the nouns in the language start with the overt nominalizer s- --even the words for 'man' and 'woman'. They're built on CVC roots, as is most of the rest of the language. So in Skagit, while there are nouny things and verby things and you can tell the difference in a given sentence, they don't split up the descriptive labor the way we're used to, and the roots are normally neither verby nor nouny, but can be made to refer to either. Plus there isn't a really well-defined boundary between words and sentences."

---

looks like a good read on linguistic universals:

https://www.uio.no/studier/emner/hf/ikos/EXFAC03-AAS/h05/larestoff/linguistics/Chapter%203.(H05).pdf

---

" an inventory of the elements of human nature compiled by the American anthropologist George P. Murdock during a study of cultural universals:

    Age-grading, athletic sports, bodily adornment, calendar, cleanliness training, community organizations, cooking, cooperative labor, cosmology, courtship, dancing, decorative art. divination, division of labor, dream interpretation, education, eschatology, ethics, ethnobotany, etiquette, faith healing. family feasting, firemaking, folklore, food taboos, funeral rites, games, gestures, gift giving, government, greetings, hairstyles, hospitality, housing, hygiene, incest taboos, inheritance rules, joking, kin groups, kinship nomenclature, language, law, luck superstitions, magic,. marriage, mealtimes, medicine, obstetrics, penal sanctions, personal names, population policy, postnatal care, pregnancy usages, property rights, propitiation of supernatural beings, puberty customs, religious rituals, residence rules, sexual restrictions, soul concepts, status differentiation, surgery, toolmaking, trade, visiting, weaving, and weather control.
    " [http://www.rfreitas.com/Astro/Xenopsychology.htm]

---

what would it take for Oot to be conveniently programmable on a smartphone? well, the first issue is that smartphone keyboards are too annoying to type on. i think the root of the problem is that 26 letters are too many to fit on a keyboard that one thumb can reach while holding the phone in one hand. Maybe 16 letters would be better. Also, reduce from 10 digits to 6 (base 6 instead of base 10).

here are the world's most common sounds:

http://web.phonetik.uni-frankfurt.de/upsid_info.html

from my conLangs.txt, the intersection of IPA phonemes in English, Spanish, and Chinese are:

(17 total): a e f i j k l m n nu o p s t u w x

---