Table of Contents for Programming Languages: a survey

Natural languages, knowledge representation, logic, semantics, and ontology

This part covers other ways of representing information outside of programming languages.

It may seem odd to put information about non-programming languages and philosophy in a book about programming languages, but perhaps they can serve as inspiration.

Introductions / basic definitions

Syntactical structure of logic

In mathematics, a symbol is the type of thing out of which strings are built. A set of formal symbols is referred to as an alphabet.

An expression is a finite string of symbols that is syntactically valid.

Terms are expressions that represent objects from the domain of discourse [1].

well-formed formula, abbreviated 'wff' or 'formula', is string that is a syntactically valid and that can semantically stand on its own, except that it may have variables.

An atomic formula is a wff that has no strict subwffs [2]. Typically, an atomic formula is built by applying a predicate to a term (or more generally, applying a relation to a set of terms).

Non-atomic formulas are built from atomic formulas by applying logical connectives nor quantifiers.

A sentence is a wff with no free variables. Typically, in logic, sentences are interpreted as propositions, that is, things for which it makes sense to ask "is it true?" well-formed formula.

Some but not all of the text in this section consists of quotes from the indicated links.

Syntactical structure of natural languages

In linguistics, a grapheme is roughly equivalent to what in computing is called a character.

A morpheme is a component of a word, for example 'jump' or 'ed'. A morpheme which cannot stand alone as a word itself is called a bound morpheme. I don't understand the distinction between affix and bound morpheme.

A lexeme is the set of forms taken by a single word; for example, 'run', 'runs', 'ran' and 'running' are forms of the same lexeme, conventionally written as RUN. The term 'lexical' means relating to words. A lexicon is a language's inventory of lexemes and bound morphemes.

A word is the smallest element that may be uttered in isolation, for example 'jumped'. A role that a word plays in its parent sentence/phrase/clause is a part-of-speech.

A phrases is a single word or group of words that functions as a constituent (unit) in the syntactical structure of the sentence; for example the noun phrase 'the orange bird'. A role that a phrase plays in its parent sentence/phrase/clause is a "phrasal category".

A clause is the smallest grammatical unit that can express a complete proposition. A typical clause consists of a subject and a verb phrase. A phrase appears within a clause (although it is also possible for a phrase to be a clause or to contain a clause within it). Many clauses can stand alone as sentences (those that cannot are called subordinate clauses).

Clause(s) go together into sentences.

Various items from the above can be placed in various grammatical categories such as tense, gender, mood.

(tangentially, a phoneme is a concept for spoken language that corresponds to a grapheme in written language (i do not mean that each phoneme has a corresponding grapheme, just that the role of phonemes within the conceptual structure of linguistics is similar to the role of graphemes). Note that a character has different visual representations in different fonts; an individual font's representation of a given character is called a 'glyph'. Different glyphs which represent the same graphemes are called allographs (and different sounds ('phones') that represent the same phoneme are called allophones) (see also [3]).)

Some but not all of the text in this section consists of quotes from the indicated links.

Semiotics

Semiotics is the study of things like signs, symbols, interpretations, and meaning.

Saussure's theory of semiotics has two things involved in the sign relation, the signifier, and its meaning (the signified). The 'signifier' is something that we perceive that makes us think of something, for instance, the string of letters "c a t". The meaning of a sign (the 'signified') is the concept that appears in our mind when we perceive the sign; this is distinct from the object out in the world, if any, that this concept represents (the 'referent'). Saussure used the term 'sign' to encompass the pair of the signifier and the signified. "According to him, language is made up of signs and every sign has two sides (like a coin or a sheet of paper, both sides of which are inseparable)" -- [4].

Pierce's theory (see also [5] [6]) is triadic; the 'sign' is (roughly) something that makes us think of something; the 'object' is (roughly) the thing to which the sign refers, and the 'interpretant' is the thing that is perceiving the sign and interpreting it as referring to the object. Pierce, however, attempted to define this in a more general way that did not rely upon the concept of human thought; he said, "a sign is something, A, which brings something, B, its interpretant sign determined or created by it, into the same sort of correspondence with something, C, its object, as that in which itself stands to C." In other words, when we see the word 'cat', (part of) our mind is itself turned into a sign for cats. Watch out, I may be totally misunderstanding this.

Some but not all of the text in this section consists of quotes from the indicated links.

Vocabulary differences in linguistics, logic, programming language implementation, and semiotics

There are some words that are have conflicting meanings in these domains, and also some different words that mean similar or analogous things.

In computer science, a word usually means an architecture-specific fixed-size group of digits (usually binary bits) that are processed as a unit by the hardware. Computer science's 'word' is related to the linguistic definition, in that a computer's hardware often cannot efficiently store and process data smaller than its word size.

The logic and computer science term syntax (see also [7]) means the set of rules for correctly structured documents without regard to any interpretation or meaning given to them. In computer science, 'grammar' refers to a set of production rules for strings. In linguistics, 'grammar' is a set of structural rules governing composition, and 'syntax' is the subset of grammar governing linguistic structure above the word level [8] (morphology is linguistic structure at the word level and below]).

In programming languages, 'lexeme' is sometimes used to mean a string of characters which forms a syntactic unit. Note that this conflicts with the linguistic definition of lexeme; instead, it roughly corresponds with a linguistic word, although it could also correspond to a (bound) morpheme.

In programming languages, 'lexemes' are categorized. A token is a structure representating a lexeme that explicitly indicates its categorization for the purpose of parsing. The categories of tokens are roughly equivalent to linguistic parts-of-speech.

Note that a programming language 'token' is not the same as a token-ring token, a security token, etc.

In computers, 'text' is a type of value, usually used interchangably with 'string'; in literary theory, it can mean any object that can be "read".

In mathematics and often in programming languages, 'composition' is shorthand for 'function composition', which is when one a function is defined as the result of applying a second function to the output of a third function. In general use, 'composition' can also mean 'a work of art' or it can mean 'what something is made of' or 'the way in which something is made up'.

In philosophy, a token is an instance of a type.

A logical 'term' corresponds to a linguistic noun phrase, and a logical sentence corresponds to a linguistic sentence (although these are not synonyms; i just mean that their role within the larger theories are analogous).

Some but not all of the text in this section consists of quotes from the indicated links.

Core languages and universality in natural language and semantics

https://en.wikipedia.org/wiki/Semantic_primes#List_of_semantic_primes

https://en.wikipedia.org/wiki/Cultural_universal

https://en.wikipedia.org/wiki/Natural_semantic_metalanguage#Semantic_primitives

https://en.wikipedia.org/wiki/Thematic_relation#Major_thematic_relations

[9], sections "Use easy words" and "Other basic/simplified/controlled languages"

Chinese characters in order of frequency, with English translations : http://www.zein.se/patrick/3000char.html
Chart of 800 Chinese characters common to Chinese, Japanese, Korean ("Draft Chart of Most Commonly-Used 800 Chinese Characters among the three countries" without English translation; see http://web.archive.org/web/20140819001044/http://www.korea.net/NewsFocus/Culture/view?articleId=110141 ): http://web.archive.org/web/20140819001044/http://www.korea.net/koreanet/fileDown?fileUrl=/upload/content/file/1374058738374.pdf
International Ideographs Core (IICORE), a core of 9810 common Chinese characters for partial implementation of Chinese ideographs on devices with limited capability

Yerkish, a constructed language for communication with non-human primates

https://en.wikipedia.org/wiki/Energy_Systems_Language

https://en.wikipedia.org/wiki/IConji

https://en.wikipedia.org/wiki/Blissymbols

https://en.wikipedia.org/wiki/Unicode_symbols#Symbol_block_list

http://talkdifferent.com/en/talkdifferent/

reactions gifs:

http://jezebel.com/5664134/the-comprehensive-gif-glossary

Brown's animal taxonomy

Brown's fish, snake, bird, wug (worm+bug), mammal:

Brown's plant taxonomy

Brown's tree, grass, bush, vine, herbaceous plant (and 'grerb' for the union of 'herbaceous plant' and 'grass'):

"For plants, (Brown thinks) the potentially universal life-forms are tree, grass, bush, vine, and herbaceous plant (Brown uses "herbaceous plant" to refer to the covert category of "small, soft stem" plants...). Sometimes "herbaceous plant" and "grass" form a joint category, which Brown dubs a "grerb". -- The Development of Cognitive Anthropology By Roy G. D'Andrade

Links that mention Brown

possible sources of other similar work:

The Leopard's Spots: Essays on Language, Cognition and Culture By Gerrit Dimmendaal
Psycholinguistics: Psychology, linguistics, and the study of natural language By Joseph F. Kess
Folk-taxonomies in Early English By Earl R. Anderson including chapters: The Study of Color Terms across ..., Reconstruction of Color Taxonomi..., Basic Color Terms in Germanic an..., Basic Color Terms in Middle and ..., The Seasons of the Year, Geometric Shapes, The Five Senses, The FolkPsychology? of Mind and Soul, Plant Lifeforms, Animal Lifeforms, Taxonomies and the Problem of Un...,
TREES, GRERBS, WUGS, SNURMS AND QUAMMALS notes that Brown has also worked on the cross-cultural semantics of color, anatomy, orientation, and time.

More 'universal' noun taxonomies

Constructed noun taxonomies

http://mw.lojban.org/papri/semantic_categorization_of_gismu
- discussion of lojban: https://news.ycombinator.com/item?id=9629466
- how the vocabulary of base words was chosen:
  - https://lojban.github.io/cll/4/14/
  - http://wiw.org/~jkominek/lojban/9105/msg00030.html

Kortmann's adverbial relations

list of 16 adverbial subordinators, taken from Helen Eaton's 1000 most frequently used items [10]:

time:

simultaneity overlap 'when',
simultaneity duration 'while',
simultaneity co-extensiveness 'as long as'
anteriority 'after'
immediate anteriority 'as soon as'
terminus a quo 'since'
posteriority 'before'
terminus ad quem 'until'

modal:

similarity '(just) as, like'
proportion 'the... the'" special type of comparison: expression of proportionality or equivalence of tendency or degree between p and q; (C13) The faster he drove, the paler he grew.(C14) As he grew disheartened, (so) his work deteriorated"

ccc:

cause/reason 'because'
condition 'if'
concession 'although'
result 'so that'
purpose 'in order to' other:
place 'where'

later(?) he made a list of 32 [11]:

all of the above, plus:

time:

contingency 'in cases when p, q', 'whenever p, q'

modal:

Manner: German 'indem p, q' eg "My father stiffened, straightening his shoulders". " p and q relate to the same event and p specifies how the event is performed; different from Instrument, p and q cannot necessarily be converted into a Purpose relation, i.e. it is not the case that 'p in order to q'; "answers 'how'-questions; closely related to Similarity, Comparison, Comment/Accord and especially to Instrument; " eg German Er verbrachte den Nachmittag, indem er Gedichte las. 'he spent the afternoon reading poems'
Comment/Accord: 'as p, q' "this interclausal relation is very difficult to capture in a few words: p expresses the speaker's comment on the content of the matrix clause, typically with the aim of affirming the truth (and thus reliability) of q; e.g. p may identify the source of the speaker's information, or express agreement with somebody else's opinion; (C6) As (it) appears from her essay, she has read widely in English literature. (C7) As you said, George has no children. (C8) German Wie in den Nachrichten gemeldet, hat sich die Regierung zu einer Sondersitzung versammelt.'as reported in the news, the government assembled for a special meeting'"
Comparison: 'q, as if p', 'q, as though p' eg She treats me as if I am a stranger; He looks as if he's getting better
Instrument/Means: 'by p, q' eg By twisting her body sideways, she freed herself

ccc:

Negative Condition 'unless p, q', 'if not p, q',
Concessive condition: 'even if p, q'
Contrast: 'q, whereas p'
Negative Purpose: 'lest p, q', 'q, in order that not p'
Degree/Extent: 'q, to the extent that p'
Exception/Restriction: 'except that p, q', 'q, only that p'

other:

Substitution: 'instead of p, q'
Preference: 'rather than p, q'
Concomitance (CONCOM): 'q, wobei p' "alternative terms: accompanying circumstance, accompaniment - two separate situations p and q stand side by side; there is no indication of a specific semantic relation holding between p and q; all that needs to be secured is the unity of space and time; (D9) German Ich nahm das Buch vom Regal, wobei mir die Vase herunterfiel. 'I was taking the book from the shelf when the vase fell down' (D10) The man came in, holding his hat in one hand."
Negative Concomitance (NEG_COM): 'without p, q', German 'ohne daß p, q' eg He just pushes on with these things, without taking any advice; German Er ging an mir vorbei, ohne daß er mich grüßte. 'he went past me without greeting me' '
Addition (ADDI): 'besides p, q', 'in addition to p, q' eg Besides missing my bus, I forgot to bring my books.

[12] also contains potentially useful notes on the semantic constraints upon each of these (pages thru xiv-xix; pdf pages 14-19)

USL's three primitive control structures (some details, such as related to ordering and comprehensiveness, omitted): these are 3 ways for a parent function 'parent' to call two child functions A and B as subroutines:

composition/join/dependent: parent calls A with inputs, then feeds the output of A to B as input, then the parent returns the output of B
include/independent/class partition: parent passes some of its inputs as inputs to A, and passes others of its inputs as inputs to B, and then the parent returns the union of A's outputs and B's outputs
or/alternative/set partition: parent uses a third subfunction, "partition", to decide whether it wants to use A or B. It passes all its inputs to whichever one it chose (or possibly both of them), and then returns the outputs from the chosen subroutine.

Note that these appear to correspond to the programming language primitives of compositions, projections, and conditionals.

FMaps and TMaps: FMaps relate to dynamics, TMaps related to statics (eg containment of one object by another). "For example, in an FMap, an output variable of any function is fully traceable to all other functions that use the state that variable refers to." "FMaps are used for defining functions and their relationships to other functions using the types of objects in the TMap(s). Each function on an FMap has one or more objects as its input and one or more objects as its output. Each object resides in an object map¹ (OMap¹) and is a member of a type from a TMap. TMaps are used for defining types and their relationships to other types. Every type on a TMap owns a set of inherited primitive operations for its allowed FMap primitive functional relationships."

a "function is a hybrid consisting of a traditional mathematical construct, i.e., an operation (mapping) and a linguistic construct, i.e., an assignment of particular variables to inputs and outputs."

Six axioms:

1: "a given parent controls the invocation of the set of children on its immediate, and only its immediate lower level". Some implications: "it cannot invoke itself, its parent, any of its descendants other than its immediate offspring, any other offspring of its own parent, another parent’s offspring, or an offspring that invokes its parent; the children of each parent must collectively perform no more and no less than the parent’s requirements; e.g., if a function from a lower level is removed and its ancestor still maintains its same mapping, the function at the lower level is extraneous (extraneous functions proliferate test cases and complicate interfaces). "
2: "a given parent controls the responsibility for elements of only its own output space (codomain)". "Some implications are a parent loses control (cannot ensure correct outputs) when any of its offspring stop before completion, go into an endless loop or do not return required information back to the parent; the decomposition stopping point can be determined and the bottom is reached when each function has been defined in terms of other functions on a defined type; the functions’ behavior one level from the bottom can be defined by understanding the behavior of each function at the bottom level and how it relates to other functions on that level; one can define each next highest level function in the same manner until the top node is reached; the behavior of the top node is ultimately determined by the behavior of the collective set of bottom nodes; there may be more than one formulation for a particular function, it is only necessary that the mapping be identical"
3: "a given parent controls the output access rights (ability to alter the values of variables) to each set of variables whose values define the elements of the output space for each immediate, and only each immediate, lower level child. Axiom 3 is concerned with where the required range element (as produced by an offspring) is delivered as dictated by its parent. The parent can assign to its offspring the right to alter the values of the output variables of the parent’s own function that the offspring replaces. Implications are: each range variable of the parent that an offspring replaces, must appear as a range variable of the function of at least one of its offspring; tracing of outputs can be traced for each and every performance pass (i.e., instance by instance); the output variables at the parent are a subset of the output variables of the collective children. "
4: "a given parent controls the input access rights (ability to obtain the values of variables) to each set of variables whose values define the elements of the input space for each immediate, and only each immediate lower level child. Axiom 4 is concerned with the way the parent controls access to its domain elements; specifically the parent can grant its children the right to access its domain elements for reference only. Implications are: the parent does not have the ability to alter its domain elements; each domain variable of the parent must appear as a domain variable in at least one of its children; inputs can be traced for each and every performance pass"
5: " a given parent controls the rejection of invalid elements of its own, and only its own, input set (domain). Axiom 5 requires that the parent must ensure the rejection of inputs received that are not in the domain of the parent. A parent, in performing its corresponding function, is responsible for determining if such an element has been received, and, if so, it must ensure its rejection. "
6: "a given parent controls the ordering of each tree for the immediate, and only the immediate, lower level. Axiom 6 requires the parent to control the order (including priority) based on e.g., time, events, importance, and computational needs of the invocation of its children and their descendants. Implications are: total order relationships; if two processes are scheduled to execute concurrently, the priority of each process determines precedence at the time of execution; the priority of a process is higher than the priority of any process on its most immediate lower level; if two processes have the same parent, all processes in the control tree of the process with the highest priority are of a higher priority than all the processes in the control tree with the lower priority; a process cannot interrupt itself; a process cannot interrupt its parent. "

" Some implications of both axioms 3 and 4 are: the variables of the output set of a function cannot be the variables of the input set of that same function. If f(y, x) = y could exist, access to y would not be controlled by the parent at the next immediate higher level; the variables of the output set of one function can be the variables of the input set of another function only if the variables belong to functions on the same level. If f1(x) = y and f2(y) = g, both functions exist at the same level. "

" Other implications (derived theorems) of the axioms are: every object has a unique parent, is under control; and has a unique priority; communication of children is controlled by the parent, and dependent functions exist at the same level; the priority of an object is always higher than its dependents and totally ordered with respect to other objects at its own level. Relative timing between objects (including functions) is therefore preserved; maximum completion or delay time for a process is related to a given interrupt structure. Absolute timing can therefore be established (i.e., it can be determined if there is enough time to do the job); the relationships of each variable are predetermined, instance by instance, thus eliminating conflicts; each system has the property of single reference/single assignment. SOOs can therefore be defined independent of execution order; the nodal family (a parent and its children) does not know about (is independent of) its invokers or users; concurrent patterns can be automatically detected; every system is event driven (every input is an event; every output is an event; every function is event driven); and can be used to define discrete or continuous phenomenon; each object, and changes to it, is traceable; each object can be safely reconfigured; every system can ultimately be defined in terms of three primitive control structures, each of which is derived from the six axioms—a universal semantics, therefore, exists for defining systems. "

Conlangs based on lexical classifications or ontologies or combinations of relatively small numbers of primitives

Related notes from the cognitive development of young children

" In the early 1990s, Wynn (1990, 1992) first reported that children learn the meanings of cardinal number words one at a time and in order. Wynn showed this using the “Give-N” or “Give-a-number” task, in which she asked children to give her a certain number of items (e.g., “Give me one fish”; “Give me three fish,” etc.). She found that children’s performance moved through a predictable series of levels. At the earliest (“pre-number-knower”) level, children do not distinguish among the different number words. Pre-number knowers might give one object for every number requested, or they might give a handful of objects for every number, but they show no sign of knowing the exact meaning of any number word. At the next level (called the “one-knower” level), children know that “one” means 1. On the Give-N task, one-knowers give exactly one object when asked for “one,” and they give two or more objects when asked for any other number. After this comes the “two-knower” level, where children give one object for “one,” and two objects for “two,” but do not reliably produce larger sets. This is followed by a “three-knower” level and (although Wynn didn’t find it because she never asked children for four objects) a “four-knower” level. After the four-knower level, children seem to learn the meanings of the higher cardinal number words in a different way-inferring their meanings from their place in the counting list rather than learning them individually as they did with the small numbers (Carey, 2009). Children who have done this (i.e., who have figured out how the counting system represents cardinal numbers) are called “Cardinal-principle knowers.” " -- Barbara W. Sarnecka. On the relation between grammatical number and cardinal numbers in development

unsorted

https://en.wikipedia.org/wiki/Grammatical_category

https://en.wikipedia.org/wiki/Jakobson%27s_functions_of_language

https://en.wikipedia.org/wiki/Attempto_Controlled_English

todo: move some content from [14] to here

Natural language

To be

"In English, 'to be' can have different functions:

    It talks about Identity: The cat is my only pet, The cat is Garfield
    It talks about belonging to a class, or a group: The cat is an animal
    It can talk about properties: The cat is furry
    It can be an auxiliary verb: The cat is sleeping, The cat is bitten by the dog
    It can talk about existence: There is a cat
    It can talk about location: The cat is here" -- http://simple.wikipedia.org/wiki/E_Prime

Lists of lexical categories / parts of speech

https://en.wikipedia.org/wiki/Syntactic_category#Lexical_categories_vs._phrasal_categories :

Lexical categories:

adjective (A) (open)
adposition (preposition, postposition, circumposition) (P)
adverb (Adv) (open)
coordinate conjunction (C)
determiner (D) (articles, quantifiers, demonstrative adjectives, and possessive adjectives)
interjection (I) (open)
noun (N) (open)
particle (Par)
pronoun (Pr)
subordinate conjunction (S)
verb (V) (open)

In the above, classes which are usually open according to the table at [15] are listed with '(open)'; all others are listed as 'usually closed' in that table.

The list of the nine parts-of-speech listed at https://en.wikipedia.org/wiki/Part_of_speech#Parts_of_speech_in_English correspond to this list, except that some categories on the above list are more general (eg 'determiner' is a generalization of 'article'; 'adposition' is a generalization of preposition), and also the above list breaks conjunctions into coordinate and subordinate, and has an additional category "particle". Definitions for many of the items on the above list may be found there.

The (union of the) lists of Dionysius Thrax and Priscian in https://en.wikipedia.org/wiki/Part_of_speech#History roughly correspond to the above, except that they have the additional category 'participle' (a part of speech sharing features of the verb and the noun), and they do not have adjectives (which were grouped with nouns) or particles.

The list found inside the picture at https://en.wikipedia.org/wiki/Part_of_speech#Functional_classification roughly corresponds to the above list, with a phrasal category for each lexical category, except that the phrasal category 'clause' has in addition been added, and pronouns and adverbs have been left out (presumably they are found within noun and verb phrases, respectively), and particles have also been left out, and interjections don't have a phrasal category.

https://en.wikipedia.org/wiki/Part_of_speech#Functional_classification contains a list of common lexical categories defined by function. This list corresponds to the above list, except that coordinate and subordinate conjunctions are combined into one category, and that this list contains or breaks out the new categories auxiliary verbs, clitics, coverbs, measure words or classifiers, preverbs, contractions, cardinal numbers, all of which are listed under 'usually closed classes'.

You may be wondering what some of these classes are. I assume that you probably already know what nouns, verbs, pronouns, adjectives, and prepositions are. Here are the others from the above list:

adpositions are a generalization of preposition
adverb: a modifier to a verb, just like adjectives modify nouns. For example, "quickly" ("He ran quickly.")
coordinate conjunction: "conjunctions that join, or coordinate, two or more items (such as words, main clauses, or sentences) of equal syntactic importance". For example, 'and' ("I saw Bob and Mary").
determiner: Occurs with a noun (or noun phrase) and expresses what that noun refers to.
- articles: eg the, a, an. "Articles specify grammatical definiteness of the noun, in some languages extending to volume or numerical scope." -- [16]
- demonstratives: eg this, that, these, those
- quantifiers: eg all, some, several, many, few, a lot, no
- possessive determiners: eg my, your, his, her, its, our, their (but not 'mine' and 'ours' that are used as possessive pronouns and not as determiners)
interjection: eg oh, hi
particle: "a function word that must be associated with another word or phrase to impart meaning, i.e., does not have its own lexical definition...Particles are typically words that encode grammatical categories (such as negation, mood, tense, or case), clitics, or fillers or (oral) discourse markers such as well, um, etc. Particles are never inflected." -- [17]. All of the examples given in https://en.wikipedia.org/wiki/Grammatical_particle#English are words that can be considered either in other classes (such as the negator 'not', which could also be considered an adverb) or as an integral 'part of' other words, such as the infinitive 'to', which could also be considered as "an integral part of the infinitive form of the verb"
subordinate conjunction: conjunctions that join an independent clause and a dependent clause (a clause of inferior rank), and also introduce adverb clauses. For example, "He went home after he had dined".

And here are the others from the other lists (excluding contractions, because i assume you already know what contractions are)

clause: this is a phrase type, not a lexical category as everything else in this list (ie a claus is not a part of speech; it is not a type of word; rather, it is a type of phrase). A typical clause consists of a subject and a predicate.
participle: "A participle is a form of a verb that is used in a sentence to modify a noun, noun phrase, verb or verb phrase, and thus plays a role similar to that of an adjective or adverb". Eg "The man sitting over there is my uncle." (adjective-like); "He shot the man, killing him." (adverb-like); "The chicken was eating." ('eating' is a 'past participles' that here joins with 'was' to form the passive voice version of the verb 'to eat')
auxiliary verb: "An auxiliary verb is a verb that adds functional or grammatical meaning to the clause in which it appears—for example, to express tense, aspect, modality, voice, emphasis, etc. Auxiliary verbs usually accompany a main verb. The main verb provides the main semantic content of the clause.[1] An example is the verb have in the sentence I have finished my dinner. "
clitic: "a morpheme that has syntactic characteristics of a word, but depends phonologically on another word or phrase....For example, the English possessive ’s is a clitic in the phrase the king of England's horse: It looks like a suffix, but its position at the end of "the king of England" rather than on "king" is like that of a separate word."
coverb: "a word or prefix that in some way resembles a verb or operates together with a verb." That Wikipedia page gives no English examples.
measure word or classifier: "accompanies a noun in certain grammatical contexts, and generally reflects some kind of conceptual classification of nouns, based principally on features of their referents." i dont think these exist in English but an example is given in the translation of "three rivers" to Mandarin Chinese: "三条河 (三條河) sān tiáo hé, literally "three [long-wavy-classifier] river"
preverb: "not widely accepted in linguistics...certain elements prefixed to verbs"
cardinal number: this is just a number word, for example "three"; the 'cardinal' is just to distinguish it from 'ordinal number', for example 'third'. Cardinal numbers are often used as determiners (quantifier determiners), eg "Ten people danced."

Here are common examples of some of these classes (i give a citation only when the source claims these are the most common):

adposition: on, in, to, by, for, with, at, of, from, as, according to [18] and[19])
coordinate conjunction: for, and, nor, but, or, yet, so
subordinate conjunctions: after, although, as, as far as, as if, as long as, as soon as, as though, because, before, even if, even though, every time, if, in order that, since, so, so that, than, though, unless, until, when, whenever, where, whereas, wherever, while
determiner
- articles: the, a, an
- demonstratives: this, that, these, those
- possessive determiners: my, your, his, her, its, our, their (but not 'mine' and 'ours' that are used as possessive pronouns and not as determiners)
- quantifiers: all, some, several, many, few, a lot, no
numbers:
- https://en.wikipedia.org/wiki/Benford%27s_law
- S.N. Dorogovtsev, J.F.F. Mendes, J.G. Oliveira. Frequency of occurrence of numbers in the World Wide Web
todo: others?

Other classes/subclasses sometimes mentioned:

conjuncts (mentioned in http://www.thestar.com.my/Lifestyle/Features/2010/08/27/Connecting-with-connectives/ ); eg 'first' in "First, he dug a hole".
copula (mentioned in http://www.thestar.com.my/Lifestyle/Features/2010/08/27/Connecting-with-connectives/ ); special case of verb where the object is describing a property or equational identity of the subject eg "Anne is a seamstress", "She is called Anne", "This lemon is yellow", "This food tastes bad", "This food has turned bad".

Subject, object, indirect object, etc

Is there a list of things like that? I dont know of one, but i havent really looked. Possibly related/toread:

https://en.wikipedia.org/wiki/Argument_%28linguistics%29 https://en.wikipedia.org/wiki/Complement_%28linguistics%29 https://en.wikipedia.org/wiki/Object_%28grammar%29 https://en.wikipedia.org/wiki/Object_%28grammar%29#Types_of_objects https://en.wikipedia.org/wiki/Predicate_%28grammar%29 https://en.wikipedia.org/wiki/Subject_%28grammar%29 https://en.wikipedia.org/wiki/Topic%E2%80%93comment

Other grammatical categories and lists of other grammatical categories

Ontologies and the organization of knowledge

The word 'ontology' has two related but distinct meanings. One, the older meaning is the philosophical topic concerned with the nature of existence; it concerns questions such as, what sorts of things exist? The second, more recent meaning is the computer science and information science topic concerned with the semantic structure of relationships between concepts; it concerns questions such as, how shall we organize knowledge?

These are related because, for example, a philosophical proposal for a division of things that exist into fundamental types can inspire a similar type structure for the organization of conceptual knowledge.

Here we present material related to both definitions.

Philosophical categories

"The Categories (Latin: Categoriae) introduces Aristotle's 10-fold classification of that which exists: substance, quantity, quality, relation, place, time, situation, condition, action, and passion."

" The list given by the schoolmen and generally adopted by modern logicians is based on the original fivefold classification given by Aristotle (Topics, a iv. 101 b 17-25): definition (horos), genus (genos), differentia (diaphora), property (idion), accident (sumbebekos). The scholastic classification, obtained from Boëthius's Latin version of Porphyry's Isagoge, modified Aristotle's by substituting species (eidos) for definition. Both classifications are of universals, concepts or general terms, proper names of course being excluded. There is, however, a radical difference between the two systems. The standpoint of the Aristotelian classification is the predication of one universal concerning another. The Porphyrian, by introducing species, deals with the predication of universals concerning individuals (for species is necessarily predicated of the individual), and thus created difficulties from which the Aristotelian is free (see below).

The Aristotelian classification may be briefly explained:

" The definition of anything is the statement of its essence (Arist. τὸ τί ᾖν εἶναι), i.e. that which makes it what it is: e.g. a triangle is a three-sided rectilineal figure.

Genus is that part of the essence which is also predicable of other things different from them in kind. A triangle is a rectilineal figure; i.e. in fixing the genus of a thing, we subsume it under a higher universal, of which it is a species.

Differentia is that part of the essence which distinguishes one species from another. As compared with quadrilaterals, hexagons and so on, all of which are rectilineal figures, a triangle is differentiated as having three sides.

A property is an attribute which is common to all the members of a class, but is not part of its essence (i.e. need not be given in its definition). The fact that the interior angles of all triangles are equal to two right angles is not part of the definition, but is universally true.

An accident is an attribute which may or may not belong to a subject. The color of the human hair is an accident, for it belongs in no way to the essence of humanity. "

This classification, though it is of high value in the clearing up of our conceptions of the essential contrasted with the accidental, the relation of genus, differentia and definition and so forth, is of more significance in connection with abstract sciences, especially mathematics, than for the physical sciences. It is superior on the whole to the Porphyrian scheme, which has grave defects. As has been said, it classifies universals as predicates of individuals and thus involves the difficulties which gave rise to the controversy between realism and nominalism. How are we to distinguish species from genus? Napoleon was a Frenchman, a man, an animal. In the second place how do we distinguish property and accident? Many so-called accidents are predicable necessarily of any particular persons. This difficulty gave rise to the distinction of separable and inseparable accidents, which is one of considerable difficulty.

" -- https://en.wikipedia.org/wiki/Predicable

https://en.wikipedia.org/wiki/Categories_%28Aristotle%29

Computer ontologies

[Self-notes-cog-ai-reasoning-commonsense]
https://en.wikipedia.org/wiki/Classification_scheme
https://en.wikipedia.org/wiki/Dublin_Core
https://www.wikidata.org/wiki/Special:ListDatatypes
https://www.wikidata.org/wiki/Wikidata:Glossary
http://schema.org/
http://schema.org/docs/full.html

Some relationships between components of ontologies

from fig. 2.1 of An Introduction to Ontologies and Ontology Engineering by Catherine Roussey, Francois Pinet, Myoung Ah Kang, and Oscar Corcho:

that figure in words (the vocab used here is itself of interest):

A Property has a name (which is of type (isa) Term) Concepts may have Properties Concepts may have a logical definition Concepts may have a textual (informal) definition Instances are examples of a Concept; one Concept may be instantiated in multiple Instances A Concept has a label (which is of type (isa) Term) An Instance has an ID (which is of type (isa) Term) A Relation has a name (which is of type (isa) Term) A Semantic Relation is a subtype of a Relation A Semantic Relation has arguments (which are of type (isa) Concept) An Instance Relation is a subtype of a Relation An Instance Relation has arguments (which are of type (isa) Instance) An Instance Relation is an instance of a Semantic Relation A Terminological Relation is a subtype of a Relation A Terminological Relation has arguments (which are of type (isa) Term)

todo transcribe that figure into words

Logic and semantics

https://en.wikipedia.org/wiki/Organon

https://en.wikipedia.org/wiki/Problem_of_future_contingents

https://en.wikipedia.org/wiki/Square_of_opposition

Kantian logic

Union sum types, xor, Kantian Disjunctive judgements, Kantian category of Community

Kant's notion of Disjunctive judgements relates to the logical operation of xor, and his corresponding category of Community relates to union sum types, that is, to types which represent a list of other types, and you know that a value whose type is the union sum type must be an instance of exactly one of those types on the list, but you don't know which one.

In other words, the n-ary xor of the predicates isInstance(x, t_i), where i ranges over the types on the list, and where x is any value whose type is the union sum type of that list, is True.

(in some languages, the equivalence between this and union sum types is only true if the types on the list don't overlap, eg there is no value that isInstance of more than one of those types; but i am talking about union sum types in which the cases are defined with respect to the sum type, for example data declarations in Haskell, in which you define the constructors for a datatype as mutually exclusive cases of that datatype; for example, defining a list as either (a) an empty list, or (b) cons(something, another list); in this situation the cases are mutually exclusive)

I may be totally misinterpreting this.

Unsorted

the 8 parts of speech recognized in English (noun, verb, adjective, adverb, pronoun, preposition, conjunction, and interjection)
https://en.wikipedia.org/wiki/Word_class#Functional_classification
https://en.wikipedia.org/wiki/Closed_word_classes , noting "Different languages have different word classes as open class and closed class – for example, in English, pronouns are closed class and verbs are open class (see for example the contentious topic of gender-neutral pronouns in English and how common verbing is), while in Japanese, pronouns are open class, while verbs are closed class – to form a new verb, one suffixes 〜する (-suru, "to do") to a noun – for example, "to exercise" is 運動する – "to do exercise"."
https://en.wikipedia.org/wiki/Adposition#Semantic_classification

http://en.m.wikipedia.org/wiki/Synthetic_language vs. https://en.wikipedia.org/wiki/Analytic_language (this distinction is only about inflectional morphemes; see also https://en.wikipedia.org/wiki/Isolating_language , which takes into account derivational morphemes as well as inflectional morphemes)

https://en.wikipedia.org/wiki/Template:Computable_knowledge

proj-plbook-plPartNaturalLanguage

Natural languages, knowledge representation, logic, semantics, and ontology

Introductions / basic definitions

Syntactical structure of logic

Syntactical structure of natural languages

Semiotics

Vocabulary differences in linguistics, logic, programming language implementation, and semiotics

Core languages and universality in natural language and semantics

Brown's animal taxonomy

Brown's plant taxonomy

Links that mention Brown

More 'universal' noun taxonomies

Constructed noun taxonomies

Kortmann's adverbial relations

Random interesting searches

Words with low age of acquisition

Universal Systems Language

Conlangs based on lexical classifications or ontologies or combinations of relatively small numbers of primitives

Related notes from the cognitive development of young children

unsorted

Natural language

To be

Lists of lexical categories / parts of speech

Subject, object, indirect object, etc

Other grammatical categories and lists of other grammatical categories

Ontologies and the organization of knowledge

Philosophical categories

Computer ontologies

Some relationships between components of ontologies

Logic and semantics

Kantian logic

Union sum types, xor, Kantian Disjunctive judgements, Kantian category of Community

Unsorted