# books-someStuffYouMightLikeToKnow

## not written yet...!!

NOTE: THESE ARE JUST NOTES, PROBABLY FULL OF ERRORS!!

if you see an error, please email me to let me know, thanks

## Introduction

### A priori vs. a posteriori

"A priori" knowledge is that which we can be sure of without any empirical evidence, where empirical evidence means evidence from our senses. So, "a priori" knowledge is stuff that you could figure out if you were just a mind floating in a void unconnected to a body. "A posteriori" or "empirical" knowledge is stuff that you would need empirical evidence for to believe.

### Theory

In many fields, some people spend some time working on "theory", and consider this separate from "applied" work or "experimental" work. There are various definitions of "theory", but one is that you are working on theory when your goal is to understand something. This is as opposed to application, where your goal is to solve some problem, to produce something or to produce a method to do something, and to experiment, where your goal is to collect empirical data.

If theory is defined as to understand something, it explains some of the curious properties of theoretical work. When you work on theory, you don't know what you are going to discover, if you will discover anything at all, or how long it would take, because in order to predict these things you'd have to understand the domain that you are working on, but if you already understood it, you wouldn't have to do theory in the first place.

(thanks to R.F. for this idea)

Another aspect of "theory" is its connection with a priori reasoning. If a scientist or engineer says they are doing work in theory, generally they mean that they are spending much of their time deriving a set of logical consequences from assumptions. The assumptions might not be a priori, but given the assumptions, the consequences are. Either or both of the choice of assumptions, or the results sought from derivation, are usually dependent on empirical evidence or contingent circumstances, however. So a scientist might work on theory for awhile and then say, "From experimental evidence, we see that X and Y and Z. From a priori reasoning, I can show that if W and X and Y are true, then Z is implied; and further, Q is implied. Neither W and Q have yet been tested experimentally, but both could be". The fact that Z and Q are implied by W and X and Y might be surprising or at least non-obvious, and the derivation of Z from X and Y might take a lot of work, which is why this is an activity in itself.

I don't mean to say that, in producing such a theory, a theorist would actually start by picking out beforehand a W and an X and a Y and a Z and tries to find a derivation of Z from X and Y and W. This sometimes happens, but often, a theorist would just start making random somewhat plausible assumptions and playing around with them formally to see what turns up, being attentive for things that are already empirically known facts and also for testable hypotheses. After they find some interesting-seeming structure, they might try to change things around until they get some assumptions and/or conclusions that are more firmly grounded in experiment, and/or that can be feasibly tested. Let me repeat that for emphasis; the theorists that i know do not think carefully about exactly what they want to prove from exactly which assumptions and then try really hard to prove it; they start by (1) observing unexplained but quantified (or at least formally phrased) regularities in data, (2) haphazardly making up and just as quickly discarding plausible formal assumptions and fooling around with them to see if derivations starting from those have any interesting structure. Only after they discover a promising start do they go back and think about exactly what the assumptions and conclusions should be and work hard at adjusting their derivations to find a proof for those.

Note that there is another, more concrete definition of 'a theory' used in mathematical logic (namely, the set of consequences of a given set of axioms).

### Fields of study

There are various ways of defining fields of study.

One way is by topic. Chemistry studies chemicals, biology studies living things, physics studies the laws underlying the physical world, math studies numbers, literature studies texts, history studies things that happened in the past involving humans, etc.

Another is by discipline, that is, what methods each field denies itself when producing knowledge. Math does not allow anything which is not proven. The sciences do not allow anything without empirical evidence. The experimental sciences do not allow anything which has not been confirmed by experiment. Etc.

Another is by community. Chemistry is the community of researchers in chemistry departments, their students, etc.

The terms field, topic, discipline are used interchangably but this sometimes leads to arguments. For example, "interdisciplinary" topics seem like a good thing from the "topic" and "community" perspective (there are topics which are 'in between' the well known fields, so if you want to study them you end up working with people employed in multiple fields), but seem like a bad thing from the 'discipline' perspective (imagine for example an interdisciplinary branch of mathematical physics which, like physicists, accepts as results derivations which are not sufficiently rigorous to be mathematical proofs, and which, like mathematicians, accepts as results "not even wrong" theories consisting of interesting arrangements of formal machinery which cannot make empirical predictions distinct from preexisting theories. Such a community can be said to have no discipline!).

### Types of topics

One sort of type of topic is logic. Logic is the investigation of universally certain, formal reasoning, where 'formal reasoning' means that we reason only on the basis of the form (e.g. the grammatical form) of statements, not on their content, and 'certain' means that, insofar as we are reasoning logically, we only want to reason using methods such that, if we could follow the method without error, then we could be absolutely certain of our conclusions, as opposed to merely 'beyond a reasonable doubt', and 'universally certain' means that in any situation in which a reasoning being could possibly exist, the methods of logic would still yield certain results.

For example, the statement "if A is true, and if it is true that A implies B, then B must be true" is of universally certain truth, and formal. Furthermore, the statement says that a certain method of reasoning, that is, to assume that B is true if you already know that A is true and that A implies B, is valid. So, this statement is both logically true itself, and also of particular interest to logic because it asserts the validity of a reasoning method.

If we expand our view from universally certain, formal reasoning to all universally certain, formal truth, we get the topic of rigorous mathematics, construed as truth about which one can write formal proofs. This is more a discipline than a topic; the discipline is that you may only use logical proofs.

If we expand our view to all universally certain truth, we get rigorous philosophy. At this point we admit patterns of reasoning such as, "I think, therefore I am," which has universally certain validity which seems to stem from their content rather than from their form.

Now we expand to universal truth. This covers much of the wide-ranging subject of philosophy. However there are parts of philosophy which are concerned with issues particular to humans, rather than applicable to any reasoning being.

Next we'll turn our attention to the experimental sciences. Relax universality to things which merely seem to be true everywhere in existence, but which seem to be contingently or empirically true, as opposed to necessarily true. By empirically we mean that if you were a mind floating around without a body and without any senses, you would not be able to convince yourself of these things; more precisely, you need empirical evidence from your senses in order to decide if these things seem to be true. We will also add the requirement, the requirement of experimental science: that we only consider falsifiable hypotheses, that is, hypotheses for which some experiment can be imagined which could in principal falsify the hypothesis.

The study of falsifiable hypotheses for truths which are thought to hold throughout existence is physics.

Part of physics is chemistry.

Biology has some content that is applicable to anywhere in this universe (for example, the theory of evolution), but for the most part it studies truths which are only applicable to specific types of life out of all of the theoretically possible types, namely those types which are known to exist or to have existed; and since we've only explored close to home, right now that restricts us to this solar system, and as of this writing, even to this planet.

Neuroscience is part of biology.

Geology is also for the most part confined to this planet.

At this point we turn to non-experimental sciences, that is, topics for which experiments cannot feasibly be done.

Economics is a mixture of mathematics, with an in-principal experimental science, with a non-experimental science (since no one really has the time or the money or the ethical license to do most of the experiments).

There are also 'hard sciences' and 'soft sciences'. Hard sciences are (my definition, not everyone thinks of it this way) those in which the objects of study can in principal be directly measured in a non-controversial fashion. Soft sciences are the others, those in which there is considerable disagreement on just what the objects of study really are and how to assess them. For example, in geology they study rocks. This is a hard science. You can weigh a rock, you can do chemistry on it to see what it's made of. It's clear what rocks are (a collection of molecules stuck together) and how they could be broken into well-defined parts (molecules).

By contrast, in psychology they study people's minds. People still argue about what a mind is. There are various things that are thought to be parts of a mind (thoughts, emotions, perceptions, memory, motivation, consciousness, personality) but it's arguable whether any such list is the best way to think of the composition of a mind, and no one can say if such a list is comprehensive. Because it's arguable what is being studied, assessment methods are also controversial; does X experiment really measure motiviation, or personality? Should we ask people about what they are thinking and feeling, or is that too subjective? How do we do experiments on emotions?

History is often considered not even a science because you can't actually observe the past, even in principal. It's topic is also even more provincial than biology; it only considers humans. It is also soft; although whether or not someone was in a certain place on a certain day is pretty objective, history also concerns itself with subjective issues like which countries were most powerful, and questions of why, such as, why did country X, which was so powerful in year Y, become less powerful afterwards?

Now we turn from investigation of truth to the creation of useful methods. We have engineering, which is the rearrangement of nature and the creation of artifacts using technology. Note that most branches of engineering are paired with a theoretical side, usually involving physics or mathematics or both, as well as an applied side; so we have things like the mathematical theory of computation and of algorithms twinned with the search for useful computing constructs, and we have the theory of linear signals and systems, and control theory, and information theory, twinned with the applications to electrical engineering. Like the hard sciences, the stuff being studied usually admits non-controversial definitions and measurements, and success or failure of work is often even less controversial than in the hard sciences. However, unlike science, the topic is not just what is actually true but the much wider domain of what we could hypothetically do that would be useful.

Finally, we have the humanities. Here we are still provincially restricting our attention to humans. Here we are again like the soft sciences in that the nature of what is being studied, and how to assess it, and how to assess success or failure of work, is itself controversial. However, despite this methodological difficulty the humanities are still worked in because the topics at hand are so important to us.

There are plenty of interdisciplinary areas too. There is cognitive studies, which includes bits of psychology, neuroscience, a.i., philosophy of mind, philosophy of rationality, metalogic, linguistics, and other stuff.

There is the design of programming languages which, in a manner similar to architecture, is part mathematical theory, part philosophy, part humanities, and part engineering (and possibly part art but i don't want to make anyone angry).

The definitions given in this section (and in the rest of this book) are not very carefully thought out. They are 'naive' or 'folk' definitions, as the philosophers say, and probably often wrong. Also i don't mean to indicate that the topics i mention here are the only or the most important ones.

## ontology

Ontology is the topic concerning the nature of being.

One popular question within ontology is, what are the classes of things that exist? (incidentally, if you like to think about and collect lists of classifications, you might want to look at my collection: systemsOfClassification).

A related thing which shares the same name is what i term a 'computer ontology' (which everyone else just calls an 'ontology'), which is a computer-readable formalization of a taxonomy of concepts, along with some computer-readable semantic information relating the concepts to each other and/or defining the concepts.

### Aristotle's

substance:

Primary substances are individual objects, e.g. Alice.

Secondary substances are the categories of individual elements, e.g. Human.

A species is the most specific sort of secondary substance (e.g. a human). A genus is a secondary substance which is more general than a species (e.g. a human is a kind of animal; so 'animal' is a species in relation to 'living substance' but a genus in relation to 'human'). A primary substance stands in an isa relationship to its species and genuses (genera). All secondary substances are either species or genera of some primary substance(s).

"Aristotle's account in Categories can, with some oversimplification, be expressed as follows. The primary substances are individual objects, and they can be contrasted with everything else—secondary substances and all other predicables—because they are not predicable of or attributable to anything else. Thus, Fido is a primary substance, and dog—the secondary substance—can be predicated of him. Fat, brown, and taller than Rover are also predicable of him, but in a rather different way from that in which dog is. Aristotle distinguishes between two kinds of predicables, namely those which are ‘said of’ objects and those which are ‘in’ objects. The interpretation of these expression is, as usually with Aristotelian cruxes, very controversial, but a useful way of looking at it is as follows. Dog is said of Fido because it characterizes him as a whole. Fat and the others are described as being in because they pick out a constituent feature that could be said to be, in a logical though not a physical sense, part of, or in him. Fido the individual is not attributable to any further thing at all....If substance did not exist it would be impossible for things in any of the other categories to exist. There could be no instances of properties if there were no substances to possess them." -- http://plato.stanford.edu/entries/substance/

Aristotle was big on analysis by considering what the definition of things was. Each species has a definition that distinguishes it from other things in its genus. However individuals (primary substances) don't have a logical definition.

Aristotle thought that definitions were just ways to differentiate a species from other species in its genera; therefore if there are genera which are not themselves species to any higher genera, then such genera can't be defined. Aristotle calls these things categories. He identifies ten categories:

"each [individual term] signifies either substance or quantity or qualification or a relative or where or when or being in a position or having or doing or being affected. To give a rough idea, ideas of substance are man, horse; of quantity: four foot, five foot; of qualification; white, grammatical; of a relative: double, half, larger; of where: in the Lyceum, in the market-place; of when: yesterday, last year; of being in a position: is-lying, is-sitting; of having: has-shoes-on, has-armour-on; of doing: cutting, burning; of being-affected: being-cut, being-burnt. (1b25 - 2a4)" -- http://plato.stanford.edu/entries/substance/

## qualia

"Qualia is a term used in philosophy to refer to individual instances of subjective, conscious experience....Examples of qualia are the pain of a headache, the taste of wine, or the perceived redness of an evening sky." -- http://en.wikipedia.org/wiki/Qualia

Some arguments for the existence of qualia as an ontologically separate sort of thing from the facts about the physics and biology of our brains:

"Mary the colour scientist knows all the physical facts about colour, including every physical fact about the experience of colour in other people, from the behavior a particular colour is likely to elicit to the specific sequence of neurological firings that register that a colour has been seen. However, she has been confined from birth to a room that is black and white, and is only allowed to observe the outside world through a black and white monitor. When she is allowed to leave the room, it must be admitted that she learns something about the colour red the first time she sees it — specifically, she learns what it is like to see that colour." -- Frank Jackson

"The inverted spectrum thought experiment, originally developed by John Locke,[7] invites us to imagine that we wake up one morning and find that for some unknown reason all the colors in the world have been inverted. Furthermore, we discover that no physical changes have occurred in our brains or bodies that would explain this phenomenon." -- http://en.wikipedia.org/wiki/Qualia#Arguments_for_the_existence_of_qualia

### zombies

Zombies are the idea of people who behave the same as everyone else, but who have no conscious experience.

## phenomenology

"Phenomenology is the study of structures of consciousness as experienced from the first-person point of view....The Oxford English Dictionary presents the following definition: “Phenomenology. a. The science of phenomena as distinct from being (ontology).... In its root meaning, then, phenomenology is the study of phenomena: literally, appearances as opposed to reality. This ancient distinction launched philosophy as we emerged from Plato's cave...How did philosophy move from a root concept of phenomena to the discipline of phenomenology?

Originally, in the 18th century, “phenomenology” meant the theory of appearances fundamental to empirical knowledge, especially sensory appearances.... By 1889 Franz Brentano used the term to characterize what he called “descriptive psychology”. From there Edmund Husserl took up the term for his new science of consciousness, and the rest is history. "

## natural kinds

Is there a sense in which a partitioning of the world into objects which puts all of the molecules comprising all of my cells, and nothing else, is put into one object, is a better division than one which puts most of the molecules in my torso in the same partition with a bunch of surrounding air molecules? Is there a sense in which a categorization of the world with a category for 'tree' is better than one in which there are categories like 'thing with bark and leaves and a prime number of branches under which is sitting a young girl with a book' and others like that, but no category for 'tree'?

## Occam's razor

"Entities must not be multiplied beyond necessity"

e.g. a simpler theory is better.

### Justifications

http://en.wikipedia.org/wiki/Solomonoff%27s_theory_of_inductive_inference

http://en.wikipedia.org/wiki/Occam%27s_razor#Mathematical

## kant

### Kant's book Critique of Pure Reason, and his philosophy of Trancendental idealism

The "Pure Reason" in the Critique of Pure Reason means that Kant will only use a priori reasoning. The Critique bit relates to the fact that the main point is just to define the limits of the topic what we can metaphysically know.

However, this does not reduce us to only empirical knowledge, nor does it reduce us to only logical and mathematical knowledge. We can also know things because of regularities in the necessary structure of any possible experience. For example, the fact that experiences must exist in time.

Kant defines 'idealism' as "the claim that there are none other than thinking beings; the other things that we believe we perceive in intuition are only representations in thinking beings, to which in fact no object existing outside these beings corresponds" -- Kant, Prolegomena to Any Future Metaphysics. He disagrees with this claim. So, his theory of "trancendental idealism" is NOT an idealist theory. That's why he added the word "trancendental" before idealism, to distinguish it (although imo the fact that the theory's name contains the word 'idealism' is very confusing and he should have called it something else). I think the word "transcendental" is intended to refer to the connection between our internal experience and the external world (things-in-themselves). In practice Kant's theory shares many similarities to idealism because, although formally he admits the existence of objects-in-themselves (e.g. independent of our minds), we can't know almost anything about them because almost everything we know either comes through our sense data, or is purely formal logical knowledge, or is a consequence of knowledge about the structure of the nature of experience itself, which is closely related to the structure of mind. Even today there are critics of Kant who call him an idealist or subjectivist, a charge which annoyed Kant greatly; in the Prolegomena To Any Future Metaphysics he goes on and on about how people think he's an idealist when actually he's the opposite.

I haven't read the Critique of Pure Reason, only the Prolegomena To Any Future Metaphysics, which Kant intended to be a summary of the Critique because the Critique was so long. However my feeling was that the Prolegomena, because it didn't progress methodically explaining every step, was probably even harder to read than the Critique would have been.

The Prolegomena To Any Future Metaphysics is so named because, since seeks to define the limits to what we can metaphysically know, in a sense it comes prior to actual contentful metaphysical discussion. A lot of the book is Kant trying to say what I just said, and some of it is Kant defining terms such as 'experience' and 'intuition', and some of it (mostly the Judgements and Categories and related parts) is Kant talking about what he thinks we can know a priori about the structure (some writers say 'anatomy') of the mind and experience, and some of it is Kant going thru various metaphysical questions and saying how his theory of Trancendental Idealism answers them.

The last bit is imo quite deflationary and the answer almost always turns out to be "we can't ever answer this question because it concerns things-in-themselves", which is similar to how nowadays scientific thought answers questions about God with, "That hypothesis is non-falsifiable and therefore we must remain agnostic", or with "By definition science doesn't deal with supernatural entities, so it cannot address that question". However in most cases Kant then goes on to reinterpret the language of the question to refer, not to things-in-themselves, but rather to questions concerning the nature of possible experiences, and then answers the rewritten questions. I find these rewritten questions to be uninteresting; afaict Kant's real answer to most of the questions is simply "we can't know" and he should have stopped there and avoided confusion (a friend told me that he once heard a theory that Kant tried to make the Critique deliberately obscure so that he wouldn't be punished by the Church for denying the possibility of a rational argument for the existence of God, but i've never found any evidence for that).

The whole book is very confusingly written, uses long phrases with complicated words where shorter ones would have done, and is very redundant. Kant could have used a good editor. However, from the perspective of cognitive studies, the definitions and the parts about the Judgements and Categories are relevant, as they are a sketch towards what in today's A.I. would be called a computer 'ontology' and a 'cognitive architecture' which is underappreciated in today's A.I. community.

### Categories

(all tables have four sections, arranged spatially as:

     1
2       3
4

even though i'll write then in columns)

"Logical table of judgements:

1. According to quantity Universal Particular Singular

2. According to quality Affirmative Negative Infinite

3. According to relation Categorical Hypothetical Disjunctive

4. According to modality Problematic Assertoric Apodictic "

"Trancendental table of concepts of the understanding:

1. According to quantity Unity (measure) Plurality (magnitude) Totality (the whole)

2. According to quality Reality Negation Limitation

3. According to relation Substance (Inherence and Substance) Cause (Causality and Dependence) Community

4. According to modality Possibility Existence Necessity "

The entries in these two tables correspond to each other. The nature of the correspondence is that in the former table we have a purely logical notion. In the latter table, we have regularity in the structure of experience, or of mind, which arises from the logical notion and the fact that our minds work rationally (i think Kant sort of defines 'mind' to be 'rational mind', and experience to be involved with the operation of a rational mind, so he rules out by definition the possibility of experiences of illogic).

However, there is some argument about the whether Kant actually meant the items in the table to line up with one another. E.g. in "Kant's Anatomy of the Intelligent Mind", opines "Since singular judgement is listed third under the moments of quantity in the table of categories, many assume that Kant traced unity not to singular but to universal judgement. But it is difficult to make sense of the notion that unity corresponds to universal logical form and totality to singular. Kant's trichotomous ordering of the categories under each heading seems to be due not primarily to the sequence of the corresponding functions of judgment in the table of judgements, but rather to the fact that the third member of each categorial trichtomy must be understood as resulting from the combination of the first two..."

For an example of correspondence, note that Relation:Hypothetical corresponds to Relation:Cause. Hypothetical is Kant's word for the logical connective of implication. Clearly the mental structure of the concept of causation is 'the same shape as' (in the sense of homo/isomorphism) the structure of the concept of logical implication.

What some of the others mean:

Judgements: This is a linguistic/logical issue. Here we're talking about the logical/grammatical form of assertions. That is, we're classifying assertions as to their form.

Judgements of Quantity: Universal involves 'all men', for example the statement "all men are mortal". Particular involves 'some men'. Singular involves 'This man' or 'The man'. Note that this terminology does not match modern usage; today we would map kant's universal->universal quantification, kant's particular -> existential quantification, kant's singular -> particular or singular (no quantification).

Judgements of Quality: affirmative is like 'F is G'. negative is like "it is not the case that F is G"; infinite is like "F is non-G". i think. sometimes i get negative and infinite confused. About infinite judgements, Kant says unhelpfully that, "Although I have taken something away from the possibilities of what the soul might be like, I have not thereby said what it is or clarified the concept of the soul, there are still an infinite number of possible ways the soul could be. The content of an infinite judgment is purely limitative of our knowledge rather than amplitative of it." Norman Kemp's interpretation ( http://www.gutenberg.org/files/43572/43572-0.txt ) is that the point is that infinite judgements are like affirmative ones and unlike negative judgements in that they have content attached to the notion of what F is (the content is the entire infinity of everything except for G); but unlike affirmative judgements, and like negative ones, the effect of infinite judgements is only to limit possibilities, not to actually tell you what something is (because even though G has been eliminated, there are still an infinity of possibilities left for what F could be, so you still don't know anything specific about its structure).

Judgements of Relation: Categorical are judgements like "P". Hypothetical are like "If P then Q”. Disjunctives are like "Fs are either Gs or Hs or …" "where where each partition of the total domain is mutually exclusive and the total set of partitions is exhaustive", quoting http://plato.stanford.edu/entries/kant-judgment/ (e.g. disjunctions are like http://en.wikipedia.org/wiki/Sum_type in programming languages).

Judgements of Modality: Problematic/Assertoric/Apodictic are the modalities of possible/actual/necessary.

Unhelpfully, Kant refuses to define the categories, at least within the Critique: "I purposely omit the definitions of the categories in this treatise...to give them here would only bide from our view the main aim of our investigation, at the same time raising doubts and objections, the consideration of which, without injustice to our main purpose, may be very well postponed till another opportunity. Meanwhile, it ought to be sufficiently clear, from the little we have already said on this subject, that the formation of a complete vocabulary of pure conceptions, accompanied by all the requisite explanations, is not only a possible, but an easy undertaking."

Category of quantity: Unity is a single thing. Plurality is some unspecified aggregation of things. Totality is "plurality contemplated as unity". Note that the concept of a number belongs to the category of totality, not plurality; when we say "3 sheep" we are contemplating the group of three sheep as a single thing.

Category of quality: i don't have much to say about this, except that to me perhaps this is just the obvious semantics for the corresponding judgements?

Category of relation: Substance (inherence and subsistence): We have entities which have properties (substance and accident). Cause is cause and effect. Community is a bunch of things that jointly affect/simultaneously/reciprocally one another. An example Kant gives is "a body the parts of which mutually attract and repel each other". The connection to disjunction is that if you have a sum type (e.g. the days of the week), then if you say "day X is Monday", you simultaineously exclude the possibilities that "day X is Tuesday", "day X is Wedesneday", etc; and if you say simultaneously "day X is not Tuesday", "day X is not Wednesday", "day X is not Thursday", "day X is not Friday", "day X is not Saturday", "day X is not Sunday", you imply that day X must be Monday. So, the relation of logical implication between the seven propositions (of the form "day X is (some particular day)") follows a pattern where one things can simultaneously lead to many things, and many things can all simultaneously lead to one thing, rather than the one-to-one chaining pattern in ordinary implication.

Interestingly, under some presentations of constructive mathematics (e.g. http://plato.stanford.edu/entries/mathematics-constructive/ ) it seems to be exactly this category of community which is considered problematic in non-constructive proofs. Some e.g. Feferman seem to propose that perhaps this is sort of reasoning is only a problem when we are dealing with infinite communities of dubious reality, e.g. the reals, a not when we deal with e.g. finite communities. Kant's focus on the surpringly simultaneous pattern of the logic of sum types seems to shed light on why this may be disturbing; nonconstructive proofs involving reals involve deductions with a pattern of simultaineous logical interaction between all uncountably many reals, in which the presence of every other real is required. If you have any doubt as the reality of every single real (for example, if you doubt the reality of indefinable reals) then a conclusion from this method of deduction isn't justified, and furthermore the simultaneous influence of uncountably many things on your conclusion is startling in any case.

Category of modality: i don't have much to say about this, except that to me perhaps this is just the obvious semantics for the corresponding judgements?

## Aristotle

Aristotle invented the logical syllogism, as well as the aristotelian hylomorphic theory as well as the aristotelian categories covered above (todo).

## physics

physics heuristics:

• see an i? look for the e....
• got something nonlinear? do a power series expansion to first order (every now and then, to second order)
• to first order, exp(x) \approx 1 + x
• umm.. i don't like the real numbers? i like countables?

-- n-th roots of unity: they look scary but if you choose to think of {e^i x} as abstract mumbo-jumbo which is shorthand for cos x + i sin x, and if you think of that as abstract mumbo-jumbo which is shorthand for a position on a circle, then their algebraic form is actually pretty intuitive. Remember, the right point on the circle is labeled 0 (and 1), the left point is -1, the top is i and the bottom is -i.

so: what's the square roots of 1? 1 and -1. What are the fourth roots of 1? Well, in reality, just 1 and -1. But nooo, we want -1 to have roots too. So now we have i*i = -1. So that means, i*i*i*i = 1. And, (-i)^4 = 1, also. But this can easily be visualized by thinking that exponentiation represents multiplication of an angle around the circle; so, since i represents 90 degrees, i^2 represents 180 degrees (-1), and i^3 represents 270 degrees (-i). So, the fourth roots of unity are just those angles A such that if you start at the right point and then go around by A degrees, and you do that four times, then you get back to where you started.

The mathematical reason that exponentiation represents multiplication of an angle is that an angle is represented by the label at the corresponding point at that angle on the circle; so, for example, 90 degrees is labeled i. Now you represent that label with cos x + i sin x; in this case, x must be pi/2 (+ 2\pi n, of course) for the result to be i. And you represent that with e^(i x). Now, if you take (e^ix)^k, going up to the exponent steps down the exponentiation (it's like, if you're a god on earth, you're just an ordinary guy in heaven), turning it into multiplication. So you get e^(ixk); or, e^(i x'), where x' = xk. Now you can see that xk is going to be used as the amount of radians in the new angle. So the old amount (x) got multiplied by k.

So that's the use of this whole contrivance. It's like operator overloading in obejct-oriented programming. Sometimes you want some term (a container object, in OOP terms) to hole an angular value, but you don't want it to work the normal way, where if you add two of these things, then the angular values inside them add. Nooo, you want it so that if you MULTIPLY two of these guys, then the values inside get added. Now, exponentiation is like iterated multiplication, so if you raise one of these guys to 2, then that doubles the angular value that it is holding.

So that's all this stuff is. In physics, imaginary quantities aren't ever measured directly, only real quantities are measured (get it? real?). So if you see a term like ix, you gotta wonder. Well, in physics, those ix's don't actually get used either (in my experience). Whenever you see an ix, it's usually because, somewhere down the line, it's going to be the argument of an exp function (exp(x) = e^x). Now, exp(x) is perfectly normal when x is real, but when x is imaginary, it's just a container object for angles, which turns multiplication on the outside into addition on the inside. And recall that complex numbers, a + ib, when used as the argument of an exp, separate into a real argument and an imaginary one: e^(a + ib) = (e^a)(e^{ib}). The first one is perfectly sensible (recall that it, too, turns multiplication on the outside into addition on the inside), and the second one can be thought of as a container for angles which turns multiplication on the outside into addition on the inside.

i guess a simpler way to say this is: e^x is a container for angles which turns multiplication on the outside into addition on the inside. complex numbers, at least when they are used as the arguments to exps, are just another way to represent angles.

one thing that makes it confusing is that the angle representation for inputs and outputs is different. by this i mean, consider e^{i x} = y. I'll say that "ix" is the "input", and "$y$" is the "output". Let's see what we get for an input of i \pi/2:

e^{i \pi/2} = cos \pi/2 + i sin \pi/2 = i

So, we see that the output $i$ represents the angle \pi/2.

But the input corresponding to this was $i \pi/2$. So, on the input side, $\pi/2$ is represented by $i \pi/2$; but on the output side, $\pi/2$ is represented by $i$.

so, when you see an i, you can think of it as: "something which, when given as an argument to an exp, represents an angle of 1 radian; and, when it is the output of an exp, represents a 90 degree angle".

and the point of using this sort of representation instead of just using radians directly is that, (what? why not just use e^pi and its ilk? is it b/c you want to have the canceling out of increments of 2 \pi done automatically? is it b/c you want to have the complex plane as the output, for convenience?)

Feynman apparently called it a "jewel". I call its use an obstruction (in physics, not in pure mathematics). Now, even I will admit that it makes derivations more concise, but it obscures pictorial intuition.

--

topics:

intro to calculus (no integration methods i.e. partial integration)

div, grad, curl with "intuitive" pictorial pseudo-proofs

a condensation of some physics mechanics: start from noether's law EM: derive maxwell's eqs from experimental laws relativity: use the triangle derivation

LTI systems (convolution etc)

linear algebra (from a modern algebra pov)

some algebraic logic

formal logic, godel

backprop

finding maxima and minima with derivatives

mean, variance, estimators, bias, variance of estimators

Taylor series

NP-completeness

the artifice of real numbers (instead of integers), and continuity the artifice of imaginary numbers, and euler's EQ

category theory (once i learn it)

set theory, infinite ordinals and cardinals, surreal numbers

game theory

(diff

 algebraic) (geometry topology)

?: other VP-ish stuff? e.g. chu spaces?

scaling laws

wave equations

helmholtz's theorem

modern algebra

some analysis

compound interest, some diff eqs

entropy, information, mutual information, KL divergence

pictures of even simple things like linearity, distributivity, monotonicity, etc

---

## Physics

• key-path integral (i.e. we're using feynman's path-integral formulation)
• indifference to location
• indifference to time
• indifference to speed
• the speed of light is the speed limit
• ?? what else? what's needed to derive electromechanics as well as mechanics?

derive form of Lagrangian using noether's method

take limit of massive, slow things to recover classical mechanics and electromechanics

derive special relativity

use special relativity to derive magnetism from electricity. exhibit maxwell's eqns.

general relativity: don't know yet

take limit of not-very-dense things to recover classical picture

statistical mechanics, quantum field theory: don't know yet

be careful to use explicit phase notation in place of Eulerian notation -- no 'i's allowed! ("unimaginative physics")

do every step. use mathematical document structuring (theorems, lemmas, etc, with cross-referencing by eqn numbers).

draw pictures for every other page (or somesuch). the pictures should be readable by themselves. this will allow people who don't know math to get some physical intuition.

note places where stuff sounds mystical:

the fields generated by each particle (electric, gravitational, etc) extends throughout the whole universe (although, not at the same time!)

in the macro world, we are used to things being real or not real (true or not true). at the quantum level, we have to get used to things being "sorta real". if something has a 50% chance of being true, it is "half real". (do the different sorta real possibilities interact? or do they just sum without interacting?)

when we perform a measurement, there is a wavefunction collapse. this gives the connotation of a guy who is sneakily doing things until you point right at him and yell at him, at which point his hubris "collapses" and he stays in one place. but i think a better picture is that there is Mystery in objects that we do not know much about -- and what you can do is to dive into the Mystery, to let the Mystery expand out of the object and into you. now you are part of the mystery, and it is part of you. you learn things about the object, and it is no longer mysterious to you -- but you yourself, as well as the object, are now mysterious to others.

cotter once told me that chi is both a substance and a concept at the same time. perhaps there are things in physics which are also physical entities and yet concepts at the same time.

--

## Quantum physics and information

### Quantum computation

Quantum computers only work if you don't/can't observe the intermediate internal state of the computation!

Quantum erasure: if part A of the system observes part B, and then 'forgets' what it saw, then the quantum computer works; but if part A communicates what it saw to the observer, quantum computer does not work. However, 'forgetting' is not like what we are used to in the macro world, and does NOT include mixing up information themodynamically (i think?); you have to 'cancel out' information to forget.

unitary operators

## Formal logic

First let's cover what's available in formal logic. By 'formal', we mean all of 'rigorous', 'symbolic', and also 'concerned with form, not content'.

By 'logic', we mean the form of correct reasoning with absolute certainty. Philosophically, there is currently some argument whether the 'correctness' of logic stems from a property of reality, from universal properties of the human mind, or from a cultural protocol, but we won't further discuss that here.

Sometimes people talk about multiple 'logics'. Sometimes these 'logics' are meant to be alternative formal models of the universal, correct reasoning process, other times the the term is used to denote systems whose formalism shares a superficial similarity to formalisms of "correct reasoning", but which are not being put forward themselves as such a model. Some philosophers might get annoyed by this use and insist that there may be many formal systems intended to model parts of logical reasoning, but there is only one "logic", meaning there is only one form of correct reasoning with absolute certainty. Perhaps other philosophers might disagree with that.

My phrase "with absolute certainty" needs clarification. There may be "probabalistic logic", which tells you how to correctly reason with probabalities; in such a case, "absolute certainty" refers not the probability attached to each conclusion, but to the correctness of attaching that probability to that conclusion. For example, if i ask you what is the chance that an ideal fair coin lands heads-up, you may answer "50%", and be absolutely certain in this answer.

if you pick up a book on 'formal logic' or 'mathematical logic' expecting it to tell you a system for correct reasoning, you may be surprised. Most of the content under these topics is proving theorems ABOUT formal reasoning systems. For example, one property often desired of a formal reasoning system is consistency; a consistent system is one that will never conclude both "A" and "not A" at the same time, where A is some proposition [for example, "I'm wearing socks" and "I'm not wearing socks"]. A book on mathematical logic might present a proof that some formal logic system is consistent. If you picked up a book on mathematical logic hoping to be told a formal reasoning system and advice on how to use it, you might be surprised to instead find a formal characterization OF a formal reasoning system, and a bunch of theorems about things like consistency.

Other popular properties of a formal logic system are 'soundness' [unless you feed it false assumptions, the system will never conclude anything that is false], and completeness [the system can conclude anything which is true]. Both soundness and completeness are with reference to some formalized model of the sort of things in the world that the formal logic is talking about.

### Syllogisms

A syllogism is an argument of the form "All men are mortal. Socrates is a man. Therefore, Socrates is mortal."

### Predicate logic

Predicate logic is one formal logical system. In predicate logic, you have variables which can each take on the boolean values "True" or "False". You then have the unary operator "not", and the binary operators "and" and "or". You put these together to make propositions, for example, "[A and [not B]] or C". If you pick either "true" or "false" for each of the variables [called an "assignment to the variables"], then the "truth value" of the proposition as a whole can be calculated. For example, if A is False and B is False and C is False, then "[A and [not B]] or C" is false.

We see then that each proposition induces a function from assignments of the variables to boolean values.

Starting with "not", "and" and "or" [actually you only need "not" and "and"], you can define other commonly used logical connectives, such as "implies" [a implies B is "B or not[A]"], "bidirectionally implies" ["[a and b] or [not[a] and not[b]]"], "nand" ["not[a and b]"], etc.

In addition to variables, we allow boolean constants, so "True" is a valid proposition, as is "A or True". "True" always evaluates to True, "False" always evaluates to False, and when found in a more complex proposition, they are treated as if a variable was there and that variable had been assigned the constant.

Here's our inference rule. If one predicate logic proposition P is known to be true, and if there's aren't any assignment to the variables that make that proposition P true which don't also makes proposition Q true [we can check this by exhaustively enumerating all possible assignments to all the variables found in either P or Q, and then seeing which ones make P true, then seeing if all of these also make Q true], then we can infer [or "deduce"] that proposition Q is true.

Note that some predicates always evaluate to True for all assignments to the variables. For example "A or not[A]" is True no matter if A is True or A is False. Our inference rule always allows us to conclude such propositions, because the proposition "True" is known to be True regardless of the variable assignment, so if any proposition Q is found to also evaluate to True no matter what the variable assignment, then we may always conclude Q. Such propositions Q are called "tautologies".

Note also that the system of predicate logic has the property, sometimes called 'vacous implication' or 'material implication' (although these terms are confusing because they can also refer to the behavior of the 'implies' operator rather than the system itself) that if you mistakenly assumed as true some proposition P which is false on all assignments to the variables, then the inference rule allows you to infer ANY proposition Q. This is because the inference rule to infer Q from P only requires that there aren't any assignments to the variables which make P true without also making Q true. In the case under discussion, this requirement is satisfied because there aren't any assignments to the variables that make P true at all.

From any set of axioms (assumed propositions), we can deduce various other propositions.

Sometimes you see people use a symbol that looks like

 - as follows: "a - b" means "from axiom(s) a, we can deduce b". Both "a" and "b" can either be a single proposition or a whole set of propositions. Often lowercase letters are used for single propositions, and capital letters are used if we mean a set of propositions. You can also read "a - b" as "b can be derived from a", "b follows from a". If we have multiple things on the left of the - separated by commas, that means that you might need all of them in order to derive the thing on the right. E.g "a,b - c" means "if you know that a and b are both true, then you can derive the truth of c".

We call the set of all propositions that can be deduced from given axioms "the theory (under predicate logic) of those axioms".

The tautologies are the theory resulting from no axioms.

As you probably expect, predicate logic is consistent, sound, and complete. A book on mathematical logic, for example Enderton, will have the proofs.

### First-order logic

Now onto our next logic, first-order logic. First order is like predicate logic except that in addition to variables which take boolean values, we have other variables that refer to objects from some other domain. For example if the domain is natural numbers, we might have a variable "x", and an assignment to this variable might be "x = 3". We also have functions and relations over these other objects. Functions are things which take as input one or more domain objects, and return a domain object. For instance, addition takes two numbers and returns a number. Relations are things which take one or more domain objects and return a boolean, for example the "less than or equal" relation returns True if you give it "3" and "4" as inputs. In many treatments we also have the equality relation between domain objects.

We also have two "quantifier" symbol, namely "forall" (spoken "for all") and "exists" (spoken "there exists"). Forall is written like an upside-down A, and "exists" is written like a backwards E. "Forall x" means "the following is true for any choice of x", and "exists x" means "the following is true for at least one choice of x", where x is a domain object variable. For example, if the domain objects are the natural numbers, the following is true: "exists x (forall y (x <= y))". In English that is read, "there exists some natural number x such that, for all natural numbers y, x is less than or equal to y", which can be paraphrased as "there exists some natural number x such that x is less than or equal to every natural number" (this is true because we can take "x = 0", and 0 is less than or equal to every natural number).

Sometimes you'll see a period to the right of a forall or an exists, after the variable it's using; this is just a separator. For example, "exists x . (forall y . (x <= y))".

One fact about "forall" and "exists" is that "forall x . P" is equivalent to "not(exists x . (not P))", and "exists x . P" is equivalent to "not(forall x . (not P))". This makes sense in English; for example, if something is true for every man, then there isn't any man for which it isn't true; and if there is some man that fits some description, then it must not be the case that given any man, that man does not fit that description.

Instead of "propositions" we'll just call the things in first-order logic "formulas".

There are various axioms and inference rules that i won't cover here.

First-order logic is consistent, sound, and complete. A book on mathematical logic, for example Enderton, will have the proofs.

### Properties of first-order logic when combined with axioms

First-order logic alone can't say very much; it can only deduce tautologies (things true for any domain, by virtue of logic alone). In practice we usually use it in conjuction with a set of domain axioms, that is, statements about a particular domain of interest.

One simple domain is Peano arithmetic, which formalizes the concept of the natural numbers, as well as addition, multiplication, and <=. There is a generally accepted proof that Peano arithmetic is consistent, although some people have a problem with it because it uses something called 'transfinite induction', which i won't discuss further.

So, is it complete, that is, are there true statements about the natural numbers that cannot be proved by using first-order logic and Peano arithmetic? No, it is not complete; there are true statements about the natural numbers which cannot be proved with this method.

So we just need more axioms, right? Or maybe we need something stronger than first-order logic?

No, neither suffice. Shockingly, Godel's first incompleteness theorem proves that ANY computable system which is as powerful as first-order logic plus the Peano axioms ("as powerful as" means that the new system allows you to deduce any statement that first-order logic with Peano would allow you to deduce) is either inconsistent or incomplete. (by "computable", i mean that you can write a computer program that executes the deduction process in finite time and memory).

This puts a serious dent in the idea that rigorous thought consists of axioms and proofs from those axioms (not that I have any better alternative to offer). To quote Wikipedia, "There is no algorithmic procedure that can correctly decide whether arbitrary mathematical propositions are true or false".

If you combine that with the hypothesis that, at least at the level required to simulate the operation of the human brain, physics may be algorithmically computable, you would conclude that neither humans nor humans aided by computers can correctly decide whether arbitrary mathematical propositions are true or false.

When one looks at Godel's proof, it's basically a formal variant of the Liar Paradox (if i tell you "I am a liar", is that statement true or is it false?). Godel manages to create a first-order logic formula that means "I cannot be proved by first-order logic with these axioms" (formally proving the existence of such a formula is the hard part). Now, if the theory of the given axioms contains this formula, then the theory is inconsistent (because the formula could be proved, but the truth of the formula implies that it cannot be proved). If the theory of the given axioms do not contain this formula, then the theory is incomplete (because that formula is true but it cannot be proved from those axioms).

One might say, "well that's just a trick, if it's only those sorts of formulas which can't be proved, that's pretty inconsequential". And indeed, first-order logic is very useful for proving all sorts of things from axiom sets that are as powerful or more powerful than Peano, so this criticism may be justified, although it's hard to be sure without a precise characterization of just exactly what can't be proven by this system. You may also say, "if that's the only sort of thing that can't be proved, then we can imagine a logic system with axioms FOLP2 which has axioms or rules of inference that allow one to prove that sort of thing. And then, since Godel's first theorem also applies to FOLP2, we create an FOLP3 that adds in the forbidden yet true formulas identified by Godel's first theorem on FOLP2. And iterate. Wouldn't that solve everything?". I asked my logic teacher something like that once, and he said there were some complications having to do with the difficulty of discovering or specifying transfinite ordinals, and referred me to a paper "Turing in the land of O(z)" which i haven't had a chance to read yet, regarding among other things Turing's thesis, "Systems of Logic Based on Ordinals", which appears to be about exactly this according to http://en.wikipedia.org/wiki/Systems_of_Logic_Based_on_Ordinals : "Gödel showed for that any formal system S powerful enough to represent arithmetic, there is a theorem G which is true but the system is unable to prove. G could be added as an additional axiom to the system in place of a proof. However this would create a new system S' with its own unprovable true theorem G', and so on. Turing's thesis considers iterating the process to infinity, creating a system with an infinite set of axioms."

### Second-order logic and beyond

You may be wondering what the 'first-order' in first-order logic is for. It's because there is another logic called "second-order logic", which is like first-order logic except that you can quantify over (apply 'forall' and 'exists' to variables of type) functions and predicates, not just domain objects. And there are other "higher-order logics" above that.

### Infinitary logic

I think I've heard that various mathematical paradoxes (such as some non-standard models of arithmetic, and Godel's second incompleteness theorem) go away if you accept https://en.wikipedia.org/wiki/Infinitary_logic , that is, logic that permits infinitely long statements and/or infinitely long proofs. See also http://plato.stanford.edu/entries/logic-infinitary/ .

### Misc terms and notations

Inference rule specification: when defining a logic, often you'll see a weird notation for inference rule specification:

A B

    C

that means, if you have A and B, you may conclude C. This isn't just written "A and B implies C" because "and" and "implies" are operations that appear WITHIN a logical formula, and we are trying to describe an algorithm or procedure for deducing formulas, e.g. by specifying inference rules, we are answering the question: if you currently have such-and-such formulas in your bag of proved formulas, which other new formulas are you justified in adding to that bag? I guess people find it easier to keep these two things separate if you use very different notations.

the implies operator is often written as an arrow, ->. "and" is often written as a caret, ^. "or" is often written as an upside-down caret (which i can't type in ASCII).

Horn clauses: a horn clause is a logical formula of the form "(A and B and ... and D) implies Z"

sequent: a sequent is an extension to the "a

 - b" notation given above. In a sequent, you can have multiple things on the left AND on the right, separated by commas: "a, b - c, d". The meaning of the commas on the right differ from the left; in a sequent, " -" means If ALL OF the things on the left are true, then you may infer that at least ONE OF the things on the right are true.

modal logic: this involves the introduction of other symbols called "modal operators". Sometimes these come in pairs that relate to each other similarly to "forall" and "exists". For example, "necessary vs. possible" (note that something whose negation is not possible must be necessary), and "obligatory vs. permissible" (note that if you are not permitted to abtain from some action, then that action is obligatory).

other theories of arithmetic: Peano isn't the only one. See Wikipedia category Formal theories of arithmetic.

### Some non-classical logical systems

#### Intuitionistic logic

Classical logic without the law of the excluded middle, that is, without (P or not(P))

The use is that this eliminates the ability to form a nonconstructive proof.

#### Relevance logics

Various added restrictions to classical logic to prevent "irrelevant" theorems, that is, theorems whose conclusions intuitively seem like they their premises are irrelevant. For example, "If the moon is made of green cheese, then either it is raining in Ecuador now or it is not.".

#### Description logics

Many description logics are fragments of first-order logic whose expressiveness is greater than predicate logic. Some have disjoint expressivness than first-order logic, though. Some are much more computationally tractable than predicate logic.

They are designed to be good at expressing ontological knowledge, in the sense of computer ontologies.

## Recursion theory

Recursion theory is an old-fashioned name for computability theory. It deals with questions like "what is computable?". It is the branch of mathematics that sprang up around Church and Turing's initial work. Although it deals with computation, it has a logic-y flavor, rather than a computer-sciencey-flavor.

A question is "computable" if you can write a computer program that answers the question (within finite time).

The definition of "if you can write a computer program that answers the question" is rigorously defined in recursion theory. There are multiple, provably equivalent ways of defining this, based on how you define the "computer" and what kind of "programs" it takes; such definitions are called "models of computation". One model of computation is the Turing machine. Another is Church's lambda calculus. Anything which is provably equivalent to the Turing machine is called "Turing-equivalent". The Church-Turing hypothesis is that the formal notion of Turing-equivalence is equivalent to the informal notion of a procedure which can be precisely specified and, for any given input, either accomplished with finite resources, or which does not terminate (note: this informal phrasing is my own and may be wrong).

The "finite resources" part is confusing because, since the amount of resources needed may be different for different inputs, in general there is no upper bound on the amount of resources needed. For example, an ideal Turing machine can execute programs that use an arbitrarily large amount of memory. We say that real "computers" are "Turing machines" but technically speaking they are not, because they have finite memory. However, they are extremely similar in spirit; a Turing machine can be said to be an ideal computer with infinite memory.

Another model of computation is Herbrand-Godel's notion of "general recursion", which first defines something called "primitive recursion" and then extends it to "general recursion" (now you see why this field is named "recursion theory", because that's what some of them called computation back in the beginning.

Incidentally "primitive recursion" can generate almost any practically useful function (an example of a function it can't do is a function called Ackerman's function), but it is somewhat difficult to write programs in the style of primitive recursion (for example, you can't have 'while' loops), so most general-purpose programming languages would not fit the primitive recursive constraints.

Here's an example of something that isn't computable. The "halting problem". Given a computer program, tell me if this program will hang (get caught in an infinite loop) or terminate.

The proof that the halting problem isn't computable is that, if there were an algorithm that, given any program, would say if that program hangs, then you could write a wrapper around that algorithm that said, if the input program would hang, then terminate; but if the input program would terminate, then go into an infinite loop. Now run this program, and as input give it itself. If it computes that it will terminate, then it hangs; and if it computes that it will hang, then it terminates.

A set is computable if, given any potential element, you can compute whether or not that element is in the set.

A synonym of computable is "decidable". This sounds better when the question being answered is a yes/no question. So, the halting problem isn't decidable. A set is decidable if you can decide (using some computer program, within finite time) whether any given element belongs.

If something isn't decidable, it may still be "recursively enumerable". A set is recursively enumerable if there is some computer program which prints out every element of the set one by one (although the set may be infinitely large, as long as any given element will be printed at some finite point in time, it's recursively enumerable).

A synonym of "recursively enumerable" is "semidecidable". If someone asks you if element e is in set S, and S is not decidable, but it is semidecidable, then one thing you can do is run the program that prints out all of the elements of S. If e ever comes up, then you can answer, "yes, it's in the set", but if e never comes up, then you'll be waiting forever. Implementing this strategy in a computer program, you get a program which terminates if the input is in set S, and does not terminate if the input is not in set S.

A computer program which never hangs, no matter what input is given, is called a 'total function'. A computer program which might hang on at least one possible input is called a 'partial function'. This terminology makes sense if whenever the program hangs on some input, you say, "the output corresponding to that input is undefined". The picture is that for total functions, every input has a defined output, but for a partial functions, some inputs have no defined output; partial functions only give you defined outputs for part of the space of inputs.

### Some other topics in recursion theory

If you have an 'oracle', that is, a magic box that will evaluate an uncomputable function (such as the halting problem) for you, now what else can you compute, and what is uncomputable (technically speaking, the questions and answers are usually taken to be phrased in terms of the oracle corresponding to an uncomputable (non-recursive) set of natural numbers, and for any natural number you can ask the oracle, "Is this number in your set?")? The "now what else can you compute" is called "relative computability". These sort of questions induce a hierarchy of sets called Turing degrees. A lot of work in recursion theory is results on the structure of Turing degrees. Depending on your philosophical beliefs you might find this work irrelevant.

There are attempts to redo the rest of mathematics while only allowing computable sets. These go by names like as "computable mathematics", "recursive mathematics", "computable model theory", "recursive analysis", etc.

## Different types of infinity

Usually these are called "transfinite numbers". According to https://en.wikipedia.org/wiki/Transfinite_number, the term "transfinite" was coined by Cantor, who distinguished between "absolutely infinite" and merely transfinite, and sometimes reserved the term 'infinite' for 'absolutely infinite'. But nowadays most people call the transfinite numbers infinite, so we have two words that mean the same thing (transfinite and infinite).

### Cardinals

There's the cardinals or 'cardinal numbers'. These are probably what you are thinking of when you think of a 'number'. Intuitively, a cardinal number is telling you how many of something there is. Two sets are defined to have the same cardinality if and only if there is a bijection between them (this property is called 'equinumerosity'). Now the cardinal numbers are the equivalence classes w/r/t equinumerosity. So each finite integer is a cardinal number. The absolute value notation (two vertical bars surrounding something) is often used to mean "the cardinality of"; e.g.

 X can be read "the cardinality of X", but this is ambiguous with absolute value; sometimes "card(X)" is used instead.

#### Countable cardinality

The infinite cardinal corresponding to the cardinality of the natural numbers (that is, the size of the set of all natural numbers) is called aleph-naught or aleph-null, where 'aleph' is the name of a special symbol (see https://en.wikipedia.org/wiki/Aleph_number#Aleph-naught for a picture).

There are other distinct sets with the same cardinality as the natural numbers. For example, the integers, and the rationals (see below). The term 'countable set' is defined to be any set with the same cardinality as some subset of the natural numbers (some authors use 'countable' to mean only the countably infinite sets, that is, to exclude finite sets). This makes sense because if a set S is countable, that means that there is a bijection between S and some subset of the naturals, which means that there is a surjective function from the naturals to S, which means that you can 'count' starting with 1, 2, 3, .., and at each step you can produce an element of S, and eventually you will reach any given element of S. In other words, you can order the elements of S in such a way so that you can count them, and if you had infinite time to count up to the aleph-naught, then you would hit all of the elements of S. Another way of saying this is that any countable set can be indexed by the natural numbers.

Because the notion of bijection is used to define cardinality, there are some unintuitive results. The cardinality of the positive integers is the same as the cardinality of the naturals, even though the integers have 0 and the naturals don't. You can see this by constructing a bijection; 1->0, 2->1, 3->2, .... Notice that all the naturals appear on the left, and all the integers appear on the right. The cardinality of all of the integers is also the same; 1->0, 2->1, 3->-1, 4->2, 5->-2, etc.

Similarly, the cardinality of the odd natural numbers is the same as the cardinality of all of the natural numbers. You can see this by constructing a bijection; 1<->3, 2<->5, 3<->7, ...; you can count through the naturals on the left and hit all of the odds, or you can count through the odds on the right and hit all of the naturals.

Similarly, the cardinality of all pairs of naturals is the same as the cardinality of all of the naturals. In fact, for any natural number k, the cardinality of the set of all k-tuples of naturals is the same as the cardinality of all of the naturals. The bijection is to represent, for example, the sequence [1,8,2,2,4] as 1^1 * 8^3 + 2^5 + 2^7 + 4^11. That is, we create the number whose factorization encodes the sequence by setting the multiplicity of each prime factor to an item in the sequence (the https://en.wikipedia.org/wiki/Fundamental_theorem_of_arithmetic tells us that each natural number has a unique factorization into prime factors). This bijection is sometimes called https://en.wikipedia.org/wiki/G%C3%B6del_numbering_for_sequences .

Because rationals are equivalence classes over pairs of natural numbers, we can encode the rationals as pairs and then observe that they are countable. Since the natural numbers can be embedded within the rationals, there must be infinitely many rationals, therefore their cardinality is the same as the naturals.

#### Uncountable cardinality

If math made any sense then there would be only one kind of cardinal infinity, but it doesn't. A set that is not countable is called uncountable (and an uncountable set is always infinite). Cantor proved that there exist cardinal numbers bigger than aleph-naught; the theorem is called the 'diagonal theorem' and the technique he invented and used is called 'diagonalization'.

Cantor's diagonal proof in particular shows that the cardinality of the real numbers is strictly greater than the cardinality of the natural numbers; that is, it shows that there does not exist any bijection between the natural numbers and the reals. In other words, there is no way to order the reals so that you can count them one-by-one in order and hit all of them; there is no way to index all of the reals by natural numbers.

(One version of) the proof goes like this. Proof by contradiction: assume that the conclusion is false, that is, that you can index all of the reals by natural numbers. That means that you have an (infinitely long) table with two columns; in each row, the left column has a natural number, and the right column has a real number. Going down the left columns, the natural number appear in order, 1, 2, 3, ... . Going down the right column, let's define a notation for the reals that appear; call them r_1, r_2, r_3, ... . Now, we will construct a real number R that does not appear anywhere in the right-hand column. We will construct R via specifying each digit in its decimal expansion. To the left of the decimal point, R is zero. To the right of the decimal point, each digit of R depends on a different row.

We set the first digit to the right of the decimal point of R to be equal to (the first digit to the right of the decimal point of the decimal expansion of r_1) + 1, except that if (the first digit to the right of the decimal point of the decimal expansion of r_1) is 9, then we wrap back to 0.

We set the second digit to the right of the decimal point of R to be equal to (the second digit to the right of the decimal point of the decimal expansion of r_2) + 1, except that if (the first digit to the right of the decimal point of the decimal expansion of r_1) is 9, then we wrap back to 0.

And so on.

(by the way, the next step of the proof as i describe it here is actually wrong; there's a problem because of things like .99999... repeating equals 1. i won't go into the fix, but see here if you care: https://en.wikipedia.org/wiki/Cantor%27s_diagonal_argument#Real_numbers )

Now, it remains to be shown that R is not equal to r_i for any i. It's not, because if it were, then the decimal expansion of R would be identical to r_i in every digit. But, by construction, it differs at digit i after the decimal point. The figure at the top of https://en.wikipedia.org/wiki/Cantor's_diagonal_argument helps:

Diagonalization is in general good for showing that some set of things is not all-encompassing, and it is also used to show that the halting problem is uncomputable, and therefore that there exists mathematically definable functions for which there exists no algorithm to compute them. The general idea of diagonalization is to show that the type of object which is not contained in some domain can be broken into at least as many parts as there elements in the domain which does not contain it, and then for any putative bijection between elements of the domain and uncontainable objects, to construct an object which is not in the bijection by making it different from each of its peers in at least one part.

todo most reals are irrational. most reals are undefinable. most reals are uncomputable.

#### Power sets

Another cardinality-related use of diagonalization shows that power sets can be used to increase cardinality. More generally, one can start with any set S and take the 'power set' P(S) of that set, that is, the set of all subsets of that set. A diagonalization argument called "Cantor's theorem" can be used to prove that the cardinality of the power set of any set is strictly greater than the cardinality of the original set.

The proof goes like this. Proof by contradiction: assume that there is a bijection between S and P(S). This implies there is a surjective function f from S to P(S). We will show that no function f from S to P(S) can be surjective. It suffices to construct, for any function f from S to P(S), an element p of P(S) which is not in the image of f.

Recall that elements of P(S) are subsets of S. Therefore, p is a subset of S, and for any element x in S, f(x) is a subset of S. Take p to be the following subset of S: the set of every element x in S such that x is not in f(x).

It remains to be shown that p is not in the image of f, that is, there is no element y in S such that f(y) = p. For any element y in S, either y is in f(y), or it's not. If y is in f(y), then by the definition of p, y is not in p. In this case, because y is in f(y) and y is not in p, p must not be equal to f(y). On the other hand, if y is not in f(y), then by the definition of p, y is in p. In this case, because y is not in f(y) and y is in p, p must not be equal to f(y). So in either case, for any y, p is not f(y).

#### The beth numbers

Beth-naught (see https://en.wikipedia.org/wiki/Beth_number for a picture of the Beth symbol), is defined as equal to Aleph-naught. Beth-one is equal to the powerset of Beth-naught. Beth-two is equal to the powerset of Beth-one. And so on. By Cantor's theorem, the Beth numbers form a strictly increasing sequence.

Another name for Beth-one is "the cardinality of the continuum".

Beth-one is also the cardinality of the set of all continuous functions from reals to reals. But the set of all functions, not necessarily continuous, from reals to reals is of cardinality Beth-two.

#### The aleph numbers

The aleph numbers are used to represent the cardinals. Like the Beth numbers, they are written alpha-i, where i is some ordinal.

not sure about this part: Aleph-one is be defined as the next larger well-ordered cardinal after aleph-naught. Aleph-two is defined as the next larger well-ordered cardinal after aleph-one. Etc.

An alternate characterization of aleph-one is the cardinality of the set of all countable ordinal numbers (see below). From the ZF axioms (without AC), it can be shown that this implies that there are no cardinals which are simultaneously larger than aleph-naught but smaller than aleph-one.

### Ordinals

Just as cardinals are the equivalence classes w/r/t the equinumerous relation, ordinals are the equivalence classes w/r/t the order isomorphic relation.

The order isomorphic relation is strictly stronger than the equinumerous relation. Two sets are defined to be order isomorphic if there is a bijection f between them such that x <= y iff f(x) <= f(y).

todo the basic order types at the beginning of https://en.wikipedia.org/wiki/Ordinality

The order type of the natural numbers is w, written as lowercase omega (and pronounced omega).

maybe see https://sites.google.com/site/largenumbers/home/a-1/infinite_numbers , but i have found at least one error in that page (the power set of aleph null is indeed the cardinality of the real numbers; the question he is probably thinking of, the continuum hypotheses, is whether or not there are any other cardinals which are greater than aleph null but less than the powerset of aleph null, or in other words, is aleph one is equal to the powerset of aleph null)

### Paradoxes of naive set theory

In naive set theory (see https://en.wikipedia.org/wiki/Na%C3%AFve_set_theory ), the set of all sets is a set, and you can construct sets by starting with the set of all sets and then imposing a filter condition to create a subset.

Consider the set of all sets which are not members of themselves. Is this set a member of itself?

If it is, then it shouldn't be, by it's definition as the set of all sets which are NOT members of themselves.

But if it isn't, then it should be.

There is no largest cardinal number.

And therefore, you can't have a set S of all cardinal numbers. For if you did, then, following https://en.wikipedia.org/wiki/Cantor%27s_paradox#Statements_and_proofs , let T be the union of the elements of S. Then every element of S is a subset of T, and hence is of cardinality less than or equal to the cardinality of T. Cantor's diagonalization theorem then implies that every element of S is of cardinality strictly less than the cardinality of 2T.

Let S be the set of all ordinals. S is an ordinal itself. Therefore we can construct its successor S + 1, and S < S + 1. But since S is the set of all ordinals, (S + 1) is an ordinal, so (S + 1) must be in S. But then we have S + 1 < S. S + 1 < S and S < S + 1 cannot both be true, so we have a contradiction.

(note: this only works as written if ordinals are defined the von Neumann way. Otherwise, see https://en.wikipedia.org/wiki/Burali-Forti_paradox#Stated_more_generally for a more general statement)

#### How these are handled

Generally by something similar to stratification, e.g. stratifying the conceptual universe into levels of reference and disallowing the construction of things that refer to other objects on their same level.

In traditional ZFC set theory (see below), we construct axioms for sets that do not allow you to construct a set of all sets. However, we still say that there is a 'class' of all sets. Every set is also a class, but not every class is a set. A class that is not also a set is called a 'proper class'. The class of all sets is a proper class. You can do the same sorts of things with classes that you can do with sets, for example, there is a predicate 'memberOf' that tells you whether some object is a member of a class.

In my opinion this isn't a very intuitive way to handle things, but i haven't ever had time to try to come up with anything else.

#### What are sets anyhow?

It seems to me that people tell you that the notion of 'set' is both (a) a formalization of the notion of grouping or collecting or aggregating things together, and (b) a formalization of the notion of a type of thing.

For an example of (a), if you own five chickens, you might consider the set of your five chickens. For an example of (b), you might consider the set of all humans.

I don't think either of these is correct. If 'set' were a formalization of (b), then you would be able to construct the set of all sets. If 'set' were a formalization of (a), then we would expect that anytime we find an object that behaves like a grouping, it should be a set. But proper classes behave like groupings. So if 'set' were a formalization of (a), then every class would also be a set.

Therefore, i think that the concept of 'set' as defined in ZFC axiomatic set theory is neither a formalization of (a) nor of (b). Sets are a particular subtype of the notion of a type of thing; and sets are a particular grouping of some things; and they are a formalization of the notion of grouping where the domain is restricted to certain things; but they are not a formalization of the very notion of a type of thing, nor are they a formalization of the very notion of groupings of things.

### Standard axioms, models, and transfinite numbers

#### ZFC

The current standard foundational axioms are ZFC set theory (ZFC stands for Zermelo–Fraenkel? plus the axiom of Choice)

Today it's generally accepted, but there was and still is some controversy over the axiom of choice. Often you see results that use the axiom of choice note that it was used.

Troublingly, it is not known if ZFC is consistent.

#### The axiom of choice

Sometimes abbreviated AC.

AC is independent of ZF, that is, of Zermelo–Fraenkel? set theory.

If the axiom of choice is assumed, then:

• the class of cardinal numbers is totally ordered
• the next larger well-ordered cardinal is equal to the next larger cardinal

If the axiom of choice is not true, then there may be:

• cardinals larger than aleph-zero but incomparable with aleph-one

Arguments for:

• todo

Arguments against:

A weaker version of the axiom of choice is the https://en.wikipedia.org/wiki/Axiom_of_countable_choice , which is also independent of ZF. The axiom of countable choice is sufficient to imply that Aleph-naught is smaller than any other cardinal.

If the axiom of countable choice is NOT assumed, then it is possible that Dedekind-infinite sets (sets for which a proper subset of the set may be put into bijection with the whole set) may not be equivalent to infinite sets as defined as "a set that cannot be put in bijection with any finite ordinal". The axiom of equivalence of Dedekind-infinite sets with infinite sets is strictly weaker than the axiom of countable choice. See https://en.wikipedia.org/wiki/Dedekind-infinite_set .

#### The continuum hypothesis

Is aleph-one equal to beth-one? That is, are there any well-ordered cardinalities in between the cardinality of the naturals and the cardinality of the reals?

Godel and Cohen established that this is independent of the ZFC axioms.

Arguments for:

• todo

Arguments against:

A weaker version of the continuum hypothesis is https://en.wikipedia.org/wiki/Martin%27s_axiom .

A stronger version of the continuum hypothesis is the generalized continuum hypothesis (GCH), which states that for all i, Aleph-i is equal to Beth-i. The generalized continuum hypothesis is independent of ZFC, but it does imply AC. GCH is true in the constructive universe. "GCH implies that every cardinality n is smaller than some Aleph number, and thus can be ordered". See https://en.wikipedia.org/wiki/Continuum_hypothesis#The_generalized_continuum_hypothesis .

#### The constructable universe

https://en.wikipedia.org/wiki/Constructible_universe

#### The Von Neumann universe

https://en.wikipedia.org/wiki/Von_Neumann_universe

#### Grothendieck universe

https://en.wikipedia.org/wiki/Grothendieck_universe

## Alternative mathematical foundations

Many alternative mathematical foundations are motivated by a philosophical conviction that classical mathematics allows one to talk about impossible objects, and/or that it makes unwarranted assumptions. An alternative foundation often restricts the expressiveness of the language or the power of logic to prevent this. For example, in constructivist mathematics, you can't talk about objects that you can't construct and/or you can't form non-constructive proofs. In intuitionistic logic you can't use the law of the excluded middle (that is, you can't conclude that not(not(x)) -> x) (i think this turns out to be related to constructivism). In computable mathematics you only talk about computable sets.

Reverse mathematics is the enterprise of starting with some non-standard approach to the foundations of mathematics, and seeing which theorems of classical mathematics still hold in your non-standard approach. More generally, reverse mathematics seeks to characterize, for various interesting theorems, which axioms and constructions are minimally sufficient for proving that theorem. See also http://en.wikipedia.org/wiki/Reverse_mathematics , http://plato.stanford.edu/entries/hilbert-program/#1.3 , http://www.math.psu.edu/simpson/talks/banff0812/talk.pdf

### Motivation by simplicity

If this seems silly to you, consider how much time and effort in classical mathematics is devoted to understanding things like the relationship between various kinds of infinity and how these relationships change based on the presence or absence of axioms which no one can agree on whether they are false or true, and which can be proven to be unnecessary for any use except for talking about infinite things.

### An aside: finitism

Mathematics has a problem, or at least an annoying issue. Imagine that one day, people started thinking about the natural numbers; simple enough to imagine, and to visualize. Now add the rationals; still simple enough to imagine, and to visualize. And then one day they start thinking about 2-D. And then we start thinking about distances between points in 2-D. This is all very fundamental stuff that has a clear and certain connection to reality, wouldn't you agree?

What is the distance between the points (1,1) and (0,0)? This is the same as the length of the third side of the triangle defined by vertices (0,0), (0,1), (1,1). By the Pythagorean theorem, since this is a right triangle, and the length of the other sides are each 1, we have x * x = 2, where x is the distance between (0,0) and (1,1).

So the distance between (0,0) and (1,1) is provably irrational (see http://en.wikipedia.org/wiki/Square_root_of_2#Proofs_of_irrationality ).

So, okay, we want our concept of the number line to include any possible distance, right? So we'd better add sqrt(2) in there, and all of the other square roots. And if we calculate distance in 3-D, we'll find that we have to add in cube roots too, and so on up to n-th roots (and of course we want it to be closed under all basic arithmeticial operations). So now our number line consists of the simple algebraic numbers (not sure if that is a frequently used term, I saw one or two people on Google calling it that), that is, all numbers that can be obtained from the integers using a finite number of integer additions, subtractions, multiplications, divisions, and taking nth roots (where n is a positive integer).

Well gee, let's take a closer look at that thing we wrote above, "x * x = 2". Y'know, you can visualize this by graphing a function "f(x) = x*x", and seeing when it hits 2. Actually, you could also just displace it, and graph the function "f(x) = x*x - 2", and see when it hits zero. In fact you could make up lots of functions, using just the arithmetical operations of addition, subtraction, multiplication, and exponentiation, and see when they hit zero. If we restrict the second argument of the exponent to integers, then these functions are polynomials, and the places where they hit zero are their roots.

If we consider all of the roots of all of the polynomials (but only for those polynomials that "have roots", that is, when if we draw the function we see it intersect the horizontal axis somewhere), some of them are not "simple algebraic numbers" as defined above. This is proven by Galois theory. It seems sensible to want all of the roots of all of the polynomials which have roots in our number line, because the root of some function in our graph can be clearly visualized and so seems to have a clear and certain connection to reality. Now we have the "real algebraic numbers" (note: this is not yet the algebraic closure over the rationals, because we do not yet have any complex numbers).

Note: Wikipedia tells me that "The set of real algebraic numbers is linearly ordered, countable, densely ordered, and without first or last element, so is order-isomorphic to the set of rational numbers.". So the real algebraic numbers are still pretty comprehensible.

Great, now let's consider circles. What's the circumference of a circle of radius 1? 2*pi. It can be proved that this is not an algebraic number (see http://en.wikipedia.org/wiki/Lindemann–Weierstrass_theorem#Transcendence_of_e_an ). We call real numbers that are not algebraic numbers "real transcendentals". So we have at least one transcendental that has a clear and certain connection to reality.

OK, these numbers are getting pretty annoying to write down. We have all these numbers that are at some precise spot on the number line, but that we can only talk about it terms of some weird definition. Can't we represent pi in terms of numbers and operations that came before it? Well, no, we can't, that's the whole point. But wait.. we can approximate it. The Gregory–Leibniz? series (see http://en.wikipedia.org/wiki/Leibniz_formula_for_%CF%80 ) approximates pi by adding and subtracting rationals. If you could somehow do this forever, you'd reach pi. In fact, if you desire an approximation to pi which is no farther away from pi than any given distance that you choose, you can find one in finite time. Maybe we can just think in terms of the rational numbers, and in order to talk about pi, we can invent a new concept, "limit", and we can just say that pi is the "limit" of this process of approximating it.

Even if we did that, it would still be annoying to write all these numbers down, so put that thought on hold for a minute. If we draw numbers on the number line, we can easily see that this one is to the right of that other one and to the left of this third one, and that it is farther away from the that one and closer to this one, etc. Wouldn't it be nice to have a notation that is more like that picture, and less like "here's a number that is the root of 'x^2 - 2'; here's a number which is '0.5*the circumference of a circle of radius 1'"?

So we invent decimal notation. Great, pi = 3.1415... Now i can easily see and remember that sqrt(2) < pi < 4, and i can easily keep track of how far apart they are. This seems pretty concrete, right? Seems to be connected to reality? Of course, you can't actually write out pi this way. But you can get closer and closer... and if you could go on like that forever, you'd reach pi. So, great, for all practical purposes the decimals can represent everything. In fact, let's just redefine everything in terms of decimals and limits.

It turns out that if you add limits to a number line to produce a new number line (called the "completion"), then you get the real numbers whether you start with the rationals or the decimals (see http://en.wikipedia.org/wiki/Construction_of_the_real_numbers ). So, it seems pretty sensible to have our number line be the reals.

The theory of the reals is closely related to http://en.wikipedia.org/wiki/Second-order_arithmetic and http://en.wikipedia.org/wiki/Mathematical_analysis (the word "analysis" is applied to other 'realish' things too such as http://en.wikipedia.org/wiki/Analytic_function ).

Right? That's connnected to reality, right?

Well, maybe not. Once you have the reals you get all sorts of non-intuitive results. You have uncountable cardinality; an infinity of an infinity. In fact, because the cardinality of finite strings over a finite alphabet is countable, and because any actual written description of a real number is of this form, this means that most real numbers cannot even be described. Consideration of basic properties of cardinality, given that uncountable cardinality exists, leads to controversies such as the axiom of choice and the continuum hypothesis. The reals lead to all sorts of counterintuitive results (todo, give some examples). Today, the most popular set theory devised to deal with them (ZFC) hasn't been proved consistent and past versions had to be revised a few times due to all sorts of paradoxical inconsistencies, giving rise to the fear that if it looked good but turned out to be wrong before, and if we can't prove it's consistent now, then maybe it's still wrong.

Maybe things are just complicated. Or, maybe we made a mistake when we decided to think about infinite objects as if they were real, and we should just throw out all that. Maybe since something infinite is impossible, it's no wonder that we get all sorts of paradoxes when we analyze them. This viewpoint is finitism.

But dang it, it's easier to write things using decimal notation, and easier to prove things using limits, and it's easier to THINK about trancendentals as limits of decimal expansions. Maybe, really, all there are are finite things, but infinities and limits and such can be a convenient notational shorthand for verbose statements about finite things. Maybe infinities are just notational plumbing for thoughts about finite things, and it'll be complicated to work out this plumbing, but once we do, it'll be done and we can go back to studying finite things, and we'll have a guarantee that we can use the language of infinity, while being confident that philosophical controversies about the nature of infinity won't change mathematical results about finite things (see http://en.wikipedia.org/wiki/Conservative_extension ).

So that's where we were sometime in the early 1900s. To skip ahead and tell you where we are now, this programme was first incarnated as "Hilbert's programme", which was shown to be unachievable; but some people are still working on a number of distinct, distant variants of it; as one might suspect, even understanding what those variants are and what the differences between them are, is very complicated; however, if one of them eventually succeeds and takes over in the distant future, maybe it'll make mathematics simpler for future mathematicians by not having to constantly deal with all these paradoxes caused by infinities.

### back to: Motivation by simplicity

In the classical approach, difficult-to-understand concepts end up leaking into "ordinary" math and forcing every serious student of mathematics to come to grips with them. In some sense, this means that in order to do mathematics the conventional way, you have to spend a lot of time understanding theorems about "impossible" things.

You can see how one might conclude that these concepts might be merely artifacts of our way of talking about math, and that given this doubt, it would be better to restructure mathematical discourse so that you don't have to think about them very much unless you want to. In other words, I am proposing the without taking a position on the correctness of alternative mathematical foundations, one might want them for the practical reason of simplifying the learning of mathematics (since mathematics is too large to be learned by any one person in one lifetime, and hence professional mathematicians are always learning, simplying the learning of mathematics should be useful not only to undergraduates but also professional mathematicians).

On the other hand, if the object is only to save the mathematician from having to waste time dealing with technical artifacts in the course of ordinary mathematics, then this is often frustrated in that the actual mechanics of various alternative mathematics often turn out to be more complicated than classical mathematics. E.g. often there is a non-constructive proof which is shorter than any known constructive proof of the same proposition. This is to be expected, because alternative foundations are often the same as classical foundations, but with some restrictions on reasoning. If one's motivation for alternative mathematical foundations, however, is a firm belief that classical foundations are unjustified, then the 'easier' proof from classical mathematics is just wrong, so there is no choice but to take the more difficult alternative rounte.

But continuing down the path of alternative foundations as a pedagogical tool, this statement from Wikipedia is relevant; "More formally, Hilbert believed that it is possible to show that any theorem about finite mathematical objects that can be obtained using ideal infinite objects can be also obtained without them. Therefore allowing infinite mathematical objects would not cause a problem regarding finite objects. This led to Hilbert's program of proving consistency of set theory using finitistic means as this would imply that adding ideal mathematical objects is conservative over the finitistic part. Hilbert's views are also associated with formalist philosophy of mathematics. Hilbert's goal of proving the consistency of set theory or even arithmetic through finitistic means turned out to be an impossible task due to Kurt Gödel's incompleteness theorems. However, by Harvey Friedman's grand conjecture most mathematical results should be provable using finitistic means." More information on Friedman's "grand conjecture" can be found at http://en.wikipedia.org/wiki/Grand_conjecture#Friedman.27s_grand_conjecture .

From here on in, we'll just use the phrase "crazy things" to refer to those objects and styles of proof that a restricted alternative foundation wants to rule out.

In other words, Hilbert hoped to find a subset of mathematics which talked about only non-crazy things and which could be verified without talking about crazy things and which was sufficient to do all ordinary mathematics. Godel proved that this wasn't possible; that any reasonable candidate for doing ordinary mathematics cannot be formally verified without talking about crazy things.

Even if ALL important mathematics cannot be done without talking about crazy things, there is still the hope that MOST of it can, and that a strict line can be drawn between the crazy parts and the other parts. That is, there is a hope that for pedagogical purposes, all concepts and theorems can be classified and clearly labeled in all literature as "definitely non-crazy" or "possibly crazy", and most students (including most professional mathematician) can learn the non-crazy stuff, and skip the possibly crazy stuff, unless they are in the mood and have some extra time for craziness.

### Some alternative mathematical foundations

todo

#### Zermelo set theory

https://en.wikipedia.org/wiki/Zermelo_set_theory

According to https://en.wikipedia.org/wiki/Zermelo%E2%80%93Fraenkel_set_theory , one of the objections to Zermelo's original set theory axiomitization was that it couldn't prove the existence of the cardinal Aleph-w, or of the set {Z, P(Z), P(P(Z)), ...}. Perhaps those with finitist leanings should investigate Zermelo's original theory. According to https://en.wikipedia.org/wiki/Zermelo%E2%80%93Fraenkel_set_theory#Criticisms , all of mainstream mathematics can be done in Zermelo's theory plus AC.

#### New Foundations

According to https://en.wikipedia.org/wiki/Zermelo%E2%80%93Fraenkel_set_theory#Criticisms , the axioms of New Foundations admit the existence of a universal set.

#### univalent foundations from homotopy type theory

http://homotopytypetheory.org/book/

#### Limit Computable Mathematics

Like intuitionism except that the excluded middle is allowed for \Delta_0^2 formulas.

#### Arithmetics

not really foundational, in a sense, since in logic we expect to be able to go to higher orders

todo talk about Peano, PRA, EFA, etc

http://mathoverflow.net/questions/36580/is-all-ordinary-mathematics-contained-in-high-school-mathematics

"Finite automata can decide Presburger arithmetic formulae" -- http://cstheory.stackexchange.com/questions/14811/what-is-the-enlightenment-im-supposed-to-attain-after-studying-finite-automata/14818#14818

#### Other notions of well-behavedness

Working in the continuum, https://en.wikipedia.org/wiki/Smooth_function s seem to be a logic endpoint of avoiding monstrosities such as the everywhere continuous but nowhere differentiable https://en.wikipedia.org/wiki/Weierstrass_function .

But relating this to countable stuff, it would seem that e.g. a piecewise linear function is a simple object, but this may not be smooth. So we have some trouble bridging the continuous and discrete worlds in a sensible manner.

Similarly, sets of measure zero seem offensive in the setting of the continuum; but if one is thinking discretely, then it's the sets NOT of measure zero which are problematic.

Perhaps there is some sort of thing like a homomorphism connecting discrete and continuous mathematics that formalizes this sort of thing?

#### Various constructions of the reals

ok these arent really alternative foundations in the same sense but it's past 3am and i just want to write down a note or two before i go to sleep.

one can construct the real starting with the naturals, then passing to the integers, then to the rationals, then to the algebraic numbers, and then to the definable trancendentals, then we say geez can't we find a representation that would let us easily and intuitively compute which of two of these numbers is bigger, and then we get the decimal expansion, and then if we allow our decimal expansions to be of countably infinite length (or taking the limits of other things), we get the reals.

one can also construct the reals starting with via the surreal route. After finite 'generation' steps we get the dyadic rationals, and then if we allow countably infinite 'generation' steps, we get the reals.

Note that before we get to the reals, the surreals cannot give us even simple rationals like 1/3.

This might be taken as an argument for the reality of the reals, since the surreal construction seems so sensible, as does the other progression. On the other hand, it might be taken as an argument against the reality of the reals, because it takes two constructions that are so obviously different and makes them behave the same.

Another point against the reality of the reals is that there are so many reals that not only are most of them not computable, most of them are not even definable.

#### Suspicions of power sets

"The predicativist will object that the power-set operation is not meaningful. And even the classical mathematician will admit that there are some obscure features of the power-set operation. The independence of the continuum hypothesis illustrates one such obscurity." -- http://plato.stanford.edu/entries/logic-higher-order/

http://plato.stanford.edu/entries/set-theory-constructive/#ResPow

https://en.wikipedia.org/wiki/Controversy_over_Cantor%27s_theory

#### todo...

i guess what really bothers me is indefinables, that is, classes for which you would need infinite amounts of information in order to be able to distinguish each of their members. but i guess that's the same as saying that i don't like the cardinality of the continuum, Beth-2?

if one doesn't like the idea of considering an aggregate as a thing, even for the purpose of proving properties about all of its members or about the existence of some member, just because one can't reach, with finite information, every member of that aggregate, then i probably don't like higher-order logic, either. but that seems problematic, because then how do you do metamathematics, e.g. proofs about the properties of predicates that might exist over specific kinds of universes?

i have a vague intuition that what i want is to say something like 'predicates are real things only in respect to this finite system that we have for talking and reasoning about them'. Perhaps i need to learn more about Kleene realizers.

also, should learn what the definitions of formalist and realism are w/r/t math. i guess one opinion is that e.g. uncountable cardinalities are 'real' in the sense that they are specific mechanisms in our formal system and that those mechanisms have real logical and computational consequences as to how the rest of our system works, but they do not denote actual numbers in the same way that finite numbers, and probably countable cardinalities, do. so e.g. independence results are GREAT, because they show where the consequences of the part of our system that we consider to be right for intuitive reasons peter out; beyond this point, the system does not say whether certain propositions about these weird objects are true or false, and that's fine, because we have no reason for believing that those objects are real objects and not just artifacts of our formal system, anyhow; and it's great because then we don't have to bother with that stuff.

i take the same attitude towards complex and even non-positive numbers and i suppose towards the decimal expansions of trancendentals. Those things aren't really numbers in the sense of "here are two sheep", but they are clever ways of stuffing some useful logical computation into our notation that let's us not have to hold so much in our mind as we do math, thereby allowing us to go further, faster, than we otherwise could given the pathetically small number of things we can hold in our mind simultaneously.

i'm guessing that's the prototypical formalist point of view but i'm not sure. it's certainly not realist.

see http://plato.stanford.edu/entries/formalism-mathematics/ . A quick skim of that convinces me that my position is probably Hilbertian formalism. See http://plato.stanford.edu/entries/hilbert-program/ .

One nuance of my position is that i don't really seem to have a problem with countable transfinite ordinals. For example, if you took an infinite sheet of graph paper and considered all of its grid points, these are points of the form of pairs (x,y) where x and y are integers. If you want to have a total ordering on these pairs, you need w^2 (i think), which is a transfinite ordinal bigger than mere w. But as we see from this example, w^2 isn't so crazy.

Just as various proofs that people give that seem intuitively reasonable formally require induction up to w, various proofs that people give that seem intuitively reasonable formally require transfinite induction on transfinite but countable ordinals. This makes me think that i'm cool with transfinite induction on countable ordinals, and maybe even some form of countably infinitary logic.

It is unclear to me whether this feeling of mine extends to all countable ordinals or only to some of them. I should take a closer look at Feferman's predicativity programme which appears to be based on going up to the Feferman–Schutte? ordinal only. Feferman has talked about his programme as 'Hilbert's Program Relativized', which seems to be what i'm aiming at here. I am not positive that Feferman's ideas haven't evolved over time, and am not sure if all these things i say about 'feferman's programme' refer to the same programme.

mb should read http://math.stanford.edu/~feferman/papers/DraftIntroFEM.pdf which is relatively recent (2012)

#### feferman: conceptions of the continuum

" Conceptions of the Continuum • There is no unique concept of the continuum but rather several related ones. (Feferman 2009) • To clear the way as to whether CH is a genuine mathematical problem one should avoid the tendency to conflate these concepts, especially those that we use in describing physical reality. • (i)The Euclidean continuum, (ii) The Hilbertian continuum, (iii) The Dedekind real line, (iv) The Cauchy-Cantor real line, (v) The set 2 N , (vi) the set of all subsets of N, S(N). " -- http://math.stanford.edu/~feferman/papers/Conceptual_Structuralism.pdf

" Conceptions of the Continuum (Cont’d) • Not included are physical conceptions of the continuum , since our only way of expressing them is through one of the conceptions via geometry or the real numbers. • Which continuum is CH about? Their identity as to cardinality assumes impredicative set theory. • Set theory erases the conceptual distinction between sets and sequences . • CH as a proposition about subsets of S(N) and possible functions (one-one sets of ordered pairs "

" I shall be relatively brief about each of thes e since I have expanded on them in Feferman ( 2009a ). (i) The Euclidean continuum . Th e conception is semi - structural. Straight lines are conceived of as lying in the plane or in space. They are perfectly straight and have no breadth. Points are perfectly fine; they are conceived of as the objectification of pure locations. A point may lie on a line, but the line does not consist of the set of its points. The idea of arbitrary subset of a line or of the removal of some points from a line is foreign to Euclidean geometry. (ii) The Hilbertian continuum . The conception is structural: on e has points, lines, planes, incidence, betweenness and equidistance relations. Continuity is expressed set - theoretically following Dedekind. (iii) The Dedekind real line . This is a hybrid (arithmetical/geometric) notion that can be expressed in struc tural terms relative to the rationals. Continuity is expressed by the Dedekind cut condition. (iv) The Cauchy - Cantor real line . This is again a hybrid structural notion relative to Q. Continuity is expressed by the Cauchy convergence condition, used by Cantor to “construct” the reals.

17 (v) The set 2 N , i.e. the set of infinite sequences of 0’s and 1’s, i.e. all paths in the full binary tree. (vi) The set of all subsets of N, S(N). Not included here are physical conceptions of the continuum, since o ur only way of expressing them is through one of the mathematical conceptions via geometry or the real numbers. Euclidean geometry aside, s o far as CH is concerned, set theory erases the distinctions between the notions in (ii) - (vi) by identifying sequenc es with sets and by considering only the cardinality of the various continua. But there is also a basic conceptual difference between 2 N and S(N), with the elements of 2 N pictured as the result of an unending sequence of choices, 0 or 1 (left or right) wh ile the elements of S(N) are the result of sprinkling 1s through the natural numbers in some simultaneous manner and then attaching 0s to the rest . Of course, with each set is associated a sequence as its characteristic function and vice versa, but that r equires a bit of reasoning ⎯ the equivalence of these notions is not conceptually immediate. " -- http://logic.harvard.edu/EFI_Feferman_IsCHdefinite.pdf

http://mathoverflow.net/questions/125626/conceptual-structuralism-and-continuum-hypothesis

#### feferman: conceptions of sets

" Conceptions of Sets • Sets are supposed to be definite totalities, determined solely by which objects are in the membership relation (∈) to them, and independently of how they may be defined, if at all. • A is a definite totality iff the logical operation of quantifying over A, (∀x∈A) P(x), has a determinate truth value for each definite property P(x) of elements of A. •A⊆B means (∀x∈A) (x∈B) •Extensionality: A⊆B∧B⊆A→A? = B.

The Structure of “all” Sets • (V,∈), where V is the universe of “all” sets. •V itself is not a definite totality, so unbounded quantification over V is not justified on this conception. Indeed, it is essentially indefinite. •If the operation S( . ) is conceived to lead from sets to sets, that justifies the power set axiom Pow. • At most, this conception justifies KPω+Pow+AC, with classical logic only for bounded statements (Feferman, t.a.) " -- http://math.stanford.edu/~feferman/papers/Conceptual_Structuralism.pdf

#### feferman: power sets

" The Status of CH • But--I believe--the assumption of S(N), S(S(N)) as definite totalities is philosophically justified only on platonistic grounds. • From the point of view of conceptual structuralism, the conception of the totality of arbitrary subsets of any given set is essentially indefinite (or inherently vague). • For, any effort to make it definite violates the idea of what it is supposed to be. "

#### feferman on definiteness

" A proposed logical framework for what’s definite (and what’s not) According to the finitists , the natural numbers form an inc ompleted or indefinite totality and the general operations of universal and existential quantification ov er the natural numbers are not to be admitted in reasoning about them except in bounded form. The intuitionists share the finitists’ general point of view about the natural numbers but regard 23 unbounded quantification over them as meaningful though not whe n combined with classical logic. According to the predicativists, the natural numbers form a definite totality, but not the collection of “all” sets of natural numbers. Nevertheless systems of predicative analysis have been formulated using quantificatio n with respect to variables for such sets within classical logic. Finally, we have the case of set theory, in which according to practically all points of view ⎯ including that of the mathematical practitioners from the very beginning to the present as well as both the philosophical defenders and critics ⎯ sets in general form an indefinite totality, otherwise the universe of all sets would be a set and thus give rise to Russell’s Paradox. All of this poses questions about the proper formulation of axiomatic systems in which some of the domains involved are conceived to form indefinite totalities, while others, or parts of them, are considered to be definite totalities. There has been much interesting philosophical discussion in recent years about the proble ms raised by absolute gen erality (cf. Rayo and Uzquiano 2006 ) both in its unrestricted sense and more specifically as applied to set theory. " -- http://logic.harvard.edu/EFI_Feferman_IsCHdefinite.pdf

" My main concern here is rather to see what precisely can be said about certain ways of construing in formal terms the distinction between definite and indefinite concepts. One way of saying of a statement φ that it is definite is that it is true or false; on a deflationary account of truth that’s the same as saying that the Law of Excluded Middle (LEM) holds of φ , i.e. one has φ ∨ ¬ φ . Since LEM is rejected in intuitionistic logic as a basic principle, that su ggests the slogan, “What’s definite is the domain of classical logic, what’s not is that of intuitionistic logic.” In the case of predicativity, this would lead us to consider systems in which quantification over the natural numbers is governed by classic al logic while only intuitionistic logic may be used to treat quantification over sets of natural numbers or sets more generally. And in the case of set theory, where every set is conceived to be a definite totality, we would have classical logic for boun ded quantification while intuitionistic logic is to be used for unbounded quantification. The formal study of systems of the latter kind goes back at least to a Wolf(1974 ) and 24 Friedman (1980 ) and has most recently been carried on in the paper, Feferman (20 10b ). 12 The general pattern of the studies there is that we start with a system T formulated in fully classical logic, then consider an associated system SI - T formulated in a mixed, semi - intuitionistic logic, and ask whether there is any essential loss in proof - theoretical strength when passing from T to SI - T. In the cases that are studied, it turns out that there is no such loss, and moreover, there can be an advantage in going to such an SI - T; namely, we can beef it up to a semi - constructive system SC - T without changing the proof - theoretical strength, by the adjunction of certain principles that go beyond what is admitted in SI - T, because they happened to be verified in a certain kind of functional interpretation of the intuitionistic logic. For example , if one starts with S equal to the classical system KP ω of admissible set theory (including the Axiom of Infinity), the associated system SI - KP ω has the same axioms as KP ω , and is based on intuitionistic logic plus the law of excluded middle for all bound ed (i.e. Δ 0 ) formulas. But the beefed up system SC - KP ω proves the Full Axiom of Choice Scheme for sets, ∀ x ∈ a ∃ y φ (x,y) → ∃ r [Fun(r) ∧ ∀ x ∈ a φ (x, r(x))], where φ (x, y,...) is an arbitrary formula of the language of set theory. That in turn implie s the Full Collection Axiom Scheme, ∀ x ∈ a ∃ y φ (x,y) → ∃ b ∀ x ∈ a ∃ y ∈ b φ (x,y) for φ an arbitrary formula, while this holds only for ∑ 1 formulas in SI - KP ω . In addition, SC - KP ω contains some other non - intuitionistic principles that can, nevertheless, be gi ven a semi - constructive interpretation. The main result in Feferman ( 2010b ) is that all of these systems are of the same proof - theoretical strength, KP ω ≡ SI - KP ω ≡ SC - KP ω . " -- http://logic.harvard.edu/EFI_Feferman_IsCHdefinite.pdf

#### Descriptive Set Theory (DST)

"Descriptive Set Theory (DST, the study of definable sets of reals and of other topological spaces)" -- http://logic.harvard.edu/EFI_Feferman_IsCHdefinite.pdf

### Ways to choose alternative formalisms

"

5. Revised Hilbert Programs

Even if no finitary consistency proof of arithmetic can be given, the question of finding consistency proofs is nevertheless of value: the methods used in such proofs, although they must go beyond Hilbert's original sense of finitism, might provide genuine insight into the constructive content of arithmetic and stronger theories. What Gödel's result showed was that there can be no absolute consistency proof of all of mathematics; hence work in proof theory after Gödel concentrated on relative results, both: relative to the system for which a consistency proof was given, and relative to the proof methods used.

Reductive proof theory in this sense has followed two traditions: the first, mainly carried out by proof theorists following Gentzen and Schütte, has pursued a program of what is called ordinal analysis, and is exemplified by Gentzen's first consistency proof of PA by induction up to ε0. ε0 is a certain transfinite (though countable) ordinal, however, “induction up to ε0” in the sense used here is not a genuinely transfinite procedure. Ordinal analysis does not operate with infinite ordinal numbers, but rather with ordinal notation systems which themselves can be formalized in very weak (essentially, finitary) systems. An ordinal analysis of a system T is given if: (a) one can produce an ordinal notation system which mimics the ordinals less than some ordinal αT so that (b) it can be proved finitarily that the formalization TI(αT) of the principle of induction up to αT implies the consistency of T (i.e., S ⊢ TI(αT)→Con T) and (c) T proves TI(β) for all β < αT (S is a theory formalizing finitary metamathematics and is generally a weak sub-theory of T). To have any foundational significance it is also required that one can give a constructive argument for transfinite induction up to αT. As mentioned above, this was done for by Gentzen and Takeuti for ε0, the proof theoretic ordinal of PA, but becomes more difficult and of progressively questionable philosophical significance for stronger theories.

A philosophically more satisfactory continuation of Hilbert's Program in proof theoretic terms has been suggested by Kreisel (1983; 1968) and Feferman (Feferman, 1988; Feferman, 1993a). This work proceeds from a wider conception of Hilbert's Program as an attempt to justify ideal mathematics by restricted means. In this conception, the aim of Hilbert's proof theory was to show that, at least as far as a certain class of real propositions is concerned, ideal mathematics does not go beyond real mathematics. A finitary consistency proof of the kind envisaged by Hilbert would have accomplished this: if ideal mathematics proves a real proposition, then this proposition is already provable by real (i.e., finitary) methods. In a sense this reduces ideal mathematics to real mathematics. A proof-theoretic reduction of a theory T to a theory S shows that, as far as a certain class of propositions is concerned, if T proves a proposition, then S proves it too, and the proof of this fact is itself finitary. Hilbert's proof theoretic program can then be seen to be a search for a proof theoretic reduction of all of mathematics to finitary mathematics; in a relativized program one looks for reductions of theories weaker than all of classical mathematics to theories often stronger than finitary mathematics. Proof theorists have obtained a number of such results, including reductions of theories which on their face require a significant amount of ideal mathematics for their justification (e.g., subsystems of analysis) to finitary systems. (Feferman, 1993b) has used such results in combination with other results that show that most, if not all, of scientifically applicable mathematics can be carried out in systems for which such reductions are available to argue against the indispensability argument in the philosophy of mathematics. The philosophical significance of such proof theoretic reductions is currently the subject of debate (Hofweber, 2000; Feferman, 2000).

The program of so-called reverse mathematics developed by, in particular, Friedman and Simpson, is another continuation of Hilbert's program. In the face of Gödel's results showing that not all of classical mathematics can be reduced to the finitary, they seek to answer the question: how much of classical mathematics can be so reduced? Reverse mathematics seeks to give a precise answer to this question by investigating which theorems of classical mathematics are provable in weak subsystems of analysis which are reducible to finitary mathematics (in the sense discussed in the preceding paragraph). A typical result is that the Hahn-Banach theorem of functional analysis is provable in a theory known as WKL0 (for “weak König lemma”); WKL0 is conservative over PRA for Π02 sentences (i.e., sentences of the form ∀x∃yA(x, y). (See Simpson 1988 for an overview and Simpson 1999 for a technical treatment.) " -- http://plato.stanford.edu/entries/hilbert-program/#1.3

http://en.wikipedia.org/wiki/Reverse_mathematics#The_big_five_subsystems_of_second_order_arithmetic notes five subsystems, ascending in power, of second-order arithmetic (see also http://en.wikipedia.org/wiki/Second-order_arithmetic#Subsystems_of_second-order_arithmetic ). They are somewhat independent, but sometimes people implicitly refer to one to mean that one, plus the ones beneath it, which is what i will do (i think; i am copying some statements from elsewhere so perhaps i'll make a mistake). Perhaps we should choose one of those.

RCA_0 is computable mathematics, which seems philosophically pretty nice.

WKL_0 also seems feasible, "The subsystem WKL0 consists of RCA0 plus a weak form of König's lemma, namely the statement that every infinite subtree of the full binary tree (the tree of all finite sequences of 0's and 1's) has an infinite path. "

Both RCA_0 and WKL_0 have proof-theoretic ordinals w^w. This is the same as the ordinal of Primitive Recursive Arithmetic (PRA). The next one up is ACA_0, with ordinal \epsilon_0. "ACA0 allows us to form the set of natural numbers satisfying an arbitrary arithmetical formula (one with no bound set variables, although possibly containing set parameters). Actually, it suffices to add to RCA0 the comprehension scheme for Σ1 formulas in order to obtain full arithmetical comprehension.".

ACA_0 (Arithmetical comprehension axiom) is the last predicative one. ACA_0 is necessary for some elementary results that you hear about in required undergraduate analysis courses, such as Bolzano–Weierstrass? and Ascoli's theorem. ACA_0 is "conservative over Peano arithmetic for arithmetical sentences". ACA_0's ordinal \epsilon_0 is the same as Peano Arithmetic's ordinal.

The consistency of ACA_0 can be proved in the next one, ATR_0 (Arithmetical transfinite recursion). "The system ATR0 adds to ACA0 an axiom which states, informally, that any arithmetical functional (meaning any arithmetical formula with a free number variable n and a free class variable X, seen as the operator taking X to the set of n satisfying the formula) can be iterated transfinitely along any countable well ordering starting with any set." ATR_0's ordinal is \gamma_0, "the supremum of that of predicative systems.", see http://en.wikipedia.org/wiki/Feferman%E2%80%93Sch%C3%BCtte_ordinal . ATR_0 is "conservative over Feferman's system IR for Π_1^1 sentences". ATR_0 is needed to prove some theorems that i haven't heard of yet. \gamma_0 is the same ordinal as for "Martin-Löf type theory with arbitrarily many finite level universes".

Beyond that we have a system with ordinal http://en.wikipedia.org/wiki/Psi0%28Omega_omega%29 .

I wonder what ordinal goes with Coq's logic.

So, knowing nothing else but this, it seems to me that ACA_0 is a good place to start. It sounds like many people think that e_0 can be philosophically justified, and it's predicative. Maybe you'd want to go one more step, to ATR_0, to prove ACA_0.

I should learn more about the third, Kreisel-Feferman programme; see also http://math.stanford.edu/~feferman/book98.html , Feferman, Solomon, 1993a, “What rests on what? The proof-theoretic analysis of mathematics”, in Philosophy of Mathematics. Proceedings of the Fifteenth International Wittgenstein-Symposium, Part 1, Johannes Czermak, ed., Vienna: Hölder-Pichler-Tempsky, 147-171, Feferman, Solomon, 1993b, “Why a little bit goes a long way: Logical foundations of scientifically applicable mathematics”, PSA 1992, 2: 442-455. Reprinted in Feferman (1998, Ch. 14, 284-298).

... in http://web.archive.org/web/20120206233919/http://math.stanford.edu/~feferman/papers/psa1992.pdf , Feferman talks about a system W which is at least as powerful at ACA_0.

Some advocate using classical logic for quantification over natural numbers, but intuitionistic logic for quantification over sets of natural numbers: http://web.archive.org/web/20111219144534/http://math.stanford.edu/~feferman/papers/whatsdef.pdf (semi-intuitionism, semi-constructivism)

In http://web.archive.org/web/20111130031429/http://math.stanford.edu/~feferman/papers/newaxioms.pdf , Feferman advocates using classical mathematics practically, but from a philosophical point of view notes that all mathematics that he's found to be applicable to science, with the exception of certain speculative approaches to the foundations of quantum field theory, are conservative extensions of PA (i think, not sure if i understood).

Feferman presents another way of looking at finitism in http://web.archive.org/web/20111219144522/http://math.stanford.edu/~feferman/papers/UnfoldFA.pdf , whose power is equivalent to PA or PRA depending on some choices.

Feferman is working on a book which appears to be his current-best-proposal for metamathematics on mathematical foundations: math.stanford.edu/~feferman/papers/DraftIntroFEM?.pdf‎

http://www.ltg.unibe.ch/research/Foundations%20of%20Explicit%20Mathematics

An interesting "natural" theorem that is unprovable using just Peano arithmetic: http://en.wikipedia.org/wiki/Goodstein%27s_theorem

I am unclear about the meaning of impredicative. As soon as one moved beyond computable functions, one has uncomputable functions, so i assume that somewhere around here you get the halting function. Can you only construct halting function-like things in theories beyond \Gamma_0? Or is the circularity needed to construct the halting function a different thing than whatever makes people say that theories with proof-theoretic ordinals beyond \Gamma_0 are impredicative? TODO TOASK

I am also unclear about diagonalization. Are any of the is the "big five" subsystems of second order logic mentioned above powerful enough to prove the existence of an uncountable cardinal? (according to http://mathforum.org/kb/message.jspa?messageID=7918208 , yes, even RCA_0 is)

### Some theories, their completeness, and what is needed to prove their consistency

http://en.wikipedia.org/wiki/G%C3%B6del%27s_first_incompleteness_theorem#Second_incompleteness_theorem shows that no consistent theory at least as powerful as first-order logic plus Peano arithmetic can be complete.

http://en.wikipedia.org/wiki/G%C3%B6del%27s_second_incompleteness_theorem#Second_incompleteness_theorem shows that no consistent theory at least as powerful as first-order logic plus Peano arithmetic can prove its own consistency (I think).

http://en.wikipedia.org/wiki/Gentzen%27s_consistency_proof proves the consistency of Peano Arithmetic (PA). It requires a theory which is neither stronger nor weaker than Peano arithmetic, namely Primitive Recursive Arithmetic (PRA) plus the axiom of quantifier-free transfinite induction up to \epsilon_0.

### Questions

#### Omega rule

http://mathoverflow.net/questions/25430/can-goodsteins-theorem-been-proven-with-first-order-pa-constructive-omega-rul

http://m-phi.blogspot.com/2011/03/completeness-of-pa-with-omega-rule.html says that Peano Arithmetic "closed" under the omega-rule is complete. I don't understand how that squares with Godel. Is it inconsistent? Does this "closing" operation not lead to a real theory? TODO TOASK

"One idea proposed by Stephen Read and Alan Weir (among others) is to enrich the proof-system with non-recursive rules (specifically, the omega-rule), so as to avoid Gödel’s result. However, it strikes me as unclear what the consequences of adding infinitary rules will be for inferentialist semantics." -- http://inferential.wordpress.com/2008/02/11/carnap-and-gentzen/#comment-316 " Carnap adds an ω-rule, but it is one of his consequence rules, not his derivation rules. They are in good company I guess."

so omega rules are exempt from Godel because they allow infinitely long propositions? See also http://philosophy.stackexchange.com/questions/7201/what-happens-when-we-drop-the-condition-that-proofs-are-finite-strings-of-infere http://www.hss.cmu.edu/philosophy/techreports/34_Ignjatovic.pdf

## Type theory and programming languages

Type theory started out in mathematical logic as a response to Russell's paradox.

Russell's paradox is: "Consider the set of all sets that do not contain themselves. Does this set contain itself?". If it does contain itself, then by it's own definition, it should not contain itself. If it does not contain itself, then by it's own definition, it should contain itself. Therefore such a set ("the set of all sets that do not contain themselves") cannot exist.

In order to avoid this and similar paradoxes, type theory was created, which was a theory that roughly speaking stratifies objects into levels ("types"), and requires that the definition of an object on a given level only refer to lower levels. Then you say that any putative object that doesn't follow the rules is not allowed to be introduced into formal discussion (i.e. such an object is considered to be just a trick of English language that doesn't really exist). For example, the "set of all sets that do not contain themselves" would break the rules and so isn't allowed to be discussed. Now you can avoid the paradox.

There are many different type theories.

The way that type theories tends to work is to present some fundamental types, and also some operations which you are allowed to apply to any type that you already have; these operations produce new types. By this means, starting with the fundamental types and iteratively applying the operations, you can generate lots of types. In addition to the language of types, there is usually an object language by which objects can be constructed, and some correspondence between the two. For any given term in the object language, in order to see if it is admissible as a 'real object', you try to find a corresponding type which can be legally constructed in the type theory. If you can find such a type, then the term is called 'well-typed'. If there is no such type, then the term is called 'ill-typed'. For instance, the term "the set of all sets that do not contain themselves" would be ill-typed in most type theories (however, often there is a term of the form "the ___ of all sets that do not contain themselves"; usually ___ is taken to be 'class', the idea being that you cannot have a SET of all sets that do not contain themselves, because that quantifies over things of the same level as itself, but you can have a collection of all sets that do not contain themselves, provided that this collection is not itself a set).

(many type theories get more detailed than just "levels", e.g. in many programming languages you can give a function a type which indicates that it can only be called on strings, not on booleans.)

#### The Curry-Howard isomorphism

We said that type theories typically present some fundamental types, and also some operations which you are allowed to apply to any type that you already have; these operations produce new types.

This is similar to the way that axiomatic systems of logic typically work; you have some fundamental propositions (axioms or assumptions), and also some inference rules that you are allowed to apply to any propositions that you already know to be true, to generate new propositions from the ones you already know to be true.

So say you take a given type theory. Can you make a logic whose inference rules work exactly analogously to the type theory's operations? If you do, will that logic make any sense and do anything interesting?

#### The Curry-Howard isomorphism

(i'm not sure that i'm understanding this section correctly, as Curry-Howard is usually described as a correspondence between PROGRAMS and proofs, not type theory and proofs; but the proposition proved by a program is the type of the program, so maybe these are essentially the same as what i say below?)

In fact in many cases the answer is yes, and the logic that you end up with often turns out to be a standard logic that was already known.

So in many cases, for a given type theory, the application of particular type theory operations to types can be formally matched up with particular inference rules in some logic. This means that the resulting types are matched up with proofs.

The Curry-Howard isomorphism proved that such a correspondence exists between the types in simply typed lambda calculus, and intuitionistic logic. Later, other such isomorphisms were shown, matching other type systems with other logics, and the term 'Curry-Howard isomorphism' is often used not just for the particular correspondence between simply typed lambda calculus and intuitionistic logic, but also for the general idea of such correspondences.

I'm not sure about this, but i think that the logic that is matched with a given type system using a Curry-Howard isomorphism does not specify the procedure used for type inference or type checking in a programming language using that type system; proofs in the logic correspond only to the existence of a particular type; further thought may be required to determine which type a term in a programming language has. Furthermore, finding the existence of a type corresponds to proving a particular theorem using that logic; the tactics used in proving that theorem may be more intelligent than using brute force to enumerate all possible proofs. For example, if i write "lambda x: x x", that might be ill-typed in many type systems, but i don't see how that is immediately apparent from the rules of a typical type theory; you have to prove a negative, that is, that there is no sequence of the type operations which arrives at that object term. I'm guessing that the algorithm for finding out that that is ill-typed must be separately discovered even after you know the rules of the logic corresponding to the type theory.

It may be interesting to take as one's foundation for mathematics a logic and also the type theory corresponding to that logic.

To what extent the philosophical import of a given type system corresponds to the philosophical import of its corresponding logic is something that I'm unclear about. To date, i am aware of no such correspondence. However, even without any philosophical meaning, the correspondence is certainly technically useful in that one can translate theorems from metalogic into type theory.

#### Dependent types

todo inductive types e.g. the type of lists from Vempty and Vcons

### Some type theories (and their logics)

#### Simply typed lambda calculus

Intuitionistic propositional logic ?

#### Martin-Lof type theory

Intuitionistic first-order logic ?

(note: "first-order logic" == "predicate logic")

#### System F

Second-order typed lambda calculus

Like simply typed lambda calculus, but you can quantify over types.

Polymorphic types in programming languages.

Fragment of second-order intuitionistic logic with no existential quantification.

#### System F_w

(is this a superset of System F?)

Adds type operators, i.e. functions from types to types

#### System F_<

System F plus subtyping.

#### \lambda HOPL

higher-order predicate logic as a type system

#### Calculus of constructions

Intuitionistic logic + System F_w

impredicative

an embedding of higher-order predicate logic; not isomorphic; there are terms in CIC that do not have a meaning in higher-order predicate logic, and there are propositions which are not provable in higher-order predicate logic, but which have corresponding types in CIC. i don't understand this. see http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.35.3947

Coq: a theorem prover that can represent part of Calculus of Constructions, with some restrictions on inference to make things more tractable?

## Model theory

A model of a theory is a collection of objects that obeys the theory. Often for a theory there is the informal notion of a 'standard model', that is, the model that everyone at first assumes that the theory is supposed to be about. Often, one might be surprised to discover that a theory that you thought was about its standard model also describeds various non-standard models. For example, the standard model of Peano arithmetic is the natural numbers; but there are also non-standard models which contain additional 'numbers' greater than any natural number (see http://en.wikipedia.org/wiki/Non-standard_model_of_arithmetic ).

Model theory studies the models of theories. Model theory appears to contain more than that these days; see http://en.wikipedia.org/wiki/Model_theory

## Math

More than most fields, mathematicians define themselves by their discipline rather than their topic. A proposition is not considered to be mathematically known until it has been proven. Since people employed as academics are judged by the production of new knowledge, mathematicians see their business as proving things. Therefore, mathematics has come to be seen not as the study of numbers, but as the study of things which can be proven. This leads to the curious result that a person who spent their time learning all about numbers, on reading facts about numbers asserted by others, and on thinking about numbers, but not on proving anything or on reading others' proofs, would not be considered a mathematician by many mathematicians, even though most other human beings would call em a mathematician (and according to most dictionary definitions, such a person would be a mathematician).

It should be noted that the identification of mathematics with things that can be proven, where "proven" is identified with "an irrefutable argument", would be objected to by some philosophers, who believe that irrefutable arguments exist about certain philosophical topics which are not in the form of math-style proofs (perhaps "I think therefore I am" is one, or more to the point "You think, therefore you are"; philosophically speaking, it is clear to anyone able to understand that sentence that they think, therefore it follows that they exist; but by the standards of a mathematical proof, no justification has been provided for the premise "you think", and so "you are" has only been proven contingent on the acceptance of the axiom "you think").

## Universal algebra

todo

relations

functions

arity

infix notation for binary functions

boolean functions

sets, characteristic functions (mb this is not exactly universal algebra)

injective, surjective, bijective

image

inverse

invertible

inverse image

monotone functions

equivalence relation

equivalence classes

total order

partial order

hasse diagram

preorder

graph

DAG

tree

isomorphism

automorphism

homomorphism

quotient

endomorphism

embedding

inclusion maps

cover

direct products

semilattices

lattices

galois connection

## Algebra

groups

cyclic groups

the symmetric groups

fields

tarski/dedekind's high school algebra axioms (and the tarski high school algebra problem)

linear algebra (and modules)

example of noncommutative yet associative operations: rotations, string concatenation

example of nonassociative operation: subtraction

example of commutative yet nonassociative operation: "Consider the set of all points on a line (or a plane, or 3-dimensional space --- it won't matter for the example) and let the operation send each pair (P,Q) of points to the midpoint of the segment PQ" -- http://math.stackexchange.com/questions/327422/is-all-algebraic-commutative-operation-always-associative

## Algebraic topology

### Homotopy

"Homotopy of two continuous mappings f,g: X -> Y. A formalization of the intuitive idea of deformability of one mapping into another." -- http://www.encyclopediaofmath.org/index.php/Homotopy

"In topology, two continuous functions from one topological space to another are called homotopic if one can be "continuously deformed" into the other, such a deformation being called a homotopy between the two functions.

"The two dashed paths shown above are homotopic relative to their endpoints. The animation represents one possible homotopy." -- http://en.wikipedia.org/wiki/Homotopy

"More exactly, two mappings and are called homotopic (denoted by ) if there exists a family of continuous mappings f_t : X -> Y, continuously depending on a parameter t \in [0,1], such that f_0 = f, and f_1 = g." -- http://www.encyclopediaofmath.org/index.php/Homotopy

### Topological equivalence

"Consider a world where objects are made of elastic rubber. Two objects are considered equivalent if they can be deformed into each other without tearing the material. If such a transformation between X and Y exists, we say they are topologically equivalent..." -- http://www.ayasdi.com/_downloads/Topological_Analysis_of_Population_Activity_in_Visual_Cortex.pdf

### Betti numbers

"...it is evident that a possible reason for two objects not to be equivalent is that they differ in the number of holes. Thus, simply counting holes can provide a signature for the object at hand. Holes can exist in different dimensions. A one-dimensional hole is exposed when a one-dimensional loop (a closed curve) on the object cannot be deformed into a single point without tearing the loop. If two such loops can be deformed into one another they define the same hole, which should be counted only once. Analogous definitions can be invoked in higher dimensions. For example, a two-dimensional hole is revealed when a closed two-dimensional oriented surface on the object cannot be deformed into a single point.

This notion of counting holes of different dimensions is formalized by the definition of Betti numbers. The Betti numbers of an object X can be arranged in a sequence, b ( X )=( b 0 , b 1 , b 2 , I ), where b 0 represents the number of connected components, b 1 represents the number of one- dimensional holes, b 2 the number of two-dimensional holes, and so forth. An important property of Betti sequences is that if two objects are topologically equiv- alent (they can be deformed into each other) they share the same Betti sequence. One must note, as we will shortly illustrate, that the reverse is not always true: two objects can be different but have the same Betti sequence. " -- http://www.ayasdi.com/_downloads/Topological_Analysis_of_Population_Activity_in_Visual_Cortex.pdf

## computability theory

total function vs (non-total) partial function

computable vs enumerable vs not enumerable

primitive recursion, mu-recursion/turing machines/lambda calc/etc

universality, interpreters

various machine models, polynomial time emulators

halting problem

## basic proof techniques

modus ponens/syllogism

direct proof

proof by cases

proof by contrapositive

proof by cancellation

proof by symmetry

how to prove the nonexistence of something: show that every possibility has a set of properties that together preclude them from being the target contradiction diagonalization

(algebra)

## godel's incompleteness

didn't we already do this above?

## transfinite induction

over well-ordered sets

complete/strong induction

and extending it even to uncountables

## boolean logic

and, or, not nand/nor

refer to http://boole.stanford.edu/cs353/handouts/book3.pdf for Post's theorem: (Post) A necessary and sufficient condition that a set of operations form a basis for the nonzeroary Boolean operations is that it contain operations that are respectively nonaffine, nonmonotone, nonstrict, noncostrict, and nonselfdual

todo

## analysis

constructions of the real numbers

continuity

what else?

derivatives

integrals

todo

## multivariable calc

change-of-variables, hessian

## polynomials

roots closed form solution thru quartics non-closed form solution for some quintics

taylor series expansion

## exponentiation

logs

the approximation for log(1+x) for small x

properties of e

## complex numbers

intution behind e^(i*pi) = -1

## probability

pdf

todo

can't really summarize here because involves a lot of thinking about the translation between the real world and math (word problems)

## statistics

todo

can't really summarize here because involves a lot of thinking about the translation between the real world and math (word problems)

## machine learning

supervised learning

unsupervised learning

clustering

dimensionality reduction

semisupervised learning

todo

## neuro

some parts of the brain and what they might do

neurons and spikes

todo

## history

summarize

http://www.waitbutwhy.com/2013/08/putting-time-in-perspective.html

and

http://www.hyperhistory.com/online_n2/History_n2/a.html

and a history of ideas

## futurology

demographics

economics

sigmoids

tech: things take longer than you think, but other things you didn't think of pop up (find that quote)

what else? i dont know much futurology

https://en.wikipedia.org/wiki/Portal:Contents/Outline_of_knowledge

https://en.wikipedia.org/wiki/Portal:Contents

https://en.wikipedia.org/wiki/Category:Outlines