proj-oot-ootNotes17

Difference between revision 14 and current revision

No diff available.

---

one thing to do is to look at frequent words and patterns in existing code corpora in order to see what is common, so that we can consider thinking about these things as fundamental, and also, more pedestrianly, we can optimize for making them easy to read and write in oot:

text mining on Java source code. Most frequent lexemes: " The most commonly occurring word is the + operator (305,685) followed by the scoping block (295,726) and the = operator (124,813). If we exclude operators and scoping blocks from our analysis, the most frequent words are public (124,399), if (119,787), and int (108,709). The most common identi er (the programming language equivalent of a lexical word in natural language as discussed in Section 3.2), is String . It is the ninth most frequently occurring word overall with 71,504 occurrences. This pseudo-primitive type in Java is a special case of a non-primitive that has nearly achieved primitive status in the language and may well do so in either a future version of Java or a derivative language it spawns. The next three most frequent lexical words are length (19,312), Object (18,506), and IOException (11,322). " -- http://flosshub.org/sites/flosshub.org/files/21st-delorey.pdf

" Top Idioms

Figure 6 shows the top idioms mined in the Library data set, ranked by the number of files in the test sets where each idiom has appeared in. The reader will observe their immediate usefulness. Some idioms capture how to retrieve or instantiate an object. For example, in Figure 6, the idiom 6a captures the instantiation of a message channel in RabbitMQ?, 6q retrieves a handle for the Hadoop file system, 6e builds a SearchSourceBuilder? in Elasticsearch and 6l retrieves a URL using JSoup. Other idioms capture important transactional properties of code: idiom 6h demonstrates proper use of the memory-hungry RevWalk? object in JGit and 6i is a transaction idiom in Neo4J. Other idioms capture common error handling, such as 6d for Neo4J and 6p for a Hibernate transaction. Finally, some idioms capture common operations, such as closing a connection in Netty (6m), traversing through the database nodes (6n), visiting all AST nodes in a JavaScript? file in Rhino (6k) and computing the distance between two locations (6g) in Android. The reader may observe that these idioms provide a meaningful set of coding patterns for each library, capturing semantically consistent actions that a developer is likely to need when using these libraries. In Figure 7 we present a small set of general Java idioms mined across all data sets by Haggis . These idioms represent frequently used patterns that could be included by default in tools such as Eclipse’s SnipMatch? [ 43 ] and IntelliJ’s? live templates [ 23 ]. These include idioms for defining constants (Figure 7c), creating loggers (Figure 7b) and iterating through an iterable (Figure 7a).

Figure 6: Top cross-project idioms for L ibrary projects (Figure 4). Here we include idioms that appear in the test set files. We rank them by the number of distinct files they appear in and restrict into presenting idioms that contain at least one library-specific ( i.e. API-specific) identifier. The special notation $(TypeName?) denotes the presence of a variable whose name is undefined. $BODY$ denotes a user-defined code block of one or more statements, $name a freely defined (variable) name, $methodInvoc a single method invocation statement and $ifstatement a single if statement. All the idioms have been automatically identified by Haggis

channel=connection. createChannel();

Elements $name=$(Element). select($StringLit?);

Transaction tx=ConnectionFactory?. getDatabase().beginTx();

catch (Exception e){ $(Transaction).failure(); }

SearchSourceBuilder? builder= getQueryTranslator().build( $(ContentIndexQuery?));

LocationManager? $name = (LocationManager?)getSystemService( Context.LOCATION_SERVICE);

Location.distanceBetween( $(Location).getLatitude(), $(Location).getLongitude(), $...);

try { $BODY$ } finally { $(RevWalk?).release(); }

try { Node $name=$methodInvoc(); $BODY$ } finally { $(Transaction).finish(); }

ConnectionFactory? factory = new ConnectionFactory?(); $methodInvoc(); Connection connection = factory.newConnection();

while ($(ModelNode?) != null ){ if ($(ModelNode?) == limit) break ; $ifstatement $(ModelNode?)=$(ModelNode?) .getParentModelNode(); }

Document doc=Jsoup.connect(URL). userAgent("Mozilla"). header("Accept","text/html"). get();

if ($(Connection) != null ){ try { $(Connection).close(); } catch (Exception ignore){} }

Traverser traverser

$(Node).traverse();

for (Node $name : traverser){ $BODY$ }

Toast.makeText( this , $stringLit,Toast.LENGTH_SHORT) .show()

try { Session session

HibernateUtil

.currentSession(); $BODY$ } catch (HibernateException? e){ throw new DaoException?(e); }

FileSystem? $name

FileSystem.get(

$(Path).toUri(),conf);

(token=$(XContentParser?) .nextToken()) != XContentParser? .Token.END_OBJECT

Figure 7: Sample language-specific idioms. $StringLit? denotes a user-defined string literal, $name a (variable) name, $methodInvoc a method invocation statement, $ifstatement an if statement and $BODY$ a code block.

(a) Iterate through the elements of an Iterator: (Iterator iter=$methodInvoc; iter.hasNext(); ) {$BODY$}

(b) Creating a logger for a class:

private final static Log $name= LogFactory?.getLog($type. class );

(c) Defining a constant String:

public static final String $name = $StringLit?;

(d) Looping through lines from a BufferedReader?:

while (($(String) = $(BufferedReader?). readLine()) != null ) {$BODY$}

-- http://homepages.inf.ed.ac.uk/csutton/publications/idioms.pdf

One interesting observation is that 50% of Java methods are 3 lines or less. Manually inspecting these methods we find accessors (setters and getters) or empty methods (e.g. constructors).

-- http://homepages.inf.ed.ac.uk/csutton/publications/msr2013.pdf

Table 2: The attribute catalogue

Name Formal definition Returns void The return descriptor is V . No parameters The list of parameter descriptors is empty. Field reader GETFIELD or GETSTATIC instruction. Field writer PUTFIELD or PUTSTATIC instruction. Contains loop Jump instructions that allow for instructions to be executed more than once in the same method invocation. Creates object NEW instruction. Throws exception ATHROW instruction. Type manipulator INSTANCEOF or CHECKCAST instruction. Local assignment One of the STORE instructions (for instance, ISTORE ). Same name call Calls a method of the same name.

The name get is interesting because it is by far the most common one; nearly a third of all Java methods in the corpus are get-methods.

Lexicon Entries.

ACCEPT. Methods named accept very seldom read state. Furthermore, theyrarely throw exceptions, call methods of the same name, create objects, manipulate state, use local variables, have no parameters, perform type- checking or contain loops. The name accept has a precise use. A similar name is visit . Generalisations of accept are handle and initialize . Somewhat related names are set , end , is and insert .

ACTION. Methods named action never call methods of the same name. Further- more, they very often read state. Finally, they often return void, and rarely throw exceptions, have no parameters or contain loops. The name action has a precise use. Similar names are remove and add.

ADD. Among the most common method names. Methods named add often read state. Similar names are remove and action .

CHECK. Methods named check very often throw exceptions. Furthermore, they often create objects and contain loops, and rarely call methods of the same name. Unfortunately, check is an imprecise name for a method.

CLEAR. Methods named clear very often have no parameters. Furthermore, they often return void, call methods of the same name and manipulate state, and rarely create objects, use local variables or perform type-checking. A generalisation of clear is reset . A somewhat related name is close .

CLOSE. Methods named close often return void, call methods of the same name, manipulate state, read state and have no parameters, and rarely create objects or perform type-checking. A generalisation of close is validate . A somewhat related name is clear .

CREATE. Among the most common method names. Methods named create very often create objects. Furthermore, they rarely call methods of the same name, read state or contain loops.

DO. Methods named do often throw exceptions and perform type-checking, and rarely call methods of the same name. Unfortunately, do is an imprecise name for a method.

DUMP. Methods named dump never throw exceptions. Furthermore, they very often create objects and use local variables, and very seldom read state. Finally, they often call methods of the same name and contain loops, and rarely manipulate state. The name dump has a precise use.

END. Methods named end often return void, and rarely create objects, use local variables, read state or contain loops. Generalisations of end are handle and initialize . A specialisation of end is insert . Somewhat related names are accept , set , visit and write .

EQUALS. Methods named equals never return void, throw exceptions, create objects, manipulate state or have no parameters. Furthermore, they very often call methods of the same name and perform type-checking. Finally, they often use local variables and read state. The name equals has a precise use.

FIND. Methods named find very often use local variables and contain loops. Furthermore, they often perform type-checking, and rarely return void.

GENERATE. Methods named generate often create objects, use local variables and contain loops, and rarely call methods of the same name. Unfortunately, generate is an imprecise name for a method.

GET. The most common method name. Methods named get often read state and have no parameters, and rarely return void, call methods of the same name, manipulate state, use local variables or contain loops. A similar name is has . Specialisations of get are is and size . A somewhat related name is hash .

HANDLE. Methods named handle often read state, and rarely call methods of the same name. A similar name is initialize . Specialisations of handle are accept , set , visit , end and insert .

HAS. Methods named has often have no parameters, and rarely return void, throw exceptions, create objects, manipulate state, use local variables or perform type-checking. The name has has a precise use. A similar name is get . Specialisations of has are is and size . A somewhat related name is hash

HASH. Methods named hash always have no parameters, and never return void, throw exceptions, create objects or perform type-checking. Furthermore, they very often call methods of the same name. Finally, they often read state, and rarely manipulate state or use local variables. The name hash has a precise use. Somewhat related names are has , is , get and size .

INIT. Methods named init very often manipulate state. Furthermore, they often return void, create objects and have no parameters, and rarely call methods of the same name.

INITIALIZE. Methods named initialize often return void and manipulate state, and rarely call methods of the same name or read state. A similar name is handle . Specialisations of initialize are accept , set , visit , end and insert .

INSERT. Methods named insert often throw exceptions, and rarely create objects, read state, have no parameters or contain loops. Generalisations of insert are handle , end and initialize . Somewhat related names are accept , set , visit and write .

IS. The third most common method name. Methods named is often have no parameters, and rarely return void, throw exceptions, call methods of the same name, create objects, manipulate state, use local variables, perform type- checking or contain loops. The name is has a precise use. Generalisations of is are has and get . Somewhat related names are accept , visit , hash and size .

LOAD. Methods named load very often use local variables. Furthermore, they often throw exceptions, create objects, manipulate state, perform type-checking and contain loops. Unfortunately, load is an imprecise name for a method.

MAKE. Methods named make very often create objects. Furthermore, they rarely return void, throw exceptions, call methods of the same name or contain loops.

NEW. Methods named new never contain loops. Furthermore, they very seldom use local variables. Finally, they often call methods of the same name and create objects, and rarely return void, manipulate state or read state.

NEXT. Methods named next very often manipulate state and read state. Furthermore, they often throw exceptions and have no parameters, and rarely return void.

PARSE. Among the most common method names. Methods named parse very often call methods of the same name, read state and perform type-checking. Furthermore, they rarely use local variables. The name parse has a precise use.

PRINT. Methods named print often call methods of the same name and contain loops, and rarely throw exceptions or manipulate state.

PROCESS. Methods named process very often use local variables and contain loops. Furthermore, they often throw exceptions, create objects, read state and perform type-checking, and rarely call methods of the same name. Unfortunately, process is an imprecise name for a method.

READ. Methods named read often throw exceptions, call methods of the same name, create objects, manipulate state, use local variables and contain loops. Unfortunately, read is an imprecise name for a method.

REMOVE. Among the most common method names. Methods named remove often throw exceptions. Similar names are add and action .

RESET. Methods named reset very often manipulate state. Furthermore, they often return void and have no parameters, and rarely create objects, use local variables or perform type-checking. A specialisation of reset is clear .

RUN. Among the most common method names. Methods named run very often read state. Furthermore, they often have no parameters, and rarely call methods of the same name.

SET. The second most common method name. Methods named set very often manipulate state, and very seldom use local variables or read state. Furthermore, they often return void, and rarely call methods of the same name, create objects, have no parameters, perform type-checking or contain loops. The name set has a precise use. Generalisations of set are handle and initialize . Somewhat related names are accept , visit , end and insert .

SIZE. Methods named size always have no parameters, and never return void, create objects, manipulate state, perform type-checking or contain loops. Furthermore, they very seldom use local variables. Finally, they rarely read state. The name size has a precise use. Generalisations of size are has and get . Somewhat related names are is and hash .

START. Methods named start often return void, manipulate state and read state.

TO. Among the most common method names. Methods named to very often call methods of the same name and create objects. Furthermore, they often have no parameters, and rarely return void, throw exceptions, manipulate state or perform type-checking.

UPDATE. Methods named update often return void and read state.

VALIDATE. Methods named validate very often throw exceptions. Furthermore, they often create objects and have no parameters, and rarely manipulate state. A specialisation of validate is close .

VISIT. Methods named visit rarely throw exceptions, use local variables, read state or have no parameters. A similar name is accept . Generalisations of visit are handle and initialize . Somewhat related names are set , end , is and insert .

WRITE. Among the most common method names. Methods named write often return void and call methods of the same name, and rarely have no parameters. Somewhat related names are end and insert .

-- The Programmer’s Lexicon, Volume I: The Verbs

---

(at least) 3 ways to loop: while, jump, hof search function (but is that same as while?). also colllection-oriented looping is not universal for control, but does most of it. also for long-lasting things where you want to just 'loop forever and respond to events until i decide to terminate', instead of having that loop in your program, you could register with a manager that calls you upon each iteration (but the manager still has to loop) (is this really that different from a while loop?).

---

examples of things that cannot directly be 'inlined' in some languages:

---

fleshing out the idea of a general, fundamental Search operator (generalization of a fixpoint operator) a little:

the search operator takes parameters in two stages, that is, it is a higher-order function that takes two arguments, each of which are 'packages' of functions and parameters (or it could just take a bunch of arguments, with no'packaging'). First, it takes a group of parameters that specify a search strategy. This includes functions that say how to initialize the search's internal state, how to choose the next search position given some internal state (perhaps the previous search position and its score), and when to terminate the search. For example, by giving different functions for these inputs, you can create a depth-first search, a breadth-first search, an A* search, a fixpoint operator (terminate upon idempotency of "next search position"), or a search that quits when it hits a plateu (a near-fixpoint) even if the objective function is still moving some tiny amount.

Second, it takes another group of parameters to choose the objective function, and to set the items to be searched through, and to choose the initial location of the search.

note that this is equivalent to an OOP system where there is an abstract base class (the Search operator), and concrete subclasses (breadth-first search, depth-first search, A*, fixpoint search, a search that quits when it hits a plateua). The class defines/satisfies an interface that has one method (do_search or the like).

---

the example of a general fundamental Search operator shows us that, when an OOP base class's purpose is just to be Called through one primary method call, it corresponds to a higher-order function.

---

was talking about goal-oriented programming with my friend DR. I explained my idea that you could specify a goal in terms of preconditions and postconditions (eg defining a sort) and the compiler could find a subroutine to satisfy them. And then you could add time and space complexity requirements, eg "cannot require more than O(n^2) space". And then you could add time and space complexity hints, eg "i am going to read this data structure a lot but rarely write to it". DR pointed out that means giving priorities. I said the trouble with priorities is that in a formal mathematical sense if you say "maximize x at the expense of y" you might end up with some solution that is EXTREMELY costly in y for a tiny gain in x, which is usually not what humans mean when they say to another human "prioritize a over b"; DR noted this doesn't mean priority is not useful here, it just means that we are looking to explore this fuzzier definition of priority. Also i mentioned Alan Key's phrase policy-oriented programming (or something like that). DR then pointed out that so far in this convo we have three concepts to think about for goal-oriented programming:

---

another interesting goal for a simple language is to think of what would be desired for a post-apocalyptic scenario. I think this is unlikely (even conditional upon an apocalypse, which is already unlikely), but imagine if a few individuals have working computers but only a few of them, and so much has been lost that no one has the complete of the toolchain, eg. whatever is required to cross-compile gcc onto a new architecture; or (even less likely) imagine if gcc is available, but not the gcc source code; and (even less likely) imagine that there is no comprehensive C specification or even documentation around; in such a situation, programming language implementations would have to be re-implemented by ordinary programmers (not compiler specialists) based on what they remember about the language (they can maybe refer to some code samples from a few personal projects they happened to have on their personal machine at the time of the apocalypse). Assume that communication is initially spotty enough that there are multiple re-implementors who do not know of each other until much later. Most likely these different re-implementors would misremember different things about the language definition and we'd get a family of mutually incompatible, C-like new languages.

Contrast with e.g. BASIC; i bet everyone would remember BASIC pretty well and the result would be a family of real BASIC dialects, not just vaguely related new languages.

If you substitute 'oot' for 'C' here, we would want Oot to be easier to remember than C; more like BASIC.

As noted above, i think this is unlikely to happen in the real world, even if there were an apocalypse, but it's a good thought experiment to push the language to be 'simple' and to think of whether a language 'fits in your head'.

---

http://stackoverflow.com/questions/10858787/what-are-the-uses-for-tags-in-go

---

http://www.geeksforgeeks.org/write-a-function-to-reverse-the-nodes-of-a-linked-list/

---

assertion of fact (opposite: query fact (and then match pattern to multiassign results)) (you could assert an equation rather than a special value assignment, but you could assert a value assignment too) vs command vs assignment (which is like pronouns) and also (although mb not separate from the above) RESTful interactions like GET address, SET (PUT) document=value; CRUD; VERBs applied to NOUNs, possibly with other arguments too (eg the value being assigned to a document in PUT)

---

evaluation strategy relates to variable substitution, but also to time, as variable substitution is an analog of time (the sequence of computation) within the timeless realm of purity (referential transarency)

---

jcrites 2 days ago

The article is discussing documentation for the AWS Flow Framework specifically. The Flow Framework is a Java framework built on top of the SWF API, and it provides a completely different programming model than the SWF API.

The C# example being discussed, as well as the Java example at the end, are examples of using that SWF API directly. The SWF API is indeed simpler for trivial examples. Flow is a power tool that handles complex workflows better than any alternative I've seen, but the framework itself is complex and incurs cost to set up and use. The documentation could do a better job of explaining this, and of providing Java API examples.

The Flow framework provides something that's a mix of Java code and a domain-specific language for building SWF applications that's expressed as Java code. (For an analogy, consider EasyMock?.) It's hard to explain Flow concisely, but if I had to try I'd say, "You write code that looks like it's procedural and runs on one machine, and Flow converts that into a distributed workflow running across of fleet of machines, all of which may be stateless". Flow threads the state into and out of SWF for you, making it possible to express distributed workflows at a higher level of abstraction. Some examples are in the AWS Flow Framework Recipes: https://aws.amazon.com/code/2535278400103493

To achieve this, Flow uses fancy code weaving & AOP techniques. This makes it more complicated to set up, develop, and test. Flow pays off however once your workflow is more complex than a simple linear workflow. You could, for example, process data with a distributed map/reduce pattern in a few lines of code in Flow. (Source: built production systems on SWF with and without Flow)

reply

---

" In other words, state machines should not be specified as tuples that connect two states (S1, A, S2) as they traditionally are, they are rather tuples of the form (Sk, Ak1, Ak2,…) that specify all the actions enabled, given a state Sk, with the resulting state being computed after an action has been applied to the system, and the model has processed the updates. "

---

prewett 8 hours ago

I think part of the problem is that MVC is pretty heavyweight. Most UI doesn't need that kind of flexibility, but when you want it, you want it. So you need a way to make it simple most of the time and still have access to the details.

In web development it is probably complicated by that fact that, in my opinion, declarative positioning and sizing of elements is a pipe dream. It looks simple until you try to actually implement it, and HTML/CSS has only a rudimentary implementation. (As far as I know, Motif and Apple's constraints are the only UI toolkits that have a solid implementation) Given what we want to do with the web these days, I think we would be better off with programming the web page declaratively. Something like what Qt does. I've never found an easier way to write a UI than Qt.

reply

---

term rewriting vs. lambda calculus:

http://stackoverflow.com/questions/24330902/how-does-term-rewriting-based-evaluation-work :

" How does term-rewriting based evaluation work?

The Pure programming language is apparently based on term rewriting, instead of the lambda-calculus that traditionally underlies similar-looking languages. ...

The matching of patterns, and substitution into output expressions, superficially looks a bit like syntax-rules to me (or even the humble #define), but the main feature of that is obviously that it happens before rather than during evaluation, whereas Pure is fully dynamic and there is no obvious phase separation in its evaluation system (and in fact otherwise Lisp macro systems have always made a big noise about how they are not different from function application). Being able to manipulate symbolic expression values is cool'n'all, but also seems like an artifact of the dynamic type system rather than something core to the evaluation strategy (pretty sure you could overload operators in Scheme to work on symbolic values; in fact you can even do it in C++ with expression templates).

So what is the mechanical/operational difference between term rewriting (as used by Pure) and traditional function application, as the underlying model of evaluation, when substitution happens in both?

1 Answer

Term rewriting doesn't have to look anything like function application, but languages like Pure emphasise this style because a) beta-reduction is simple to define as a rewrite rule and b) functional programming is a well-understood paradigm.

A counter-example would be a blackboard or tuple-space paradigm, which term-rewriting is also well-suited for.

One practical difference between beta-reduction and full term-rewriting is that rewrite rules can operate on the definition of an expression, rather than just its value. This includes pattern-matching on reducible expressions:

-- Functional style map f nil = nil map f (cons x xs) = cons (f x) (map f xs)

-- Compose f and g before mapping, to prevent traversing xs twice result = map (compose f g) xs

-- Term-rewriting style: spot double-maps before they're reduced map f (map g xs) = map (compose f g) xs map f nil = nil map f (cons x xs) = cons (f x) (map f xs)

-- All double maps are now automatically fused result = map f (map g xs)

Notice that we can do this with LISP macros (or C++ templates), since they are a term-rewriting system, but this style blurs LISP's crisp distinction between macros and functions.

CPP's #define isn't equivalent, since it's not safe or hygenic (sytactically-valid programs can become invalid after pre-processing).

...

Another practical consideration is that rewrite rules must be confluent if we want deterministic results, ie. we get the same result regardless of which order we apply the rules in. No algorithm can check this for us (it's undecidable in general) and the search space is far too large for individual tests to tell us much. Instead we must convince ourselves that our system is confluent by some formal or informal proof; one way would be to follow systems which are already known to be confluent.

For example, beta-reduction is known to be confluent (via the Church-Rosser Theorem), so if we write all of our rules in the style of beta-reductions then we can be confident that our rules are confluent. Of course, that's exactly what functional programming languages do!

"

---

on runtime bounds-testing, i assume?:

" It was clear when Hoare stated on his speech in 1980:

"Many years later we asked our customers whether they wished us to provide an option to switch off these checks in the interests of efficiency on production runs. Unanimously, they urged us not to - they already knew how frequently subscript errors occur on production runs where failure to detect them could be disastrous. I note with fear and horror that even in 1980, language designers and users have not learned this lesson. In any respectable branch of engineering, failure to observe such elementary precautions would have long been against the law." "

---

munificent 4 hours ago

> Is there a reason why one of these hasn't emerged/been adopted by the community?

Personally, I believe package management is one of those things that really does need an official blessed solution. Otherwise, you have a nasty bootstrapping problem: if there are ten competing package managers, how do you install them, and how do package developers know which one to put their packages in?

Collection types have the same problem. You basically need to put some collections in a blessed core library, otherwise it's virtually impossible to reliably share code. Any function that wants to return a list ends up having to pick one of N list implementations and which ever one they pick means their library is hard for users of the other N-1 lists to consume.

The Go team hasn't blessed a package manager, I think, because it's not that relevant to them: they mostly live within Google's own infrastructure which obviates the need for something like version management. They probably don't feel the pain acutely and/or might not have the expertise to design one that would work well outside Google.

reply

---

on Parrot's M0 and Lorito:

If you were truly interested in M0, any decent search engine, or even a trawl through one of several Perl 6 and Parrot Links pages, would have taken you to an article written in 2011 by one of the designers and developers of Lorito and M0. That article is Less Magic, Less C, A Faster Parrot, which says:

" The current stage of Lorito is M0, the "zero magic" layer of implementing a handful of operations which provide the language semantics of C without dragging along the C execution model. In other words, it's a language powerful enough to do everything we use C for without actually being C. It offers access to raw memory, basic mathematical operations, and Turing-complete branching while not relying on the C stack and C calling conventions.

This was the core of both the M0 design and Lorito itself. ... the Squeak Slang approach (or the Forth approach or...) that M0 intended " -- http://www.perlmonks.org/?node_id=1048142

http://www.modernperlbooks.com/mt/2011/07/less-magic-less-c-a-faster-parrot.html

(note that the person who wrote that, Chromatic, says "Update: M0 is dead, Parrot is effectively doomed, and the author believes that Rakudo is irrelevant. This post is now a historical curiosity."

Chromatic repeats here that M0 is dead: "After YAPC 2011, I did spend a little time working on a prototype of a smaller, faster core for Parrot, but that went nowhere." (the link talks about M0 and Lorito)

---

" PIR is a mostly terrible language in which to write a compiler. It's better than C in many ways. That's not high praise in the 21st century. " -- http://www.modernperlbooks.com/mt/2013/02/goodnight-parrot.html

---

http://pmthium.com/2014/10/apw2014/ isn't relevant to me, but i read it anyways, and i learned that (not surprisingly) Perl6 has a bunch of stuff that is the opposite of the 'simplicity' and straightforwardness that i want for Oot. Eg some things auto-flatten lists.

---

what do i mean/what is my strategy for 'simple'? Some notes:

---

some ways of thinking about programming languages, and about how the brain might work:

how does the brain or a programming language implement:

custom program representation loading in the custom program control flow primitive atomic data types composite data structure types primitive operations modules execution model routing memory/state short term (heap) medium term (main memory) long term (disk) medium and long term memory allocation and freeing concurrency not stepping on each other on shared resources including IO and memory ipc scheduling (lending/sharing computational resources from inactive processes) nondeterminism IPC sync what else? IO scaling up of system resources

---

from https://github.com/perl6/nqp/blob/master/docs/ops.markdown:

what are these? they look cool:

https://www.google.com/search?client=ubuntu&channel=fs&q=take+last+next+redo+succeed+proceed+warn&ie=utf-8&oe=utf-8

mb look at http://doc.perl6.org/language/control ---

mb not related at all but just in case:

http://eli.thegreenplace.net/2015/calling-back-into-python-from-llvmlite-jited-code/

also should probably check out llvmlite "A lightweight LLVM python binding for writing JIT compilers."

---

http://www.drdobbs.com/architecture-and-design/the-rebol-ios-distributed-filesystem/184405152 http://www.rebol.com/ios-intro.html

---

impl Print for u32 { fn print(&self) { println!("{}", self); } fn copy(&self) -> Self { *self } }

---

" A selection of database primitives... S ET O P ERATIO NS ( FO R RID LISTS )  Intersection  Difference  Union S O RTING  Merge Sort H ASH O P ERATIO NS  Integer Hashing  String Hashing  Hash Table Management ... " -- http://dsg.uwaterloo.ca/seminars/notes/2014-15/Lehner.pdf (no need to read that)

---

Transaction-Oriented Architecture shared-everything vs. Data-Oriented Architecture mixed shared-everything & shared-nothing

-- http://dsg.uwaterloo.ca/seminars/notes/2014-15/Lehner.pdf (no need to read that)

---

some db ops:

CRUD scan

---

" RDMA has three modes of communication, from fastest to slowest these are:

    One-sided RDMA (CPU bypass) which provides read, write, and two atomic operations fetch_and_add, and compare_and_swap.
    An MPI interface with SEND/RECV verbs, and
    An IP emulation mode that enables socket-based code to be used unmodified"

---

" Next up to I want to highlight hardware transactional memory (HTM) instruction support, available in the x86 instruction set architecture since Haswell as the “Transactional Synchronization Extensions.” It comes in two flavours, a backwards-compatible Hardware Lock Elison (HLE) instruction set, and a more flexible forward-looking Restricted Transactional Memory (RTM) instruction set.

Finally, as we saw yesterday, new persistent memory support is coming to give more control over flushing data from volatile cache into persistent memory. "

---

http://atomthreads.com/

"Atomthreads is a free, lightweight, portable, real-time scheduler for embedded systems."

---

http://journal.stuffwithstuff.com/2015/02/01/what-color-is-your-function/ this general pattern applies to at least two things:

---

colin_mccabe 777 days ago

I think a big part of the issue is that in Java and C++, you really need generics a lot more than in Go. Without templates, you would not have any easy way of doing maps and lists in C++. There are no builtin types for those like in Go. The way the type system works in those languages also makes things very difficult if you don't have generics.

Think about the sorting example I wrote earlier: https://news.ycombinator.com/item?id=7080562 If you were writing it in Java, is sort.Sort an interface or an abstract base class? Well, you can only "extend" one class (single inheritance only), so you would probably want Sort to be a Java interface. That means that you would always have to implement all three functions, not just one as I did, since Java interfaces cannot have default implementations. The comments indicate that most posters didn't even consider the idea that you could reuse the StringSlice? functions. That method of easy composition simply doesn't exist in Java.

In general, generics get used a lot as a band-aid to avoid multiple inheritance in C++ and Java. You can't (or shouldn't, in C++) have your Foo inherit from both a (non-abstract) Bar and Baz. But you can certainly template on them. In C++, this kind of thing is called "traits" and Alexandrescu wrote a whole book about it. It's also why std::string is actually std::basic_string<char, std::char_traits<char>, std::allocator<char> >. In Go, you don't need all this... you just implement as many interfaces as you like and you're done.

---

" Cyclone implements three kinds of reference (following C terminology these are called pointers):

---

if a function has an inner function that is returned as a closure, then that inner function can access and modify the values of variables in the outer function. Does this mean that those variables that might be modified from the inner function must be prefixed with the '&' sigil? No, because & is only needed for 'non-local' modification, and "non-local" is delineated by lexical scope & threading; so because it's an inner function, it's within the lexical scope, and therefore 'local'. But if this inner function is passed to another thread, then the variable must indeed have a '&', because it is being accessed across threads (alternately we could just disallow that sort of thing and force communication over an explicit channel for that).

---

from emacs Info mode:

space for next page, delete for prev page, p for previous chapter, n for next chapter (and what for up and down?)

---

maybe .= for 'is'?

---

http://spacecraft.ssl.umd.edu/akins_laws.html

---

 pron 1 day ago

That also depends on the namespaces used by the language. E.g. auto-completion in Clojure works a lot better than in JS, because every symbol has a unique resolution, known statically (or as unique as in many typed languages). True, you get less helpful suggestions, but OTOH, the error messages can be much clearer than in the typed case. The former can leave error reporting to the DSL, while the latter is restricted to the cryptic types of the host language's compiler. I'm not saying this can't be resolved with some clever compiler plugins, but if we're comparing options available now, the advantage is not so clear-cut (I do favor types, but only as long as they don't get too complex).

reply

---

"

 You mean coding the stuff people are doing in LLVM and GCC in ML on CompCert or similar system? No, it's significantly easier to do that than get current architectures right in C-like languages. FOSS just doesnt do it for most part. Rust was exception: did theirs in Ocaml.

After FOSS compiler types build it, users can get the reproducible source and build the tool. Then that builds the other apps from source. See how easy that is?

Note: Wirth et al built a safe language, simple ASM, CPU, OS, apps, and all with a few people in a few years. The ASM-3GL-Compiler build is WAY easier than you think. It bootstraps faster one after. .... That's a large problem. The compiler part is smaller with lots of work in CompSci?, FOSS, and private sector (eg books). There's tools with source available on net for imperative and functional languages that are safer, too. Ignored almost entirely by safety or security oriented projects in compilers and general FOSS in favor of harder-to-analyze, less secure stuff. However, they'll happily bring up fad-driven stuff like Thompson attack or reproducible builds as The Solution.

Here's the actual solution. You start with a simple, non-optimizing toolchain designed and documented for easy understanding and implementation. It has extensive test suite. User worried about subversion implements that in tooling of their choice on own machine. Wirth simplified it with P-code interpreter that was easy to inplement with compiler and apps targeting it. Once first compiler is done, it compiles the HLL source of its own code. Now, you can use it to compile a high-performance compiler's source or add optimizations to it. Most of this work is done so it's a natter of FOSS compiler types or project teams just integrating and using it. ...

In 70's-80's, people designed, assured, and pentested guards with great results. Firewalls were a watered down version that came later with features but not assurance. Push guards on firewall proponents, even developers, then you'll just get ignored. They will work on whatever is making rounds on favorite IT or INFOSEC sites, though.

Compiler and OS people. They usually write their stuff in a monolithic style in C despite decades of bad results that way. Showing even one person (Edison), three (Lilith/Oberon), or handful (MINIX 3) can do entire system safer with less people and time will not change this. Showing them ML or something with C compilation for portability will not change this. They systematically reject this while doing whatever is their tradition or becomes in the vogue.

... " -- nickpsecurity

 nickpsecurity 202 days ago | parent

This is an old problem solved dozens of ways that mainstream just refuses to deal with. The requirement is even standard for proprietary products going for DO-178B certification. I believe they do quite manual confirmation but automated exists. The solution is called certified compilation: the verifiable transformation of source into binaries. You break the process into a series of steps which each can be verified with the CST's/AST's handed from one to the next. You can implement the steps yourself or validate someone else's, even easier if it's a safe[r] language. Examples each using different methods are VLISP [1], FLINT [2], and CompCert? C [3].

Running Debian through CompCert? while putting more work into CompCert? for portability and optimization is the easiest solution with long-term benefits. Performance will go up steadily. Bug count will go down steadily because that's what SML/Ocaml does. Code will be more readable. Repeat for most trusted tools to drive assurance up across the board.

If they don't want to do that, then the result will be something along lines of just having a bunch of people compile and sign the distro publishing signatures, etc. You will trust that they trusted whatever they all looked at. And anyone whose studied GCC's source, etc will know that basically means they all saw the same code. They'd have to understand it all to have known if there was a weakness introduced. They won't, use of C/C++ makes that harder, plenty of rope to hang one's self in any common action, and it's why those of us doing subversion-resistant development use languages like ML's or Oberon. FOSS needs to similarly transition toward safe, comprehensible tools that aren't backdoor generators just by architecture & language used.

Otherwise, all this talk of preventing subversion is just talk: they're going to get in. And if not subversion, the endless stream of 0-day's from the language and architectural choices will continue to do the job. A re-implementation of the TCB's of our systems is long overdue.

[1] http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.51.2824&rep=rep1&type=pdf

[2] http://flint.cs.yale.edu/flint/software.html

[3] http://compcert.inria.fr/

...

Here's what subversion resistant development takes: modular software with sensible interfaces: ability to understand code for human review (closer to algorithm the better); ability to understand compiler passes in isolation; ability to implement toolchain in language of choosing. There are existing flows like this as I illustrated. So, you use them and leverage diverse audience to check results. I mean, would you rather implement CompCert? passes by hand or GCC even without optimizations? See the difference? ;)

Now, I did have a method to solve problem you're addressing. You implement an assembler first. Then a macro assembler with macro's for HLL primitives. You can use that immediately to jmplement certified compiler. Alternately, you can pick up Oberon report or Scheme book to implement that to get a true HLL plus compiler. Then you implememt the certified compiler with it. Comprehension, code complexity, and trust are kept manageable by building layer by layer. This, for productivity not security, is how Wirth and Carl first built Lilith then Oberon. Same method will work again and good that ML/Scheme/Oberon folks already gave us doc's plus code to use. Lets use them.

jeffreyrogers 201 days ago

Yep, building up like that would work. Oberon if I recall correctly is pretty simple too (maybe ~20k LOC?) so that would actually be possible by a small team.

nickpsecurity 201 days ago

It has many times. Nice, LISP-style example that was recently on HN:

https://speakerdeck.com/nineties/creating-a-language-using-only-assembly-language

Note: LISP/Scheme interpreters and processors with plenty of detail (including source) can be found with Google. Many implemented before 1990. Will run on cheap FPGA's or process nodes. Can take it all the way to hardware. ;)

The macro ASM can be built on something like P-code: an idealized, low-level machine easy to deploy on CISC and RISC architectures. A good example of how to bridge ASM and HLL's is Hyde's High Level Assembly:

http://www.plantation-productions.com/Webster/

The HLL, for non-LISP audience, can be Oberon with aid of Wirth's Compiler Construction book among other papers:

http://www.ethoberon.ethz.ch/WirthPubl/CBEAll.pdf

So, many possibilities. People just gotta use them. Build a LISP w/ macros to build everything else still seems to be easiest strategy. Esp as one can reuse code from textbooks unlikely to be subverted. Wirth's next best.

nickpsecurity 149 days ago

parent

That's not going to happen because the two toolchains won't be anything alike. You won't be able to make them alike either. Worse, that people often quote Thompson's attack shows that INFOSEC teaches subversion very poorly: it's the least likely attack to affect you and you really need to counter the others. What you have to do is make the software correct, make it secure, and ensure no subversion in lifecycle. That's called high assurance or robustness system design. Links below show you how to do that.

High assurance software design - nice intro http://web.cecs.pdx.edu/~hook/cs491sp08/AssuranceSp08.ppt

FOSS tools for high assurance http://www.dwheeler.com/essays/high-assurance-floss.html

Certified compilation (HW w/ need certified "synthesis") http://compcert.inria.fr/

Original work on subversion http://csrc.nist.gov/publications/history/myer80.pdf

Example of it in action http://www.cisr.us/downloads/theses/02thesis_anderson.pdf

Example of high assurance hardware (AAMP7G version is secure but no public paper...) http://www.csl.sri.com/papers/wift95/wift95.pdf

---

http://stackoverflow.com/questions/367115/is-there-a-python-equivalent-to-perl-pi-e

http://everythingsysadmin.com/perl2python.html

http://www.softpanorama.org/Scripting/Perlorama/perl_in_command_line.shtml

http://programmers.stackexchange.com/questions/65150/is-there-any-good-reason-for-someone-who-knows-python-to-learn-perl

---

mb resurrect list-context sigils (@ vs $) from perl5 so that when a list is prefixed with '$' it is treated as usual (a value that happens to be a list), whereas when it is in @ form it is implicitly mapped over (note: this isnt quite how Perl5 does it). But didn't Perl6 get rid of this? Or did they?

---

" Common practice in network monitoring and in QoS? technologies is to identify a flow of packets by the 5-tuple {source address, dest address, source port, dest port, protocol #}. " -- https://www.ietf.org/mail-archive/web/ipv6/current/msg11559.html

---

"This is relatively trivial at line speed in IPv4 since these things are at fixed locations in the header. But in IPv6, the protocol number is at the end of a linked list of "next headers." .. From today's perspective, the IPv6 header design is complete crap. Maybe it was optimized for software forwarding on in-order CPUs, but that's distant history now. "

-- https://www.ietf.org/mail-archive/web/ipv6/current/msg11559.html and https://www.ietf.org/mail-archive/web/ipv6/current/msg11681.html

---

IP packets have a 'protocol number'. It is a 1-byte field. TCP and UDP and ICMP are some of the choices.

On top of TCP/IP are various other protocols, like telnet, SMTP, HTTP, SSH, etc. They tend to be associated with 'well-known ports' The first 1024 ports are reserved as 'system ports' or 'well known ports', assigned by IANA. From 1024 to 49151 are 'user ports' or 'registered ports' which are also assigned by IANA, presumably much more easily ([1] says about the system ports, "The requirements for new assignments in this range are stricter than for other registrations.[2]"). There are currently some games registered in both the system ports (eg Doom at 666) and the user ports. From 40152-65535 are the 'ephemeral' or dynamic or private ports.

what are the common IP protocols?

http://www.cisco.com/c/en/us/td/docs/security/asa/asa90/configuration/guide/asa_90_cli_config.pdf lists ('Possible ASA protocol literal values'):

ahp, eigrp, esp, gre, icmp, igmp, igrp, ip, ipinip, ips ec, nos, ospf, pcp, snp, tcp, and udp

http://www.networking-forum.com/viewtopic.php?f=46&t=15498 lists:

TCP UDP mb DCCP SCTP RSVP ICMP - Used by pings, and for other network-level information. IGMP - Used with multicasting to join/leave groups. RSVP - Used to reserve bandwidth for a flow. Not very common, but discussed in almost all QOS texts. GRE - Used for tunneling.

https://technet.microsoft.com/en-us/library/cc959827.aspx?f=255&MSPPError=-2147217396 lists:

1 Internet Control Message Protocol (ICMP) 6 Transmission Control Protocol (TCP) 17 User Datagram Protocol (UDP) 47 General Routing Encapsulation (PPTP data over GRE) 51 Authentication Header (AH) IPSec 50 Encapsulation Security Payload (ESP) IPSec 8 Exterior Gateway Protocol (EGP) 3 Gateway-Gateway Protocol (GGP) 20 Host Monitoring Protocol (HMP) 88 Internet Group Management Protocol (IGMP) 66 MIT Remote Virtual Disk (RVD) 89 OSPF Open Shortest Path First 12 PARC Universal Packet Protocol (PUP) 27 Reliable Datagram Protocol (RDP) 46 Reservation Protocol (RSVP) QoS?

http://networkengineering.stackexchange.com/a/16212 (answer to http://networkengineering.stackexchange.com/questions/16191/raw-ip-communication ) says:

"

List of common IP protocols:

Nearly every application I can think of uses one of the following IPv4 protocols (in rough order of frequency people see on the wire):

    TCP (IP Protocol 6)
    UDP (IP Protocol 17)
    ICMP (IP Protocol 1)
    OSPF (IP Protocol 89)
    EIGRP (IP Protocol 88)
    GRE (IP Protocol 47)
    ESP (IP Protoocl 50)
    AH (IP Protocol 51)
    PIM (IP Protocol 103)
    IGMP (IP Protocol 2)
    VRRP (IP Protocol 112)
    SCTP (IP Protoocl 132)

That's it... statistically, every other IP protocol ranks in noise... and if it wasn't obvious enough, every one of those protocols performs at least one of the functions of an IP Transport protocol. "

http://etherape.sourceforge.net/introduction.html lists (this is more than just IP):

ETH_II, 802.2, 803.3, IP, IPv6, ARP, X25L3, REVARP, ATALK, AARP, IPX, VINES, TRAIN, LOOP, VLAN, ICMP, IGMP, GGP, IPIP, TCP, EGP, PUP, UDP, IDP, TP, ROUTING, RSVP, GRE, ESP, AH, EON, VINES, EIGRP, OSPF, ENCAP, PIM, IPCOMP, VRRP

http://teleco-network.blogspot.com/2012_11_01_archive.htmll lists:

AH, ARP/RARP, ATMP, BGMP, BGP-4, COPS, DCAP, DHCP, DHCPv6, DNS, DVMRP, EGP, EIGRP, ESP, FANP, Finger, FTP, GOPHER, GRE, HSRP, HTTP, ICMP, ICMPv6, ICP, ICPv2, IDRP, IGMP, IGRP, IMAP4, IMPP, IP, IPv6, IPDC, IRC, L2F, L2TP, LDAP, LDP, MARS, MDTP, Megaco (ASCII + ASN.1), Mobile IP, MZAP, NARP, Nat, NetBIOS?/IP, NHRP, NTP, OSPF, PIM, POP3, PPTP, Radius, RIP2, RIPng for IPv6, RSVP, RTSP, RUDP, SCSP, SCTP, SDCP , SLP, SMPP,SSH, SMTP, SNMP, SOCKS, TACACS+, TCP, TELNET, TFTP, TRIP, UDP, Van Jacobson, VRRP, WCCP, XOT, X-Window.

what are the common ports/application layer protocols?

[2] lists some well-known examples of system port applications:

    21: File Transfer Protocol (FTP)
    22: Secure Shell (SSH)
    23: Telnet remote login service
    25: Simple Mail Transfer Protocol (SMTP)
    53: Domain Name System (DNS) service
    80: Hypertext Transfer Protocol (HTTP) used in the World Wide Web
    110: Post Office Protocol (POP3)
    119: Network News Transfer Protocol (NNTP)
    123: Network Time Protocol (NTP)
    143: Internet Message Access Protocol (IMAP)
    161: Simple Network Management Protocol (SNMP)
    194: Internet Relay Chat (IRC)
    443: HTTP Secure (HTTPS)

http://brainmeta.com/forum/index.php?showtopic=9800 lists:

20 FTP data (File Transfer Protocol) 21 FTP (File Transfer Protocol) 22 SSH (Secure Shell) 23 Telnet 25 SMTP (Send Mail Transfer Protocol) 43 whois 53 DNS (Domain Name Service) 68 DHCP (Dynamic Host Control Protocol) 79 Finger 80 HTTP (HyperText? Transfer Protocol) 110 POP3 (Post Office Protocol, version 3) 115 SFTP (Secure File Transfer Protocol) 119 NNTP (Network New Transfer Protocol) 123 NTP (Network Time Protocol) 137 NetBIOS?-ns 138 NetBIOS?-dgm 139 NetBIOS? 143 IMAP (Internet Message Access Protocol) 161 SNMP (Simple Network Management Protocol) 194 IRC (Internet Relay Chat) 220 IMAP3 (Internet Message Access Protocol 3) 389 LDAP (Lightweight Directory Access Protocol) 443 SSL (Secure Socket Layer) 445 SMB (NetBIOS? over TCP) 666 Doom 993 SIMAP (Secure Internet Message Access Protocol) 995 SPOP (Secure Post Office Protocol)

https://en.wikibooks.org/wiki/Network_Plus_Certification/Technologies/Common_Protocols some protocols by layer, including some non-TCP-based application layer protocols (eg RTP which is usually over UDP):

Application DNS, TFTP, TLS/SSL, FTP, HTTP, IMAP4, POP3, SIP, SMTP, SNMP, SSH, Telnet, RTP Transport TCP, UDP Internet IP (IPv4, IPv6), ICMP, IGMP Link ARP

http://www.answersthatwork.com/Download_Area/ATW_Library/Networking/Network__2-List_of_Common_TCPIP_port_numbers.pdf lists:

ftp ssh telnet smtp wins_replication whois dns dhcp finger http x.400 pop3 sftp nntp ntp rpc_locator_service netbios_name_service_wins imap4 snmp BGP SRS LDAP SSL SMB gmail_outgoing ldap_over_ssl ssl_imap gmail_pop3

http://www.linuxnix.com/important-port-numbers-linux-system-administrator/ lists:

20 – FTP Data (For transferring FTP data) 21 – FTP Control (For starting FTP connection) 22 – SSH(For secure remote administration which uses SSL to encrypt the transmission) 23 – Telnet (For insecure remote administration 25 – SMTP(Mail Transfer Agent for e-mail server such as SEND mail) 53 – DNS(Special service which uses both TCP and UDP) 67 – Bootp 68 – DHCP 69 – TFTP(Trivial file transfer protocol uses udp protocol for connection less transmission of data) 80 – HTTP/WWW(apache) 88 – Kerberos 110 – POP3(Mail delivery Agent) 123 – NTP(Network time protocol used for time syncing uses UDP protocol) 137 – NetBIOS?(nmbd) 139 – SMB-Samba(smbd) 143 – IMAP 161 – SNMP(For network monitoring) 389 – LDAP(For centralized administration) 443 – HTTPS(HTTP+SSL for secure web access) 514 – Syslogd(udp port) 636 – ldaps(both tcp and udp) 873 – rsync 989 – FTPS-data 990 – FTPS 993 – IMAPS

http://web.mit.edu/rhel-doc/4/RH-DOCS/rhel-sg-en-4/ch-ports.html lists:

http://www.techexams.net/forums/network/2235-tcp-ip-port-assignments-print.html lists:

 FTP 21, SSH 22, Telnet 23, SMTP 25, DNS 53, TFTP 69, HTTP 80, POP3 110, NNTP 119, NTP 123, IMAP4 143, SNMP 161, HTTPS 443....also gopher finger 

http://www.sei.cmu.edu/reports/12tr006.pdf says: " The top expected services requested by clients on a typical network are web (ports 80 and 443), DNS (53), and SMTP (25) "

http://www.internet-computer-security.com/Firewall/Protocols/Ports-Protocols-IP-Addresses.html lists:

20 FTP (for File Transfer Protocol) – Data Port 21 FTP (File Transfer Protocol) – Command Port 22 SSH (Secure Shell) - Used for secure remote access 23 Telnet – Used for insecure remote access, data sent in clear text 25 SMTP (Simple Mail Transport Protocol) – Used to send email 53 DNS (Domain Name Service) – Used to resolve DNS names to public IP addresses 68 DHCP (Dynamic Host Configuration Protocol) – Used to assign IP addresses to clients 80 HTTP (Hypertext Transfer Protocol) - Used to browse the web 110 POP3 (Post Office Protocol, version 3) - Used to retrieve email from a server 115 SFTP (Secure File Transfer Protocol) - Secure file transfer 119 NNTP (Network News Transfer Protocol) – For transferring news articles between news servers 123 NTP (Network Time Protocol) For synchronising system time with a time server on the public network. 161 SNMP (Simple Network Management Protocol) For receiving system management alerts 163 IMAP (Internet Message Access Protocol 4) For retrieving emails 389 LDAP (Lightweight Directory Access Protocol) Querying directory services such as Active Directory 443 SSL (Secure Socket Layer) Using a secure web connection 445 SMB (Server Message Block) For shared access to files and printers

http://www.linuxsecurity.com/resource_files/firewalls/firewall-seen.html lists "common incoming TCP/UDP probes against my firewall":

1 tcpmux Indicates someone searching for SGI Irix machines. Irix is the only major vendor that has implemented tcpmux, and it is enabled by default on Irix machines. Irix machines ship with several default passwordless accounts, such as lp, guest, uucp, nuucp, demos, tutor, diag, EZsetup, OutOfBox?, and 4Dgifts. Many administrators forget to close these accounts after installation. Therefore, hackers scan the Internet looking first for tcpmux, then these accounts. [CA-95.15] 7 Echo You will see lots of these from people looking for fraggle amplifiers sent to addresses of x.x.x.0 and x.x.x.255.

A common DoS? attack is an echo-loop, where the attacker forges a UDP from one machine and sends it to the other, then both machines bounce packets off each other as fast as they can (see also chargen). [CA-96.01]

Another common thing seen is TCP connections to this port by DoubleClick?. They use a product called "Resonate Global Dispatch" that connects to this port on DNS servers in order to locate the closest one.

Harvest/squid caches will send UDP echoes from port 3130. To quote: If the cache is configured with source_ping on, it also bounces a HIT reply off the original host's UDP echo port. It can generate a lot of these packets. 11 sysstat This is a UNIX service that will list all the running processes on a machine and who started them. This gives an intruder a huge amount of information that might be used to compromise the machine, such as indicating programs with known vulnerabilities or user accounts. It is similar the contents that can be displayed with the UNIX "ps" command. ICMP doesn't have ports; if you see something that says "ICMP port 11", you probably want ICMP type=11. 19 chargen This is a service that simply spits out characters. The UDP version will respond with a packet containing garbage characters whenever a UDP packet is received. On a TCP connection, it spits out a stream of garbage characters until the connection is closed. Hackers can take advantage of IP spoofing for denial of service attacks. Forging UDP packets between two chargen servers, or a chargen and echo can overload links as the two servers attempt to infinitely bounce the traffic back and forth. Likewise, the "fraggle" DoS?