proj-oot-ootDataNotes3

"

lkrubner 727 days ago

I love this:

"Key := Val Updates an existing Key The key must be present in the old map. If Key is not in the old map Erlang will complain loudly. Using two operations instead of one has several advantages: A spelling mistake cannot accidentally introduce a new key"

Lord knows how many times I accidentally created a new key with a spelling typo, and then later when I wanted to retrieve the data, I was like "What the hell is wrong? I know I put data in that key, so why is nothing coming back? I guess I have discovered a rare bug in the compiler that no one has ever discovered before." "

---

 roncohen 823 days ago | parent

They should do the world a favor and include a datetime type.

samatman 823 days ago

Indeed, this is a known terrible mistake, easily avoided.

The lack of the string "UUID" in the RFC is also cause for concern.


MichaelGG? 823 days ago

They have a tag for date strings, or you can use seconds from epoch, as an integer _or floating-point_. So if you actually want to represent time with proper fractional seconds, you're stuck representing them as strings. Hardly concise.


Someone 823 days ago

Whats terribly wrong with http://tools.ietf.org/html/rfc7049#section-2.4.1?


drdaeman 823 days ago

The minor issues are missing timezone and precision information.

But, most importantly, use of integers for datetime values hides type-level semantics. It's just integers and you, the end user, and not the deserializer, is responsible for handling the types.

I think it's quite inconvenient to do tons of `data["since"] = parse_datetime(data["since"])` all the time, for every model out there.


---

samatman 840 days ago

parent on: ECMA-404: The JSON Data Interchange Format [pdf]

To summarize the top HN complaints: No date format, no comments, needlessly strict commas.

Is it too late for edn? https://github.com/edn-format/edn

Symbols that aren't "strings" are kinda neat too, and you get downright attached to arbitrary key-value mappings once you have them.

---

samatman 823 days ago

parent on: RFC 7049 - Concise Binary Object Representation (C...

Indeed, this is a known terrible mistake, easily avoided.

The lack of the string "UUID" in the RFC is also cause for concern.

---

can we unify truthiness (bools; True/False) and Maybe?

---

grex: graph regex

can generalize matching on a character to matching on an opaque 'node' by letting the grex contain arbitrary code (a predicate) at each location. The arbitrary code is stateless. Its input is a node, and its output is 'accept' or 'reject'. If the graph is in the shape of a string (linear) then the running time of the grex IN TERMS OF THE RUNNING TIME OF THE PROVIDED CODE (eg if the runtime of the provided code is taken to be a single step, or at least O(1)) is still O(m*n).

if the graph is not linear, how to generalize? One thing to do is to say that the grex follows each edge (if directed, only those edges in the correct direction) from the current node. Edge direction is just one kind of 'edge type', so to generalize this, it follows all edges of a given type. Generalizing further, we follow all edges that meet some predicate. Generalizing further, this predicate may change at each location, so we embed edge predicates like te node predicates above. So one way to do grexs is to provide an 'edge matching' construct for each 'space' in between the node matches. These are implicit in string regexes because there is only one type of 'space' (edge) to match. Like the node, the most general thing is to pass the edge to a provided predicate subroutine. But here we come to additional complexity. What do we do if an edge is rejected? I guess we should backtrack and try another edge at the same node. So there is additional nondeterminism here. I guess this is okay because it seems appropriate that the special case of a linear graph should lead to simpler behavior. Is the runtime still O(m*n)? I bet it is, but i'm not sure. The reason i bet it is is that i think of the first example that comes to mind for something that might be exponential in runtime, and it's a regex given by [1] for a regex which has an exponential blowup in space when you convert it from an NFA to a DFA: (a

b)*a(ab)(ab)(ab) (exponential, 2^k, in the number of (ab)s at the end; here we have k=3). You can see that this sort of thing may backtrack a lot if each node it checks has two children, b/c if one descendent tree isn't suitable it will want to backtrack up and try the other child of each ancestor. But for each node to have two children, the size of the graph itself must be exponential, so it's still O(m*n) in this case. To get exponential behavior in a non-exponential graph, we'd have to backtrack on something greater than the choices given by all of the choices of which edge comes out of the child.

Maybe if the graph has cycles then that could happen? Imagine a graph with two pairs of nodes where each node in each pair has an outgoing edge to each node in the other pair, a, b, c, d: a->c, a->d, b->c, b->d, c->a, c->b, d->a, d->b. Label all edges going to the first node in each pair (a or c) with '0' and label the other edges, the ones going to b or d, with '1'. Now if the node predicate just accepts any node and you write the grex as for the edges only as: (0

1)*0(01)(01)(01). What will happen? Well, in the best case, the edge ordering is such that all 0s are chosen first, and the NFA accepts without ever considering more than 1 edge. But even if the 1s are ordered first, the 0 bottleneck (after the *) will choose the 0 edge, and there will be no backtracking. The thing is, to force backtracking, you need a situation where the searcher is thinking like, 'This sequence doesn't match; but if only i had chosen a different child at the grandfather of the current node, maybe i would have found a different chain of descendents that would match'. But in this example graph, all sequences of 0,1 are available from any position.

So what about the graph where one of the pairs forces a '1', eg a, b, d: a-1>d, b-1>d, d-0>a, d-1>b. We left out the a->c and b->c edges, which are the only ones labeled '1' from the a,b pair of nodes; since there are now no incoming nodes to 'c', it is unreachable (assuming the initial state is a or b or some node with epsilon edges that goes to a and b), so we removed c. In this graph, the initial (0

1)* just 'populates' all of the nodes with walkers; then the 'bottleneck' 0 just 'prunes' those walkers on a or b, leaving the walker on d to survive and successfully transition to a with remaining grex (01)(01)(01). There is only one path out of a, (a->d) so the walker follows it, with remaining grex (01)(01). But now the walker bifurcates, going both to a and b. I suppose if the sequence at the end was longer than 3 then every other (01) there would be a bifurcation. If the walker then hit an impossible condition at the end (eg if the grex was (01)*0(01)(01)(01)2; there are no '2' labels in this graph), it would then 'backtrack', the possibility of which in my visualization here is already represented 'nondeterministically without backtracking' by the 2^(k/2) walkers present by the end (is there some more complicated construction that eliminates some of those walkers based on recognizing the finite, cyclic nature of the graph? i conjecture that this is np-complete (in other words, that computing edge label grexs on possibly-cyclic graphs is NP-complete).

so i guess my initial hunch may be wrong, and that there can be an exponential complexity blowup, if the graph is cyclic. So i reduce my original conjecture to: grexs have runtime O(m*n) on DAGs. Again, even if grexs have NP-complete performance on cyclic grexs, this seems appropriate as we transition from a linear graph to a cyclic one.

otoh my analysis here may be wrong. In implementations of ordinary regexs by backtracking, (a

aa)*b causes exponential blowup [2]