Thread overview
[OT] Just curious: What's this algorithm's name?
Apr 04, 2012
Nick Sabalausky
Apr 04, 2012
H. S. Teoh
Apr 04, 2012
Nick Sabalausky
Apr 04, 2012
H. S. Teoh
Apr 05, 2012
Nick Sabalausky
Apr 04, 2012
bls
April 04, 2012
Nothing important, just something I'm curious about:

At least a couple times recently I've come across a certain pattern for dealing with directed graphs (ones that may have cycles). It seems general enough that I'm surprised I don't seem to recognize it (but then again, it has been years since I've formally studied algorithms - I only have vague recollections of names like "Prim's"). A brief search didn't seem to turn up anything that jumped out at me as being a match.

Here's what it's for:

Suppose you have a directed graph, which may have cycles, and you want to compute something for each node. But the final result for each node is dependent on the final results for each of the nodes it points to. (It may sound like this would just cause an endless "feedback loop" ie infinite recursion whenever there's a cycle, but there are applications of this where that isn't a problem. Ie, in cases where the computation stops recursing when it gets back to itself.)

A basic example: You have a graph. Various strings can be computed for each node (based on the data stored in the node). Additionally, for each node, all N edges are numbered from 1 to N. For each node "X", you want to compute a single set of unique strings: All the strings computed from node X, plus all the strings computed from all the nodes reachable from node X using *only* even-numbered edges (ie, "edge % 2 == 0").

The way that's done:

There's two main phases:

1. "Spontaneously generate" a partial result for each node X. This partial result is *only* the strings generated from node X itself. It does not attempt to include any strings from the other nodes reachable from X. Additionally, while you're at each node X, you generate a "propagation graph": Ie, make note of which edges are even-numbered and will therefore need to be traversed.

2. "Propagate" the results. For each node X, add in (propagate) any new strings that come directly from the nodes you noted in the "propagation graph" (ie, the edges you already identified as matching "edge % 2 == 0"). Repeat this step 2 until you can go through the entire graph without propagating any new strings.

Granted, that specific example can be simplified somewhat since "follow only even-numbered edges" just happens to be trivial without actually creating a propagation graph. But anyways, that's the basic idea.

There's at least two uses of this algorithm when building LALR parse tables for LALR parsing: For one, it's used to generate all the lookahead information (which is where I got much of the terminology). And I've recently realized that, unless I'm mistaken, it also appears to be needed when determining what terminals can be the "first" terminal in a nonterminal's derivation (which is, in turn, more necessary information for computing the lookahead information).

Even though I don't remember ever encountering this particular algorithm anywhere besides generating LALR parse tables, I have a feeling there are other practical applications of it.

So anyone know offhand what this graph-processing algorithm is called? Or is it just too basic a usage of graphs for it to even have a name?


April 04, 2012
On Wed, Apr 04, 2012 at 01:33:21AM -0400, Nick Sabalausky wrote: [...]
> Suppose you have a directed graph, which may have cycles, and you want to compute something for each node. But the final result for each node is dependent on the final results for each of the nodes it points to. (It may sound like this would just cause an endless "feedback loop" ie infinite recursion whenever there's a cycle, but there are applications of this where that isn't a problem. Ie, in cases where the computation stops recursing when it gets back to itself.)

Sounds like the word you want is "closure".


T

-- 
One disk to rule them all, One disk to find them. One disk to bring them all and in the darkness grind them. In the Land of Redmond where the shadows lie. -- The Silicon Valley Tarot
April 04, 2012
"H. S. Teoh" <hsteoh@quickfur.ath.cx> wrote in message news:mailman.1351.1333549896.4860.digitalmars-d@puremagic.com...
> On Wed, Apr 04, 2012 at 01:33:21AM -0400, Nick Sabalausky wrote: [...]
>> Suppose you have a directed graph, which may have cycles, and you want to compute something for each node. But the final result for each node is dependent on the final results for each of the nodes it points to. (It may sound like this would just cause an endless "feedback loop" ie infinite recursion whenever there's a cycle, but there are applications of this where that isn't a problem. Ie, in cases where the computation stops recursing when it gets back to itself.)
>
> Sounds like the word you want is "closure".
>

Now that you mention it, that *does* seem to make sense: (Warning: Little bit of LALR internals here) The LALR theory materials I had read didn't refer to what I just desribed (ie, basically the lookahead generation) as "closure". But before that, when actually generating the state graph itself, it did use the term "closure" for going from what it called the "kernel" items in a state to the entire state with *all* applicable items included. The algorithm looked a little different, but now that I think about it, those "kernel" items *are* essentially the "spontaneously generated" elements (and computing them involves looking at another graph - the AST of the grammar's BNF-like description). And then computing what it called the "closure" of the state is essentially the same as the "step 2" I described, traversing the graph of the BNF over and over to find new elements until there's no new elements found.

It would indeed appear that the term "closure" the books used for state graph generation is also applicable for the lookahead generation algorithm I described in the OP.

I think another part of what made me not realize to call it "closure" is that somehow I had the impression of "closure" being a much more general, even nebulous term. For example, I knew I'd also heard it used in the context of database theory. But, looking that up now, that *is* the same damn algorithm, too.

So I guess a /facepalm is in order here ;)

Thanks!


April 04, 2012
On Wed, Apr 04, 2012 at 05:03:39PM -0400, Nick Sabalausky wrote:
> "H. S. Teoh" <hsteoh@quickfur.ath.cx> wrote in message news:mailman.1351.1333549896.4860.digitalmars-d@puremagic.com...
[...]
> > Sounds like the word you want is "closure".
> >
> 
> Now that you mention it, that *does* seem to make sense: (Warning:
> Little bit of LALR internals here)
[...]

I'm familiar with LALR internals. :-)


[...]
> It would indeed appear that the term "closure" the books used for state graph generation is also applicable for the lookahead generation algorithm I described in the OP.
> 
> I think another part of what made me not realize to call it "closure" is that somehow I had the impression of "closure" being a much more general, even nebulous term. For example, I knew I'd also heard it used in the context of database theory. But, looking that up now, that *is* the same damn algorithm, too.
> 
> So I guess a /facepalm is in order here ;)
[...]

Well, the term itself *is* quite general. Mathematically speaking, for something to be "closed" means that every combination of operations on a set results in something that is already in the set. It's closed in the sense that no operation will get you "outside the set". So what's a closure? Given a set that *isn't* closed (i.e., some operations may take you "outside the set", so to speak), the "closure" of the set is a larger set that *is* closed.

OK, an example is in order. Take the set A={0,1,2} and the operation +. A is not closed, because although 0+1 and 0+2 are in A, 1+2 is not in A. To form a closure, then, we take the elements produced by + that aren't already in A, and add them to A. So for example, we add 3 to A, now 1+2 is closed. But then 2+2 is not in the set. No problem, we just add that to the set too. But now the newly added elements 3 and 4 are themselves not closed under +, because 1+4 isn't in the set, for example. So we have to add *those* elements as well. (I hope you can see the parallel here with the algo in your OP -- you're trying to incrementally compute the closure.)

Taken to its logical conclusion, we find eventually that in order for A to be closed under the operation +, we have to add *all* positive numbers to it. So the closure of A is actually the entire set of natural numbers.

Now, this example isn't exactly the best one, because the closure is infinite, so an incremental algorithm to compute it would never terminate. However, not all closures are infinite. Closure under grammar rule applications, like in your OP, is finite. Basically you keep applying the rules until it doesn't yield anything more that you don't already have. Then you can stop, because you've found the closure.


T

-- 
Prosperity breeds contempt, and poverty breeds consent. -- Suck.com
April 04, 2012
On 04/03/2012 10:33 PM, Nick Sabalausky wrote:
> Suppose you have a directed graph, which may have cycles, and you want to
> compute something for each node. But the final result for each node is
> dependent on the final results for each of the nodes it points to. (It may
> sound like this would just cause an endless "feedback loop" ie infinite
> recursion whenever there's a cycle, but there are applications of this where
> that isn't a problem. Ie, in cases where the computation stops recursing
> when it gets back to itself.)



Had to find my R. Sedgewick and N. Wirth  Algorithm books..  But: these books do not even give a hint. Nada.

However the problem you describe smells like something for which the the "Chain of Responsibility Pattern" is made for.

Though  this pattern requires a break condition 'cause you wont an endless search of an responsible
( Finally We are programmers, not politicians )

I have a hard time to image how your Graph Node might look.

SOOOooooo... Here you go
April 05, 2012
"H. S. Teoh" <hsteoh@quickfur.ath.cx> wrote in message news:mailman.1358.1333576694.4860.digitalmars-d@puremagic.com...
> On Wed, Apr 04, 2012 at 05:03:39PM -0400, Nick Sabalausky wrote:
>> "H. S. Teoh" <hsteoh@quickfur.ath.cx> wrote in message news:mailman.1351.1333549896.4860.digitalmars-d@puremagic.com...
> [...]
>> > Sounds like the word you want is "closure".
>> >
>>
>> Now that you mention it, that *does* seem to make sense: (Warning:
>> Little bit of LALR internals here)
> [...]
>
> I'm familiar with LALR internals. :-)
>
>
> [...]
>> It would indeed appear that the term "closure" the books used for state graph generation is also applicable for the lookahead generation algorithm I described in the OP.
>>
>> I think another part of what made me not realize to call it "closure" is that somehow I had the impression of "closure" being a much more general, even nebulous term. For example, I knew I'd also heard it used in the context of database theory. But, looking that up now, that *is* the same damn algorithm, too.
>>
>> So I guess a /facepalm is in order here ;)
> [...]
>
> Well, the term itself *is* quite general. Mathematically speaking, for something to be "closed" means that every combination of operations on a set results in something that is already in the set. It's closed in the sense that no operation will get you "outside the set". So what's a closure? Given a set that *isn't* closed (i.e., some operations may take you "outside the set", so to speak), the "closure" of the set is a larger set that *is* closed.
>
> OK, an example is in order. Take the set A={0,1,2} and the operation +. A is not closed, because although 0+1 and 0+2 are in A, 1+2 is not in A. To form a closure, then, we take the elements produced by + that aren't already in A, and add them to A. So for example, we add 3 to A, now 1+2 is closed. But then 2+2 is not in the set. No problem, we just add that to the set too. But now the newly added elements 3 and 4 are themselves not closed under +, because 1+4 isn't in the set, for example. So we have to add *those* elements as well. (I hope you can see the parallel here with the algo in your OP -- you're trying to incrementally compute the closure.)
>
> Taken to its logical conclusion, we find eventually that in order for A to be closed under the operation +, we have to add *all* positive numbers to it. So the closure of A is actually the entire set of natural numbers.
>
> Now, this example isn't exactly the best one, because the closure is infinite, so an incremental algorithm to compute it would never terminate. However, not all closures are infinite. Closure under grammar rule applications, like in your OP, is finite. Basically you keep applying the rules until it doesn't yield anything more that you don't already have. Then you can stop, because you've found the closure.
>

Yea, that makes sense. I've been starting to think of it like: In both everyday english and in math/CS, "closure" is essentially "Everything's taken care of, no loose ends". Not a good technical definition, of course, but a way to help remember an internalize it.