May 11, 2006
Walter Bright wrote:
>>
>> Thanks.  Assuming I did this, would it be technically legal to destroy this list after the iteration ends but before opApply returns? Otherwise, I'm not entirely certain how to go about processing the removals.
> 
> I'm a bit reluctant to say yes on that, worrying that I'm forgetting something. What you can do is not use foreach, but just a for loop.

I think I could for Thread, though doing so may require an iterator implementation (threads are now stored in a linked list).  I have fewer options for ThreadGroup as it's implemented using an AA, but perhaps I can come up with something.  However, using foreach offers a tremendous advantage that other options don't: I can use a 'synchronized' block inside opApply and get thread safety without any user involvement. Better yet, on platforms with reentrant locks (ie. all platforms D currently runs on) I can synchronize on the same object in container method calls and pay for only a single lock/unlock for the entire loop, regardless of what the user does to the container.  The alternative with traditional loops would be to use the iterator as an RAII mechanism for this lock, but that seems a disaster waiting to happen.  Alternately, I could follow the traditional approach of not locking within the container at all, but that seems a tad odd for code specifically intended for multithreaded programming.

To redirect discussion a bit, what about something akin to iterator categories in C++.  For example, it seems likely that a compiler might try to optimize loops involving a random_access_iterator, but far less likely that this would occur for other iterator types (forward_only, input_iterator, etc).  Perhaps there's a way to indicate to the compiler what sort of sequence is being represented as a way to suggest (or restrict) possible optimizations that may be performed?  Thus I may be overjoyed to have the compiler figure out something clever to do with my BitArray iteration and may even be willing to provide hints so that it can do so, but I wouldn't necessarily want or expect this to occur for my SqlCursor object.  This would also allow me to restrict optimizations in instances when dynamic manipulation of the sequence is more important to me than iteration performance.

This idea just came to me so I don't have any brilliant suggestions for how best to implement it, but one obvious method would be to have separate opApply signatures for each optimization category.  However, it would be nice if there were some way to represent in code how the built in types should be handled as well.  Herb Sutter's Concur project attempts to solve this by including an additional qualifier in the loops themselves:

    for_each( c.depth_first(), f ); // sequential
    for_each( c.depth_first(), f, parallel ); // fully parallel
    for_each( c.depth_first(), f, ordered ); // ordered parallel

And while I don't entirely like that the user must specify the optimization/parallelization level to use, I do like that the choice isn't completely taken out of the hands of the programmer.  Does the general idea seem reasonable?


Sean
May 11, 2006
Sean Kelly wrote:
> Chris Miller wrote:
> 
>> A new linked list module has been released at http://www.dprogramming.com/list.php
>> You may be wondering why another linked list; check out the FAQ to see!
> 
> 
> Very cool.  One thing... does this work:
> 
>     foreach( Person p; per.each )
>     {
>         if( p.age > 50 )
>             p.listRemove();
>     }
> 
> ie. can you remove elements within a foreach?
> 
> 
> Sean


how about:

alias void delegate(void) vvd;
vvd outs[];

foreach( Person p; per.each )
{
	if( p.age > 50 )
	outs ~= &p.listRemove;
}

foreach (act; outs)
	act();
May 11, 2006
You know, this whole sequence optimization business seems to be linked to mutability: if no mutating operations are performed then optimize away!  Unfortunately, I haven't seen a high-level construct for method classification that can't be violated by the user.  And while this may be irritating in some cases, it could lead to nasty bugs in the face of such optimizations.  In some respects, it would be nice if each operation in the language could be labeled as mutating or non-mutating and then functions could be analyzed for atomicity during compilation, so a function would considered atomic if it does not perform a mutating operation on non-local data.  But this may be overly restrictive.  For example, the body of a foreach loop must be able to perform some mutating operations, but different categorites would impose different restrictions on optimization:

    int sum = 0;
    foreach( int val; values ) {
        sum += val;
    }

Here, the code could be heavily parallelized because the result doesn't depend on the sequence in which operations are performed.  However:

    foreach( int val; values ) {
        printf( "val: %d\n", val );
    }

Here, the code could be optimized somewhat but must still be executed sequentially.  By comparison:

    void fn() {
        values ~= 5;
    }

    foreach( int val; values ) {
        if( val > 10 ) fn();
    }

No optimizations may be performed here because the iteration sequence is being modified during processing.  To make matters worse, this could occur through an untraceable sequence of pointers and such that the compiler can't make sense of.  What would be wonderful is if there were a way to distinguish these three categories of code somehow.  I know, that's the million dollar question, but I harbor a secret hope that if it's asked frequently enough then perhaps someone will think of an answer.


Sean
May 12, 2006
On Thu, 11 May 2006 09:52:10 -0700, Walter Bright wrote:

> Sean Kelly wrote:
>> Very cool.  One thing... does this work:
>> 
>>     foreach( Person p; per.each )
>>     {
>>         if( p.age > 50 )
>>             p.listRemove();
>>     }
>> 
>> ie. can you remove elements within a foreach?
> 
> It's undefined behavior. foreach is entitled to assume that the aggregate is a loop invariant, although the contents of the aggregate's elements can change. This precludes things like resizing an array inside a foreach, etc.

This bit my bum yesterday too. I had code inside the foreach that added to the list, only to discover that even though the elements were added, they were ignored inside the same foreach loop.

--------- example.d -----------
import std.stdio;
import std.string;
void tester(char[] x)
{
    writefln("Adding '%s'", x);
    vList ~= x;
}

char[][] vList;

void main()
{
    tester("abc");
    tester("def");
    tester("ghi");

version(FOR)
{    for(int i; i < vList.length; i++)
    {
        char[] p = vList[i];
        writefln("Found '%s'", p);
        if (std.string.find(p, "e") != -1)
        {
            tester("orang");
        }
    }

}

version(EACH)
{   foreach(char[] p; vList)
    {
        writefln("Found '%s'", p);
        if (std.string.find(p, "e") != -1)
        {
            tester("orang");
        }
    }

}

}
---------- end of example.d ----------


c:\temp>build example -full -exec -version=FOR
Path and Version : y:\util\build.exe v2.10(1556)
  built on Thu Apr  6 13:13:32 2006
Adding 'abc'
Adding 'def'
Adding 'ghi'
Found 'abc'
Found 'def'
Adding 'orang'
Found 'ghi'
Found 'orang'

c:\temp>build example -full -exec -version=EACH
Path and Version : y:\util\build.exe v2.10(1556)
  built on Thu Apr  6 13:13:32 2006
Adding 'abc'
Adding 'def'
Adding 'ghi'
Found 'abc'
Found 'def'
Adding 'orang'
Found 'ghi'


-- 
Derek
(skype: derek.j.parnell)
Melbourne, Australia
"Down with mediocracy!"
12/05/2006 10:08:16 AM
May 12, 2006
Sean Kelly wrote:
> You know, this whole sequence optimization business seems to be linked to mutability: if no mutating operations are performed then optimize away!  Unfortunately, I haven't seen a high-level construct for method classification that can't be violated by the user.  And while this may be irritating in some cases, it could lead to nasty bugs in the face of such optimizations.  In some respects, it would be nice if each operation in the language could be labeled as mutating or non-mutating and then functions could be analyzed for atomicity during compilation, so a function would considered atomic if it does not perform a mutating operation on non-local data.  But this may be overly restrictive.  For example, the body of a foreach loop must be able to perform some mutating operations, but different categorites would impose different restrictions on optimization:
> 
>     int sum = 0;
>     foreach( int val; values ) {
>         sum += val;
>     }
> 
> Here, the code could be heavily parallelized because the result doesn't depend on the sequence in which operations are performed.  However:
> 
>     foreach( int val; values ) {
>         printf( "val: %d\n", val );
>     }
> 
> Here, the code could be optimized somewhat but must still be executed sequentially.  By comparison:
> 
>     void fn() {
>         values ~= 5;
>     }
> 
>     foreach( int val; values ) {
>         if( val > 10 ) fn();
>     }
> 
> No optimizations may be performed here because the iteration sequence is being modified during processing.  To make matters worse, this could occur through an untraceable sequence of pointers and such that the compiler can't make sense of.  What would be wonderful is if there were a way to distinguish these three categories of code somehow.  I know, that's the million dollar question, but I harbor a secret hope that if it's asked frequently enough then perhaps someone will think of an answer.
> 
> 
> Sean

The simplest answer is to trust the user to supply some unreserved words as hints to the compiler about what the loop's behavior is.

-- 
Regards,
James Dunne
May 12, 2006
James Dunne wrote:
> 
> The simplest answer is to trust the user to supply some unreserved words as hints to the compiler about what the loop's behavior is.

Yeah.  I wondered briefly if 'volatile' might suit, but I need to deepen my understanding of optimizer techniques before going any further.  One pretty basic question I had was whether a volatile statement is basically a manually specified basic block or whether the restriction on loads and stores doesn't apply in quite the same way to basic blocks.


Sean
May 12, 2006
Sean Kelly wrote:
> You know, this whole sequence optimization business seems to be linked to mutability: if no mutating operations are performed then optimize away!  Unfortunately, I haven't seen a high-level construct for method classification that can't be violated by the user.  And while this may be irritating in some cases, it could lead to nasty bugs in the face of such optimizations.  In some respects, it would be nice if each operation in the language could be labeled as mutating or non-mutating and then functions could be analyzed for atomicity during compilation, so a function would considered atomic if it does not perform a mutating operation on non-local data.  But this may be overly restrictive.  For example, the body of a foreach loop must be able to perform some mutating operations, but different categorites would impose different restrictions on optimization:
> 
>     int sum = 0;
>     foreach( int val; values ) {
>         sum += val;
>     }
> 
> Here, the code could be heavily parallelized because the result doesn't depend on the sequence in which operations are performed.  However:
> 
>     foreach( int val; values ) {
>         printf( "val: %d\n", val );
>     }
> 
> Here, the code could be optimized somewhat but must still be executed sequentially.  By comparison:
> 
>     void fn() {
>         values ~= 5;
>     }
> 
>     foreach( int val; values ) {
>         if( val > 10 ) fn();
>     }
> 
> No optimizations may be performed here because the iteration sequence is being modified during processing.  To make matters worse, this could occur through an untraceable sequence of pointers and such that the compiler can't make sense of.  What would be wonderful is if there were a way to distinguish these three categories of code somehow.  I know, that's the million dollar question, but I harbor a secret hope that if it's asked frequently enough then perhaps someone will think of an answer.
> 
> 
> Sean

Maybe I'm taking this out of context because I just jumped into this thread, but...

I think a D compiler (because foreach is built-in) could safely optimize those three w/o a lot of magic right now.

For #1, neither sum nor val are references and there isn't a call outside of the loop scope. So I think a compiler could determine that and fully optimize w/ parallelization or whatever.

The 2nd one calls outside of the loop scope, so the compiler could keep track of that to determine that it can't safely do any optimization where the elements may be unordered. Even if it is a call to something like putc which is then inlined, it must eventually make a call and/or write a value to a hardware reference, either of which - call or dereference - the optimizer could flag as "can't use an unordered optimization".

IIRC, the 3rd one is undefined behavior because the aggregate is modified inside the loop. But even if it wasn't, the same restrictions for #2 could be put on it because it calls outside of the loop scope.
May 12, 2006
On Fri, 12 May 2006 02:52:10 +1000, Walter Bright <newshound@digitalmars.com> wrote:

> Sean Kelly wrote:
>> Very cool.  One thing... does this work:
>>      foreach( Person p; per.each )
>>     {
>>         if( p.age > 50 )
>>             p.listRemove();
>>     }
>>  ie. can you remove elements within a foreach?
>
> It's undefined behavior. foreach is entitled to assume that the aggregate is a loop invariant, although the contents of the aggregate's elements can change. This precludes things like resizing an array inside a foreach, etc.

I fell for this one just yesterday. I had ...

     foreach( Person p; People)
     {
         Foo();
         SomeFunc(); // Potentially adds an item to the People list.
     }

but the new items were never processed by Foo() !

-- 
Derek Parnell
Melbourne, Australia
May 12, 2006
Dave wrote:
> 
> Maybe I'm taking this out of context because I just jumped into this thread, but...
> 
> I think a D compiler (because foreach is built-in) could safely optimize those three w/o a lot of magic right now.
> 
> For #1, neither sum nor val are references and there isn't a call outside of the loop scope. So I think a compiler could determine that and fully optimize w/ parallelization or whatever.
> 
> The 2nd one calls outside of the loop scope, so the compiler could keep track of that to determine that it can't safely do any optimization where the elements may be unordered. Even if it is a call to something like putc which is then inlined, it must eventually make a call and/or write a value to a hardware reference, either of which - call or dereference - the optimizer could flag as "can't use an unordered optimization".
> 
> IIRC, the 3rd one is undefined behavior because the aggregate is modified inside the loop. But even if it wasn't, the same restrictions for #2 could be put on it because it calls outside of the loop scope.

I think you're right.  I'll admit much of my confusion here is a result of not knowing just how aggressive current optimization strategies are.  Also, this seems like a good opportunity to explore how this might extend to implicit parallelization and such, which seems to follow similar rules but isn't at all common (I think the Intel compiler does parallel loop unrolling in certain instances, but that's about it). Ideally, it would be nice if there were some means of allowing optimizations to take place for which there currently isn't enough information, similar to the various levels of parallelization in the C++ example.  But at the very least I'd be happy if I didn't have to give up on foreach altogether in instances where the result should be well-defined before optimizations take place (#3 is a still undefined, but my linked list example should not be).


Sean
May 12, 2006
Derek Parnell wrote:
> On Fri, 12 May 2006 02:52:10 +1000, Walter Bright <newshound@digitalmars.com> wrote:
> 
>> Sean Kelly wrote:
>>> Very cool.  One thing... does this work:
>>>      foreach( Person p; per.each )
>>>     {
>>>         if( p.age > 50 )
>>>             p.listRemove();
>>>     }
>>>  ie. can you remove elements within a foreach?
>>
>> It's undefined behavior. foreach is entitled to assume that the aggregate is a loop invariant, although the contents of the aggregate's elements can change. This precludes things like resizing an array inside a foreach, etc.
> 
> I fell for this one just yesterday. I had ...
> 
>      foreach( Person p; People)
>      {
>          Foo();
>          SomeFunc(); // Potentially adds an item to the People list.
>      }
> 
> but the new items were never processed by Foo() !

If People is a dynamic array and adding a person to the list results in a reallocation then I'd expect this.  I would actually consider this undefined behavior for dynamic arrays because whether a reallocation occurs is implementation defined.  But for a linked list it should work, darnit :-)


Sean