May 11, 2008
Yigal Chripun Wrote:

> Dee Girl wrote:
> > Yigal Chripun Wrote:
> > 
> > 
> > This should be good for the standard library. It is the most flexible approach. You can always do what you want in one line:
> > 
> > void mysort(int array[], bool delegate comp(int, int)) { sort!(comp)(array); }
> > 
> > Now you can call mysort many times and there is only one version of std.sort (if I understand it correctly).
> 
> not exactly. there's always a balance between memory and time, if you
> improve the first second worsens.
> for each different comp function there will be a different sort instance
> created by the compiler.

I do not think so. In fact I am sure you are entirely wrong this time.

mysort has a signature that is not parameterized. The language definitions (I just read again) implies the model of compiling is that not-parameterized functions have only one implementation.

If mysort has only one implementation, it is not possible for mysort to use different instantiations of sort. It is called the pigeonhole principle. It uses only one instantiations for all calls. Unless D generates code at runtime. ^_^

> a simple example:
> struct num(N) { int number = N; }
> num!(1) is different from num!(2);
> so with this method you'll get a bigger executable that contains the
> sort code twice.

I understand that very well (better than English...) I did just implemented a compiler for a mini-pl0 language with generics in class. These problems of code duplication are very clear when I wrote the compiler my self! ^_^ But maybe you did not understand how mysort works without duplicating code.

> with my way on the other hand you'll always have only one sort function.
> the cost in my case is that sort will always call the comp function to
> compare while the template can avoid that function call. simply time vs.
> memory. memory is cheap nowadays and the linker could be made better to
> try to minimize the memory footprint. on the other hand cpu chips today
> come with multi-cores and the compiler could optimize many calls to
> small functions. so in the obj file, the compiler can inline the
> function body instead of calling it.
> the main question would be: what are you optimizing for?
> if you develop on a cellphone memory could be more important then time,
> but on a missile a nano second could mean the missile missed its target.
> so it all depends on what you want to achieve.

If you believe this then you should be happy with std.sort. Which gives the option to optimize for what you want. All other sort implementation I read only go one way and it's hard to make them go an other way.

> > But think what if your mysort is the only implementation given in the standard library. Maybe sometimes it is too slow for your program. What do you do?
> > 
> > void fastsort!(alias comp)(int[] array) { bool dg(int a, int b) {
> > return comp(a, b); } mysort(array, dg); }
> > 
> > This will not be faster at all ^_^. As a friend said: You can build a dynamic library from a static one but not inverse. Things only stack one direction. I think it is good the way sort is now in the standard library, it offers the maximum possible and opens all options. It is very impressive for someone coming from many other language.
> > 
> >> Now, when you use a string with the template, the sort will contain the code itself inline to the sort instance so it "saves" you a function call. IMO, the string solution is not clean ( C MACROs? )and while for simple cases it could save you a function call it's premature optimization IMO.
> > 
> > To me it looks very convinent!
> > 
> >> Also, if I use delegates, I can't see how a template can save that function call. so it just bloats my executable with unneeded copies of sort.
> > 
> > You can use mysort above. It is my first contribution to D. ^_^
> > 
> >> I hate when library code forces me to this. I think the programmer knows his code better than the library writer and should be able to choose himself whether he needs to use a compile time solution to save function calls or a run-time solution to prevent unnecessary executable bloat. this optimization should be made by the programmer and _not_ the library writer. This solution does not give me that freedom.
> > 
> > It looks like you understand the exact oposite. std.sort gives you all freedom, anything else would give less. I have studied many languages but none offers the same flexibility. Functional languages force you to copy. Java, Smalltalk and C# force virtual calls on comparisons. C++ force you to go outside your function and write the comparison even simple as a.member < b.member! D is the best and std.algorithm very elegant.
> > 
> >> this solution is a classic STL piece of code, BUT, the main difference is that for C++ this is probably the only reasonable solution as C++ doesn't have delegates. this is why when coding for D I urge people to rethink their solution instead of just doing what is best in C++ (or any other preferred language of the library writer). This is a prime example why the C++ solution is wrong for D.
> > 
> > The other rant you wrote and the interesting discussion with Janice are good to read. But there is little objective points you bring. I see you do not like anything like C++ and then you go back and try to make arguments.
> > 
> > One objective point is you like a container hierarchy. I have worked with them and they can be useful. But like with mysort the same argument works. You can build a hierarchy of containers on top of fast containers. But not the inverse. Dee Girl
> 
> My rant emphasizes specific things, since that's the point of a rant. I could talk about all the things I like in C++ and Janice would agree and we won't have anything interesting to talk about. I do use c++ and it is a powerful language. My point is not that I'm against templates, or that I hate C++. What i'm trying to say is that using any one tool for everything is wrong.

But you want to use one tool for all job. You want indirect calls and they force your design only one way.

> I'm not arguing that delegates should replace all
> template code in D! I'm arguing that both solutions need to be evaluated
> and the better one chosen, also note that these two options overlap
> sometimes and (when it's suitable) both solutions should be provided so
> the user could choose his preferred trade-off (time vs. memory).

Maybe once it is clear the inclusion relationship: that with the template you can have dynamic. Then you will be happy. Dee Girl
May 11, 2008
Janice Caron wrote:
> On 11/05/2008, Yigal Chripun <yigal100@gmail.com> wrote:
>>  for each different comp function there will be a different sort instance
>>  created by the compiler.
> 
> But surely, every container must itself be templatised on the element? I mean, unless you go the Java way and /only/ have a container for Objects. So your plan would instantiate at least one sort function for every element type.
> 
> Moreover, since your OrderedCollection interface would demand that every container must implement sort, then every different kind of collection would have it's own sort function.
> 
> So under your scheme, the total number of sort functions would equal the total number of container types. In other word, if your program used
> 
>     List!(char) list1, list2;
>     List!(int) list3;
>     List!(C) list4, list5;
>     Array!(double) array1;
> 
> then there would be four sort functions. In what way is that fewer than would be produced by
> 
>     sort(list1);
>     sort(list2);
>     sort(list3);
>     sort(list4);
>     sort(list5);
>     sort(array1);
> 
> ? The way I see it, we both get four. However, if choose /not/ to sort list3, then I only get three instantiations, but you still get four.
> 
> 
> 
> 
>>  What i'm trying to say is that using any one tool for
>>  everything is wrong.
> 
> You should have just said that to start with, because it's a statement of the obvious and no one will disagree with you. :-)
> 
> 
>>  I'm not arguing that delegates should replace all
>>  template code in D! I'm arguing that both solutions need to be evaluated
>>  and the better one chosen
> 
> Why are you assuming that that didn't happen?
> 
> 
>>  also note that these two options overlap
>>  sometimes and (when it's suitable) both solutions should be provided so
>>  the user could choose his preferred trade-off (time vs. memory).
> 
> Except that, if you can trivially create one from the other, then providing one is effectively providing both anyway.

From what I know, an efficient STL implementation would use C style
containers with void* under the hood to minimize executable bloat.
in a perfect world, my idealized container collection would do something
similar. the interfaces need to use templates so that I have a list!(T)
and such with a proper hierarchy, however under the hood all those
List!(T) with different T types will use the same code. also, My
idealized collection framework would use mixins extensively to share
common code.
so, if all those tools are combined properly, the amount of sort
instances should be minimal.
the Java way is the extreme opposite of the c++ way, I want something in
the middle, Eiffel has the best design IMO, couple that API with C
efficiency and we got a winner :)
the idea is of course take the best from each language and combine it to
something even better.
BTW, you really should take a look at Eiffel. for example, it has
multiple-inheritance implemented in a completely different way than C++
avoiding most of c++ problems with it. they have many novel ideas there
that D can and should learn from.

My perfect collection framework would be based on the Eiffel design, avoiding problems in the Java model and problems in the C++ model.

--yigal
May 11, 2008
your reply is correct but it is misleading.
a different example:
void Fn() {
  int i;
}

every time the function is called a new int i is generated on the stack
and destroyed when the function is exited. so for a specific call to Fn
there can be only one variable i, in the course of the program that same
variable is created and destroyed many times ( for each call to the
function).
you are correct in saying that there is only one mysort function but
every time you call it with a different comp delegate it needs to have a
different sort instance. the difference is that the int i is generated
on the stack but all those different sort instances are not. those are
code which goes into the executable.
so while in any specific time mysort uses only one sort instance, over
the course of the entire program the sort template can get instantiated
more than once.
does it make more sense to you, now?

--Yigal
May 11, 2008
On 11/05/2008, Yigal Chripun <yigal100@gmail.com> wrote:
> your reply is correct but it is misleading.

Whose reply? What is this a reply to?

You might want to continue replying point-by-point. It makes things a lot easier to follow.


>  you are correct in saying that there is only one mysort function

Who said that, and in what context?


>  but
>  every time you call it with a different comp delegate it needs to have a
>  different sort instance.

OK, I see what you're saying. If I do

    sort!(English)(text);
    sort!(Czech)(text);

then I do indeed get two instantiations (which presumably will sort text alphabetically by English and Czech rules respectively).

HOWEVER. If I write

    void mysort(T)(T[] array, int delegate(T,T) compare)
    {
        sort!(compare)(array);
    }

then I have implemented your desired solution, trivially. And now I can call

    mysort(array,English);
    mysort(array,Check);

and still there's only one instantiation of sort. It's exactly what Dee's been saying all along. You can make dynamic from static, but you can't make static from dynamic.
May 11, 2008
Janice Caron wrote:
> On 11/05/2008, Yigal Chripun <yigal100@gmail.com> wrote:
>> your reply is correct but it is misleading.
> 
> Whose reply? What is this a reply to?
> 
> You might want to continue replying point-by-point. It makes things a lot easier to follow.
> 
> 
>>  you are correct in saying that there is only one mysort function
> 
> Who said that, and in what context?
> 
> 
sorry for that, that is a reply to dee.

>>  but
>>  every time you call it with a different comp delegate it needs to have a
>>  different sort instance.
> 
> OK, I see what you're saying. If I do
> 
>     sort!(English)(text);
>     sort!(Czech)(text);
> 
> then I do indeed get two instantiations (which presumably will sort text alphabetically by English and Czech rules respectively).
> 
> HOWEVER. If I write
> 
>     void mysort(T)(T[] array, int delegate(T,T) compare)
>     {
>         sort!(compare)(array);
>     }
> 
> then I have implemented your desired solution, trivially. And now I can call
> 
>     mysort(array,English);
>     mysort(array,Check);
> 
> and still there's only one instantiation of sort. It's exactly what Dee's been saying all along. You can make dynamic from static, but you can't make static from dynamic.

I don't follow. How can there be only one sort instance?
the mysort function uses both sorts so the code for both needs to be in
the executable...the first call instantiates the first sort for English,
and the second instantiates the second sort for Czech.

May 11, 2008
On 11/05/2008, Yigal Chripun <yigal100@gmail.com> wrote:
> From what I know, an efficient STL implementation would use C style
>  containers with void* under the hood to minimize executable bloat.

I /think/ I get what you're saying here, which is (correct me if I'm
wrong, but) given

    class C {}
    class D {}

then

    Vector!(C) c;
    Vector!(D) d;

produces two instantiations of Vector! - one for C and one for D - and yet those instantiations will produce byte-for-byte identical machine code.

However - this is a linker problem, not a template problem, and Walter has said that this will be fixed in the future. So in the future, whenever any two or more segments consist of identical bytes, the linker may keep one and throw the rest away.

The moral of the story is: specify your intent, but leave the implementation to the compiler/linker.

If you try to "outguess" the compiler, by instantiating only one single Vector!(void*), then in the long run, all you are doing is creating obfuscated code for short term gain.
May 11, 2008
On 11/05/2008, Yigal Chripun <yigal100@gmail.com> wrote:
> I don't follow. How can there be only one sort instance?

Because I only instantiated it once - specifically with the delegate named "compare".

>  the mysort function uses both sorts

No, the mysort function is handed a delegate an input parameter.


> so the code for both needs to be in
> the executable...

The comparison functions need to be in the executable, but not two separate instantiations of sort.

Why is my last post not clear? You get two instantiations with

    sort!(English)(text);
    sort!(Czech)(text);

because I statically passed in two separate delegates, but you only get one instantiation with

    sort!(compare)(array)

because I statically passed in only one delegate. That you passed two into mysort is irrelevant, because that's a runtime thing.
May 11, 2008
Yigal Chripun Wrote:

> your reply is correct but it is misleading.
> a different example:
> void Fn() {
>   int i;
> }
> 
> every time the function is called a new int i is generated on the stack
> and destroyed when the function is exited. so for a specific call to Fn
> there can be only one variable i, in the course of the program that same
> variable is created and destroyed many times ( for each call to the
> function).
> you are correct in saying that there is only one mysort function but
> every time you call it with a different comp delegate it needs to have a
> different sort instance. the difference is that the int i is generated
> on the stack but all those different sort instances are not. those are
> code which goes into the executable.
> so while in any specific time mysort uses only one sort instance, over
> the course of the entire program the sort template can get instantiated
> more than once.
> does it make more sense to you, now?

I am sorry Yigal. What you say makes no sense. Your previous reply is wrong and this reply is even more wrong.

It is simple logic. mysort has only one body. Inside that body is has a regular call to a function that implements the sorting. That call can only go to one function! It is very clear. D does not generate code at runtime.

It is a silly exercise but we can do this. Please compile this program.

=== test.d
#!/home/yasuko/bin/compile-launch -w
import std.algorithm;
import std.stdio;

void mysort(int[] array, bool delegate(int, int) comp)
{
    sort!(comp)(array);
}

void main()
{
    int[] array = [ 1, 2, 3, 4 ];
    bool comp1(int a, int b) { return a > b; }
    mysort(array, &comp1);
    //writeln(array);
    bool comp2(int a, int b) { return a < b; }
    mysort(array, &comp2);
    //sort!(comp2)(array);
    //writeln(array);
}
===

writeln is taken out so it does not get confusing. Please run

dmd -g test.d
objdump -d test.o|less

Search for word "section" in the disassembled file. There are 19. It seems each function has its section.

1. section .text is the all file

2. section .text._D4test6mysortFAiDFiiZbZv is the mysort function. I do not know the naming scheme of D but it is easy to guess which function is.

3. section .text._Dmain is the main funciton

4. section .text._D4test4mainFZv8compare1MFiiZb is compare1

5. section .text._D4test4mainFZv8compare2MFiiZb is compare2

6. section .text._D4test6mysortFAiDFiiZbZv108__T4sortS36_D4test6mysortFAiDFiiZbZv4compDFiiZbVE3std9algorithm12SwapStrategy0S233std9algorithm8iterSwapTAiZ4sortMFAiZv is the std.sort function

7. section .text._D3std8iterator12__T5beginTiZ5beginFAiZPi is begin funciton

8. section .text._D3std8iterator10__T3endTiZ3endFAiZPi is end function

9. section .text._D4test6mysortFAiDFiiZbZv112__T8sortImplS36_D4test6mysortFAiDFiiZbZv4compDFiiZbVE3std9algorithm12SwapStrategy0S233std9algorithm8iterSwapTAiZ8sortImplMFAiZv is the sortimpl function

10. section .text._D4test6mysortFAiDFiiZbZv112__T8sortImplS36_D4test6mysortFAiDFiiZbZv4compDFiiZbVE3std9algorithm12SwapStrategy0S233std9algorithm8iterSwapTAiZ8sortImplMFAiZv4predMFiZb is the pred function

11. section .text._D4test6mysortFAiDFiiZbZv55__T8getPivotS36_D4test6mysortFAiDFiiZbZv4compDFiiZbTPiZ8getPivotMFPiPiZPi is the getPivot function

12. section .text._D3std9algorithm16__T8iterSwapTPiZ8iterSwapFPiPiZv is the iterSwap function

13. section .text._D3std9contracts17__T8pointsToTiTiZ8pointsToFKiKiZb is the pointsTo function

14. section .text._D3std9algorithm11__T4swapTiZ4swapFKiKiZv is the swap function

15. section .text._D3std8iterator12__T5rangeTiZ5rangeFPiPiZAi is the range function

16. section .text._D4test6mysortFAiDFiiZbZv112__T8sortImplS36_D4test6mysortFAiDFiiZbZv4compDFiiZbVE3std9algorithm12SwapStrategy0S233std9algorithm8iterSwapTAiZ8sortImplMFAiZv243__T9partitionS165_D4test6mysortFAiDFiiZbZv112__T8sortImplS36_D4test6mysortFAiDFiiZbZv4compDFiiZbVE3std9algorithm12SwapStrategy0S233std9algorithm8iterSwapTAiZ8sortImplMFAiZv4predMFiZbVE3std9algorithm12SwapStrategy0S233std9algorithm8iterSwapTAiZ9partitionMFAiZPi is the partition function

17. section .text._D4test6mysortFAiDFiiZbZv55__T8isSortedS36_D4test6mysortFAiDFiiZbZv4compDFiiZbTAiZ8isSortedMFAiZb is the isSorted function

18. section .text._D4test6mysortFAiDFiiZbZv55__T8isSortedS36_D4test6mysortFAiDFiiZbZv4compDFiiZbTAiZ8isSortedMFAiZb4predMFiiZb is another isSorted function (I do not now why there are two)

19. section .text._D4test6mysortFAiDFiiZbZv55__T8isSortedS36_D4test6mysortFAiDFiiZbZv4compDFiiZbTAiZ8isSortedMFAiZb133__T12findAdjacentS108_D4test6mysortFAiDFiiZbZv55__T8isSortedS36_D4test6mysortFAiDFiiZbZv4compDFiiZbTAiZ8isSortedMFAiZb4predMFiiZbTAiZ12findAdjacentMFAiZPi is the isAdjacent function.

It is clear now that there is only one instantiation of std.sort. I hope now it is all clear. Thank you, Dee Girl
May 11, 2008
Janice Caron wrote:
<snip>
if interpret_cast can be implemented by the user and it is quite
dangerous, than i think people who need that should implement that
themselves with union as you showed. no need for D to provide that
functionality by default.
I'm still not sure about constancy casts. but in order to provide the
minimal solution, I'd prefer cast!(T) to do constancy casts over adding
another construct for that. also, I don't like the static cast(), and it
might cause ambiguity problems. what is this good for?

open question: if this proposal does not provide a reinterpret_cast, can we split the cast!(T) to perform either a type change /or/ a constancy change?

int n = ...;
auto d1 = cast(double)n; //converts the n to double so 1 => 1.0
// below case uses the same bit pattern, exception on invalid double
// value
auto d2 = cast!(double)n;

same goes for classes with user defined cast and cast! operators.
downcast can be treated as a compiler pre-defined conversion from a
class to its derived classes. uses the cast(Derived) form and throws
exception on error.
constancy is performed via cast!(T).
pointers are cast with cast!(T)
so to convert const(int)* to const(long*) you'll need something like:
auto res = cast!(const(long*))cast(const(long)*)x;

May 11, 2008
Janice Caron Wrote:

> Why is my last post not clear? You get two instantiations with
> 
>     sort!(English)(text);
>     sort!(Czech)(text);
> 
> because I statically passed in two separate delegates, but you only get one instantiation with
> 
>     sort!(compare)(array)
> 
> because I statically passed in only one delegate. That you passed two into mysort is irrelevant, because that's a runtime thing.

Thank you. This is a clear explaining. It took me long time to write my massage with disassembling a program! I wish I saw your post first. I would not have to write it ^_^ Thank you, Dee Girl