November 15, 2013
Ary Borenszweig:

> And so LLVM, which is what Crystal uses as a backend.

LDC2 uses the same back end :-)


> But I thought ranges were meant to be fast. No allocations and all of that. In fact, I was kind of sad that Crystal doesn't have a similar concept so it could never get as fast as D ranges. But if D ranges are not fast, what's the point of having them and making everyone use them?

If you use ranges badly you will get a slow program, if you use them well with a good back-end, you will have a fast program.

I have written a third intermediate program, it's longer than yours, and it seems much slower than your code:


void main(in string[] args) {
    import std.stdio, std.conv, std.algorithm, std.array;

    immutable n = (args.length == 2) ? args[1].to!uint : 10;
    if (n == 0)
        return;

    auto seq = "1";
    writefln("%2d: n. digits: %d", 1, seq.length);
    foreach (immutable i; 2 .. n + 1) {
        Appender!string result;
        foreach (immutable digit, immutable count; seq.group) {
            result ~= "123"[count - 1];
            result ~= digit;
        }
        seq = result.data;
        writefln("%2d: n. digits: %d", i, seq.length);
    }
}


On my system it runs in 0.34 seconds for n=50. Could you compare some of the timings of the various D/Crystal versions on your system (using ldc2 for D)?

Bye,
bearophile
November 15, 2013
On Friday, 15 November 2013 at 13:13:51 UTC, Ary Borenszweig wrote:
> On 11/15/13 10:07 AM, Chris wrote:
>> On Friday, 15 November 2013 at 12:47:21 UTC, bearophile wrote:
>>> Ary Borenszweig:
>>>
>>>> Here's what I was able to do in some minutes:
>>>>
>>>> ---
>>>> if ARGV.length != 1
>>>> puts "missing argument: n"
>>>> exit 1
>>>> end
>>>>
>>>> n = ARGV[0].to_i
>>>> str = "1"
>>>> buffer = String::Buffer.new(20)
>>>> n.times do
>>>> puts str.length
>>>> str.each_chunk do |digit, count|
>>>>   buffer << '0' + count
>>>>   buffer << digit
>>>> end
>>>> str = buffer.to_s
>>>> buffer.clear
>>>> end
>>>>
>>>> With n=70 it takes about 4.89s. With n=45 it takes about 0.012s.
>>>
>>> This program is much longer in number of tokens to the first D
>>> program. You can write a D program about as fast as this in about the
>>> same number of tokens.
>>>
>>> Perhaps I should add an intermediate third version that shows code
>>> that's not as extreme as the two versions there. Thank you for the
>>> implicit suggestion.
>>>
>>>
>>>> And with Crystal you could do the second version as well, because you
>>>> have access to low level stuff like pointers.
>>>
>>> In Crystal do you have final switches, gotos, etc too?
>>>
>>>
>>>> And also, the language is pretty new so there's still
>>>> a lot of optimizations to be done.
>>>
>>> And LDC2 will improve in the meantime.
>>>
>>>
>>>> I also thought ranges were pretty fast because of their nature.
>>>
>>> It also matters a lot how you use them, this is normal in computer
>>> programming.
>>>
>>>
>>>> Why are they slow in this example?
>>>
>>> Just because the first example is not written for speed, I didn't even
>>> add run-time timings for it at first. And it's not that slow.
>>>
>>> Bye,
>>> bearophile
>>
>> Slightly OT: Why do languages like Ruby (and now Crystal) have to state
>> the obvious in an awkward way?
>>
>> (2...max).each do
>>
>> Of course you _do_ _each one_ from 2 to max. Is it to make it more
>> "human"?
>
> Absolutely. You are a human and you spend a lot of time reading code. The more human the code looks to you, the better, I think, as long as it doesn't become too long or too annoying to read, like:
>
> for every number between 2 and max do
>   ...
> end
>
> :-P

Well, that was exactly my point. As a human being you don't need the patronizing (and highly annoying) "for every number ...". This is what you say when you explain it to a newbie. But there is no need to spell this out in the syntax. Syntax of programming languages is (or should be) like road signs, or any other signs. Concise and expressive. Else, what's the point? I know that languages like Lua have the philosophy that non-programmers should be able to use it. But every human being is capable of abstracting things. There is no need for this terrible syntax

(2..max).each do:

end

It doesn't add anything to the code except for useless characters. Humans have used signs and acronyms for ages. We can cope with it. I once saw the most beautiful encrypted message in Arabic, which when read properly unfolds into an array of characters and meaning. We humans can deal with it. I still don't see why x++; is a problem and has to be spelled out as x = x + 1, or even x += 1 (slightly better).

If Ruby programmers had invented spelling, you would "Double U Ar I Tee ee" like this. Ha ha ha! :-)
November 15, 2013
On 11/15/13 10:33 AM, bearophile wrote:
> Ary Borenszweig:
>
>> And so LLVM, which is what Crystal uses as a backend.
>
> LDC2 uses the same back end :-)
>
>
>> But I thought ranges were meant to be fast. No allocations and all of
>> that. In fact, I was kind of sad that Crystal doesn't have a similar
>> concept so it could never get as fast as D ranges. But if D ranges are
>> not fast, what's the point of having them and making everyone use them?
>
> If you use ranges badly you will get a slow program, if you use them
> well with a good back-end, you will have a fast program.
>
> I have written a third intermediate program, it's longer than yours, and
> it seems much slower than your code:
>
>
> void main(in string[] args) {
>      import std.stdio, std.conv, std.algorithm, std.array;
>
>      immutable n = (args.length == 2) ? args[1].to!uint : 10;
>      if (n == 0)
>          return;
>
>      auto seq = "1";
>      writefln("%2d: n. digits: %d", 1, seq.length);
>      foreach (immutable i; 2 .. n + 1) {
>          Appender!string result;
>          foreach (immutable digit, immutable count; seq.group) {
>              result ~= "123"[count - 1];
>              result ~= digit;
>          }
>          seq = result.data;
>          writefln("%2d: n. digits: %d", i, seq.length);
>      }
> }
>
>
> On my system it runs in 0.34 seconds for n=50. Could you compare some of
> the timings of the various D/Crystal versions on your system (using ldc2
> for D)?

Sure.

This last version you wrote, compiling it with "-enable-inlining -release -O3", takes 0.054s (please tell me if I'm missing some flags, in Crystal I just used --release). The Crystal version takes 0.031s. I also tried with n=70. In D: 9.265s. In Crystal: 4.863s.

I couldn't compile the first version written in D because I get:

Error: pure function '__lambda1' cannot call impure function 'join'

(I think you mentioned this in another post in this thread)

The super-optimized D version, with n=70, takes 1.052s. This is the fastest.

However, I'm starting to think that all those immutable, final switches and gotos are useless if they don't give a performance benefit (well, final switches do give you more safety). Maybe it's just that D/ldc doesn't use the immutability information and everything else to do aggressive optimizations?
November 15, 2013
On 11/15/13 11:39 AM, Chris wrote:
> Well, that was exactly my point. As a human being you don't need the
> patronizing (and highly annoying) "for every number ...". This is what
> you say when you explain it to a newbie. But there is no need to spell
> this out in the syntax. Syntax of programming languages is (or should
> be) like road signs, or any other signs. Concise and expressive. Else,
> what's the point? I know that languages like Lua have the philosophy
> that non-programmers should be able to use it. But every human being is
> capable of abstracting things. There is no need for this terrible syntax
>
> (2..max).each do:
>
> end

No need to do that. You can, if you want to. I would have done:

2.upto(max) do
  ...
end

>
> It doesn't add anything to the code except for useless characters.

What do you mean by useless characters? How do you do it in D?
November 15, 2013
Ary Borenszweig:

Your timings are good enough for me, I have updated the rosettacode page with the third D version.

> However, I'm starting to think that all those immutable, final switches and gotos are useless if they don't give a performance benefit (well, final switches do give you more safety).

In the second D program if you compile with ldc2 the final switch gives a significant performance increase :-)


> Maybe it's just that D/ldc doesn't use the immutability information and everything else to do aggressive optimizations?

This was discussed some time ago. Probably there are ways for ldc2 to use a little better the static information of well annotated D code.

Bye,
bearophile
November 15, 2013
On Friday, 15 November 2013 at 15:20:20 UTC, Ary Borenszweig wrote:
> On 11/15/13 11:39 AM, Chris wrote:
>> Well, that was exactly my point. As a human being you don't need the
>> patronizing (and highly annoying) "for every number ...". This is what
>> you say when you explain it to a newbie. But there is no need to spell
>> this out in the syntax. Syntax of programming languages is (or should
>> be) like road signs, or any other signs. Concise and expressive. Else,
>> what's the point? I know that languages like Lua have the philosophy
>> that non-programmers should be able to use it. But every human being is
>> capable of abstracting things. There is no need for this terrible syntax
>>
>> (2..max).each do:
>>
>> end
>
> No need to do that. You can, if you want to. I would have done:
>
> 2.upto(max) do
>   ...
> end
>
>>
>> It doesn't add anything to the code except for useless characters.
>
> What do you mean by useless characters? How do you do it in D?

I prefer the C style syntax:

foreach (whatever) {
  // ...
} // closes block

All this "end" stuff is superfluous, but at least it closes blocks. My pet hate as far as syntax is concerned is Python. The indentation (t)error. Change, copy paste > run: Indentation error on line ... WTF?! In C style it doesn't matter if you have

if (a > b) { return; }

if (a > b) {
  return;
}

if (a > b)
{
return;
}

or in loops

    foreach (i; array) {
if (i == "Hello!") { break; }
    }

Cleaning up indentation is the programmer's business not the language's.
November 15, 2013
Am Fri, 15 Nov 2013 02:09:52 +0100
schrieb "bearophile" <bearophileHUGS@lycos.com>:

> I have created two interesting D entries for this Rosettacode Task, is someone willing to create a Reddit entry for this? They show very different kinds of code in D.
> 
> http://rosettacode.org/wiki/Look-and-say_sequence#D
> 
> Bye,
> bearophile

APL is awesome!

-- 
Marco

November 15, 2013
Ary Borenszweig:

> This last version you wrote, compiling it with "-enable-inlining -release -O3", takes 0.054s (please tell me if I'm missing some flags, in Crystal I just used --release). The Crystal version takes 0.031s. I also tried with n=70. In D: 9.265s. In Crystal: 4.863s.

This version is more than twice faster, because group() avoids decoding UTF:


void main(in string[] args) {
    import std.stdio, std.conv, std.algorithm, std.array, std.string;

    immutable n = (args.length == 2) ? args[1].to!uint : 10;
    if (n == 0)
        return;

    auto seq = ['1'];
    writefln("%2d: n. digits: %d", 1, seq.length);
    foreach (immutable i; 2 .. n + 1) {
        Appender!(typeof(seq)) result;
        foreach (const digit, const count; seq.representation.group) {
            result ~= "123"[count - 1];
            result ~= digit;
        }
        seq = result.data;
        writefln("%2d: n. digits: %d", i, seq.length);
    }
}


> I couldn't compile the first version written in D because I get:
>
> Error: pure function '__lambda1' cannot call impure function 'join'

Just remove the "pure" for ldc2.


> However, I'm starting to think that all those immutable, final switches and gotos are useless if they don't give a performance benefit

Adding immutable to variables sometimes helps catch some troubles in your code. In a modern language variables should be immutable on default, and functions should be pure on default, because the default choice should be the faster (if it's equally safe) and because when you read code it's simpler to reason on things that have less capabilities (like the capability to mutate for a value).

Bye,
bearophile
November 17, 2013
On Friday, 15 November 2013 at 01:09:53 UTC, bearophile wrote:
> I have created two interesting D entries for this Rosettacode Task, is someone willing to create a Reddit entry for this? They show very different kinds of code in D.
>
> http://rosettacode.org/wiki/Look-and-say_sequence#D
>
> Bye,
> bearophile

I have a task if you are interested, but I didn't bother to login to create it.
So if you like, you can create the page...

Here it is:
In a sequence of a million random integers, return the length and the indexes of the longest duplicate sequence (display indexes counting from one, not zero).

In order for everybody to start with the same random sequence, it may be useful to specify a simple implementation for the generating function.

November 17, 2013
On Friday, 15 November 2013 at 13:33:33 UTC, bearophile wrote:
> If you use ranges badly you will get a slow program, if you use them well with a good back-end, you will have a fast program.
>

And so, what are the rules for not using ranges badly ? What should be avoided ?