March 22, 2015
On Sunday, 22 March 2015 at 03:43:33 UTC, Walter Bright wrote:
> On 3/21/2015 2:08 PM, "Ola Fosheim =?UTF-8?B?R3LDuHN0YWQi?= <ola.fosheim.grostad+dlang@gmail.com>" wrote:
>> On Saturday, 21 March 2015 at 19:35:02 UTC, Walter Bright wrote:
>>> I know I shouldn't, but I'll bite. Show me the "low level C code" that
>>> effectively uses SIMD vector registers.
>>
>> You are right, you should not bite. C code is superflous, this is a general
>> issue with efficient parallel computations. You want to avoid dependencies
>> within a single register.
>>
>> E.g. Take a recurrence relation and make an efficient simd implementation for
>> it.  You might need to try to expand the terms so you have N independent
>> formulas. If it uses floating point you will have to be careful about drift
>> between the N formulas that are computed in parallel.
>>
>
> I.e. there isn't low level C code that effectively uses SIMD vector registers. You have to use the auto-vectorizer, which tries to reconstruct high level operations out of C low level code, then recompile.

at least it still compiles into efficient code for the PDP11?
March 22, 2015
On Wed, Mar 18, 2015 at 1:09 AM, Paulo Pinto via Digitalmars-d < digitalmars-d@puremagic.com> wrote:

> On Tuesday, 17 March 2015 at 20:50:51 UTC, Bienlein wrote:
>
>>
>>  Go is only a CSP-like, it isn't CSP. cf Python-CSP and PyCSP, not to
>>> mention JCSP and GPars.
>>>
>>
>> I'm not really sure whether this can be put exactly that way. On a machine with 4 GB RAM you can spawn about 80.000 goroutines...
>
> What about using a JVM with green threads support or Quasar, wouldn't it be more comparable?
>
> --
> Paulo
>

It seems it is the same issue. From the Quasar user manual ( http://docs.paralleluniverse.co/quasar/):

Fibers are not meant to replace threads in all circumstances. A fiber should be used when its body (the code it executes) blocks very often waiting on other fibers (e.g. waiting for messages sent by other fibers on a channel, or waiting for the value of a dataflow-variable). For long-running computations that rarely block, traditional threads are preferable. Fortunately, as we shall see, fibers and threads interoperate very well.

--
Ziad


March 22, 2015
On 3/19/15 6:30 PM, bearophile wrote:
> Andrei Alexandrescu:
>
>> You may want to answer there, not here. I've also posted a response.
>
> There is this, with an attach:
> https://issues.dlang.org/show_bug.cgi?id=11810

I destroyed:

https://github.com/D-Programming-Language/phobos/pull/3089


Andrei


March 22, 2015
On Sunday, 22 March 2015 at 03:43:33 UTC, Walter Bright wrote:
> I.e. there isn't low level C code that effectively uses SIMD vector registers. You have to use the auto-vectorizer, which tries to reconstruct high level operations out of C low level code, then recompile.

I don't think low level hardware registers qualify as "high level constructs" which is the term you used. Besides, all major C compilers ship with builtin vector types and support for standard hardware vendor SIMD intrinsics. But even if you dismiss that, then even less sophisticated contemporary compiler is capable of using SIMD for carefully manually unrolled expressions. Still, even without explicit simd instructions the superscalar nature of desktop CPUs require you to break dependencies to avoid bubbles in the pipeline.

So in order to optimize the filling of an array with the fibonacci sequence a plain high level library generator is insufficient. You also need to utilize the closed formula for fib(x) so that you can generate sequences in parallel, e.g. compute the sequence fib(0),fib(1)… in parallel with fib(N), fib(N+1) etc.

Without having the closed formula to obtain fib(N-2) and fib(N-1) a regular optimizer will simply not be able to break the dependencies as effectively as a handwritten low level loop.
March 22, 2015
On Saturday, 21 March 2015 at 15:09:32 UTC, Andrei Alexandrescu wrote:
> On 3/20/15 9:43 PM, Sebastiaan Koppe wrote:
>> On Saturday, 21 March 2015 at 01:31:21 UTC, Andrei Alexandrescu wrote:
>>> On 3/20/15 5:56 PM, Walter Bright wrote:
>>>> On 3/20/2015 5:23 PM, Andrei Alexandrescu wrote:
>>>>> Yah, and uses reference counting for management. -- Andrei
>>>>
>>>> Ref counting won't improve splitLines, because it must keep them all.
>>>
>>> Yah, all solution based on "let's keep all lines so we count them at
>>> the end" are suboptimal. -- Andrei
>>
>> What about `.count("\n")` ?
>
> Using "count" is among the solutions I measured: http://stackoverflow.com/questions/28922323/improving-line-wise-i-o-operations-in-d/29153508#29153508 -- Andrei

Fwiw I get similar timings to python and pypy using LDC optimized and the others are not bad.  Maybe it is a Windows specific problem (I use arch).  (Posted to SO).


March 22, 2015
> Something like
>
>     while (n != EOF) {
>         n = read(fd, buf, sizeof(buf));
>         if (n==-1) throw(...);
>         if (strcmp(buf, PREFIX) == 0) {
>              return buf;
>         }
>     }
>     return NULL;
>
> Requires no prior knowledge, and have similar effect.
>

I'm surprised nobody commented on "no prior knowledge". How are you supposed to guess what strcmp(buf, PREFIX) does if you don't know the function? Why is it being compared to 0? What's this -1 magic number? What does read do? What is EOF? Unless you're born with C/POSIX programming knowledge, this is prior knowledge. I know C well enough, but it took me some time to understand what it does.

And honestly, compared to
>   File("/tmp/a").byChunk(4096).joiner.startsWith(s)
you can easily guess that you have a file - do some nonobvious magic on it - and check if it starts with `s` just by reading it as plain English.
March 23, 2015
On 2015-03-23 at 00:15, krzaq wrote:
>> Something like
>>
>>     while (n != EOF) {
>>         n = read(fd, buf, sizeof(buf));
>>         if (n==-1) throw(...);
>>         if (strcmp(buf, PREFIX) == 0) {
>>              return buf;
>>         }
>>     }
>>     return NULL;
>>
>> Requires no prior knowledge, and have similar effect.
>>
>
> I'm surprised nobody commented on "no prior knowledge". How are you supposed to guess what strcmp(buf, PREFIX) does if you don't know the function? Why is it being compared to 0? What's this -1 magic number? What does read do? What is EOF? Unless you're born with C/POSIX programming knowledge, this is prior knowledge. I know C well enough, but it took me some time to understand what it does.

Indeed. I know of strcmp (because of prior knowledge) but had to have a look at "man 2 read" just in case to verify if read was used correctly. Of course there is also this conveniently omitted part in throw(...). Throw what? Use switch to go over errno and pick the suitable Exception or rather pick one generic Exception and put the output of strerror as its msg? Why should we be bothered with such low-level tasks?

> And honestly, compared to
>>   File("/tmp/a").byChunk(4096).joiner.startsWith(s)
> you can *easily* guess that you have a file - do some nonobvious magic on it - and check if *it* starts with `s` just by reading it as plain English.

Now you've injured yourself with your own weapon. I can guess that File(path) opens the file for reading (probably because of other language libraries) and that byChunk(size) reads it one chunk at a time (but frankly only because it looked similar to byLine which I've known before), but what the hell is joiner? Does it glue ranges together so that they appear as a single contiguous one? Oh, wait, I may have actually read about it somewhere already. But if I didn't, I wouldn't have a clue.

What should start with s? The file, any chunk, the joiner - whatever it meant? It is much clearer than the loop, but I'm not sure I'd guess what it does, because of the two middle elements in the UFCS chain. This *nonobvious magic* may have transformed the contents of the file in a way that makes startsWith(s) do something different.
March 23, 2015
On Monday, 23 March 2015 at 11:31:16 UTC, FG wrote:
>> And honestly, compared to
>>>  File("/tmp/a").byChunk(4096).joiner.startsWith(s)
>> you can *easily* guess that you have a file - do some nonobvious magic on it - and check if *it* starts with `s` just by reading it as plain English.
>
> Now you've injured yourself with your own weapon. I can guess that File(path) opens the file for reading (probably because of other language libraries)
That's why I used the word "guess" ;)
> and that byChunk(size) reads it one chunk at a time (but frankly only because it looked similar to byLine which I've known before), but what the hell is joiner? Does it glue ranges together so that they appear as a single contiguous one? Oh, wait, I may have actually read about it somewhere already. But if I didn't, I wouldn't have a clue.
I'd argue that joiner is intuitive enough, but I agree on byChunk. I am also baffled why this byLine/byChunk madness is necessary at all, it should be something like
File("path").startsWith(s)
or
File("path").data.startswith(s)

The same goes for stdin, something as simple as cin >> intvariable in C++ rises to an almost insurmountable task in D.

> What should start with s? The file, any chunk, the joiner - whatever it meant? It is much clearer than the loop, but I'm not sure I'd guess what it does, because of the two middle elements in the UFCS chain. This *nonobvious magic* may have transformed the contents of the file in a way that makes startsWith(s) do something different.
You're right, I didn't even think of that.
March 23, 2015
On 3/23/15 4:59 AM, krzaq wrote:
> File("path").data.startswith(s)

We could define File.byByte, but that's minor. I appreciate the ability to decide on the buffering strategy. -- Andrei

March 23, 2015
On 2015-03-23 at 12:59, krzaq wrote:
> I'd argue that joiner is intuitive enough, but I agree on byChunk. I am also baffled why this byLine/byChunk madness is necessary at all, it should be something like
> File("path").startsWith(s)
> or
> File("path").data.startswith(s)

Yeah, that would be useful for example to test magic values at the beginning of files:

    string[] scripts;
    foreach (string path; dirEntries(topDir, SpanMode.depth))
        if (isFile(path) && File(path).startsWith("#!"))
            scripts ~= path;

but that's the simplest case of a bigger problem, because here you just need the first few bytes, and you don't want to read the whole file, nor anything more than a sector.

OTOH, there are also file formats like ZIP that put the meta information at the end of the file and scatter the rest of the data all over the place using offset information. You don't need to read everything just to grab the metadata. But, when I had a look at the sources of libzip, I went crazy seeing all the code performing tons of file seeking, reading into buffers and handling them[1].

D's std.zip took a simple approach and doesn't deal with that at all; it reads the whole file into the memory. That makes the algorithm more clearly visible, but at the same time it makes the module completely useless if you want to handle archives that are larger than the available memory, and over-the-top if all you wanted was to extract a single file from the archive or only read the directory structure.

So, how do you envision something representing a file, i.e. a mix of "BufferedRange" and "SeekableRange", that would neatly handle buffering and seeking, without you dropping to stdc IO or wanting to shoot yourself when you look at the code?


[1] for your amusement: http://hg.nih.at/libzip/file/78b8e3fa72a0/lib/zip_open.c