January 20, 2022
On Thursday, 20 January 2022 at 22:31:17 UTC, Steven Schveighoffer wrote:
>
> Because it would allow altering const data.
>

I'm not sure I understand. At what point in this function is valuesArray modified, and thus preventing it being passed in with const?

// ---

int[][int][] CreateDataSet
ref const int[] idArray, ref int[][] valuesArray, const int numRecords)
{
    int[][int][] records;
    records.reserve(numRecords);

    foreach(i, const id; idArray)
        records ~= [ idArray[i] : valuesArray[i] ];

    return records.dup;
}

// ----
January 20, 2022
On 1/20/22 15:01, forkit wrote:
> On Thursday, 20 January 2022 at 22:31:17 UTC, Steven Schveighoffer wrote:
>>
>> Because it would allow altering const data.
>>
>
> I'm not sure I understand. At what point in this function is valuesArray
> modified, and thus preventing it being passed in with const?
>
> // ---
>
> int[][int][] CreateDataSet
> ref const int[] idArray, ref int[][] valuesArray, const int numRecords)
> {
>      int[][int][] records;

Elements of records are mutable.

>      records.reserve(numRecords);
>
>      foreach(i, const id; idArray)
>          records ~= [ idArray[i] : valuesArray[i] ];

If that were allowed, you could mutate elements of record and would break the promise to your caller.

Aside: There is no reason to pass arrays and associative arrays as 'ref const' in D as they are already reference types. Unlike C++, there is no copying of the elements. When you pass by value, just a couple of fundamental types are copied.

Furthermore and in theory, there may be a performance penalty when an array is passed by reference because elements would be accessed by dereferencing twice: Once for the parameter reference and once for the .ptr property of the array. (This is in theory.)

void foo(ref const int[]) {}  // Unnecessary
void foo(const int[]) {}      // Idiomatic
void foo(in int[]) {}         // Intentful :)

Passing arrays by reference makes sense when the function will mutate the argument.

Ali

January 20, 2022
On 1/20/22 15:10, Ali Çehreli wrote:

> void foo(const int[]) {}      // Idiomatic

As H. S. Teoh would add at this point, that is not idiomatic but the following are (with different meanings):

void foo(const(int)[]) {}      // Idiomatic
void foo(const(int[])) {}      // Idiomatic

> void foo(in int[]) {}         // Intentful :)

I still like that one. :)

Ali

January 20, 2022

On 1/20/22 6:01 PM, forkit wrote:

>

On Thursday, 20 January 2022 at 22:31:17 UTC, Steven Schveighoffer wrote:

>

Because it would allow altering const data.

I'm not sure I understand. At what point in this function is valuesArray modified, and thus preventing it being passed in with const?

The compiler rules aren't enforced based on what code you wrote, it doesn't have the capability of proving that your code doesn't modify things.

Instead, it enforces simple rules that allow prove that const data cannot be modified.

I'll make it into a simpler example:

const int[] arr = [1, 2, 3, 4 5];
int[] arr2 = arr;

This code does not modify any data in arr. But that in itself isn't easy to prove. In order to ensure that arr is never modified, the compiler would have to analyze all the code, and every possible way that arr2 might escape or be used somewhere at some point to modify the data. It doesn't have the capability or time to do that (if I understand correctly, this is NP-hard).

Instead, it just says, you can't convert references from const to mutable without a cast. That guarantees that you can't modify const data. However, it does rule out a certain class of code that might not modify the const data, even if it has the opportunity to.

It's like saying, "we don't let babies play with sharp knives" vs. "we will let babies play with sharp knives but stop them just before they stab themselves."

-Steve

January 21, 2022
On Thursday, 20 January 2022 at 23:49:59 UTC, Ali Çehreli wrote:
>

so here is final code, in idiomatic D, as far as I can tell ;-)

curious output when using -profile=gc

.. a line referring to: std.array.Appender!(immutable(char)[]).Appender.Data std.array.Appender!string.Appender.this C:\D\dmd2\windows\bin\..\..\src\phobos\std\array.d:3330

That's not real helpful, as I'm not sure what line of my code its referrring to.

// ---------------

/+
  =====================================================================
   This program create a sample dataset consisting of 'random' records,
   and then outputs that dataset to a file.

   Arguments can be passed on the command line,
   or otherwise default values are used instead.

   Example of that output can be seen at the end of this code.
   =====================================================================
+/

module test;
@safe

import std.stdio : write, writef, writeln, writefln;
import std.range : iota;
import std.array : array, byPair;
import std.random : Random, unpredictableSeed, dice, choice, uniform;
import std.algorithm : map, uniq, canFind;
import std.conv : to;
import std.stdio : File;
import std.format;

debug { import std; }

Random rnd;
static this() {  rnd = Random(unpredictableSeed); } // thanks Ali

void main(string[] args)
{
    int recordsNeeded, valuesPerRecord;
    string fname;

    if(args.length < 4)
    {
        recordsNeeded = 10;
        valuesPerRecord= 8;
        fname = "D:/rnd_records.txt";
    }
    else
    {
        // assumes valid values being passed in ;-)
        recordsNeeded = to!int(args[1]);
        valuesPerRecord = to!int(args[2]);
        fname = args[3];
    }

    int[] idArray;
    createUniqueIDArray(idArray, recordsNeeded);

    int[][] valuesArray;
    createValuesArray(valuesArray, recordsNeeded, valuesPerRecord);

    int[][int][] records = CreateDataSet(idArray, valuesArray, recordsNeeded);
    ProcessRecords(records, fname);

    writefln("All done. Check if records written to %s", fname);
}

void createUniqueIDArray
(ref int[] idArray, const(int) recordsNeeded)
{
    idArray.reserve(recordsNeeded);
    debug { writefln("idArray.capacity is %s", idArray.capacity); }

    int i = 0;
    int x;
    while(i != recordsNeeded)
    {
       // id needs to be 9 digits, and needs to start with 999
       x = uniform(999*10^^6, 10^^9); // thanks Stanislav

       // ensure every id added is unique.
       if (!idArray.canFind(x))
       {
           idArray ~= x; // NOTE: does NOT appear to register with -profile=gc
           i++;
       }
    }
}

void createValuesArray
(ref int[][] valuesArray, const(int) recordsNeeded, const(int) valuesPerRecord)
{
    valuesArray = iota(recordsNeeded)
            .map!(i => iota(valuesPerRecord)
            .map!(valuesPerRecord => cast(int)rnd.dice(0.6, 1.4))
            .array).array;  // NOTE: does register with -profile=gc
}

int[][int][] CreateDataSet
(const(int)[] idArray, int[][] valuesArray, const(int) numRecords)
{
    int[][int][] records;
    records.reserve(numRecords);
    debug { writefln("records.capacity is %s", records.capacity); }

    foreach(i, const id; idArray)
    {
        // NOTE: below does register with -profile=gc
        records ~= [ idArray[i] : valuesArray[i] ];
    }
    return records.dup;
}

void ProcessRecords
(in int[][int][] recArray, const(string) fname)
{
    auto file = File(fname, "w");
    scope(exit) file.close;

    string[] formattedRecords;
    formattedRecords.reserve(recArray.length);
    debug { writefln("formattedRecords.capacity is %s", formattedRecords.capacity); }

    void processRecord(const(int) id, const(int)[] values)
    {
        // NOTE: below does register with -profile=gc
        formattedRecords ~= id.to!string ~ values.format!"%(%s,%)";
    }

    foreach(ref const record; recArray)
    {
        foreach (ref rp; record.byPair)
        {
            processRecord(rp.expand);
        }
    }

    foreach(ref rec; formattedRecords)
        file.writeln(rec);
}

/+
sample file output:

9992511730,1,0,1,0,1,0,1
9995369731,1,1,1,1,1,1,1
9993136031,1,0,0,0,1,0,0
9998979051,1,1,1,1,0,1,1
9998438090,1,1,0,1,1,0,0
9995132750,0,0,1,0,1,1,1
9997123630,0,1,1,1,0,1,1
9998351590,1,0,0,1,1,1,1
9991454121,1,1,1,1,1,0,1
9997673520,1,1,1,1,1,1,1

+/

// ---------------




January 21, 2022
On Friday, 21 January 2022 at 01:35:40 UTC, forkit wrote:
>

oops. nasty mistake to make ;-)


module test;
@safe

should be:

module test;
@safe:


January 20, 2022
On 1/20/22 17:35, forkit wrote:

> module test;
> @safe

Does that make just the following definition @safe or the entire module @safe? Trying... Yes, I am right. To make the module safe, use the following syntax:

@safe:

>      idArray.reserve(recordsNeeded);
[...]
>             idArray ~= x; // NOTE: does NOT appear to register with
> -profile=gc

Because you've already reserved enough memory above. Good.

>      int[][int][] records;
>      records.reserve(numRecords);

That's good for the array part. However...

>          // NOTE: below does register with -profile=gc
>          records ~= [ idArray[i] : valuesArray[i] ];

The right hand side is a freshly generated associative array. For every element of 'records', there is a one-element AA created. AA will need to allocate memory for its element. So, GC allocation is expected there.

>      string[] formattedRecords;
>      formattedRecords.reserve(recArray.length);
[...]
>          // NOTE: below does register with -profile=gc
>          formattedRecords ~= id.to!string ~ values.format!"%(%s,%)";

Again, although 'formattedRecords' has reserved memory, the right hand side has dynamic memory allocations.

1) id.to!string allocates

2) format allocates memory for its 'string' result (I think the Appender report comes from format's internals.)

3) Operator ~ makes a new string from the previous two

(Somehow, I don't see three allocations though. Perhaps an NRVO is applied there. (?))

I like the following better, which reduces the allocations:

        formattedRecords ~= format!"%s%(%s,%)"(id.to!string, values);

>      foreach(ref rec; formattedRecords)
>          file.writeln(rec);

The bigger question is, why did 'formattedRecords' exist at all? You could have written the output directly to the file. But even *worse* and with apologies, ;) here is something crazy that achieves the same thing:

void ProcessRecords
(in int[][int][] recArray, const(string) fname)
{
    import std.algorithm : joiner;
    auto toWrite = recArray.map!(e => e.byPair);
    File("rnd_records.txt", "w").writefln!"%(%(%(%s,%(%s,%)%)%)\n%)"(toWrite);
}

I've done lot's of trial and error for the required number of nested %( %) pairs. Phew...

Ali

January 21, 2022
On Friday, 21 January 2022 at 02:30:35 UTC, Ali Çehreli wrote:
>
> The bigger question is, why did 'formattedRecords' exist at all? You could have written the output directly to the file.

Oh. this was intentional, as I wanted to write once, and only once, to the file.

The consequence of that decision of course, is the extra memory allocations...

But in my example code I only create 10 records. In reality, my dataset will have 100,000's of records, so I don't want to write 100,000s of time to the same file.

> But even *worse* and with apologies, ;) here is something crazy that achieves the same thing:
>
> void ProcessRecords
> (in int[][int][] recArray, const(string) fname)
> {
>     import std.algorithm : joiner;
>     auto toWrite = recArray.map!(e => e.byPair);
>     File("rnd_records.txt", "w").writefln!"%(%(%(%s,%(%s,%)%)%)\n%)"(toWrite);
> }
>
> I've done lot's of trial and error for the required number of nested %( %) pairs. Phew...
>
> Ali

Yes, that does look worse ;-)

But I'm looking into that code to see if I can salvage something from it ;-)


January 21, 2022
On Friday, 21 January 2022 at 03:45:08 UTC, forkit wrote:
> On Friday, 21 January 2022 at 02:30:35 UTC, Ali Çehreli wrote:
>>
>> The bigger question is, why did 'formattedRecords' exist at all? You could have written the output directly to the file.
>
> Oh. this was intentional, as I wanted to write once, and only once, to the file.
>

oops. looking back at that code, it seems I didn't write what i intended :-(

I might have to use a kindof stringbuilder instead, then write a massive string once to the file.

similar to C#: File.WriteAllText(Path, finalString);
January 20, 2022
On Fri, Jan 21, 2022 at 03:50:37AM +0000, forkit via Digitalmars-d-learn wrote: [...]
> I might have to use a kindof stringbuilder instead, then write a massive string once to the file.
[...]

std.array.appender is your friend.


T

-- 
Meat: euphemism for dead animal. -- Flora