Jump to page: 1 24  
Page
Thread overview
constness for arrays
Jul 18, 2006
xs0
Jul 18, 2006
Reiner Pope
Jul 18, 2006
David Medlock
Jul 18, 2006
xs0
Jul 18, 2006
Don Clugston
Jul 18, 2006
David Medlock
Jul 18, 2006
xs0
Jul 19, 2006
xs0
Jul 19, 2006
Andrew Fedoniouk
Jul 19, 2006
Chad J
Jul 19, 2006
Andrew Fedoniouk
Jul 19, 2006
Chad J
Jul 19, 2006
Andrew Fedoniouk
Jul 19, 2006
Chad J
Jul 19, 2006
xs0
Jul 19, 2006
Andrew Fedoniouk
Jul 19, 2006
xs0
Jul 19, 2006
Johan Granberg
Jul 20, 2006
Andrew Fedoniouk
Jul 19, 2006
Andrew Fedoniouk
Jul 19, 2006
Dave
Jul 19, 2006
Andrew Fedoniouk
Jul 19, 2006
Dave
Jul 19, 2006
xs0
Jul 19, 2006
xs0
Jul 20, 2006
Bruno Medeiros
Jul 19, 2006
Reiner Pope
Jul 19, 2006
Andrew Fedoniouk
Jul 20, 2006
Reiner Pope
Jul 19, 2006
Craig Black
Jul 19, 2006
xs0
Jul 20, 2006
Don Clugston
Jul 20, 2006
Andrew Fedoniouk
Jul 21, 2006
Don Clugston
Jul 21, 2006
Andrew Fedoniouk
Jul 21, 2006
Ben Phillips
Jul 21, 2006
Andrew Fedoniouk
Jul 22, 2006
Derek Parnell
Jul 22, 2006
Andrew Fedoniouk
Jul 22, 2006
Derek Parnell
July 18, 2006
As has been discussed, the lack of something like C++ const hurts most when using arrays, as you can't code around it like with classes, structs or primitives (with the latter two you can just pass by value, for classes you can make readonly versions). The fact that inbuilt strings are also arrays makes the problem occur often.

I was wondering whether the following would resolve that issue:

- the top bit of arrays' .length becomes an indicator of the readonlyness of the array reference

- type of .length is changed from uint to int (just to indicate the proper maximum value; it still can't be negative (from the user's POV))

- arrays get a .isReadonly property which tests the top bit

- arrays get a .lock() method that sets it to 1

- .dup clears it (obviously :)

- reading .length masks the bit out

- setting .length sets the bit to zero if reallocation occurs, and leaves it intact otherwise

- arrays get a .readonly property which returns a copy of the array reference with the bit set

- optionally, arrays get a .needToWrite() method which does the following: { if (arr.isReadonly) arr=arr.dup; }  (yes, the name sucks)


Now this has the following (imho) neat properties:

- initial implementation should be quite trivial, I bet Walter could do it in a few hours; eventually, debug builds could prevent you from writing to a readonly array, but that's even not that important

- losing that one bit has no real effect on anything

- it can be tested for at runtime

- it has practically negligible impact on efficiency:
  - reading .length needs one instruction more (AND with 0x7fffffff)
  - setting .length needs about three instructions more
  - reading and writing to the array has no additional cost
  - moving the reference around also costs the same
  - new operations are quite trivial as well

- fits with COW perfectly

- there's no const-pollution and no need to write two versions of functions


A quick example of the possibilities:

char[] toUpper(char[] txt)
{
    for (int i=0; i<txt.length; i++) {
        char c = s[i];
        if ('a'<=c && c<='z') {
            txt.needToWrite();
            txt[i] = c-(cast(char)'a'-'A');
        }
    }
}

char[] FOO = toUpper("foo"); // constants are readonly, so COW is made

char[] bibi = getBibi(); // who owns it? I can finally know if it's me
char[] BIBI = toUpper(bibi); // write into bibi, if owned
char[] BIBI = toUpper(bibi.readonly); // leave bibi alone, as I need it


What y'all think?


xs0

PS:
Credits: the idea is not all mine, I got it from the discussion with Reiner Pope on D.learn

I'm also sorry if this was already suggested, but I don't remember anything of the sorts discussed..
July 18, 2006
I like it. Same capabilities as I was grasping at, but much more elegant.

>   - reading and writing to the array has no additional cost
I just want to point out that this means that you can compile a release version with checking disabled for max speed, but for extra safety, you could set write-checking for debug builds, where the slowdown should be more acceptable.
July 18, 2006
xs0 wrote:
> As has been discussed, the lack of something like C++ const hurts most when using arrays, as you can't code around it like with classes, structs or primitives (with the latter two you can just pass by value, for classes you can make readonly versions). The fact that inbuilt strings are also arrays makes the problem occur often.
> 
> I was wondering whether the following would resolve that issue:
> 
> - the top bit of arrays' .length becomes an indicator of the readonlyness of the array reference
> 
<snip>

I like it except it drops the max array size by half, doesn't it?

Since we are talking about dynamic arrays here, why not just:

1. add a flags byte or short to the internal array structure to hold it.

2. make the pointers lower bit hold it- this of course assumes the pointer is at least word-aligned.  I am not sure if this would conflict with structs which are align(1).

-DavidM
July 18, 2006
David Medlock wrote:
> xs0 wrote:
>> As has been discussed, the lack of something like C++ const hurts most when using arrays, as you can't code around it like with classes, structs or primitives (with the latter two you can just pass by value, for classes you can make readonly versions). The fact that inbuilt strings are also arrays makes the problem occur often.
>>
>> I was wondering whether the following would resolve that issue:
>>
>> - the top bit of arrays' .length becomes an indicator of the readonlyness of the array reference
>>
> <snip>
> 
> I like it except it drops the max array size by half, doesn't it?

Theoretically, but in practice:
- if you have a 64-bit machine, you don't care
- if you have a 32-bit machine, you can't get the full 4GB anyway (on Windows, a process can only allocate 2GB, I bet it's similar in other OSes)
- with anything larger than a byte you don't even theoretically need that bit

> Since we are talking about dynamic arrays here, why not just:
> 
> 1. add a flags byte or short to the internal array structure to hold it.

because that would increase the size of array structure, making it consume more memory (I'd guess at least 4 bytes per reference) and slower (more data to copy)

> 2. make the pointers lower bit hold it- this of course assumes the pointer is at least word-aligned.  I am not sure if this would conflict with structs which are align(1).

that wouldn't work - the pointer is unrestricted and can point anywhere..


xs0
July 18, 2006
xs0 wrote:
> As has been discussed, the lack of something like C++ const hurts most when using arrays, as you can't code around it like with classes, structs or primitives (with the latter two you can just pass by value, for classes you can make readonly versions). The fact that inbuilt strings are also arrays makes the problem occur often.
> 
> I was wondering whether the following would resolve that issue:
> 
> - the top bit of arrays' .length becomes an indicator of the readonlyness of the array reference

This is a really interesting idea. You're essentially chasing a performance benefit, rather than program correctness. Some benchmarks ought to be able to tell you if the performance benefit is real:

instead of char[], use

struct CharArray {
 char [] arr;
 bool readOnly;
}

for both the existing and proposed behaviour (for the existing one, readonly is ignored, but include it to make the parameter passing fair).

For code that makes heavy use of COW, I suspect that the benefit could be considerable. You probably don't need to eliminate many .dups to pay for the slightly slower .length.

The situation where a function only occasionally returns a read-only string is probably quite common:

char [] func(int n) {
  if (n==0) return "annoying";
  else return toString(n);
}
.. and you have to .dup it for the rare case where it's a literal.

July 18, 2006
Don Clugston wrote:
> xs0 wrote:
> 
>> As has been discussed, the lack of something like C++ const hurts most when using arrays, as you can't code around it like with classes, structs or primitives (with the latter two you can just pass by value, for classes you can make readonly versions). The fact that inbuilt strings are also arrays makes the problem occur often.
>>
>> I was wondering whether the following would resolve that issue:
>>
>> - the top bit of arrays' .length becomes an indicator of the readonlyness of the array reference
> 
> 
> This is a really interesting idea. You're essentially chasing a performance benefit, rather than program correctness. Some benchmarks ought to be able to tell you if the performance benefit is real:
> 
> instead of char[], use
> 
> struct CharArray {
>  char [] arr;
>  bool readOnly;
> }
> 
> for both the existing and proposed behaviour (for the existing one, readonly is ignored, but include it to make the parameter passing fair).
> 
> For code that makes heavy use of COW, I suspect that the benefit could be considerable. You probably don't need to eliminate many .dups to pay for the slightly slower .length.
> 
> The situation where a function only occasionally returns a read-only string is probably quite common:
> 
> char [] func(int n) {
>   if (n==0) return "annoying";
>   else return toString(n);
> }
> .. and you have to .dup it for the rare case where it's a literal.
> 

Agreed , Don.

Its important to note this is two issues:

1.  A readonly property of arrays.
2.  Implementation of #1.

If Walter agrees on #1, I am sure he is best person to ask for advice on #2 (at least in the case of DMD).

-DavidM
July 18, 2006
Don Clugston wrote:
> xs0 wrote:
>> As has been discussed, the lack of something like C++ const hurts most when using arrays, as you can't code around it like with classes, structs or primitives (with the latter two you can just pass by value, for classes you can make readonly versions). The fact that inbuilt strings are also arrays makes the problem occur often.
>>
>> I was wondering whether the following would resolve that issue:
>>
>> - the top bit of arrays' .length becomes an indicator of the readonlyness of the array reference
> 
> This is a really interesting idea. You're essentially chasing a performance benefit, rather than program correctness. Some benchmarks ought to be able to tell you if the performance benefit is real:

Well, actually the primary motivation is correctness*, but the whole thing does indeed seem to benefit performance as well (well, pending some actual tests, but it seems quite obvious).

> instead of char[], use
> 
> struct CharArray {
>  char [] arr;
>  bool readOnly;
> }
> 
> for both the existing and proposed behaviour (for the existing one, readonly is ignored, but include it to make the parameter passing fair).

will probably do, as soon as I have some time (but if anyone else feels like it, go ahead ;)


> The situation where a function only occasionally returns a read-only string is probably quite common:
> 
> char [] func(int n) {
>   if (n==0) return "annoying";
>   else return toString(n);
> }
> .. and you have to .dup it for the rare case where it's a literal.

Yup.. and it's also common to .dup just because there is no indication at all whether it is required or not, and it's the safe thing to do.. like, if you call something like ucFirst(toLower("BIBI")) there's no need for ucFirst to .dup (already done in toLower), but it still does as it has no idea that is the case..


xs0

*) correctness in the sense that it becomes easier to write correct programs, and not in the sense that the compiler would force you to do so, at least until debugtime checking is done
July 19, 2006
Dynamic constness versus static (compile time) constness is not new.

For example in Ruby you can dynamicly declare object/array readonly and
its runtime will control all modifications and note - in full as Ruby's
sandbox
(as any other VM based runtime) has all facilities to fully control
immutability of such objects.

In case of runtimes like D (natively compileable) such control is not an
option.

I beleive that proposed runtime flag a) is not a constness in any sense
b) does not solve compile verification of readonlyness and
c) can be implemented now by defining:
struct vector
{
    bool readonly;
    T*  data;
    uint length;
}

Declarative contness prevents data misuse at compile time when runtime constness moves problem into execution time when is a) too late to do anything and b) expensive.

I would mention old idea again - real solution would be in creating of
mechanism of disabling exiting or creating new opertaions
for intrinsic types.

For example string definition might look like as:

typedef  string char[]
{
    disable opAssign;
    ....
    char[] tolower() { ..... }
}

In any case such mechanism a) is more universal than const in C++
b) allows to create flexible type systems and finally
c) this will also legalize situation with
"external methods" D has now for array types.

The later one alone is a good enough motivation to do so
as current situation with "external methods" looks like as
just a bug of design or compiler to be honest.

I am yet silent that it will make D's type system unique
in this respect among other languages.

Andrew Fedoniouk.
http://terrainformatica.com





July 19, 2006
Andrew Fedoniouk wrote:
> Dynamic constness versus static (compile time) constness is not new.
> 
> For example in Ruby you can dynamicly declare object/array readonly and
> its runtime will control all modifications and note - in full as Ruby's sandbox
> (as any other VM based runtime) has all facilities to fully control
> immutability of such objects.
> 
> In case of runtimes like D (natively compileable) such control is not an
> option.
> 
> I beleive that proposed runtime flag a) is not a constness in any sense
> b) does not solve compile verification of readonlyness and
> c) can be implemented now by defining:
> struct vector
> {
>     bool readonly;
>     T*  data;
>     uint length;
> }
> 
> Declarative contness prevents data misuse at compile time
> when runtime constness moves problem into execution time
> when is a) too late to do anything and b) expensive.
> 
> I would mention old idea again - real solution would be in creating of
> mechanism of disabling exiting or creating new opertaions
> for intrinsic types.
> 
> For example string definition might look like as:
> 
> typedef  string char[]
> {
>     disable opAssign;
>     ....
>     char[] tolower() { ..... }
> }
> 
> In any case such mechanism a) is more universal than const in C++
> b) allows to create flexible type systems and finally
> c) this will also legalize situation with
> "external methods" D has now for array types.
> 
> The later one alone is a good enough motivation to do so
> as current situation with "external methods" looks like as
> just a bug of design or compiler to be honest.
> 
> I am yet silent that it will make D's type system unique
> in this respect among other languages.
> 
> Andrew Fedoniouk.
> http://terrainformatica.com
> 

I like that typedef.  Should be templatable though...

typedef(T) array T[]
{
    ...
}

Or some such.  In an earlier post ("Module level operator overloading" at http://www.digitalmars.com/drn-bin/wwwnews?digitalmars.D/39504) I was hoping for external functions as operator overloads and IFTI to help with things like array operations.  I just didn't know about external functions at the time.  But if this is supposed to replace external functions, how would I do the array op overloads that external functions would help me with?  Would be unfortunate to write something like this...

typedef(T) T[] T[] // mmm what would this do
{
  void opAdd(T[] array1, T[] array2)
  {
    etc...
  }

  ...
}
July 19, 2006
Andrew Fedoniouk wrote:
> Dynamic constness versus static (compile time) constness is not new.
> 
> For example in Ruby you can dynamicly declare object/array readonly and
> its runtime will control all modifications and note - in full as Ruby's sandbox
> (as any other VM based runtime) has all facilities to fully control
> immutability of such objects.
> 
> In case of runtimes like D (natively compileable) such control is not an
> option.
> 
> I beleive that proposed runtime flag a) is not a constness in any sense
> b) does not solve compile verification of readonlyness and
> c) can be implemented now by defining:
> struct vector
> {
>     bool readonly;
>     T*  data;
>     uint length;
> }
> 
> Declarative contness prevents data misuse at compile time
> when runtime constness moves problem into execution time
> when is a) too late to do anything and b) expensive.
> 
> I would mention old idea again - real solution would be in creating of
> mechanism of disabling exiting or creating new opertaions
> for intrinsic types.
> 
> For example string definition might look like as:
> 
> typedef  string char[]
> {
>     disable opAssign;
>     ....
>     char[] tolower() { ..... }
> }
> 
> In any case such mechanism a) is more universal than const in C++
> b) allows to create flexible type systems and finally
> c) this will also legalize situation with
> "external methods" D has now for array types.
> 
> The later one alone is a good enough motivation to do so
> as current situation with "external methods" looks like as
> just a bug of design or compiler to be honest.
> 
> I am yet silent that it will make D's type system unique
> in this respect among other languages.
> 
> Andrew Fedoniouk.
> http://terrainformatica.com
> 
> 

What do you mean by external methods?

This?

import std.stdio;
void main()
{
    char[] str = "abc";
    writefln(str.ucase()); // "ABC"
}
char[] ucase(char[] str)
{
    foreach(inout char c; str) if(c >= 'a' && c <= 'z') c += 'A' - 'a';
    return str;
}

If so, that's not a bug, it's intentional. Line 4141 of expression.c.

- Dave
« First   ‹ Prev
1 2 3 4