March 29, 2005
On Mon, 28 Mar 2005 19:05:39 -0500, Ben Hinkle <ben.hinkle@gmail.com> wrote:
>>>>>             if (!line) break;
>>>>
>>>> "if line == null" then break... no idea what this is good for.
>>>
>>> I think this isn't needed. I think it probably is why blank lines stop
>>> the
>>> foreach.
>>
>> I think readLine is broken. It needs to return "" and not null.
>> The difference being that "" has a non null "line.ptr" and "line is null"
>> is not true.
>
> IMO the right way to check if a string is empty is asking if the length is 0.

No. You cannot tell empty from null with length, eg.

char[] isnull = null;
char[] isempty = "";

assert(isnull.length == 0);
assert(isempty.length == 0);

compile, run, no asserts.

> Setting an array's length to 0 automatically sets the ptr to null. So
> relying on any specific behavior of the ptr of a 0 length array is dangerous at best (since it would rely on always slicing to resize).

I agree. I currently use "is" or "===" to tell them apart. eg.

char[] isnull = null;
char[] isempty = "";

assert(isnull === null);
assert(isempty !== null);

I, at first, suspected the behaviour above to be a side effect of D's behaviour of appending \0 to hard-coded/static strings (thus ptr cannot be null for ""). If this behaviour were removed ptr would have 'nothing' to point at. However...

char[] isempty;
char[] test;

test.length = 3;
test[0] = 'a';
test[1] = 'b';
test[2] = 'c';
	
isempty = test[0..0];
	
assert(isempty.length == 0);
assert(isempty !== null);

it appears not, but, as you mention:

> For example the statement
>   str.length = str.length;
> does nothing if length > 0 and sets the ptr to null if length == 0.

isempty.length = isempty.length;
	
assert(isempty.length == 0);
assert(isempty !== null);

asserts on the 2nd assert statement as it has set the ptr to null.

> One can argue about D's behavior about nulling the ptr but that's the
> current situation.

Indeed. Setting length to 0, should IMO create an empty string, not un-assign or free the string. Setting the reference to null should un-assign or free the string.

To be honest I don't really care what it does *so long as* I can tell an empty string (array assigned to something with length 0) apart from one that does not exist (unassigned array, init to null).

The simple fact of the matter being that in some situations these two things need to be treated differently.

In some cases an AA and the "in" operator can be used as a workaround, as "in" checks for existance. I didn't think of this idea immediately (someone else suggested it). It would be nice if the functionality was more immediately apparent.

To clarify I don't want to make it harder to treat them the same, which you can currently do with "if (length == 0)" I just want a guaranteed method of telling them apart.

> Perhaps it should be illegal to implicitly cast a dynamic array to a ptr.

If the array ptr is null the result will be null, right? I don't see a problem with this.

Regan
March 29, 2005
"Regan Heath" <regan@netwin.co.nz> wrote in message news:opsodiv9b023k2f5@nrage.netwin.co.nz...
> On Mon, 28 Mar 2005 19:05:39 -0500, Ben Hinkle <ben.hinkle@gmail.com> wrote:
>>>>>>             if (!line) break;
>>>>>
>>>>> "if line == null" then break... no idea what this is good for.
>>>>
>>>> I think this isn't needed. I think it probably is why blank lines stop
>>>> the
>>>> foreach.
>>>
>>> I think readLine is broken. It needs to return "" and not null.
>>> The difference being that "" has a non null "line.ptr" and "line is
>>> null"
>>> is not true.
>>
>> IMO the right way to check if a string is empty is asking if the length is 0.
>
> No. You cannot tell empty from null with length, eg.
>
> char[] isnull = null;
> char[] isempty = "";
>
> assert(isnull.length == 0);
> assert(isempty.length == 0);
>
> compile, run, no asserts.

uhh - I think we have different definition of the word "empty". I take it
you define empty to be non-null ptr and 0 length, correct? I take empty to
mean anything that compares as equal to "". In D length==0 is equivalent to
=="":
 str.length == 0 iff str == ""
That is why I consider testing length to be the simplest/fastest way to test
for "empty". For example
int main() {
  char[] x;
  x = new char[5];
  assert(x != "");
  assert(x.length != 0);

  x = x[0..0];
  assert(x == "");
  assert(x.length == 0);

  char[] y = "";
  assert(y == "");
  assert(y.length == 0);

  char[] z = null;
  assert(y == "");
  assert(y.length == 0);

  return 0;
}


>> Setting an array's length to 0 automatically sets the ptr to null. So relying on any specific behavior of the ptr of a 0 length array is dangerous at best (since it would rely on always slicing to resize).
>
> I agree. I currently use "is" or "===" to tell them apart. eg.
>
> char[] isnull = null;
> char[] isempty = "";
>
> assert(isnull === null);
> assert(isempty !== null);
>
> I, at first, suspected the behaviour above to be a side effect of D's behaviour of appending \0 to hard-coded/static strings (thus ptr cannot be null for ""). If this behaviour were removed ptr would have 'nothing' to point at. However...
>
> char[] isempty;
> char[] test;
>
> test.length = 3;
> test[0] = 'a';
> test[1] = 'b';
> test[2] = 'c';
>
> isempty = test[0..0];
>
> assert(isempty.length == 0);
> assert(isempty !== null);
>
> it appears not, but, as you mention:

It is also true that
char[] isempty = "";
char[] isempty2 = test[0..0];
assert( isempty !== isempty2);

>> For example the statement
>>   str.length = str.length;
>> does nothing if length > 0 and sets the ptr to null if length == 0.
>
> isempty.length = isempty.length;
>
> assert(isempty.length == 0);
> assert(isempty !== null);
>
> asserts on the 2nd assert statement as it has set the ptr to null.
>
>> One can argue about D's behavior about nulling the ptr but that's the current situation.
>
> Indeed. Setting length to 0, should IMO create an empty string, not un-assign or free the string. Setting the reference to null should un-assign or free the string.
>
> To be honest I don't really care what it does *so long as* I can tell an empty string (array assigned to something with length 0) apart from one that does not exist (unassigned array, init to null).

ah - here I can see what empty means to you. It is true our definitions of "empty" differ.

> The simple fact of the matter being that in some situations these two things need to be treated differently.

That's what "is" and !== are for. But those are rare occasions I would bet.

> In some cases an AA and the "in" operator can be used as a workaround, as "in" checks for existance. I didn't think of this idea immediately (someone else suggested it). It would be nice if the functionality was more immediately apparent.
>
> To clarify I don't want to make it harder to treat them the same, which you can currently do with "if (length == 0)" I just want a guaranteed method of telling them apart.
>
>> Perhaps it should be illegal to implicitly cast a dynamic array to a ptr.
>
> If the array ptr is null the result will be null, right? I don't see a problem with this.

I was suggesting making it illegal so that casually testing !line would be illegal. Instead it would have to be !line.ptr which makes it more obvious what is actually being tested (ie - the length is ignored and just the ptr is checked)

By the way, when would you like readLine to return a null string as opposed to an non-null-zero-length string?


March 29, 2005
Regan Heath wrote (Ben read your feedback as well thanx):

>>>     {
>>>         int result = 0;
>>>         char[] line;
>>>          while(!eof())
>>>         {
>>>             line = readLine();
>>
>>
>> How come readLine() knows of the stream?
> 
> Because LineReader is a child class of BufferedFile, which is a stream.  The readLine call above calls the readLine of the parent class  BufferedFile.

Ah... that is one of the things I really hate as a OOP beginner. It is very difficult to check where the heck certain "behavior" comes from. If  the programmer is indeed fully aware of the parent classes, that may be clearer, but when I only see the "new" code, I find it very confusing. I am not even sure one *can* look up the original definition of the parent classes?

>>>             result = dg(line);
>>>             if (result) break;
>>
>> Don't understand these lines either.
> 
> As Ben said, it's part of the foreach "magic", his links should explain  it. If not, let us know how the docs are deficient and hopefully someone  can improve them.

Reminds me that I don't actually understand D, and that I only use certain code sniplets all over the place sofar. :)


>> Why did you use size_t for lineno, would int now also work? (I tested  this and it works fine to replace all size_t with int).
> 
> As Ben mentioned, size_t is either a 32 or 64 bit type depending on the  underlying OS/processor. I believe the idea is that using it chooses the  most "sensible" type for holding "size" values on the current OS/processor.

Aha... IIRC there was something like that in ANSI C as well... I never trusted it ;)... so size_t is something like a special optimization case. I.e. when do you decide to use good old int, and when do you feel size_t would be a better choice?


>> IOW you seem to have defined a new stream?
> 
> Yes. I have extended/added foreach-ability to any Stream class.

Neat indeed.



BTW, I decided to go the simple way:

File lg = open_Read_Log( glb.log );
File mg = open_Write_Mlg( metafile );

File open_Read_Log( char[] logfile )
{
	char[13] warn = "open_Read_Log";

	if( ! std.file.exists(logfile) )
    {
		Err(warn, "Can't open *read* your log file... '"~logfile~"'", "Ensure log file exists and double check path!");
		exit(1);
	}	

	// Define/create "handle" for logfile READ
	File lg = new File( logfile, FileMode.In );
	// If logfile open error: "Error: file '...' not found"
	return lg;
}

etc.

What surprised me in open_Read_Log(), when comparing it to my ANSI C code:

    if (fgets(line, M2AXCHR, link)==NULL){
      if(ferror(link)!=0){ puts("Error during log read..."); exit(1); }
      clearerr(link); break;}

You can check for file existence.

But you do not seem to be able to handle
    "new File( logfile, FileMode.In )"
errors... i.e. if something happens D, will exit with an internal Error message.

Presumably one could "catch" such errors to provide own error messages?


Same seems to be the case with

    while( ! lg.eof() )	
    {
	line = lg.readLine();
    }

Should a readLine() error occur, then D trows a internal Error message.

I am not sure I *really* want to catch errors, should this be possible in the above 2 cases. But maybe that could be useful?


AEon
March 29, 2005
On Mon, 28 Mar 2005 21:13:54 -0500, Ben Hinkle <ben.hinkle@gmail.com> wrote:
> I take it you define empty to be non-null ptr and 0 length, correct?

"empty" - "Holding or containing nothing."

In my mind something is "empty" if it:
  a. contains nothing.
  b. exists.

It cannot be "empty" if it contains something.
It cannot be "empty" if it does not exist.

So, my first question. How do I represent "non existant" in D?

Some abstract ideas/thoughts. A pointer/reference/handle/whatever is a construct which we use to access some data. This construct IMO needs the ability to (1) indicate the (non)existance of the data (2) give us access to the data.

In C I would use a pointer eg.

char *ptr = NULL;
ptr = NULL;  //no value exists
ptr = "";    //value exists, it is empty.

The humble pointer can indicate that no data exists, by pointing at NULL (which is defined to be an invalid address for data). The pointer can indicate the existing data by pointing at it's address. The data it points to may be empty if it "contains nothing" (what that means depends on the data itself).

D's char[] is a reference not a pointer. A reference should be able to represent 1 & 2 above but it's implementation in D blurs the distinction between "non existant" and "existing but empty" due to it's relationship with null and it's behaviour when setting length to 0.

In short:
- A char[] should not go from "empty" to "non existant" without being explicitly assigned to "non existant" (AKA null).
- "empty" (AKA "") should not compare equal to "non existant" (AKA null).

It appears to me that the only reliable way in D to indicate "non existant" is to throw an exception. Perhaps this is acceptable, perhaps it's the D way and I simply have to get used to it.

<snip>

>>> Perhaps it should be illegal to implicitly cast a dynamic array to a ptr.
>>
>> If the array ptr is null the result will be null, right? I don't see a
>> problem with this.
>
> I was suggesting making it illegal so that casually testing !line would be illegal. Instead it would have to be !line.ptr which makes it more obvious what is actually being tested (ie - the length is ignored and just the ptr is checked)

I don't think this is necessary.

> By the way, when would you like readLine to return a null string as opposed to an non-null-zero-length string?

At the end of file.

readLine() - null means no lines "exist".
readLine() - "" means a line "exists" but is "emtpy" of chars.

Regan
March 29, 2005
On Tue, 29 Mar 2005 22:47:53 +1200, Regan Heath wrote:

> On Mon, 28 Mar 2005 21:13:54 -0500, Ben Hinkle <ben.hinkle@gmail.com> wrote:
>> I take it you define empty to be non-null ptr and 0 length, correct?
> 
> "empty" - "Holding or containing nothing."
> 
> In my mind something is "empty" if it:
>    a. contains nothing.
>    b. exists.
> 
> It cannot be "empty" if it contains something.
> It cannot be "empty" if it does not exist.
> 
> So, my first question. How do I represent "non existant" in D?
> 
> Some abstract ideas/thoughts. A pointer/reference/handle/whatever is a construct which we use to access some data. This construct IMO needs the ability to (1) indicate the (non)existance of the data (2) give us access to the data.
> 
> In C I would use a pointer eg.
> 
> char *ptr = NULL;
> ptr = NULL;  //no value exists
> ptr = "";    //value exists, it is empty.
> 
> The humble pointer can indicate that no data exists, by pointing at NULL (which is defined to be an invalid address for data). The pointer can indicate the existing data by pointing at it's address. The data it points to may be empty if it "contains nothing" (what that means depends on the data itself).
> 
> D's char[] is a reference not a pointer. A reference should be able to represent 1 & 2 above but it's implementation in D blurs the distinction between "non existant" and "existing but empty" due to it's relationship with null and it's behaviour when setting length to 0.
> 
> In short:
> - A char[] should not go from "empty" to "non existant" without being
> explicitly assigned to "non existant" (AKA null).
> - "empty" (AKA "") should not compare equal to "non existant" (AKA null).
> 
> It appears to me that the only reliable way in D to indicate "non existant" is to throw an exception. Perhaps this is acceptable, perhaps it's the D way and I simply have to get used to it.
> 
> <snip>
> 
>>>> Perhaps it should be illegal to implicitly cast a dynamic array to a ptr.
>>>
>>> If the array ptr is null the result will be null, right? I don't see a problem with this.
>>
>> I was suggesting making it illegal so that casually testing !line would be illegal. Instead it would have to be !line.ptr which makes it more obvious what is actually being tested (ie - the length is ignored and just the ptr is checked)
> 
> I don't think this is necessary.
> 
>> By the way, when would you like readLine to return a null string as opposed to an non-null-zero-length string?
> 
> At the end of file.
> 
> readLine() - null means no lines "exist".
> readLine() - "" means a line "exists" but is "emtpy" of chars.

All of this is well said and presented. I'm in total agreement with this point of view.

An empty string is a string that is empty.

-- 
Derek Parnell
Melbourne, Australia
29/03/2005 9:03:46 PM
March 29, 2005
On Tue, 29 Mar 2005 05:35:07 +0200, AEon <aeon2001@lycos.de> wrote:
> Regan Heath wrote (Ben read your feedback as well thanx):
>
>>>>     {
>>>>         int result = 0;
>>>>         char[] line;
>>>>          while(!eof())
>>>>         {
>>>>             line = readLine();
>>>
>>>
>>> How come readLine() knows of the stream?
>>  Because LineReader is a child class of BufferedFile, which is a stream.  The readLine call above calls the readLine of the parent class  BufferedFile.
>
> Ah... that is one of the things I really hate as a OOP beginner. It is very difficult to check where the heck certain "behavior" comes from. If   the programmer is indeed fully aware of the parent classes, that may be clearer, but when I only see the "new" code, I find it very confusing. I am not even sure one *can* look up the original definition of the parent classes?

In this case you can look in dmd\src\phobos\std\stream.d for the class definition of BufferedFile.

You may be interested in an old thread on method name resolution:
  http://www.digitalmars.com/d/archives/digitalmars/D/6928.html

It's kinda involved but relevant to your comments above as the method name resolution affects the behaviour of a derived class. The idea being D's method name resolution makes it simpler/explicit WRT the behaviour of classes with overloaded methods.

>>>>             result = dg(line);
>>>>             if (result) break;
>>>
>>> Don't understand these lines either.
>>  As Ben said, it's part of the foreach "magic", his links should explain  it. If not, let us know how the docs are deficient and hopefully someone  can improve them.
>
> Reminds me that I don't actually understand D, and that I only use certain code sniplets all over the place sofar. :)

I wouldn't worry overmuch. I still find it hard to remember how to code things like opApply, I copy/paste from the docs and then modify each time I do it.

>>> Why did you use size_t for lineno, would int now also work? (I tested  this and it works fine to replace all size_t with int).
>>  As Ben mentioned, size_t is either a 32 or 64 bit type depending on the  underlying OS/processor. I believe the idea is that using it chooses the  most "sensible" type for holding "size" values on the current OS/processor.
>
> Aha... IIRC there was something like that in ANSI C as well... I never trusted it ;)... so size_t is something like a special optimization case. I.e. when do you decide to use good old int, and when do you feel size_t would be a better choice?

Good question. I would use 'int' when the size of the type is important, i.e. I need 32 bits. I would use size_t when the size is unimportant, so long as it is "big enough".

> But you do not seem to be able to handle
>      "new File( logfile, FileMode.In )"
> errors... i.e. if something happens D, will exit with an internal Error message.
>
> Presumably one could "catch" such errors to provide own error messages?

Yes.

try {
  File f = new File(logfile, FileMode.In);
}
catch (OpenException e) {
  writefln("OPEN ERROR - ",e);
}

> Same seems to be the case with
>
>      while( ! lg.eof() )	
>      {
> 	line = lg.readLine();
>      }
>
> Should a readLine() error occur, then D trows a internal Error message.

try {
  while( ! lg.eof() )	
  {
    line = lg.readLine();
  }
}
catch (ReadException e) {
  writefln("READ ERROR - ",e);
}

> I am not sure I *really* want to catch errors, should this be possible in the above 2 cases. But maybe that could be useful?

Exceptions are the recommended error handling mechanism for D. The argument/confusion centers around what is worthy of an exception and what is not.

For example IMO in the code above not being able to open a file is exceptional (you have assumed it exists by opening in FileMode.In), but, reaching the end of the file is not exceptional as it's guaranteed to happen eventually.

Uncaught exceptions are automatically handled by the default handler, for trivial applications allowing it to handle your exceptions (like the failure to open a file) might be exactly what you want. It's your choice.

Regan
March 29, 2005
"Regan Heath" <regan@netwin.co.nz> wrote in message news:opsoeax3jt23k2f5@nrage.netwin.co.nz...
> On Mon, 28 Mar 2005 21:13:54 -0500, Ben Hinkle <ben.hinkle@gmail.com> wrote:
>> I take it you define empty to be non-null ptr and 0 length, correct?
>
> "empty" - "Holding or containing nothing."
>
> In my mind something is "empty" if it:
>   a. contains nothing.
>   b. exists.
>
> It cannot be "empty" if it contains something.
> It cannot be "empty" if it does not exist.
>
> So, my first question. How do I represent "non existant" in D?

What you describe is ok with me but I don't think it maps well to D's
arrays. To me I don't really look at existance or non-existance but instead
the following two rules
1) all arrays have a well-defined length
2) arrays with non-zero length have a well-defined pointer
One can tread carefully to preserve pointers with 0 length arrays but it
takes effort.

>> By the way, when would you like readLine to return a null string as opposed to an non-null-zero-length string?
>
> At the end of file.
>
> readLine() - null means no lines "exist".
> readLine() - "" means a line "exists" but is "emtpy" of chars.

The foreach will stop automatically at eof. It's like a foreach stopping at the end of an array when it has no more elements. It doesn't run once more with null - it just stops.


March 29, 2005
Regan Heath wrote:

>> But you do not seem to be able to handle
>>      "new File( logfile, FileMode.In )"
>> errors... i.e. if something happens D, will exit with an internal Error  message.
>>
>> Presumably one could "catch" such errors to provide own error messages?
> 
> Yes.
> 
> try {
>   File f = new File(logfile, FileMode.In);
> }
> catch (OpenException e) {
>   writefln("OPEN ERROR - ",e);
> }

Have read several examples by now. Is there a complete list of catch "keywords"? The D documentions mentions a few, but probably not all?

e.g.	catch (ArrayBoundsError)
	catch (Object o)
	catch (std.asserterror.AssertError ae)

>> Same seems to be the case with
>>
>>      while( ! lg.eof() )        {
>>     line = lg.readLine();
>>      }
>>
>> Should a readLine() error occur, then D trows a internal Error message.
> 
> try {
>   while( ! lg.eof() )     {
>     line = lg.readLine();
>   }
> }
> catch (ReadException e) {
>   writefln("READ ERROR - ",e);
> }

Ahh... info like that could be helpful in the official docs.


>> I am not sure I *really* want to catch errors, should this be possible  in the above 2 cases. But maybe that could be useful?
> 
> Exceptions are the recommended error handling mechanism for D. The  argument/confusion centers around what is worthy of an exception and what is not.
> 
> For example IMO in the code above not being able to open a file is  exceptional (you have assumed it exists by opening in FileMode.In), but,  reaching the end of the file is not exceptional as it's guaranteed to  happen eventually.
> 
> Uncaught exceptions are automatically handled by the default handler, for  trivial applications allowing it to handle your exceptions (like the  failure to open a file) might be exactly what you want. It's your choice.

Well in the above examples it would basically just give me the chance to write out my own messages. But since these cases are serious, there is nothing much one could save.

AEon
March 29, 2005
On Tue, 29 Mar 2005 08:29:36 -0500, Ben Hinkle <ben.hinkle@gmail.com> wrote:
> "Regan Heath" <regan@netwin.co.nz> wrote in message
> news:opsoeax3jt23k2f5@nrage.netwin.co.nz...
>> On Mon, 28 Mar 2005 21:13:54 -0500, Ben Hinkle <ben.hinkle@gmail.com>
>> wrote:
>>> I take it you define empty to be non-null ptr and 0 length, correct?
>>
>> "empty" - "Holding or containing nothing."
>>
>> In my mind something is "empty" if it:
>>   a. contains nothing.
>>   b. exists.
>>
>> It cannot be "empty" if it contains something.
>> It cannot be "empty" if it does not exist.
>>
>> So, my first question. How do I represent "non existant" in D?
>
> What you describe is ok with me but I don't think it maps well to D's
> arrays.

Exactly my point. It would only take a few small changes to "fix" the problem as I see it.

> To me I don't really look at existance or non-existance but instead the following two rules
> 1) all arrays have a well-defined length
> 2) arrays with non-zero length have a well-defined pointer
> One can tread carefully to preserve pointers with 0 length arrays but it
> takes effort.

Indeed. So, how do you handle existance/non-existance?

>>> By the way, when would you like readLine to return a null string as
>>> opposed to an non-null-zero-length string?
>>
>> At the end of file.
>>
>> readLine() - null means no lines "exist".
>> readLine() - "" means a line "exists" but is "emtpy" of chars.
>
> The foreach will stop automatically at eof. It's like a foreach stopping at the end of an array when it has no more elements. It doesn't run once more with null - it just stops.

Which foreach? My one? Assume now that I remove the eof() check. What happens now?

Regan
March 29, 2005
On Tue, 29 Mar 2005 19:33:30 +0200, AEon <aeon2001@lycos.de> wrote:
>>> But you do not seem to be able to handle
>>>      "new File( logfile, FileMode.In )"
>>> errors... i.e. if something happens D, will exit with an internal Error  message.
>>>
>>> Presumably one could "catch" such errors to provide own error messages?
>>  Yes.
>>  try {
>>   File f = new File(logfile, FileMode.In);
>> }
>> catch (OpenException e) {
>>   writefln("OPEN ERROR - ",e);
>> }
>
> Have read several examples by now. Is there a complete list of catch "keywords"? The D documentions mentions a few, but probably not all?
>
> e.g.	catch (ArrayBoundsError)
> 	catch (Object o)
> 	catch (std.asserterror.AssertError ae)

Each "catch keyword" is a class derived from the Exception or Error classes. They are defined in the modules that use them. I agree it would be nice to have a complete list. Eventually I can imagine a documentation generator listing all the exceptions that can be thrown by a function.

Regan