De-Referencing A Pointer (page 3) - D Programming Language Discussion Forum

Forums

New users
- Learn
Community
- General
- Announce
Improvements
- DIP Ideas
- DIP Devel.
Ecosystem
- GDC
- LDC
- Debuggers
- IDEs
- DWT
Development
- Internals
- Issues
- Beta
- DMD
- Phobos
- Druntime
- Study
Turkish
- Genel
- Duyuru

Index » Learn » De-Referencing A Pointer (page 3)

March 22, 2006

Re: De-Referencing A Pointer

Posted by David L. Davis
in reply to Rory Starkweather

David L. Davis

Posted in reply to Rory Starkweather

Rory, I've put together some code that I got to work, you can download it in the following zip file: http://spottedtiger.tripod.com/Downloads/SampleDLL.zip

David L.

-------------------------------------------------------------------
"Dare to reach for the Stars...Dare to Dream, Build, and Achieve!"
-------------------------------------------------------------------

MKoD: http://spottedtiger.tripod.com/D_Language/D_Main_XP.html

March 22, 2006

Re: De-Referencing A Pointer

Posted by tjohnson at prtsoftware . com
in reply to David L. Davis

tjohnson at prtsoftware . com

Posted in reply to David L. Davis

In article <dvqi0d$29mi$1@digitaldaemon.com>, David L. Davis says...
>
>Rory, I've put together some code that I got to work, you can download it in the following zip file: http://spottedtiger.tripod.com/Downloads/SampleDLL.zip
>
>David L.
>
>-------------------------------------------------------------------
>"Dare to reach for the Stars...Dare to Dream, Build, and Achieve!"
>-------------------------------------------------------------------
>
>MKoD: http://spottedtiger.tripod.com/D_Language/D_Main_XP.html

Thanks.  I was going to ask for this.

Tom J

March 22, 2006

Re: De-Referencing A Pointer

Posted by James Dunne
in reply to James Dunne

James Dunne

Posted in reply to James Dunne

Attachments:

DInStr - works.zip

James Dunne wrote:
> Rory Starkweather wrote:
> 
>> ...
> 
> As much as I'd love to help you out, I'm also hopelessly lost.  Can you ZIP up your project and document your test cases and post them on the NG or e-mail them to me personally?  The e-mail I post with on the newsgroups is my valid e-mail.
> 

The working code is attached with a sample Excel spreadsheet with embedded VBA test code.

Since VB uses equivalents of D's wchar* strings, you cannot easily use D's phobos string handling functions, since most of them expect char[] arguments, not wchar[].

My conclusion is that we really need wchar[] and dchar[] string-handling functions in phobos, or at least be assured that all the string-handling functions which accept char[] arguments assume UTF-8 encoding, not just ASCII.

I had to look at the std.string.d module's find() method to see if it is accepting ASCII or UTF-8, and I couldn't figure out which!  The code is:

int find(char[] s, dchar c)
{
     char* p;

     if (c <= 0x7F)
     {	// Plain old ASCII
	p = cast(char*)memchr(s, c, s.length);
	if (p)
	    return p - cast(char *)s;
	else
	    return -1;
     }

     // c is a universal character
     foreach (int i, dchar c2; s)
     {
	if (c == c2)
	    return i;
     }
     return -1;
}

This doesn't make a lick of sense to me why one can iterate over a char[] with foreach, expecting dchars to come out of it outside the range of ASCII...  is there something going on under the hood that I'm not aware of?  Is this code trying to imply that the char[] is being treated as UTF-8 magically?

-- 
-----BEGIN GEEK CODE BLOCK-----
Version: 3.1
GCS/MU/S d-pu s:+ a-->? C++++$ UL+++ P--- L+++ !E W-- N++ o? K? w--- O
M--@ V? PS PE Y+ PGP- t+ 5 X+ !R tv-->!tv b- DI++(+) D++ G e++>e
h>--->++ r+++ y+++
------END GEEK CODE BLOCK------

James Dunne

March 23, 2006

Re: De-Referencing A Pointer

Posted by Derek Parnell
in reply to James Dunne

Derek Parnell

Posted in reply to James Dunne

On Thu, 23 Mar 2006 02:59:09 +1100, James Dunne <james.jdunne@gmail.com> wrote:



> I had to look at the std.string.d module's find() method to see if it is
> accepting ASCII or UTF-8, and I couldn't figure out which!  The code is:
>
> int find(char[] s, dchar c)
> {
>      char* p;
>
>      if (c <= 0x7F)
>      {	// Plain old ASCII
> 	p = cast(char*)memchr(s, c, s.length);
> 	if (p)
> 	    return p - cast(char *)s;
> 	else
> 	    return -1;
>      }
>
>      // c is a universal character
>      foreach (int i, dchar c2; s)
>      {
> 	if (c == c2)
> 	    return i;
>      }
>      return -1;
> }
>
> This doesn't make a lick of sense to me why one can iterate over a
> char[] with foreach, expecting dchars to come out of it outside the
> range of ASCII...  is there something going on under the hood that I'm
> not aware of?

The foreach() has a mode of operation that automatically converts UTF encodings one character at a time.  Thus  foreach(dchar c; "abcdef"c) is valid code.

> Is this code trying to imply that the char[] is being
> treated as UTF-8 magically?

char[] *is* utf-8 it is not ASCII.  No magic here.

-- 
Derek Parnell
Melbourne, Australia

March 23, 2006

Re: De-Referencing A Pointer

Posted by James Dunne
in reply to Derek Parnell

James Dunne

Posted in reply to Derek Parnell

Derek Parnell wrote:
> On Thu, 23 Mar 2006 02:59:09 +1100, James Dunne <james.jdunne@gmail.com>  wrote:
> 
> 
> 
>> I had to look at the std.string.d module's find() method to see if it is
>> accepting ASCII or UTF-8, and I couldn't figure out which!  The code is:
>>
>> int find(char[] s, dchar c)
>> {
>>      char* p;
>>
>>      if (c <= 0x7F)
>>      {    // Plain old ASCII
>>     p = cast(char*)memchr(s, c, s.length);
>>     if (p)
>>         return p - cast(char *)s;
>>     else
>>         return -1;
>>      }
>>
>>      // c is a universal character
>>      foreach (int i, dchar c2; s)
>>      {
>>     if (c == c2)
>>         return i;
>>      }
>>      return -1;
>> }
>>
>> This doesn't make a lick of sense to me why one can iterate over a
>> char[] with foreach, expecting dchars to come out of it outside the
>> range of ASCII...  is there something going on under the hood that I'm
>> not aware of?
> 
> 
> The foreach() has a mode of operation that automatically converts UTF  encodings one character at a time.  Thus  foreach(dchar c; "abcdef"c) is  valid code.
> 

Okay, that's where I was confused.  Thanks!

>> Is this code trying to imply that the char[] is being
>> treated as UTF-8 magically?
> 
> 
> char[] *is* utf-8 it is not ASCII.  No magic here.
> 

I understand that, I just didn't know the hidden foreach magic.

-- 
-----BEGIN GEEK CODE BLOCK-----
Version: 3.1
GCS/MU/S d-pu s:+ a-->? C++++$ UL+++ P--- L+++ !E W-- N++ o? K? w--- O M--@ V? PS PE Y+ PGP- t+ 5 X+ !R tv-->!tv b- DI++(+) D++ G e++>e h>--->++ r+++ y+++
------END GEEK CODE BLOCK------

James Dunne

March 23, 2006

Break/Continue Structure

Posted by Rory Starkweather
in reply to James Dunne

Rory Starkweather

Posted in reply to James Dunne

 I guess I would like to ask why I shouldn't do this at the same time I ask how to do it.

 I've been looking at a piece of code like:

	foreach (int i, dchar c; theString)
	{
                if (c == searchChar)
		return i + 1;
	}
	return 0;
}

 I understand the reason for doing this, but prefer to do things like this:

	int iPointer;
	
	iPointer = 0;
	foreach (int i, dchar c; theString)
	{
                if (c == searchChar)
		iPointer =  i + 1;
		// ?? break;
	}
	return (iPointer);
}

 I realize that the extra integer takes up a little memory space, but . . .

 My questions are:
Will 'break' work here?
Why not do it this way?

March 24, 2006

Re: Break/Continue Structure

Posted by Derek Parnell
in reply to Rory Starkweather

Derek Parnell

Posted in reply to Rory Starkweather

On Thu, 23 Mar 2006 17:51:09 -0600, Rory Starkweather wrote:

>   I guess I would like to ask why I shouldn't do this at the same time I
> ask how to do it.
> 
>   I've been looking at a piece of code like:
> 
> 	foreach (int i, dchar c; theString)
> 	{
>                  if (c == searchChar)
> 		return i + 1;
> 	}
> 	return 0;
> }
> 
>   I understand the reason for doing this, but prefer to do things like this:
> 
> 	int iPointer;
> 
> 	iPointer = 0;
> 	foreach (int i, dchar c; theString)
> 	{
>                  if (c == searchChar)
> 		iPointer =  i + 1;
> 		// ?? break;
> 	}
> 	return (iPointer);
> }
> 
>   I realize that the extra integer takes up a little memory space, but . . .
> 
>   My questions are:
> Will 'break' work here?
Yes it will, though it should be coded ...

 iPointer = 0;
 foreach (int i, dchar c; theString)
 {
   if (c == searchChar)
   {
     iPointer =  i + 1;
     break;
   }
 }
 return (iPointer);


> Why not do it this way?

It is just a coding-style issue. People code to different standards.

BTW, using the foreach this way can be misleading. The pointer value returned represents the number of dchars examined and *not* an index into theString. This is significant if theString is not a dchar[].

-- 
Derek
(skype: derek.j.parnell)
Melbourne, Australia
"Down with mediocracy!"
24/03/2006 11:07:48 AM

March 24, 2006

Re: Break/Continue Structure

Posted by Oskar Linde
in reply to Derek Parnell

Oskar Linde

Posted in reply to Derek Parnell

Derek Parnell wrote:

> BTW, using the foreach this way can be misleading. The pointer value returned represents the number of dchars examined and *not* an index into theString. This is significant if theString is not a dchar[].

That is not correct. The index returned is an index into the char[] array, not the number of dchars processed:

void main() {
        foreach(uint ix, dchar c; "åäö"c)
                writefln("c = %s, ix = %s",c,ix);
}

Prints:

c = å, ix = 0
c = ä, ix = 2
c = ö, ix = 4

/Oskar

March 24, 2006

Re: Break/Continue Structure

Posted by Rory Starkweather
in reply to Derek Parnell

Rory Starkweather

Posted in reply to Derek Parnell

Derek Parnell wrote:
> On Thu, 23 Mar 2006 17:51:09 -0600, Rory Starkweather wrote:
> 
> 
>>  I guess I would like to ask why I shouldn't do this at the same time I 
>>ask how to do it.
>>
>>  I've been looking at a piece of code like:
>>
>>	foreach (int i, dchar c; theString)
>>	{
>>                 if (c == searchChar)
>>		return i + 1;
>>	}
>>	return 0;
>>}
>>
>>  I understand the reason for doing this, but prefer to do things like this:
>>
>>	int iPointer;
>>	
>>	iPointer = 0;
>>	foreach (int i, dchar c; theString)
>>	{
>>                 if (c == searchChar)
>>		iPointer =  i + 1;
>>		// ?? break;
>>	}
>>	return (iPointer);
>>}
>>
>>  I realize that the extra integer takes up a little memory space, but . . .
>>
>>  My questions are:
>>Will 'break' work here?
> 
> Yes it will, though it should be coded ...

 Understood. The // ?? was added for emphasis.

> 
>  iPointer = 0;
>  foreach (int i, dchar c; theString)
>  {
>    if (c == searchChar)
>    {
>      iPointer =  i + 1;
>      break;
>    }
>  }
>  return (iPointer);
> 
> 
> 
>>Why not do it this way?
> 
> 
> It is just a coding-style issue. People code to different standards.
> 
> BTW, using the foreach this way can be misleading. The pointer value
> returned represents the number of dchars examined and *not* an index into
> theString. This is significant if theString is not a dchar[].

 Good point. Thanks for mentioning it.I hadn't really considered that. Another option that has been suggested is using 'ifind' after suitable conversions. 'ifind' is pretty much guaranteed to give me a pointer to the actual character I want, isn't it?

March 24, 2006

Re: Break/Continue Structure

Posted by Rory Starkweather
in reply to Oskar Linde

Rory Starkweather

Posted in reply to Oskar Linde

Oskar Linde wrote:
> Derek Parnell wrote:
> 
> 
>>BTW, using the foreach this way can be misleading. The pointer value
>>returned represents the number of dchars examined and *not* an index into
>>theString. This is significant if theString is not a dchar[].
> 
> 
> That is not correct. The index returned is an index into the char[] array,
> not the number of dchars processed:
> 
> void main() {
>         foreach(uint ix, dchar c; "åäö"c)
>                 writefln("c = %s, ix = %s",c,ix);
> }
> 
> Prints:
> 
> c = å, ix = 0
> c = ä, ix = 2
> c = ö, ix = 4
> 
> /Oskar

 I'm having some trouble understanding this implementation of the 'foreach' construct. From the definiton of the'foreach' expression, the purpose of the 'c' in . . .; "åäö"c) is not clear to me. Does this implicitly declare an array of items with the same data type as 'c'? In other words, three dchars in this case? From Oskar's comment that seems unlikely.

 For me a large part of the problem is the variable naming convention in the documentation, which seems a little ambiguous, although reading the entries carefully usually clarifies things. I think I am just not used to one letter variable names yet. A style I have never been comfortable with. Oddly enough, I am also trying to learn Oberon-2 now, and Wirth uses the same convention.

Top | Forum index | About this forum

Copyright © 1999-2021 by the D Language Foundation