June 08, 2004
In article <opr9aqpftm5a2sq9@digitalmars.com>, Regan Heath says...
>
>On Tue, 8 Jun 2004 14:56:32 -0700, Walter <newshound@digitalmars.com> wrote:
>> The function uppercases the input string. It shouldn't modify its inputs.
>
>A perfect example of where 'in' should mean 'const' and the compiler should catch this error.
>
>Regan
>

Regan: If an "in" acts like an "inout" for strings, does it do this for all the other different types (int, real, long, etc.) too?  :(  Seems confusing, when is an "in" and "in" and not an "inout?"

-------------------------------------------------------------------
"Dare to reach for the Stars...Dare to Dream, Build, and Achieve!"
June 08, 2004
On Tue, 8 Jun 2004 22:35:23 +0000 (UTC), David L. Davis <SpottedTiger@yahoo.com> wrote:

> In article <opr9aqpftm5a2sq9@digitalmars.com>, Regan Heath says...
>>
>> On Tue, 8 Jun 2004 14:56:32 -0700, Walter <newshound@digitalmars.com>
>> wrote:
>>> The function uppercases the input string. It shouldn't modify its inputs.
>>
>> A perfect example of where 'in' should mean 'const' and the compiler
>> should catch this error.
>>
>> Regan
>>
>
> Regan: If an "in" acts like an "inout" for strings, does it do this for all the
> other different types (int, real, long, etc.) too?

No. Strings are passed by reference, (int, real, long, etc.) are not, see below..

>  :(  Seems confusing, when is
> an "in" and "in" and not an "inout?"

Exactly!

I believe strings and other arrays are all passed by reference and due to this you can change the *contents* of the string, but not the *reference* to the string. If you passed it as an inout you could change both the *contents* and the *reference*.

Regan.

> -------------------------------------------------------------------
> "Dare to reach for the Stars...Dare to Dream, Build, and Achieve!"



-- 
Using M2, Opera's revolutionary e-mail client: http://www.opera.com/m2/
June 08, 2004
In article <ca5evb$1d8$1@digitaldaemon.com>, David L. Davis says...
>
>If an "in" acts like an "inout" for strings, does it do this for all the other different types (int, real, long, etc.) too?

For array, classes and pointers, but not for primitive types or for structs.


> :(

I echo that sentiment.


>Seems confusing, when is
>an "in" and "in" and not an "inout?"

Unlike C and C++, D provides no DbC mechanism for catching const errors. You can do it, but you have to try REALLY hard. The following example WILL assert as a consequence of a DbC const error (I've tested it):

>   private char[][] backup;
>   void f(in char[] s)
>   in
>   {
>       backup.length = backup.length + 1;
>       backup[backup.length-1] = s.dup;
>   }
>   out
>   {
>       assert(s == backup[backup.length-1]);
>       backup.length = backup.length - 1;
>   }
>   body
>   {
>       s[0] ='*'; // violates my DbC assertion of s's constness
>   }
>
>   int main(char[][] args)
>   {
>       char[] s = "hello";
>       f(s);
>       return 0;
>   }

However - even THAT won't work if an exception is thrown or if the code is multi-threaded. You'd have to also make the whole thing synchronized AND wrapped in try/catch to ensure you got that. (And, so far as I know, there is no way to introduce either "synchronized" or "try/catch" in a release build only, without writing the whole function twice).

So - like you so eloquently put it earlier,

:(

Jill


June 09, 2004
In article <ca5ct6$2vif$1@digitaldaemon.com>, Walter says...
>
>The function uppercases the input string. It shouldn't modify its inputs.
>

Walter: Third time around is normally the "Charm!" Anywayz, I've been hammering away at these two functions ifind() and irfind(), and I believe I've make them much better than before, thanks to both you and Jill for the advice.

Please, let me know if I've still missed something, but if not I may ask some of the folks here to do a little testing of these functions. Right now tho I'm tried and seeing double (it's late here), so I'll check what's up in the morning. I will be bright and brushly tailed tomorrow to fix any problems found.

Thxs for giving a chance at denoting some code...I've learned a few more things about how to use "D", and that's all good indeed! :))


import std.c.stdio;
import std.string;

/****************************************************************************
* Function      : int ifind( in char[], in char[] )
* Author        : David L. 'SpottedTiger' Davis
* Language      : DigitalMars "D" aka Mars v0.92
* Created Date  : 03.Jun.04
* Modified Date : 08.Jun.04 Removed the wrapper function and the
*                           default parameter, mainly because the
*                           string being passed in should be already
*                           sliced so that the next search will find
*                           the matching sub-string value. Also per
*                           advice from Walter, <g> I've removed every
*                           tolower() call, and now locally all characters
*                           in the strings are set to lowercase where they
*                           sit without the need to create a another copy.
*               : 09.Jun.04 Reworked the whole thing! Fixed the problem
*                           with the input string getting stepped on, and
*                           now only the sSubStr to duped to another string.
*                           While the sStr string is looked at in a loop
*                           looking for the matchng SubString...a character
*                           at a time.
* Requirements  : std.string
* Licence       : Same as those for the Phobos (Runtime Library)
*****************************************************************************
*
* Note: Meant to be a case insensitive version of std.string.find
*/
int ifind
(
in char[] sStr,
in char[] sSubStr
)
{
char[] sSubStrTmp;
bool   bFoundMatch   = false;
int    iFound1stPos  = -1;
int    iSubStrRunner = 0;
char   cCharTmp;

// If either of the string parameters are empty, return not found
if ( sStr.length < 1 || sSubStr.length < 1 || sSubStr.length > sStr.length )
return -1;

// Get a working copy of sSubStr sSubStrTmp = sSubStr.dup;

// sSubStrTmp set to lowercase locally
// lowercase ascii a = '\x61', uppercase ascii A = '\x41'
foreach ( int iStrPos, char cChar; sSubStrTmp )
sSubStrTmp[ iStrPos ] = ( find( uppercase, cChar ) != -1 ) ? sSubStrTmp[ iStrPos
] + 0x20 : sSubStrTmp[ iStrPos ];

foreach ( int iStrPos, char cChar; sStr ) { cCharTmp = cChar;

// If cChar is an uppercase ASCII, make it lowercase for the compare cCharTmp = ( cCharTmp >= '\x41' && cCharTmp <= '\x5A' ? cCharTmp + '\x20' : cCharTmp );

//printf( "iStrPos=%d, cChar=%c, cCharTmp=%c, sStr.length=%d\n", iStrPos, cChar, cCharTmp, sStr.length );

// Find the very 1st character of the Sub String is found within the Main String
if ( cCharTmp == sSubStrTmp[ 0 ] && bFoundMatch == false )
{
iFound1stPos  = iStrPos;
bFoundMatch   = true;
iSubStrRunner = 1;

if ( sSubStrTmp.length == 1 ) return iFound1stPos;
continue;
}
// Match the rest of the characters in the Sub String is found within the Main
String
else if ( cCharTmp == sSubStrTmp[ iSubStrRunner ] && bFoundMatch == true )
{
iSubStrRunner++;
if ( iSubStrRunner > sSubStrTmp.length - 1 ) return iFound1stPos;
continue;
}
// Not all characters match, reset
else if ( bFoundMatch == true )
{
// Not a total match, reset back to defaults
iFound1stPos  = -1;
bFoundMatch   = false;
iSubStrRunner = 0;
}

}

return -1;

} // end int ifind( in char[], in char[] )


/****************************************************************************
* Function      : int irfind( in char[], in char[] )
* Author        : David L. 'SpottedTiger' Davis
* Language      : DigitalMars "D" aka Mars v0.92
* Created Date  : 03.Jun.04
* Modified Date : 08.Jun.04 Removed the wrapper function and the
*                           default parameter, mainly because the
*                           string being passed in should be already
*                           sliced so that the next search will find
*                           the matching sub-string value. Also per
*                           advice from Walter, <g> I've removed every
*                           tolower() call, and now locally all characters
*                           in the strings are set to lowercase where they
*                           sit without the need to create a another copy.
*               : 09.Jun.04 Reworked the whole thing! Fixed the problem
*                           with the input string getting stepped on, and
*                           now only the sSubStr to duped to another string.
*                           While the sStr string is looked at in a loop
*                           looking for the matchng SubString...a character
*                           at a time.
* Requirements  : std.string
* Licence       : Same as those for the Phobos (Runtime Library)
****************************************************************************
*
* Note: Meant to be a case insensitive version of std.string.rfind.
*/
int irfind
(
in char[] sStr,
in char[] sSubStr
)
{

char[] sSubStrTmp;
bool   bFoundMatch   = false;
int    iFound1stPos  = -1;
int    iSubStrRunner = 0;
char   cCharTmp;

// If either of the string parameters are empty, return not found
if ( sStr.length < 1 || sSubStr.length < 1 || sSubStr.length > sStr.length )
return -1;

// Get a working copy of sSubStr sSubStrTmp = sSubStr.dup;

// sSubStrTmp set to lowercase locally
// lowercase ascii a = '\x61', uppercase ascii A = '\x41'
foreach ( int iStrPos, char cChar; sSubStrTmp )
sSubStrTmp[ iStrPos ] = ( find( uppercase, cChar ) != -1 ) ? sSubStrTmp[ iStrPos
] + 0x20 : sSubStrTmp[ iStrPos ];

for ( int iStrPos = sStr.length - 1; iStrPos >= 0; iStrPos-- )
{

cCharTmp = sStr[ iStrPos ];

// If cChar is an uppercase ASCII, make it lowercase for the compare cCharTmp = ( cCharTmp >= '\x41' && cCharTmp <= '\x5A' ? cCharTmp + '\x20' : cCharTmp );

//printf( "iStrPos=%d, cChar=%c, cCharTmp=%c, sStr.length=%d\n", iStrPos, sStr[ iStrPos ], cCharTmp, sStr.length );


// Find the very 1st character of the Sub String is found within the Main String
if ( cCharTmp == sSubStrTmp[ 0 ] && bFoundMatch == false )
{
iFound1stPos  = iStrPos;
bFoundMatch   = true;
iSubStrRunner = 1;

//printf( "iStrPos=%d, cChar=%c, cCharTmp=%c, sStr.length=%d\n", iStrPos, sStr[ iStrPos ], cCharTmp, sStr.length );

if ( sSubStrTmp.length == 1 ) return iFound1stPos;

if ( iStrPos + 1 > sStr.length - 1 ) continue;

for ( int iInnerLoop = iStrPos + 1; iInnerLoop < sStr.length; iInnerLoop++ )
{
cCharTmp = sStr[ iInnerLoop ];

// If cChar is an uppercase ASCII, make it lowercase for the compare cCharTmp = ( cCharTmp >= '\x41' && cCharTmp <= '\x5A' ? cCharTmp + '\x20' : cCharTmp );

// Match the rest of the characters in the Sub String is found within the Main
String
if ( cCharTmp == sSubStrTmp[ iSubStrRunner ] && bFoundMatch == true )
{
iSubStrRunner++;
if ( iSubStrRunner > sSubStrTmp.length - 1 ) return iFound1stPos;
continue;
}
// Not all characters match, reset
else if ( bFoundMatch == true )
{
// Not a total match, reset back to defaults
iFound1stPos  = -1;
bFoundMatch   = false;
iSubStrRunner = 0;
break;
}
}
}
}

return -1;

} // end int irfind( in char[], in char[] )

// Test ifind() and irfind() to find multiple of the same sub-string
int main()
{

int    iStrPos;
int    iSlicePos;
char[] sStrTest  = "ApO 123355 PO Box 23, Waterpool Street Portland, Texas";

printf( "Original Before = %.*s\n\n", sStrTest );

iStrPos   = 0;
iSlicePos = 0;

while ( iSlicePos != -1 )
{
iSlicePos = ifind( sStrTest[ iStrPos .. sStrTest.length - 1 ], "PO" );

if ( iSlicePos != -1 )
{
printf( "Found \'PO\' at position with ifind()= %d\n", iStrPos + iSlicePos );
iStrPos = iStrPos + iSlicePos + "PO".length;
}
}

printf("\n\n");

iStrPos  = sStrTest.length - 1;
iSlicePos = 0;

while ( iSlicePos != -1 && iStrPos >= 0 )
{
iSlicePos = irfind( sStrTest[ 0 .. iStrPos + 1 ], "PO" );

if ( iSlicePos != -1 )
{
printf( "Found \'PO\' at position with irfind()= %d\n", iSlicePos );
iStrPos = iSlicePos - "PO".length;
}
}

printf("\n\n"); printf( "Original After = %.*s\n", sStrTest );

return 0;

} // end int main()

-------------------------------------------------------------------
"Dare to reach for the Stars...Dare to Dream, Build, and Achieve!"
June 09, 2004
Arcane Jill wrote:

<snip>
> The bit SLICE (or bit array, depending on your point of view) was never a
> problem (apart from the bugs). The bit /itself/ is the problem. Walter's
> suggestion will make bit slicing work, but the code below will still fall over:
> 
> 
>>      bit[] b;
>>      b.length = 64;
>>      bit* p = &b[3];
>>      *p = 1;
> 
> 
> Anywhere where you get a pointer to a bit, or a reference to a bit (and this
> includes passing a bit as an out or inout function parameter) you get a problem.

I thought that inout bits were already not supported.  Unless that's only in foreach....

> However, if Walter could make all of these situations compile-errors, he may
> have got it sussed!

Or have a bit offset in the bit pointer itself.  Which would turn it into a 35-bit object....

Of course, it could be 64-bit, in the form (byteAddress << 32) | (bitOffset << 29), which would make incrementing it a doddle....

Stewart.

-- 
My e-mail is valid but not my primary mailbox, aside from its being the unfortunate victim of intensive mail-bombing at the moment.  Please keep replies on the 'group where everyone may benefit.
June 09, 2004
In article <ca54is$2h2r$1@digitaldaemon.com>, David L. Davis says...

> sStr[ iStrPos ] + 0x20

Ah! Now these old ASCII habits really should be dropped. Hauke has written this magnificent charToUpper() routine. It should be used.

> I feel like a young Skywalker in training, learning how to best use "The Force!"

Other than that: Impressive - Obi Won has taught you well. (Hope I'm not too
discouraging).  :)

Jill


June 09, 2004
Haven't read any related postings before, so I don't know if I'm far off with my reply. Anyway, if you try to convert a lower char to an upper char in a clever and old-fashioned way, do it like this, young Jedi:

lower to upper -> var &= 0x5F;

Chr. Grade

In article <ca71c4$2b8l$1@digitaldaemon.com>, Arcane Jill says...
>
>In article <ca54is$2h2r$1@digitaldaemon.com>, David L. Davis says...
>
>> sStr[ iStrPos ] + 0x20
>
>Ah! Now these old ASCII habits really should be dropped. Hauke has written this magnificent charToUpper() routine. It should be used.
>
>> I feel like a young Skywalker in training, learning how to best use "The Force!"
>
>Other than that: Impressive - Obi Won has taught you well. (Hope I'm not too
>discouraging).  :)
>
>Jill
>
>


June 09, 2004
Arcane Jill wrote:

> In article <ca54is$2h2r$1@digitaldaemon.com>, David L. Davis says...
> 
> 
>>sStr[ iStrPos ] + 0x20
> 
> 
> Ah! Now these old ASCII habits really should be dropped. Hauke has written this
> magnificent charToUpper() routine. It should be used.
<snip>

Except that that snippet converts upper to lower.

There's always

    sStr[iStrPos] + 'a' - 'A'

which'll work as long as the uppercase alphabet is a constant offset from the lowercase alphabet.

Stewart.

-- 
My e-mail is valid but not my primary mailbox, aside from its being the unfortunate victim of intensive mail-bombing at the moment.  Please keep replies on the 'group where everyone may benefit.
June 09, 2004
Stewart Gordon wrote:
> Arcane Jill wrote:
> 
>> In article <ca54is$2h2r$1@digitaldaemon.com>, David L. Davis says...
>>
>>
>>> sStr[ iStrPos ] + 0x20
>>
>>
>>
>> Ah! Now these old ASCII habits really should be dropped. Hauke has written this
>> magnificent charToUpper() routine. It should be used.
> 
> <snip>
> 
> Except that that snippet converts upper to lower.

Well, charToLower then.


> There's always
> 
>     sStr[iStrPos] + 'a' - 'A'
> 
> which'll work as long as the uppercase alphabet is a constant offset from the lowercase alphabet.

That is exactly the same as using 0x20 directly since D's character literals are always unicode (no codepage stuff involved). So 'a'-'A' always equals 0x20.

In any case, in Unicode upper and lower case characters do not have a constant offset to each other. That is only true for the ASCII subset.

Hauke
June 09, 2004
Hauke Duden wrote:

<snip>
> In any case, in Unicode upper and lower case characters do not have a constant offset to each other. That is only true for the ASCII subset.

Yes, you do have a point there.  What's more, there isn't a 1:1 mapping between uppercase and lowercase characters.  And the mappings that there are aren't language independent.  So we can't write a single formula that'll correctly case-convert all text in all languages.

Stewart.

-- 
My e-mail is valid but not my primary mailbox, aside from its being the unfortunate victim of intensive mail-bombing at the moment.  Please keep replies on the 'group where everyone may benefit.