Jump to page: 1 2
Thread overview
String convention
Jul 01, 2006
Niklas Ulvinge
Jul 01, 2006
Kirk McDonald
Jul 01, 2006
Niklas Ulvinge
Jul 01, 2006
Frits van Bommel
Jul 01, 2006
Frits van Bommel
Jul 01, 2006
Derek Parnell
Jul 01, 2006
Derek Parnell
Jul 01, 2006
Niklas Ulvinge
Jul 03, 2006
Don Clugston
Jul 01, 2006
Hasan Aljudy
July 01, 2006
I could't find any info about it so I'm asking here...

I just looked at D and it sounds rather interesting.

Now to my Q:
Strings in D starts with some data that is defining the length of the string.
Why did they decide to use this aproach?
What is this 'data' at the beginning of the string?

This has a limitation, strings can't be longer than 'data' allows. Is there a way around this?


An idea, I got when I wrote a dynamic array (in C), was to use s[-1] as the size
for array s (and s[-2] for capacity, but that isn't necesary here...).

Couldn't this be used with strings?
Then this would work:
string s = "IDK\0";
printf("%s",s);

Niklas Ulvinge
aka IDK wishes
everyone happy
programming!!!
July 01, 2006
Niklas Ulvinge wrote:
> I could't find any info about it so I'm asking here...
> 
> I just looked at D and it sounds rather interesting.
> 
> Now to my Q:
> Strings in D starts with some data that is defining the length of the string.
> Why did they decide to use this aproach?
> What is this 'data' at the beginning of the string?
> 

This is an implementation detail, and shouldn't matter to your code.

> This has a limitation, strings can't be longer than 'data' allows.
> Is there a way around this?
> 

I think you are somewhat confused. Strings in D are dynamic arrays of type char. They may be of any length, so long as you have enough RAM.

http://www.digitalmars.com/d/arrays.html

> 
> An idea, I got when I wrote a dynamic array (in C), was to use s[-1] as the size
> for array s (and s[-2] for capacity, but that isn't necesary here...).
> 
> Couldn't this be used with strings?
> Then this would work:
> string s = "IDK\0";

The D syntax is:

char[] s = "IDK";

The \0 is not needed as strings in D are not null-terminated. The length of the string may be retrieved with "s.length".

> printf("%s",s);
> 
> Niklas Ulvinge
> aka IDK wishes
> everyone happy
> programming!!!


-- 
Kirk McDonald
Pyd: Wrapping Python with D
http://dsource.org/projects/pyd/wiki
July 01, 2006
"Niklas Ulvinge" <Niklas_member@pathlink.com> wrote in message news:e86km2$12ar$1@digitaldaemon.com...

> This has a limitation, strings can't be longer than 'data' allows. Is there a way around this?

Keep in mind that the "length" member of an array is the word size of the machine, so that the longest array possible would take up the entire memory space :S

> printf("%s",s);

Never ever ever ever use printf() in D.  Please.  The spec uses it profusely, but that's because a lot of the examples were written before std.stdio.writefln() was written.  Use that instead.  You don't even have to have a format string with writefln, i.e.

import std.stdio;

...

char[] s = "hi";
writefln(s);
writefln(4, ", ", 5);
writefln("hello");


July 01, 2006
In article <e86l86$1364$1@digitaldaemon.com>, Kirk McDonald says...
>
>Niklas Ulvinge wrote:
>> I could't find any info about it so I'm asking here...
>> 
>> I just looked at D and it sounds rather interesting.
>> 
>> Now to my Q:
>> Strings in D starts with some data that is defining the length of the string.
>> Why did they decide to use this aproach?
>> What is this 'data' at the beginning of the string?
>> 
>
>This is an implementation detail, and shouldn't matter to your code.
>
>> This has a limitation, strings can't be longer than 'data' allows. Is there a way around this?
>> 
>
>I think you are somewhat confused. Strings in D are dynamic arrays of type char. They may be of any length, so long as you have enough RAM.
>
>http://www.digitalmars.com/d/arrays.html
>
>> 
>> An idea, I got when I wrote a dynamic array (in C), was to use s[-1] as the size
>> for array s (and s[-2] for capacity, but that isn't necesary here...).
>> 
>> Couldn't this be used with strings?
>> Then this would work:
>> string s = "IDK\0";
>
>The D syntax is:
>
>char[] s = "IDK";
>
>The \0 is not needed as strings in D are not null-terminated. The length of the string may be retrieved with "s.length".
>

Thanks for clearing things up a bit.

First, if the data don't have a terminator, then they can't have a big(infinity)
size.
This is becouse, s.length need to be a variable. And that variable has got to
have a size, wich makes it impossible to make big arrays.

I only need to rewrite my Q's...
What type is s.length(or for that matter any dynamic array's size)?
Or is dynamic arrays implemented in a different way?

Niklas Ulvinge
aka IDK wishes
everyone happy
programming!!!
July 01, 2006
Kirk McDonald wrote:
> Niklas Ulvinge wrote:
>> I just looked at D and it sounds rather interesting.

Always good to hear.

>> Now to my Q:
>> Strings in D starts with some data that is defining the length of the string.

Actually, the /reference/ to the (dynamic) string begins with that data (i.e. what in C would be the pointer to it is in D twice as long, the first half containing the length).
With static strings, the length is encoded in the type (i.e. a char[3] has length 3) and doesn't need to be stored separately.

>> Why did they decide to use this aproach?

Having constant-time access to the length of a string makes a lot of operations more efficient. Having it be separate from the string data enables you to do cool things like efficient string slicing. (see http://www.digitalmars.com/d/arrays.html#slicing )

Oh, and everything I'm saying about strings goes for *any* array type.

>> What is this 'data' at the beginning of the string?

An integer value: the length of the string.

> This is an implementation detail, and shouldn't matter to your code.
> 
>> This has a limitation, strings can't be longer than 'data' allows.

'data' is a size_t. The 'limit' you refer to is therefore /at least/ (the maximum size of addressable memory) - 1 (more on machines with weird pointer types or for array of elements with size > 1).

>> Is there a way around this?

Buy a computer that can address more memory at the same time (like a 64-bit one, if you're coming from a 32-bit machine) and making sure your compiler generates appropriate code (i.e. use a 64-bit-aware compiler for a 64-bit platform).

>> Couldn't this be used with strings?
>> Then this would work:
>> string s = "IDK\0";
> 
> The D syntax is:
> 
> char[] s = "IDK";
> 
> The \0 is not needed as strings in D are not null-terminated. The length of the string may be retrieved with "s.length".

Well, he wants to use it with printf("%s", ...), so then adding a null terminator at the end would probably be a good idea:

>> printf("%s",s);

Though, of course, writef is a better alternative.
July 01, 2006
On Sun, 02 Jul 2006 06:07:30 +1000, Niklas Ulvinge <Niklas_member@pathlink.com> wrote:

> I could't find any info about it so I'm asking here...
>
> I just looked at D and it sounds rather interesting.
>
> Now to my Q:
> Strings in D starts with some data that is defining the length of the string.


No, that is a misunderstanding. Strings in D are a variable-length array of characters, and variable-length arrays consist of two data items. The first is the array reference and this is a pseudo-struct with two members : the length (uint of 32-bits) and a pointer to the first array element (void *), the second data item is the array data itself which is a contiguous block of RAM that will hold at least the number of elements specified in the 'length' member.

But you as a coder don't need to worry about this because the compiler handles all the manipulation for you.

> Why did they decide to use this aproach?

Because it makes for very fast an flexible dynamic arrays. Slices become easy to implement and fast.

> What is this 'data' at the beginning of the string?

There is no data at the beginning of the string data. There is a separate array reference though.

> This has a limitation, strings can't be longer than 'data' allows.

Currently, utf8 strings are limited to 4Gigabytes. This might change on 64-bit architectures. But if you are dealing with strings that big you probably need to rethink you algorithms anyhow ;-)

> Is there a way around this?

Solve the problem when you get to it. Are you actually running into limitations already?

> An idea, I got when I wrote a dynamic array (in C), was to use s[-1] as the size
> for array s (and s[-2] for capacity, but that isn't necesary here...).

That's right, it isn't.

> Couldn't this be used with strings?
> Then this would work:
> string s = "IDK\0";
> printf("%s",s);

Do not use the C function 'printf'. Use the D function 'writef' and your formatting issues will disappear.

alias string char[];
string s = "IDK";
writef("%s", s);

-- 
Derek Parnell
Melbourne, Australia
July 01, 2006
See also my other post, made just before I saw this one. This is a summary.

Niklas Ulvinge wrote:
> In article <e86l86$1364$1@digitaldaemon.com>, Kirk McDonald says...
> Thanks for clearing things up a bit.
> 
> First, if the data don't have a terminator, then they can't have a big(infinity)
> size.

Nor can it *with* a terminator. Your computer's memory has a finite size, deal with it :).

> This is becouse, s.length need to be a variable. And that variable has got to
> have a size, wich makes it impossible to make big arrays.
> 
> I only need to rewrite my Q's...
> What type is s.length(or for that matter any dynamic array's size)?

Should be a size_t.
size_t.max >= (addressable memory).sizeof - 1
So the theoretical maximum length string is as large as it could be with a terminator. Assuming your OS lets you use that much, which it typically won't.
July 01, 2006
On Sun, 02 Jul 2006 07:13:30 +1000, Derek Parnell <derek@psych.ward> wrote:

Oops ! I just woke up and haven't my moring coffee yet ;)

The alias shoulld be

 alias char[] string;


-- 
Derek Parnell
Melbourne, Australia
July 01, 2006
D strings are dynamic arrays.
The "length" does not limit anything. you can change it anytime anyhow.

char[] s = "....."; //some string

.
.
.

s.length = s.length + XYZ; //change the length of the array to whatever thing you like
s = "exxxaa"; //or like this

I can't think of any limitation.

P.S. Don't use printf with D. Use writef instead, and don't forget to import std.stdio

Niklas Ulvinge wrote:
> I could't find any info about it so I'm asking here...
> 
> I just looked at D and it sounds rather interesting.
> 
> Now to my Q:
> Strings in D starts with some data that is defining the length of the string.
> Why did they decide to use this aproach?
> What is this 'data' at the beginning of the string?
> 
> This has a limitation, strings can't be longer than 'data' allows.
> Is there a way around this?
> 
> 
> An idea, I got when I wrote a dynamic array (in C), was to use s[-1] as the size
> for array s (and s[-2] for capacity, but that isn't necesary here...).
> 
> Couldn't this be used with strings?
> Then this would work:
> string s = "IDK\0";
> printf("%s",s);
> 
> Niklas Ulvinge
> aka IDK wishes
> everyone happy
> programming!!!
July 01, 2006
Thanks for all replies, now I understand most of what I wanted to know. (although the Q about the internal structure of dynamic arrays still remains...)


In article <op.tb03wsb06b8z09@ginger.vic.bigpond.net.au>, Derek Parnell says...
>But you as a coder don't need to worry about this because the compiler  =
>
>handles all the manipulation for you.
>

I think as 'real programmers' ;) :
"Real programmers can write assembly langauge in any language"

This is very hard to do in D, but really easy in C.

The foreach statemente as an example.
In D, the compiler handles the implementation.
I want to know how it is implemented.


In languages where "a" + "b" = "ab" works there could be programmers who doesn't
see that concating is much more complex than adding a couple of numbers.
In D, this is a little better, becouse it's hard to find the concating char (I
don't have it now, becouse of an odd bug in firefox).
In C/C++ this is better, becouse it was a func, wich indicated how hard it was
to do.

Some programmers may instead of using:
writef(a,b,c)
concate them. Wich would be very bad.


« First   ‹ Prev
1 2