October 12, 2003
"Hauke Duden" <H.NS.Duden@gmx.net> wrote in message news:bmbgpn$19tm$1@digitaldaemon.com...
> That might be possible for functions in the runtime lib, but what about
3rd
> party C libraries?
>
> As I understand it, one of the major design goals of D is to be able to easily interact with existing C code. And I agree with Matthew that the
need
> for this kind of manual conversion is very error-prone. I think this is definitely a problem that needs to be addressed.
>
> The more I think about it, the more I realize that we'll need a real
string
> class that takes care of such issues. Maybe that class could always add a terminating zero that is not included in the length. That way we maintain compatibility with all the existing string libraries and can still pull
off
> nifty stuff like having embedded zeros if you use your strings only with pure D code.
>
> Another thing: what about OS functions? The whole Win32 API expects zero-terminated strings. And we cannot wrap everything into D functions,
can
> we? It gets even worse with COM interfaces. Since these use an object oriented approach with a virtual function table, you cannot replace individual functions with wrappers. You would have to wrap the whole
object
> with all its interfaces - which isn't possible, since you might not know
all
> the interfaces it supports!

You can use char* to interface with C code. They'll work fine as null terminated strings.


October 12, 2003
"Walter" <walter@digitalmars.com> wrote in message news:bmc4uf$23qj$1@digitaldaemon.com...
> You can use char* to interface with C code. They'll work fine as null terminated strings.

That's not what I meant. The main problem is that D strings are by default not null-terminated. If they were, then all the array operations (concatenation, etc.) wouldn't produce proper strings. So you have to add a terminating null to your strings just before passing them to a C function, and remove it whenever you get a string back from a C function. Which is an error-prone and strenuous thing to do.

You suggested that a solution to this issue would be to create D versions of all C string functions, so that they can handle non-null-terminated strings. This is what my reply was about - I tried to point out that this would be a huge and (in the case of COM) sometimes impossible task.

Hauke


October 12, 2003
"Hauke Duden" <H.NS.Duden@gmx.net> wrote in message news:bmca96$2b66$1@digitaldaemon.com...
> "Walter" <walter@digitalmars.com> wrote in message news:bmc4uf$23qj$1@digitaldaemon.com...
> > You can use char* to interface with C code. They'll work fine as null terminated strings.
> That's not what I meant. The main problem is that D strings are by default not null-terminated.

I see what you mean now, but that's not strictly true. The null termination in C strings is by convention, it has nothing to do with the C core language other than "literal strings" are null terminated. Literal strings in D are null terminated, as well (the null is just not reflected in the .length property). Hence, if you use char*, and take care to follow the C conventions with it, just as you would in C, it will work like it does in C.

> If they were, then all the array operations
> (concatenation, etc.) wouldn't produce proper strings. So you have to add
a
> terminating null to your strings just before passing them to a C function, and remove it whenever you get a string back from a C function. Which is
an
> error-prone and strenuous thing to do.

Actually, you have to frequently manually insert the 0 in C strings too when programming in C. It's error-prone and tedious (though not strenuous <g>).

> You suggested that a solution to this issue would be to create D versions
of
> all C string functions, so that they can handle non-null-terminated
strings.
> This is what my reply was about - I tried to point out that this would be
a
> huge and (in the case of COM) sometimes impossible task.

You're right, it would be impractical for COM. But COM also has multiple representations for strings, like BSTR, LPSTR, OLESTR, etc. There's no way to paper over all these things, one needs to examine each COM API function when using it to be sure the right kind of string is passed. So I suggest when using COM interfaces with C style null-terminated strings, use char*'s (or wchar*'s or BSTR's or whatever) and use null-terminated strings in D. It won't be any extra work than it would be in C.

I don't think there's a practical way to have strings be both length specified and null terminated without throwing away much of the benefit of length specified strings, and without creating all kinds of odd cases where it doesn't work right anyway.


October 12, 2003
"Walter" <walter@digitalmars.com> wrote in message news:bmcd3q$2er9$1@digitaldaemon.com...
> I see what you mean now, but that's not strictly true. The null
termination
> in C strings is by convention, it has nothing to do with the C core
language
> other than "literal strings" are null terminated.

Yeah, you're right. However, it is a pretty strong convention, since almost all C functions, bot in the RTL and in 3rd party libraries expect strings to be null-terminated.

> Literal strings in D are
> null terminated, as well (the null is just not reflected in the .length
> property).

That's good to know! Is it guaranteed, or is it only a quirk in the current implementation?

My main point, however, is that to be easily usable with C functions all strings would have to be null-terminated, not just literal ones. The more I think about this, the more I'm certain that there should be a standard string class that overloads the operators to achieve this.

> > You suggested that a solution to this issue would be to create D
versions
> of
> > all C string functions, so that they can handle non-null-terminated
> strings.
> > This is what my reply was about - I tried to point out that this would
be
> a
> > huge and (in the case of COM) sometimes impossible task.
>
> You're right, it would be impractical for COM. But COM also has multiple representations for strings, like BSTR, LPSTR, OLESTR, etc. There's no way to paper over all these things, one needs to examine each COM API function when using it to be sure the right kind of string is passed.

You're right. Though most COM methods use wide char strings, so it might be tempting to just pass a wchar array and forget the null-terminator handling.

> I don't think there's a practical way to have strings be both length specified and null terminated without throwing away much of the benefit of length specified strings, and without creating all kinds of odd cases
where
> it doesn't work right anyway.

I think it is possible. I have done something like that in C++ before. The main problem is that splicing a part from an existing string should be done without creating a copy of the data. This can be dealt with by postponing the appending of a null terminator until a C compatible string is actually needed. So you would be able to keep the benefits of length-specified strings as long as you don't pass them into C code.

I think this is a pretty important issue that needs to be solved as soon as possible. I'll try and implement a D string class that makes this kind of thing easier when I can find some time.

Hauke


1 2
Next ›   Last »