TypeInfo (page 4) - D Programming Language Discussion Forum

Forums

New users
- Learn
Community
- General
- Announce
Improvements
- DIP Ideas
- DIP Devel.
Ecosystem
- GDC
- LDC
- Debuggers
- IDEs
- DWT
Development
- Internals
- Issues
- Beta
- DMD
- Phobos
- Druntime
- Study
Turkish
- Genel
- Duyuru

Index » D » TypeInfo (page 4)

August 11, 2002

Posted by Sean L. Palmer
in reply to Walter

Sean L. Palmer

Posted in reply to Walter

For streaming, you 99% of the time want to write a datum to a file, that will be read by a later run of that same program.  Sometimes a different program altogether.

I have a hard time imagining a scheme where the compiler could auto generate GUID's such that they stay the same from one compile to the next, let alone generate the same GUID on different machines if your friend decides to make a small change to your program, compile it, then try to read an existing data file generated by your program on your machine.

Yes, GUID's are necessary for COM.  However, COM is not the entire programming universe, and I don't see the entire programming industry standardizing on COM anytime soon.

GUID's are overkill for this problem...  that's the main point I'm trying to make.  I'd rather see the full string of the class/module name embedded in my datafiles than have GUID's.

Besides that, GUID's are large.  Why use 128 bits when 32 (or fewer!) bits would do?  Most programs will contain fewer than 100 classes.  Large programs may have 1000.

Sean

"Walter" <walter@digitalmars.com> wrote in message news:aj6fp2$2bi8$2@digitaldaemon.com...
>
> "Sean L. Palmer" <seanpalmer@earthlink.net> wrote in message news:aj5669$13bs$1@digitaldaemon.com...
> > GUIDs are a bad solution for aforementioned reasons.  Look at my earlier post in this thread.  If you go that route, the programmer has to
generate
> > GUID's for every class probably and those GUID's would I guess have to
> live
> > in the source code.  If I wanted that I could do it myself.  I'd like to avoid that PITA in day-to-day work if possible.
>
> The GUIDs would be generated by default by the compiler, and so for intra-process stuff, should work fine. Only for cross-process etc. would they need to be hand generated. The need to support GUIDs to work with COM isn't going to go away, and 3 schemes coexisting together is too much.

August 11, 2002

Posted by Pavel Minayev
in reply to Sean L. Palmer

Pavel Minayev

Posted in reply to Sean L. Palmer

On Sun, 11 Aug 2002 13:58:25 -0700 "Sean L. Palmer" <seanpalmer@earthlink.net> wrote:

> I have a hard time imagining a scheme where the compiler could auto generate GUID's such that they stay the same from one compile to the next, let alone generate the same GUID on different machines if your friend decides to make a small change to your program, compile it, then try to read an existing data file generated by your program on your machine.

Yes, that's the problem.

> GUID's are overkill for this problem...  that's the main point I'm trying to make.  I'd rather see the full string of the class/module name embedded in my datafiles than have GUID's.

So far I don't see any other way but use module.class string. Only it can guarantee uniqueness.

> Besides that, GUID's are large.  Why use 128 bits when 32 (or fewer!) bits would do?  Most programs will contain fewer than 100 classes.  Large programs may have 1000.

Yes, but there's still a chance that two of those 100 classes might have the same CRC32 (or whatever hashfunc you decide to use).

August 13, 2002

Posted by Sean L. Palmer
in reply to Pavel Minayev

Sean L. Palmer

Posted in reply to Pavel Minayev

True re: the hash function.  Since identifiers are only allowed to use a small subset of valid ANSI characters, a couple bits per character could be eliminated.  In 32 bits with 6 bits per character you can get only 5 characters with no possibility of loss of information.  For 128 bits you can get 21 characters.

Ahh well just put the whole string in.  Pays to use short class and module names I guess.  ;)

Sean

"Pavel Minayev" <evilone@omen.ru> wrote in message news:CFN374801163376736@news.digitalmars.com...
> On Sun, 11 Aug 2002 13:58:25 -0700 "Sean L. Palmer"
<seanpalmer@earthlink.net>
> wrote:
>
> > I have a hard time imagining a scheme where the compiler could auto
generate
> > GUID's such that they stay the same from one compile to the next, let
alone
> > generate the same GUID on different machines if your friend decides to
make
> > a small change to your program, compile it, then try to read an existing data file generated by your program on your machine.
>
> Yes, that's the problem.
>
> > GUID's are overkill for this problem...  that's the main point I'm
trying to
> > make.  I'd rather see the full string of the class/module name embedded
in
> > my datafiles than have GUID's.
>
> So far I don't see any other way but use module.class string. Only it can guarantee uniqueness.
>
> > Besides that, GUID's are large.  Why use 128 bits when 32 (or fewer!)
bits
> > would do?  Most programs will contain fewer than 100 classes.  Large programs may have 1000.
>
> Yes, but there's still a chance that two of those 100 classes might have the same CRC32 (or whatever hashfunc you decide to use).

August 13, 2002

Posted by Pavel Minayev
in reply to Sean L. Palmer

Pavel Minayev

Posted in reply to Sean L. Palmer

On Tue, 13 Aug 2002 00:36:04 -0700 "Sean L. Palmer" <seanpalmer@earthlink.net> wrote:

> True re: the hash function.  Since identifiers are only allowed to use a small subset of valid ANSI characters, a couple bits per character could be eliminated.  In 32 bits with 6 bits per character you can get only 5 characters with no possibility of loss of information.  For 128 bits you can get 21 characters.

And what will you do with Unicode identifiers? =)

> Ahh well just put the whole string in.  Pays to use short class and module names I guess.  ;)

Strings can be compressed, after all. Huffman or something.

August 14, 2002

Posted by anderson
in reply to Sean L. Palmer

anderson

Posted in reply to Sean L. Palmer

What about, writing info to disk about the present class structure in debug mode. When the class structure change you simply need to update the list, which should contain all the class info. To determine what is which, it's simply a matter of loading the list at initation and passing out the numbers to the classes. This way you can use the entire 32 bits = 4294967296 entries per group. When you distribute the code you also distribute the data file. I mentioned something like this in a pervious post.

However, I would still like this to be a standard in the complier. If I had time I'd write a standard lib to support it (although it wouldn't be as effecient as a complie time version).


"Sean L. Palmer" <seanpalmer@earthlink.net> wrote in message news:ajacej$fv8$1@digitaldaemon.com...
> True re: the hash function.  Since identifiers are only allowed to use a small subset of valid ANSI characters, a couple bits per character could
be
> eliminated.  In 32 bits with 6 bits per character you can get only 5 characters with no possibility of loss of information.  For 128 bits you
can
> get 21 characters.
>
> Ahh well just put the whole string in.  Pays to use short class and module names I guess.  ;)
>
> Sean
>
> "Pavel Minayev" <evilone@omen.ru> wrote in message news:CFN374801163376736@news.digitalmars.com...
> > On Sun, 11 Aug 2002 13:58:25 -0700 "Sean L. Palmer"
> <seanpalmer@earthlink.net>
> > wrote:
> >
> > > I have a hard time imagining a scheme where the compiler could auto
> generate
> > > GUID's such that they stay the same from one compile to the next, let
> alone
> > > generate the same GUID on different machines if your friend decides to
> make
> > > a small change to your program, compile it, then try to read an
existing
> > > data file generated by your program on your machine.
> >
> > Yes, that's the problem.
> >
> > > GUID's are overkill for this problem...  that's the main point I'm
> trying to
> > > make.  I'd rather see the full string of the class/module name
embedded
> in
> > > my datafiles than have GUID's.
> >
> > So far I don't see any other way but use module.class string. Only it
can
> > guarantee uniqueness.
> >
> > > Besides that, GUID's are large.  Why use 128 bits when 32 (or fewer!)
> bits
> > > would do?  Most programs will contain fewer than 100 classes.  Large programs may have 1000.
> >
> > Yes, but there's still a chance that two of those 100 classes might have the same CRC32 (or whatever hashfunc you decide to use).
>
>

August 15, 2002

Posted by Sean L. Palmer
in reply to Pavel Minayev

Sean L. Palmer

Posted in reply to Pavel Minayev

"Pavel Minayev" <evilone@omen.ru> wrote in message news:CFN374820383640162@news.digitalmars.com...
> On Tue, 13 Aug 2002 00:36:04 -0700 "Sean L. Palmer"
<seanpalmer@earthlink.net>
> wrote:
>
> > True re: the hash function.  Since identifiers are only allowed to use a small subset of valid ANSI characters, a couple bits per character could
be
> > eliminated.  In 32 bits with 6 bits per character you can get only 5 characters with no possibility of loss of information.  For 128 bits you
can
> > get 21 characters.
>
> And what will you do with Unicode identifiers? =)

Compile them, I hope!!  ;)

> > Ahh well just put the whole string in.  Pays to use short class and
module
> > names I guess.  ;)
>
> Strings can be compressed, after all. Huffman or something.

With pregenerated statistics about frequency of occurrences of symbols and pairs of symbols, they could actually compress quite well.  Probably just about as fast as calculating a CRC32.  Just as with the original string, the variable memory requirement is kinda annoying, though.

Sean

August 15, 2002

Posted by Pavel Minayev
in reply to Sean L. Palmer

Pavel Minayev

Posted in reply to Sean L. Palmer

On Thu, 15 Aug 2002 01:43:44 -0700 "Sean L. Palmer" <seanpalmer@earthlink.net> wrote:

> With pregenerated statistics about frequency of occurrences of symbols and pairs of symbols, they could actually compress quite well.  Probably just about as fast as calculating a CRC32.  Just as with the original string, the variable memory requirement is kinda annoying, though.

Since length of D identifiers is not limited in specification, I guess there is no workaround. We'll have to live with it.

August 16, 2002

Posted by Walter
in reply to Sean L. Palmer

Walter

Posted in reply to Sean L. Palmer

"Sean L. Palmer" <seanpalmer@earthlink.net> wrote in message news:aj6in3$2e87$1@digitaldaemon.com...
> For streaming, you 99% of the time want to write a datum to a file, that will be read by a later run of that same program.  Sometimes a different program altogether.

Yes.

> I have a hard time imagining a scheme where the compiler could auto
generate
> GUID's such that they stay the same from one compile to the next, let
alone
> generate the same GUID on different machines if your friend decides to
make
> a small change to your program, compile it, then try to read an existing data file generated by your program on your machine.

Actually, autogenerated GUIDs should deliberately not be the same from compile to compile. Only if they are hardcoded will they be the same.

> Yes, GUID's are necessary for COM.  However, COM is not the entire programming universe, and I don't see the entire programming industry standardizing on COM anytime soon.

COM is actually a brilliant idea. The implementation of it went off course somewhere, but I don't think it was the GUIDs that did that.

> GUID's are overkill for this problem...  that's the main point I'm trying
to
> make.  I'd rather see the full string of the class/module name embedded in my datafiles than have GUID's.

The full name doesn't allow for different versions of the same class.

> Besides that, GUID's are large.  Why use 128 bits when 32 (or fewer!) bits would do?  Most programs will contain fewer than 100 classes.  Large programs may have 1000.

I don't think you can guarantee uniqueness with only 32 bits without knowing every other D class.

August 16, 2002

Posted by Walter
in reply to Pavel Minayev

Walter

Posted in reply to Pavel Minayev

"Pavel Minayev" <evilone@omen.ru> wrote in message news:CFN374836441665856@news.digitalmars.com...
> Since length of D identifiers is not limited in specification, I guess there is no workaround. We'll have to live with it.

Unfortunately, I'm stuck a bit with the OMF limitations on name lengths. But it's fairly large.

August 16, 2002

Posted by anderson
in reply to Walter

anderson

Posted in reply to Walter

>
> I don't think you can guarantee uniqueness with only 32 bits without
knowing
> every other D class.
>

What about if you retricted it to just a group of classes (put each class in a group). That way there you'd only need to store unique ids within that group of classes.

I also suggested a solution (however inefficient) to this in a previous email, using the class names an saving them to disk between complies. This method wouldn't rely on compression. However, this technique could be done by the programmer or in a standard lib.

PS - Is it possible to get the name of a class as a string in C++?

Top | Forum index | About this forum

Copyright © 1999-2021 by the D Language Foundation