View mode: basic / threaded / horizontal-split · Log in · Help
April 14, 2005
How to bridge the gap between user defined types and built in types?
I think this needs a new thread.  Here is a snip of what I previously 
posted...


I got to thinking about what languages try to achieve with allowing 
operator overloading, I think that they try to make user defined types 
behave exactly as intrinsics.  This is especially important for a 
language that allows templates, as your templates should work with built 
in types and user defined types.  A while ago Derek ran across a problem 
  (thread title "opValue()") where his templates wouldn't work with 
aggregates because "=" does a reference copy, when he infact wanted a 
deep copy.
I think that this is a similar problem - do programmers want their 
operators to do a reference compare, or a deep compare?

In thinking along these lines I realised that D (at the moment) does not 
allow you to create user defined types that behave exactly as built in 
types, because user defined types (ie classes) are references, and built 
in types are not.  It looks to me like there is a huge change in 
thinking required between using built in types and user defined types.
Now, for most objects I think that references are correct and efficient. 
 However for the objects that are designed to look at act like built in 
types (ie, BigNum implementation), then they will NEVER behave exactly 
like built ins.

So I think the real question might be - how do we bridge the gap between 
user defined types and built in types?  This gap doesn't only show up in 
operators, but in passing values/references to functions.

...

I'll now solicit opinions and thoughts from the floor :)
I don't know enough about OO, or do enough pure software programming to 
know the right answer.

Brad
April 14, 2005
Re: How to bridge the gap between user defined types and built in types?
brad@domain.invalid wrote:
> I think this needs a new thread.  Here is a snip of what I previously 
> posted...
> 
> 
> I got to thinking about what languages try to achieve with allowing 
> operator overloading, I think that they try to make user defined types 
> behave exactly as intrinsics.  This is especially important for a 
> language that allows templates, as your templates should work with built 
> in types and user defined types.  A while ago Derek ran across a problem 
>   (thread title "opValue()") where his templates wouldn't work with 
> aggregates because "=" does a reference copy, when he infact wanted a 
> deep copy.
> I think that this is a similar problem - do programmers want their 
> operators to do a reference compare, or a deep compare?
> 
> In thinking along these lines I realised that D (at the moment) does not 
> allow you to create user defined types that behave exactly as built in 
> types, because user defined types (ie classes) are references, and built 
> in types are not.  It looks to me like there is a huge change in 
> thinking required between using built in types and user defined types.
> Now, for most objects I think that references are correct and efficient. 
>  However for the objects that are designed to look at act like built in 
> types (ie, BigNum implementation), then they will NEVER behave exactly 
> like built ins.
> 
> So I think the real question might be - how do we bridge the gap between 
> user defined types and built in types?  This gap doesn't only show up in 
> operators, but in passing values/references to functions.
> 
> ....
> 
> I'll now solicit opinions and thoughts from the floor :)
> I don't know enough about OO, or do enough pure software programming to 
> know the right answer.
> 
> Brad

Isn't a struct exactly what you want? It can have methods and operators 
and whatnot, yet has value semantics instead of reference semantics...
April 14, 2005
Re: How to bridge the gap between user defined types and built in types?
xs0 wrote:
> brad@domain.invalid wrote:
> 
>> I think this needs a new thread.  Here is a snip of what I previously 
>> posted...
>>
>>
>> I got to thinking about what languages try to achieve with allowing 
>> operator overloading, I think that they try to make user defined types 
>> behave exactly as intrinsics.  This is especially important for a 
>> language that allows templates, as your templates should work with 
>> built in types and user defined types.  A while ago Derek ran across a 
>> problem   (thread title "opValue()") where his templates wouldn't work 
>> with aggregates because "=" does a reference copy, when he infact 
>> wanted a deep copy.
>> I think that this is a similar problem - do programmers want their 
>> operators to do a reference compare, or a deep compare?
>>
>> In thinking along these lines I realised that D (at the moment) does 
>> not allow you to create user defined types that behave exactly as 
>> built in types, because user defined types (ie classes) are 
>> references, and built in types are not.  It looks to me like there is 
>> a huge change in thinking required between using built in types and 
>> user defined types.
>> Now, for most objects I think that references are correct and 
>> efficient.  However for the objects that are designed to look at act 
>> like built in types (ie, BigNum implementation), then they will NEVER 
>> behave exactly like built ins.
>>
>> So I think the real question might be - how do we bridge the gap 
>> between user defined types and built in types?  This gap doesn't only 
>> show up in operators, but in passing values/references to functions.
>>
>> ....
>>
>> I'll now solicit opinions and thoughts from the floor :)
>> I don't know enough about OO, or do enough pure software programming 
>> to know the right answer.
>>
>> Brad
> 
> 
> Isn't a struct exactly what you want? It can have methods and operators 
> and whatnot, yet has value semantics instead of reference semantics...

Except that structs don't have constructors and destructors... which 
makes them second class objects.
April 14, 2005
Solution - Re: How to bridge the gap between user defined types and built in types?
"Jan-Eric Duden" <jeduden@whisset.com> wrote in message news:425E4730.5080908@whisset.com...
> xs0 wrote:
> > brad@domain.invalid wrote:
> >
> >> I think this needs a new thread.  Here is a snip of what I previously
> >> posted...
> >>
> >>
> >> I got to thinking about what languages try to achieve with allowing
> >> operator overloading, I think that they try to make user defined types
> >> behave exactly as intrinsics.  This is especially important for a
> >> language that allows templates, as your templates should work with
> >> built in types and user defined types.  A while ago Derek ran across a
> >> problem   (thread title "opValue()") where his templates wouldn't work
> >> with aggregates because "=" does a reference copy, when he infact
> >> wanted a deep copy.
> >> I think that this is a similar problem - do programmers want their
> >> operators to do a reference compare, or a deep compare?
> >>
> >> In thinking along these lines I realised that D (at the moment) does
> >> not allow you to create user defined types that behave exactly as
> >> built in types, because user defined types (ie classes) are
> >> references, and built in types are not.  It looks to me like there is
> >> a huge change in thinking required between using built in types and
> >> user defined types.
> >> Now, for most objects I think that references are correct and
> >> efficient.  However for the objects that are designed to look at act
> >> like built in types (ie, BigNum implementation), then they will NEVER
> >> behave exactly like built ins.
> >>
> >> So I think the real question might be - how do we bridge the gap
> >> between user defined types and built in types?  This gap doesn't only
> >> show up in operators, but in passing values/references to functions.
> >>
> >> ....
> >>
> >> I'll now solicit opinions and thoughts from the floor :)
> >> I don't know enough about OO, or do enough pure software programming
> >> to know the right answer.
> >>
> >> Brad
> >
> >
> > Isn't a struct exactly what you want? It can have methods and operators
> > and whatnot, yet has value semantics instead of reference semantics...
>
> Except that structs don't have constructors and destructors... which
> makes them second class objects.

[----]

I posted a solution to this problem in another thread, but it's such a long thread that most people probably skipped all or most of it.   I know I only skimmed it myself, and missed the beginning of the thread entirely.

Anyway, here it is again, since I think it should answer your question well (with additions and improvements)...

For anything that the default opCmp() function (and therefore the "==" operator) currently compares the addresses rather than the contents, Change the opCmp() function to perform as follows (in this order, exiting when a return value has been established) when comparing (first_item == second_item)...

If both items being compared are void, return 0 (equals).

If first_item is void then return -1 (less than).

If second_item is void then return 1 (greater than).

Compare identities and return 0 (equals) if identical.

Compare sizes and scan for the first difference within the smaller size number of bytes.

If no differences are found, return the sign of the difference between the sizes.

Otherwise return the sign of the difference at that location where the first difference was found.

Furthermore, for any items that the "is" operator currently compares the contents by default, change it to instead compare the locations that they are stored at, if known.  As nonsensical as that may seem, it would at least me consistant functionality of the operator, and there may even turn out to be emergent uses for it.

Perhaps some simple override could allow programmers to change "is" back to comparing the contents for things like numeric litterals rather than their addresses, for those that choose to use it that way, or to support backward compatability... but I would consider that a minor issue.  If the location of one or more of the items being compared is unknown (if this is even possible, but I don't see how it could be), then default to comparing the value or contents of the items.

This way, all of the comparator operators would function predictably in a way that could be defined independant of what is being compared.  The comparison would be efficient when possible, and would always be an accurate and predictable comparison that can be described without knowledge of what kinds of things are being compared.

The least efficient case would be when the entire contents of the two items is searched for differences, and that would only happen if no differences were found before that and the two items were not one and the same.  This could actually be accomplished rather quickly even if the items have very large contents, since the default comparison would be doing a very straight forward comparison independant of what kind of items are being compared.

Optionally (this part may need some input) an exception could be raised is the contents of two items are found to be identical but no cast from one item to the other is known, since this would in some ways indicate that they may not actually be really "equal" afterall.

There would no longer be ambiguous overlap between the "==" operator and the "is" operator, as each would have a clearly defined default function.  In cases where the old functionality is wanted, or where a faster or otherwise different functionality is wanted, the operators could still be overriden, and any built-in overrides for certain kinds of items could be made where possible to conform to the same criteria of using "is" to compare addresses or locations for equality and the operators based on opCmp() to compare values or contents for equality or inequality.

Perhaps "is" could also have a corrsponding "is<" and "is>" added to give more information... or maybe "precedes", "follows" and "isnt" but for now, I think what I have offered here can be converted into an actual solution to the current ambiguity and overlap of the "is" and "==" functionality.

Okay... I'm tired, and I know I've made some mistakes in typing this... some of them potentially rather substantial.. for example, I am pretty sure I made a reference to an integer NAN value.  If so... take that to mean something like "raise and exception" or what ever would make sense, because I'm just too tired to go back and edit it.

I'm sending this as is, because I don't want to risk passing out at the keyboard from exhaustion, and hopefully it will still turn out to be helpful.  It's been a rough day... but tomorrows another one.  Oh, that didn't come out right at all.  Hehe.  Goodnight.

TechnoZeus
April 14, 2005
Re: Solution - Re: How to bridge the gap between user defined types and built in types?
I'm afraid that only solves portions of the problem as I see it.  The 
main dividing line is that all builtin objects (and structs) are always 
value, but Objects are always by reference.
Which means
 - the assignment operator "=" can behave differently
 - function arguments can be either value or reference

And other things I am sure are broken as well.  This kind of thing 
mostly (I think) bites you when writing templates that can take both 
built-in types and user defined types.

Brad
April 14, 2005
Re: Solution - Re: How to bridge the gap between user defined types
In article <d3mjsr$1obd$1@digitaldaemon.com>, brad@domain.invalid says...
>
>I'm afraid that only solves portions of the problem as I see it.  The 
>main dividing line is that all builtin objects (and structs) are always 
>value, but Objects are always by reference.
>Which means
>  - the assignment operator "=" can behave differently
>  - function arguments can be either value or reference
>
>And other things I am sure are broken as well.  This kind of thing 
>mostly (I think) bites you when writing templates that can take both 
>built-in types and user defined types.
>

Just a thought: would having ".dup" available universally help the assignment
problem?

int a,b;
Integer c,d;

a = b.dup; // duplicate the value of b
c = d.dup; // duplicate the value of object d ('deep' copy)

I would expect the signature "T opDup()" to be used in case of objects;
providing it as a base member on Object would force the return type to "Object"
(messy).

As for function arguments, you could try using "inout" for all parameters to
ensure that value types are given reference semantics.  It's not the best
workaround, but it gets the job done.

- EricAnderton at yahoo
April 14, 2005
Re: How to bridge the gap between user defined types and built in types?
>  However for the objects that are designed to look at act like built in
> types (ie, BigNum implementation), then they will NEVER behave exactly 
> like built ins.

I didn't have many problems mimicing a primitive numeric type when 
implementing the "big num" gmp wrapper. It's a class so you have to 'new' 
them, so that's one difference. Another is that toString is a member 
function instead of a top-level function. It would be nice to make it easier 
to write template code that works for both primitive, structs and classes 
when it makes sense, but I see that as a post-1.0 feature. Since the 
operator overloading returns new objects there isn't a problem of reference 
vs value. In general big nums work just like ints or other primitives.
Check out http://home.comcast.net/~benhinkle/gmp-d/
April 14, 2005
Re: Solution - Re: How to bridge the gap between user defined types
"pragma" <pragma_member@pathlink.com> wrote in message news:d3ml8g$1pgv$1@digitaldaemon.com...
> In article <d3mjsr$1obd$1@digitaldaemon.com>, brad@domain.invalid says...
> >
> >I'm afraid that only solves portions of the problem as I see it.  The
> >main dividing line is that all builtin objects (and structs) are always
> >value, but Objects are always by reference.
> >Which means
> >  - the assignment operator "=" can behave differently
> >  - function arguments can be either value or reference
> >
> >And other things I am sure are broken as well.  This kind of thing
> >mostly (I think) bites you when writing templates that can take both
> >built-in types and user defined types.
> >
>
> Just a thought: would having ".dup" available universally help the assignment
> problem?
>
> int a,b;
> Integer c,d;
>
> a = b.dup; // duplicate the value of b
> c = d.dup; // duplicate the value of object d ('deep' copy)
>
> I would expect the signature "T opDup()" to be used in case of objects;
> providing it as a base member on Object would force the return type to "Object"
> (messy).
>
> As for function arguments, you could try using "inout" for all parameters to
> ensure that value types are given reference semantics.  It's not the best
> workaround, but it gets the job done.
>
> - EricAnderton at yahoo

How about giving all items, including litterals, two properties as follows?
.contents
and
.address

such that .conents always means the internal value or contents of an item, and .address always means the location of that item in memory?

For example, if X is an integer, then (X.address == 0) would compare the address of X with the integer value 0, while (X.contents == 0) would conpare the integer value of X with the integer value 0, and Y is an object, then (Y.address == 0) would compare the address in memory where the contents of Y are stored to the integer value 0, while (Y.contents == 0) would by default compare the bit pattern of the contents of Y with the bit pattern of the numeric litteral 0.

This concept can actually be generallized to the point where one can code things like "if (12.address >= "hello".address){printf("the string was stored at a lower memory address than the number.");}" and there it allows instant removal of ambiguity no matter what kinds of things you're working with, as well as the ability to copy the contents of any object with a simple assignment such as "Y.contents = Z.contents" without wondering if you're going to end up making Y and Z point to the same object.


Of course, I still think that the ambiguity should be removed from operators like "is" and "==" as much as reasonably possible, but a set of universal attributes like this would also help to facilitate that goal.

TechnoZeus
April 14, 2005
Re: Solution - Re: How to bridge the gap between user defined types
> How about giving all items, including litterals, two properties as follows?
> .contents
> and
> .address
> 
> such that .conents always means the internal value or contents of an item, and .address always means the location of that item in memory?
> 
> For example, if X is an integer, then (X.address == 0) would compare the address of X with the integer value 0, while (X.contents == 0) would conpare the integer value of X with the integer value 0, and Y is an object, then (Y.address == 0) would compare the address in memory where the contents of Y are stored to the integer value 0, while (Y.contents == 0) would by default compare the bit pattern of the contents of Y with the bit pattern of the numeric litteral 0.
> 
> This concept can actually be generallized to the point where one can code things like "if (12.address >= "hello".address){printf("the string was stored at a lower memory address than the number.");}" and there it allows instant removal of ambiguity no matter what kinds of things you're working with, as well as the ability to copy the contents of any object with a simple assignment such as "Y.contents = Z.contents" without wondering if you're going to end up making Y and Z point to the same object.
> 
> 
> Of course, I still think that the ambiguity should be removed from operators like "is" and "==" as much as reasonably possible, but a set of universal attributes like this would also help to facilitate that goal.
> 
> TechnoZeus
> 
> 
That has a lot of appeal, but certainly is a lot of typing.  It would be 
nice to always have those properties so that in the cases you need to be 
sure you can be totally unambigious.  I agree that "==" and "is" should 
be kept & as un-ambigious as possible.
BTW - this is a problem that every language I can think of has - built 
in primitives are value, objects are generally reference/pointer.  Maybe 
it would be interesting to make the builtins defaultly by reference?

Brad
April 14, 2005
Re: Solution - Re: How to bridge the gap between user defined types
I haven't worked with it since it was still in it's beta development, so I'm not sure if they retained it, but i know the C# language "had" this problem worked out years ago.  Their solution was to allow all litterals to be treated as objects when the programmer chose to do so, and to add in a set of properties similar to what I have mentioned here, which were a part of every object.  By allowing litterals to be boxed up as objects when an object based property access was attempted, they had made it unnecessary to distinguish between them other than for the sake of writing efficient code.

This is my point.  Make it consistant and easy to learn.  Those who have gotten past the newbie phase can then concentrate on writing more efficient code while those who haven't can concentrate on graduating to the next level.  This goes not only for the programming language as a whole, but also for any part of it that is still to any reasonable degree unfamilliar to a particular person, and by extension, to the tools used by programmers to facilitate programming in the language.

TZ

<brad@domain.invalid> wrote in message news:d3ms27$1vb1$1@digitaldaemon.com...
>
> > How about giving all items, including litterals, two properties as follows?
> > .contents
> > and
> > .address
> >
> > such that .conents always means the internal value or contents of an item, and .address always means the location of that item in memory?
> >
> > For example, if X is an integer, then (X.address == 0) would compare the address of X with the integer value 0, while (X.contents == 0) would conpare the integer value of X with the integer value 0, and Y is an object, then (Y.address == 0) would compare the address in memory where the contents of Y are stored to the integer value 0, while (Y.contents == 0) would by default compare the bit pattern of the contents of Y with the bit pattern of the numeric litteral 0.
> >
> > This concept can actually be generallized to the point where one can code things like "if (12.address >= "hello".address){printf("the string was stored at a lower memory address than the number.");}" and there it allows instant removal of ambiguity no matter what kinds of things you're working with, as well as the ability to copy the contents of any object with a simple assignment such as "Y.contents = Z.contents" without wondering if you're going to end up making Y and Z point to the same object.
> >
> >
> > Of course, I still think that the ambiguity should be removed from operators like "is" and "==" as much as reasonably possible, but a set of universal attributes like this would also help to facilitate that goal.
> >
> > TechnoZeus
> >
> >
> That has a lot of appeal, but certainly is a lot of typing.  It would be
> nice to always have those properties so that in the cases you need to be
> sure you can be totally unambigious.  I agree that "==" and "is" should
> be kept & as un-ambigious as possible.
> BTW - this is a problem that every language I can think of has - built
> in primitives are value, objects are generally reference/pointer.  Maybe
> it would be interesting to make the builtins defaultly by reference?
>
> Brad
« First   ‹ Prev
1 2 3 4 5
Top | Discussion index | About this forum | D home