Interesting Research Paper on Constructors in OO Languages (page 2) - D Programming Language Discussion Forum

Forums

New users
- Learn
Community
- General
- Announce
Improvements
- DIP Ideas
- DIP Devel.
Ecosystem
- GDC
- LDC
- Debuggers
- IDEs
- DWT
Development
- Internals
- Issues
- Beta
- DMD
- Phobos
- Druntime
- Study
Turkish
- Genel
- Duyuru

Index » General » Interesting Research Paper on Constructors in OO Languages (page 2)

July 16, 2013

Re: Interesting Research Paper on Constructors in OO Languages

Posted by H. S. Teoh
in reply to Dicebot

H. S. Teoh

Posted in reply to Dicebot

On Tue, Jul 16, 2013 at 11:18:31AM +0200, Dicebot wrote:
> On Tuesday, 16 July 2013 at 08:19:10 UTC, H. S. Teoh wrote:
> >Just discovered a new trick in D: struct inheritance using alias this. :)
> 
> Wasn't this stated in TDPL as one of primary design rationales behind "alias this"? :)

Haha, you're right. I read it before but apparently the only thing that stuck in my mind is that alias this is to allow a type to masquerade as another type. But looking at the relevant sections again, Andrei did describe it as "subtyping", both w.r.t. classes and structs. Touché. :)

T

-- 
Mediocrity has been pushed to extremes.

July 16, 2013

Re: Interesting Research Paper on Constructors in OO Languages

Posted by H. S. Teoh

H. S. Teoh

On Tue, Jul 16, 2013 at 01:17:30AM -0700, H. S. Teoh wrote: [...]
> 	mixin template BuilderArgs(string fields) {
> 		struct Args {
> 			typeof(super).Args base;
> 			alias base this;
> 			mixin(fields);
> 		}
> 	};
> 
> 	class MyClass : BaseClass {
> 	public:
> 		// Hmm, doesn't look too bad!
> 		mixin BuilderArgs!(q{
> 			int parm1 = 1;
> 			int parm2 = 2;
> 		});
> 		this(Args args) {
> 			super(args);
> 			...
> 		}
> 	}
> 
> 	class AnotherClass : BaseClass {
> 	public:
> 		// N.B. Looks exactly the same like MyClass.args except
> 		// for the fields! The template automatically picks up
> 		// the right base class Args to "inherit" from.
> 		mixin BuilderArgs!(q{
> 			string anotherparm1 = "abc";
> 			string anotherparm2 = "def";
> 		});
> 		this(Args args) {
> 			super(args);
> 			...
> 		}
> 	}
> 
> Not bad at all!  Though, I haven't actually tested any of this code, so I've no idea if it will actually work yet. But it certainly looks promising! I'll give it a spin tomorrow morning (way past my bedtime now).
[...]

Yep, confirmed that this code actually works! Here's the actual test code that I wrote:

	import std.stdio;

	mixin template CtorArgs(string fields) {
		struct Args {
			static if (!is(typeof(super) == Object)) {
				typeof(super).Args base;
				alias base this;
			}
			mixin(fields);
		}
	}

	class Base {
	private:
		int sum;
	public:
		mixin CtorArgs!(q{
			int basefield1 = 1;
			int basefield2 = 2;
		});
		this(Args args) {
			sum = args.basefield1 + args.basefield2;
		}
		int getResult() {
			return sum;
		}
	}

	class Derived : Base {
		int derivedSum;
	public:
		mixin CtorArgs!(q{
			int parm1 = 3;
			int parm2 = 4;
		});
		this(Args args) {
			super(args);
			derivedSum = args.parm1 + args.parm2;
		}
		override int getResult() {
			return super.getResult() + derivedSum;
		}
	}

	class AnotherDerived : Base {
	private:
		int anotherSum;
	public:
		mixin CtorArgs!(q{
			int another1 = 5;
			int another2 = 6;
		});
		this(Args args) {
			super(args);
			anotherSum = args.another1 + args.another2;
		}
		override int getResult() {
			return super.getResult() + anotherSum;
		}
	}

	// Test usage in a deeper hierarchy
	class VeryDerived : AnotherDerived {
		int divisor;
	public:
		mixin CtorArgs!(q{
			int divisor = 5;
		});
		this(Args args) {
			super(args);
			this.divisor = args.divisor;
		}
		override int getResult() {
			return super.getResult() / divisor;
		}
	}

	void main() {
		Derived.Args args1;
		args1.basefield1 = 10;
		args1.parm1 = 20;
		auto obj1 = new Derived(args1);
		assert(obj1.getResult() == 10 + 2 + 20 + 4);

		AnotherDerived.Args args2;
		args2.basefield2 = 20;
		args2.another1 = 30;
		auto obj2 = new AnotherDerived(args2);
		assert(obj2.getResult() == 1 + 20 + 30 + 6);

		VeryDerived.Args args3;
		args3.divisor = 7;
		auto obj3 = new VeryDerived(args3);
		assert(obj3.getResult() == 2);
	}

Note the nice thing about this: you can construct the ctor arguments (har har) in any order you like, and it Just Works. Referencing ctor parameters of base class ctors is just as easy; no need for ugliness like "args.base.base.base.baseparm1" thanks to alias this.  The ctors themselves just hand Args over to the base class: alias this makes the struct inheritance pretty transparent. The mixin line itself is identical across the board, thanks to the static if in the mixin template, so you can actually re-root the class hierarchy or otherwise move classes around the hierarchy without having to re-wire any of the Args handling, and things will Just Work.

Wow. So not only this technique works in D, it's working much *better* than my original C++ code! I think I shall add this to my personal D library. :) (Unless people think this is Phobos material.)


T

-- 
People who are more than casually interested in computers should have at least some idea of what the underlying hardware is like. Otherwise the programs they write will be pretty weird. -- D. Knuth

July 16, 2013

Re: Interesting Research Paper on Constructors in OO Languages

Posted by Wyatt
in reply to Craig Dillabaugh

Wyatt

Posted in reply to Craig Dillabaugh

On Tuesday, 16 July 2013 at 13:35:00 UTC, Craig Dillabaugh wrote:
> How do you envision this working where Name or Age must be set to
> a value not known at compile time?

I'm not sure if it's practical or covers all the bases, but it sounds like you would need to keep track of member initialisation during compilation, and abort if the code attempts to use the object or one of its members as an AssignExpression without initialising the whole thing.

Setting aside the fact that there's compiler work mentioned at all, have I missed some nuance of this pattern?  I guess there's the situation where you conditionally may or may not assign, or pass it around and accrete mutations, so it might be best to only do it for some properly-annotated (how?) subset of the whole?  Not sure.

-Wyatt

July 16, 2013

Re: Interesting Research Paper on Constructors in OO Languages

Posted by Craig Dillabaugh
in reply to Wyatt

Craig Dillabaugh

Posted in reply to Wyatt

On Tuesday, 16 July 2013 at 16:07:30 UTC, Wyatt wrote:
> On Tuesday, 16 July 2013 at 13:35:00 UTC, Craig Dillabaugh wrote:
>> How do you envision this working where Name or Age must be set to
>> a value not known at compile time?
>
> I'm not sure if it's practical or covers all the bases, but it sounds like you would need to keep track of member initialisation during compilation, and abort if the code attempts to use the object or one of its members as an AssignExpression without initialising the whole thing.
>
> Setting aside the fact that there's compiler work mentioned at all, have I missed some nuance of this pattern?  I guess there's the situation where you conditionally may or may not assign, or pass it around and accrete mutations, so it might be best to only do it for some properly-annotated (how?) subset of the whole?  Not sure.
>
> -Wyatt

July 16, 2013

Re: Interesting Research Paper on Constructors in OO Languages

Posted by Craig Dillabaugh
in reply to Wyatt

Craig Dillabaugh

Posted in reply to Wyatt

On Tuesday, 16 July 2013 at 16:07:30 UTC, Wyatt wrote:
> On Tuesday, 16 July 2013 at 13:35:00 UTC, Craig Dillabaugh wrote:
>> How do you envision this working where Name or Age must be set to
>> a value not known at compile time?
>
> I'm not sure if it's practical or covers all the bases, but it sounds like you would need to keep track of member initialisation during compilation, and abort if the code attempts to use the object or one of its members as an AssignExpression without initialising the whole thing.
>
> Setting aside the fact that there's compiler work mentioned at all, have I missed some nuance of this pattern?  I guess there's the situation where you conditionally may or may not assign, or pass it around and accrete mutations, so it might be best to only do it for some properly-annotated (how?) subset of the whole?  Not sure.
>
> -Wyatt

Sorry for the empty post (previous).  In general, I think the proposed idea is quite nice, and as Dicebot pointed out, my initial concern was misguided because the invariant is evaluated at runtime, not compile time (and Dicebot, I checked the docs, and you are correct about it getting stripped for release builds).

July 16, 2013

Re: Interesting Research Paper on Constructors in OO Languages

Posted by Regan Heath
in reply to Craig Dillabaugh

Regan Heath

Posted in reply to Craig Dillabaugh

On Tue, 16 Jul 2013 14:34:59 +0100, Craig Dillabaugh <cdillaba@cg.scs.careton.ca> wrote:

> On Tuesday, 16 July 2013 at 09:47:35 UTC, Regan Heath wrote:
>
> clip
>
>>
>> We have class invariants.. these define the things which must be initialised to reach a valid state.  If we had compiler recognisable properties as well, then we could have an initialise construct like..
>>
>> class Foo
>> {
>>   string name;
>>   int age;
>>
>>   invariant
>>   {
>>     assert(name != null);
>>     assert(age > 0);
>>   }
>>
>>   property string Name...
>>   property int Age...
>> }
>>
>> void main()
>> {
>>   Foo f = new Foo() {
>>     Name = "test",    // calls property Name setter
>>     Age = 12          // calls property Age setter
>>   };
>> }
>>
>> The compiler could statically verify that the variables tested in the invariant (name, age) were set (by setter properies) inside the initialise construct {} following the new Foo().
>>
>> R
>
> How do you envision this working where Name or Age must be set to
> a value not known at compile time?

The idea isn't to run the invariant itself at compile time - as you say, a runtime only value may be used.  In fact, in the example above the compiler would have to hold off running the invariant until the closing } of that initialise statement or it may fail.

The idea was to /use/ the code in the invariant to determine which member fields should be set during the initialisation statement and then statically verify that a call was made to some member function to set them.  The actual values set aren't important, just that some attempt has been made to set them.  That's about the limit of what I think you could do statically, in the general case.

In some specific cases we could extend this to say that if all the values set were evaluable at compile time, then we could actually run the invariant using CTFE, perhaps.

R

-- 
Using Opera's revolutionary email client: http://www.opera.com/mail/

July 16, 2013

Re: Interesting Research Paper on Constructors in OO Languages

Posted by Jérôme M. Berger
in reply to Regan Heath

Jérôme M. Berger

Posted in reply to Regan Heath

Attachments:

signature.asc (OpenPGP digital signature)

Regan Heath wrote:
> Or, perhaps another way to ask a similar W is.. can the compiler statically verify that a create-set-call style object has been initialised, or rather that an attempt has at least been made to initialise all the required parts.
> 
	Here's a way to do it in Scala:
http://blog.rafaelferreira.net/2008/07/type-safe-builder-pattern-in-scala.html

	Basically, the builder object is a generic that has a boolean
parameter for each mandatory parameter. Setting a parameter casts
the builder object to the same generic with the corresponding
boolean set to true. And the "build" method is only available when
the type system recognizes that all the booleans are true.

	Note however that this will not work if you try to mutate the
builder instance. IOW, this will work (assuming you only need to
specify foo and bar):

> auto instance = builder().withFoo (1).withBar ("abc").build();

but this won't work:

> auto b = builder();
> b.withFoo (1);
> b.withBar ("abc");
> auto instance = b.build();

	Something similar should be doable in D (although I'm a bit afraid
of the template bloat it might create…)

		Jerome
-- 
mailto:jeberger@free.fr
http://jeberger.free.fr
Jabber: jeberger@jabber.fr

July 16, 2013

Re: Interesting Research Paper on Constructors in OO Languages

Posted by H. S. Teoh
in reply to Regan Heath

H. S. Teoh

Posted in reply to Regan Heath

On Tue, Jul 16, 2013 at 06:17:48PM +0100, Regan Heath wrote:
> On Tue, 16 Jul 2013 14:34:59 +0100, Craig Dillabaugh <cdillaba@cg.scs.careton.ca> wrote:
> 
> >On Tuesday, 16 July 2013 at 09:47:35 UTC, Regan Heath wrote:
> >
> >clip
> >
> >>
> >>We have class invariants.. these define the things which must be initialised to reach a valid state.  If we had compiler recognisable properties as well, then we could have an initialise construct like..
> >>
> >>class Foo
> >>{
> >>  string name;
> >>  int age;
> >>
> >>  invariant
> >>  {
> >>    assert(name != null);
> >>    assert(age > 0);
> >>  }
> >>
> >>  property string Name...
> >>  property int Age...
> >>}
> >>
> >>void main()
> >>{
> >>  Foo f = new Foo() {
> >>    Name = "test",    // calls property Name setter
> >>    Age = 12          // calls property Age setter
> >>  };
> >>}

Maybe I'm missing something obvious, but isn't this essentially the same thing as having named ctor parameters?

[...]
> The idea was to /use/ the code in the invariant to determine which member fields should be set during the initialisation statement and then statically verify that a call was made to some member function to set them.  The actual values set aren't important, just that some attempt has been made to set them.  That's about the limit of what I think you could do statically, in the general case.
[...]

This seems to be the same thing as using named parameters: assuming the compiler actually supported such a thing, it would be able to tell at compile-time whether all required named parameters have been specified, and abort if not. There would be no need for any invariant-based guessing of what fields are required and what aren't, and no need for adding any property feature to the language -- the function signature of the ctor itself indicates what is required, and the compiler can check this at compile-time. (Of course, actual verification of the ctor parameters can only happen at runtime -- which is OK.)

This still doesn't address the issue of ctor argument proliferation, though: if each level of the class hierarchy adds 1-2 additional parameters, you still need to write tons of boilerplate in your derived classes to percolate those additional parameters up the inheritance tree. If a base class ctor requires parameters parmA, parmB, parmC, then any derived class ctor must declare at least parmA, parmB, parmC in their function signature (or provide default values for them), and you must still write super(parmA, parmB, parmC) in order to percolate these parameters to the base class. If the derived class requires additional parameters, say parmD, then that's added on top of all of the base class ctor arguments. And any further derived class will now have to declare at least parmA, parmB, parmC, parmD, and then tack on any additional parameters they may need. This is not scalable -- deeply derived classes will have ctors with ridiculous numbers of arguments.

Now imagine if at some point you need to change some base class ctor parameters. Now instead of making a single change to the base class, you have to update every single derived class to make the same change to every ctor, so that the new version of the parameter (or new parameter) is properly percolated up the inheritance tree. This defeats the goal in OOP of restricting the scope of changes to only localized changes. This is especially bad when you need to add an *optional* parameter to the base class: you have to do all that work of updating every single derived class yet most of the code that uses those derived classes don't even care about this new parameter! That's a lot of work for almost no benefit. (And you can't get away without doing it either, since a user of a derived class may at some point want to customize that optional base class parameter, so *all* derived class ctors must also declare it as an optional parameter.)

I think my approach of using builder structs with a parallel inheritance tree is still better: adding/removing/changing parameters to a base class's builder struct automatically propagates to all derived classes with no further code change. With the help of mixin templates, the amount of boilerplate is greatly reduced. And thanks to the use of typeof(super), you can even shuffle classes around your class hierarchy without needing to change anything more than the base class name in the class declaration -- the mixin automatically picks up the right base class builder struct to inherit from, thus guaranteeing that the parallel hierarchy is consistent at all times.

The only weakness I can see is that mandatory arguments with no reasonable default values can't be easily handled. In the simple cases, you can expand the mixin to allow you to specify builder struct ctors that have required arguments; but then this suffers from the same scalability problems that we were trying to solve in the first place, since all derived classes' builder structs will now require mandatory arguments to be propagated through their ctors. But I think this shouldn't be a big problem in practice: we can use Nullable fields in the builder struct and have the class ctor verify that all mandatory arguments are present, and throw an error if any arguments are not set properly.

T

-- 
ASCII stupid question, getty stupid ANSI.

July 17, 2013

Re: Interesting Research Paper on Constructors in OO Languages

Posted by Regan Heath
in reply to Jérôme M. Berger

Regan Heath

Posted in reply to Jérôme M. Berger

On Tue, 16 Jul 2013 18:54:06 +0100, Jérôme M. Berger <jeberger@free.fr> wrote:

> Regan Heath wrote:
>> Or, perhaps another way to ask a similar W is.. can the compiler
>> statically verify that a create-set-call style object has been
>> initialised, or rather that an attempt has at least been made to
>> initialise all the required parts.
>>
> 	Here's a way to do it in Scala:
> http://blog.rafaelferreira.net/2008/07/type-safe-builder-pattern-in-scala.html

I saw the builder pattern mentioned in the original thread..

> 	Basically, the builder object is a generic that has a boolean
> parameter for each mandatory parameter. Setting a parameter casts
> the builder object to the same generic with the corresponding
> boolean set to true. And the "build" method is only available when
> the type system recognizes that all the booleans are true.

But I hadn't realised it could enforce things statically, this is a cool idea.

> 	Note however that this will not work if you try to mutate the
> builder instance. IOW, this will work (assuming you only need to
> specify foo and bar):
>
>> auto instance = builder().withFoo (1).withBar ("abc").build();

This looks like good D style, to me, in keeping with the UFCS chains etc.

> but this won't work:
>
>> auto b = builder();
>> b.withFoo (1);
>> b.withBar ("abc");
>> auto instance = b.build();

But, you could create a separate variable for each with, couldn't you - v/ inefficient, but possible.  I don't think this syntax/style is a requirement, and I prefer the chain style above it.

> 	Something similar should be doable in D (although I'm a bit afraid
> of the template bloat it might create…)

Indeed.  The issue I have with the builder is the requirement for more classes/templates/etc in addition to the original objects.  D could likely define them in the standard library, but as you say there would be template bloat.

R

-- 
Using Opera's revolutionary email client: http://www.opera.com/mail/

July 17, 2013

Re: Interesting Research Paper on Constructors in OO Languages

Posted by Regan Heath
in reply to H. S. Teoh

Regan Heath

Posted in reply to H. S. Teoh

On Tue, 16 Jul 2013 23:01:57 +0100, H. S. Teoh <hsteoh@quickfur.ath.cx> wrote:
> On Tue, Jul 16, 2013 at 06:17:48PM +0100, Regan Heath wrote:
>> On Tue, 16 Jul 2013 14:34:59 +0100, Craig Dillabaugh
>> <cdillaba@cg.scs.careton.ca> wrote:
>>
>> >On Tuesday, 16 July 2013 at 09:47:35 UTC, Regan Heath wrote:
>> >
>> >clip
>> >
>> >>
>> >>We have class invariants.. these define the things which must be
>> >>initialised to reach a valid state.  If we had compiler
>> >>recognisable properties as well, then we could have an
>> >>initialise construct like..
>> >>
>> >>class Foo
>> >>{
>> >>  string name;
>> >>  int age;
>> >>
>> >>  invariant
>> >>  {
>> >>    assert(name != null);
>> >>    assert(age > 0);
>> >>  }
>> >>
>> >>  property string Name...
>> >>  property int Age...
>> >>}
>> >>
>> >>void main()
>> >>{
>> >>  Foo f = new Foo() {
>> >>    Name = "test",    // calls property Name setter
>> >>    Age = 12          // calls property Age setter
>> >>  };
>> >>}
>
> Maybe I'm missing something obvious, but isn't this essentially the same
> thing as having named ctor parameters?

Yes, if we're comparing this to ctors with named parameters.  I wasn't doing that however, I was asking this Q:

"Or, perhaps another way to ask a similar W is.. can the compiler statically verify that a create-set-call style object has been initialised, or rather that an attempt has at least been made to initialise all the required parts."

Emphasis on "create-set-call" :)  The weakness to create-set-call style is the desire for a valid object as soon as an attempt can be made to use it.  Which implies the need for some sort of enforcement of initialisation and as I mentioned in my first post the issue of preventing this intialisation being spread out, or intermingled with others and thus making the semantics of it harder to see.

My idea here attempted to solve those issues with create-set-call only.

> [...]
>> The idea was to /use/ the code in the invariant to determine which
>> member fields should be set during the initialisation statement and
>> then statically verify that a call was made to some member function
>> to set them.  The actual values set aren't important, just that some
>> attempt has been made to set them.  That's about the limit of what I
>> think you could do statically, in the general case.
> [...]
>
> This still doesn't address the issue of ctor argument proliferation,
> though

It wasn't supposed to :)  create-set-call ctors have no arguments.

> if each level of the class hierarchy adds 1-2 additional
> parameters, you still need to write tons of boilerplate in your derived
> classes to percolate those additional parameters up the inheritance
> tree.

In the create-set-call style additional required 'arguments' would appear as setter member functions whose underlying data member is verified in the invariant and would therefore be enforced by the syntax I detailed.

> Now imagine if at some point you need to change some base class ctor
> parameters. Now instead of making a single change to the base class, you
> have to update every single derived class to make the same change to
> every ctor, so that the new version of the parameter (or new parameter)
> is properly percolated up the inheritance tree.

This is one reason why create-set-call might be desirable, no ctor arguments, no problem.

So, to take my idea a little further - WRT class inheritance.  The compiler, for a derived class, would need to inspect the invariants of all classes involved (these are and-ed already), inspect the constructors of the derived classes (for calls to initialise members), and the initialisation block I described and verify statically that an attempt was made to initialise all the members which appear in all the invariants.

> I think my approach of using builder structs with a parallel inheritance
> tree is still better

It may be, it certainly looked quite neat but I haven't had a detailed look at it TBH.  I think you've missunderstood my idea however, or rather, the issues it was intended to solve :)  Perhaps my idea is too limiting for you?  I could certainly understand that point of view.

I think another interesting idea is using the builder pattern with create-set-call objects.

For example, a builder template class could inspect the object for UDA's indicating a data member which is required during initialisation.  It would contain a bool[] to flag each member as not/initialised and expose a setMember() method which would call the underlying object setMember() and return a reference to itself.

At some point, these setMember() method would want to return another template class which contained just a build() member.  I'm not sure how/if this is possible in D.

R

-- 
Using Opera's revolutionary email client: http://www.opera.com/mail/

Top | Forum index | About this forum

Copyright © 1999-2021 by the D Language Foundation