May 25, 2021
On Tuesday, 25 May 2021 at 12:38:56 UTC, jmh530 wrote:
> Refactoring won't happen overnight and more people who understand the compiler the more can assist with that and other things in the meantime.

They best way to refactor is to partition and encapsulate then you can replace one item at a time. The only people who can do this is people who are willing to dig deep into the codebase.

But this thread is about experimentation. Experimentation on top of parts that are considered unstable is futile. You don't need to understand every single piece of the compiler to have fun extending it. And you should focus your efforts on the stable parts.

Parts that are considered unstable need to be encapsulated and provide interfaces so that people can build on those interfaces instead of making change hard by tying more stuff to the code that you want to replace.


May 25, 2021
On Tuesday, 25 May 2021 at 12:47:53 UTC, Ola Fosheim Grøstad wrote:
> Parts that are considered unstable need to be encapsulated and provide interfaces so that people can build on those interfaces instead of making change hard by tying more stuff to the code that you want to replace.

The key point here is designing new and better interfaces. You cannot refactor yourself into heaven with no redesign.

But it does not have to be disruptive. As an example, let's pretend we want a new AST and a new IR. Here is a non-disruptive sequence:

1. Write new AST and translation to old AST. Not disruptive.

2. Write translation from old AST to new IR, and encourge backends to transition. Not disruptive.

3. Transition to new IR, by making old AST private. Backends are ready. Not disruptive.

4. Move passes one by one to new IR. Not disruptive.

5. Write translation from new AST to new IR. Done.

You don't want to document your old interface, because you don't want people to depend on it. You want to document your new interface and encourage people to transition. Then you eventually can make the old interface private and can in peace replace the old parts.



May 25, 2021
On Tuesday, 25 May 2021 at 13:11:17 UTC, Ola Fosheim Grøstad wrote:
> 1. Write new AST and translation to old AST. Not disruptive.
>
> 2. Write translation from old AST to new IR, and encourge backends to transition. Not disruptive.
>
> 3. Transition to new IR, by making old AST private. Backends are ready. Not disruptive.
>
> 4. Move passes one by one to new IR. Not disruptive.
>
> 5. Write translation from new AST to new IR. Done.

If you are unsure if the new IR is stable you can rearrange the sequence like this instead:

new AST -> new IR -> old AST

Perhaps better as it gives more time for backends to transition.

May 25, 2021
+10086.
May 25, 2021
On Tuesday, 25 May 2021 at 13:28:02 UTC, zjh wrote:
> +10086.

Refactoring doesn't take much time.
Because the function has been realized. Refactoring has great benefits. Clearly hierarchy, Clearly dependence and Clearly interface.

May 25, 2021
On Tuesday, 25 May 2021 at 13:33:25 UTC, zjh wrote:
> On Tuesday, 25 May 2021 at 13:28:02 UTC, zjh wrote:
>> +10086.
>
> Refactoring doesn't take much time.
> Because the function has been realized. Refactoring has great benefits. Clearly hierarchy, Clearly dependence and Clearly interface.

It takes time, but it is a necessary part of the life cycle, and does not have to be disruptive. It can happen bit by bit as long as you have a new design that is clean.

The nice thing is that one can easily detect regressions by comparing a dump from the old compiler with a dump the new compiler. Do this for all D programs on github and you can feel confident that the new compiler has not introduced new errors.

So: you compare new IR translated to old AST from the new compiler with the old ast from the old compiler for a D program. If they are equal, then the new compiler passed the test.

May 25, 2021
On Tuesday, 25 May 2021 at 08:32:46 UTC, sighoya wrote:
> You can't encode the full semantic into one function name with parameter names without to over blow these names.
In this case, it might be good to have a documentation comment, otherwise behavior should be known from the function name and args.

> However, small comments inside the function would also be beneficial.
Having such comments inside function body, means you've failed to make the code easy to read and understand. Instead of such inline comments, consider extracting that piece into a function with right name. Adding such comments should be the last option in your decision on what to do with that piece of code. Note that most probably next dev, if he changed that piece of code, will most probably just forget updating that comment, meaning that it will tell a lie instead of truth.


May 25, 2021

On Tuesday, 25 May 2021 at 16:00:32 UTC, Alexandru Ermicioi wrote:

>

On Tuesday, 25 May 2021 at 08:32:46 UTC, sighoya wrote:

>

You can't encode the full semantic into one function name with parameter names without to over blow these names.
In this case, it might be good to have a documentation comment, otherwise behavior should be known from the function name and args.

Agree.

>

Having such comments inside function body, means you've failed to make the code easy to read and understand. Instead of such inline comments, consider extracting that piece into a function with right name.

It's a trade-off. Over modularization can also be a mispattern as it significantly reduces locality.
The other point is how to deal with dynamic context which may solved with templates, what a hack.
Anyhow, you don't always code very high level, sometimes a bit more low level or indirect, then it's good to have some thread to follow.

>

Adding such comments should be the last option in your decision on what to do with that piece of code.

Naming is more important, definitely. But succinct comments for small sections aren't that bad and are sometimes better than to modularize it with a function:

void firstAddToThenUpdateStructureThenFinalize...

Giving a shorter and a more non-functional name to this function would be ok but is sometimes too general to understand it in your context.
Splitting this function in smaller parts may work to name these operations shorter, but the point is the context, it's not always clear even with correct semantic naming which is mostly not possible without to be too general.

It's like commit messages, I like to commit first with the technical detail:

Update ClassA:

Then in the next lines I add some points describing newly added semantics which is too much to compact it into one single line.

If you could add context otherwise, this would be pretty good, for instance Swift parameter labels are a first step into the right direction:

send(message:"Hello World",from:"Earth",to:"Mars")

>

Note that most probably next dev, if he changed that piece of code, will most probably just forget updating that comment, meaning that it will tell a lie instead of truth.

Yes, but I can argue with the same for modularization, if someone changes the body without to rename the function failed the same way.

May 25, 2021
On Tuesday, 25 May 2021 at 09:05:26 UTC, Ola Fosheim Grøstad wrote:
> On Tuesday, 25 May 2021 at 08:32:46 UTC, sighoya wrote:
>> You can't encode the full semantic into one function name with parameter names without to over blow these names.
>
> We can assume that the reader has read a book on compiler design and is familiar with the terminology and the most common algorithms. Provide a reference to wikipedia if unsure if the reader is with you...

For very general things, yes, this is possible, but there are structures and algorithms out there which didn't resemble that what you've learned or there isn't a simple name invented/discovered by someone.
Everyone has a different intuition how to solve a problem which could be pretty hard to follow without comments by reading solely index operations, shifts and type names which are so specific as the cosmos.

> Functions that are only called from a few places can have long descriptive names, that is not a negative.

Trade off, but I appreciate this in tests for instance.



> Yes, obviously. But adding 6 lines of comments for every trivial function is not helpful. It is a useless policy. It is a policy for the sake of having a policy.

Yes, I agree with this and six lines is mostly too much, look at the example I linked before, this was mostly a one liner of a comment.


> If time is invested in documenting things that should be changed... then change becomes less likely: "look, the documentation is over there, change not needed".

Okay, that may be true, but it makes it also easier to dive in and to have fun to change things.

> Anyway, documentation is the wrong solution to structural issues. It does not enable anything.

Agree.

> It is kinda like saying a city does not read roadsigns because there is a good map available. Or that a city that is a labyrinth of one-way streets are easy to navigate with the right kind of map. Driving while looking at a map is not a good experience. And when things change, can you then trust the map?
>
> *shrug*

I think the metaphor speaks against you as the map is the wiki article you mentioned :)


May 25, 2021
On Tuesday, 25 May 2021 at 19:08:18 UTC, sighoya wrote:
> I think the metaphor speaks against you as the map is the wiki article you mentioned :)

Nah, because that would be in the documentation, so you are already looking at the map.

But I think for a compiler you should assume basic terminology to be known, if people have an interest for this they would pick up a book on compiler design and implementation.