Compilation strategy (page 11) - D Programming Language Discussion Forum

Forums

New users
- Learn
Community
- General
- Announce
Improvements
- DIP Ideas
- DIP Devel.
Ecosystem
- GDC
- LDC
- Debuggers
- IDEs
- DWT
Development
- Internals
- Issues
- Beta
- DMD
- Phobos
- Druntime
- Study
Turkish
- Genel
- Duyuru

Index » General » Compilation strategy (page 11)

December 18, 2012

Re: Compilation strategy

Posted by Walter Bright
in reply to Rob T

Walter Bright

Posted in reply to Rob T

On 12/17/2012 3:11 PM, Rob T wrote:
>> I suspect most file transport protocols already compress the data, so
>> compressing it ourselves probably accomplishes nothing. There are also
>> compressed filesystems, so storing files in a compressed manner likely
>> accomplishes little.
>
> Yes however my understanding is that html based file transfers are often not
> compressed despite the protocol specifically supporting the feature. The problem
> is not with the protocol, its that some clients and servers simply do not
> implement the feature or are in a misconfigured state. HTML as you know is very
> widely used for transferring files.

I don't think fixing misconfigured HTML servers is something D should address.

> Another thing to consider, is for using byte code for interpretation, that way D
> could be used directly in game engines in place of LUA or other scripting
> methods, or even as a replacement for Java Script. Of course you know best if
> this is practical for a language like D, but maybe a subset of D is practical, I
> don't know.

Again, there is zero advantage over using a bytecode for this rather than using source code. Recall that CTFE is an interpreter. (It has some efficiency problems, but that is not related to the file format.)

There is no technical reason why tokenized and compressed D source code cannot be interpreted and effectively serve the role of "bytecode". I'll come out and say that bytecode is probably the biggest software misfeature anyone set a store on :-)

> Wow, I think that's exactly what we could use! It serves multiple optional use
> cases all at once!
>
> Was there a technical reason for you not getting around towards implementing, or
> just a lack of time?

There always seemed something more important to be doing, and Andrei thought it would be better to put such a capability in rdmd rather than dmd.

December 18, 2012

Re: Compilation strategy

Posted by Walter Bright
in reply to Denis Koroskin

Walter Bright

Posted in reply to Denis Koroskin

On 12/17/2012 3:27 PM, Denis Koroskin wrote:
> On Mon, 17 Dec 2012 13:47:36 -0800, Walter Bright <newshound2@digitalmars.com>
> wrote:
>
>> I've often thought Java bytecode was a complete joke. It doesn't deliver any
>> of its promises. You could tokenize Java source code, run the result through
>> an lzw compressor, and get the equivalent functionality in every way.
>>
>
> Not true at all. Bytecode is semi-optimized,

I'm not unaware of that, recall I wrote a Java compiler. The "semi-optimized" is generous. The bytecode simply doesn't allow for any significant optimization.

> easier to manipulate with (obfuscate, instrument, etc),

The obfuscators for bytecode are ineffective. It's probably marginally easier to instrument, but adjusting a Java compiler to emit instrumented code is just as easy.

> JVM/CLR bytecode is shared by many languages
> (Java, Scala/C#,F#) so you don't need a separate parser for each language,

That's true. But since there's a 1:1 correspondence between bytecode and Java, you can just as easily emit Java from your backend.

> and there is hardware that supports running JVM bytecode on the metal.

There's a huge problem with that approach. Remember I said that bytecode can't be more than trivially optimized? In hardware, there's no optimization, so it's going to be doomed to slow optimization. Even a trivial JIT will beat it, and if you go beyond basic code generation to using a real optimizer, that will beat the pants off of any hardware bytecode machine.

Which is why such machines have not caught on. It's the wrong place in the compilation process to put the hardware.

> Try doing the same with lzw'd source code.

Modern CPU design is heavily influenced by the kind of instructions compilers like to emit. So, in a sense, this is already the case and has been for decades.

(Note the disuse of some instructions in the 8086 that compilers never emit, and their consequent relegation to having as little silicon as possible reserved for them, and the consequent caveats to "never use those instructions, they are terribly slow".)

December 18, 2012

Re: Compilation strategy

Posted by Walter Bright
in reply to deadalnix

Walter Bright

Posted in reply to deadalnix

On 12/17/2012 4:47 PM, deadalnix wrote:
> On Tuesday, 18 December 2012 at 00:42:13 UTC, Walter Bright wrote:
>> On 12/17/2012 3:03 PM, deadalnix wrote:
>>> I know that. I not arguing against that. I'm arguing against the fact that this
>>> is a blocker. This is blocker in very few use cases in fact. I just look at the
>>> whole picture here. People needing that are the exception, not the rule.
>>
>> I'm not sure what you mean. A blocker for what?
>>
>>
>>> And what prevent us from using a bytecode that loose information ?
>>
>> I'd turn that around and ask why have a bytecode?
>>
>
> Because it is CTFEable efficiently, without requiring either to recompile the
> source code or even distribute the source code.

I've addressed that issue several times now. I know I'm arguing against scores of billions of dollars invested in JVM bytecode, but the emperor isn't wearing clothes.


>>> As long as it is CTFEable, most people will be happy.
>>
>> CTFE needs the type information and AST trees and symbol table. Everything
>> needed for decompilation.
>>
>
> You do not need more information that what is in a di file.

Yeah, which is source code. I think you just conceded :-)


> Java and C# put more
> info in that because of runtime reflection (and still, they are tools to strip
> most of it, no type info, granted, but everything else), something we don't need.

There's nothing to be stripped from .class files without rendering them unusable.

December 18, 2012

Re: Compilation strategy

Posted by H. S. Teoh
in reply to Walter Bright

H. S. Teoh

Posted in reply to Walter Bright

On Mon, Dec 17, 2012 at 04:42:13PM -0800, Walter Bright wrote:
> On 12/17/2012 3:03 PM, deadalnix wrote:
[...]
> >And what prevent us from using a bytecode that loose information ?
> 
> I'd turn that around and ask why have a bytecode?
> 
> 
> >As long as it is CTFEable, most people will be happy.
> 
> CTFE needs the type information and AST trees and symbol table. Everything needed for decompilation.
> 
> I know that bytecode has been around since 1995 in its current incarnation, and there's an ingrained assumption that since there's such an extensive ecosystem around it, that there is some advantage to it.
> 
> But there isn't.

Now this, I have to agree with. The only advantage to bytecode is that if you have two interpreters on two different platforms, then bytecode on one can run verbatim on the other. But:

1) Bytecode is slower than native code, and always will be.

2) Unless, of course, you're running a machine that runs the bytecode directly. But that just means your code is native to that machine, and the interpreters on other machines are emulators. So you're already using native code anyway. And since you're already at it, might as well just use native code on the other machines, too.

3) Performance can be improved to (near) native speeds with a JIT compiler. But then you might as well as go native to begin with. Why wait till runtime to do compilation, when it can be done beforehand?

4) Bytecode cannot be (easily) linked with native libraries. Various wrappers and other workarounds are necessary. The bytecode/native boundary is often inefficient, because generally there's need of translation between bytecode interpreter data types and native data types.

5) There are other issues, but I can't be bothered to think of them right now.

But anyway, this is getting a bit off-topic. The original issue was separate compilation, and .di files.

Just for the record, I'd like to state that I am *not* convinced about the need to obfuscate library code (either by using .di or by other means), primarily because it's futile, but also because I believe in open source code. However, I know a LOT of employers and enterprises are NOT comfortable with the idea, and would not so much as consider a particular language/toolchain if they can't at least have the illusion of security.  You may say it's silly, and I'd agree, but that does nothing to help adoption.

Using PIMPL only helps if you're trying to hide implementation details of a struct or class. Anything that requires CTFE is out of the question. Templates are out of the question (this was also true with C++). This reduces the incentive to adopt D, since they might as well just stick with C++. We lose.

If we implement a way of "hiding" implementation details that *allows* CTFE and templates (and thus one up the C++ situation), this will create a stronger incentive for D adoption. It doesn't matter if it's not hard to "unhide" the implementation; we don't lose anything (having no way to hide implementation is what we already have), plus it increases our chances of adoption -- esp. by enterprises, who are generally the kind of people who even care about this issue in the first place, and who are the people we *want* to attract. Sounds like a win to me.

But then again, even if we never do this, it makes no difference to *me* -- the current situation is good enough for *me*. The question is whether or not we want to D to be better received by enterprises.

T

-- 
I am a consultant. My job is to make your job redundant. -- Mr Tom

December 18, 2012

Re: Compilation strategy

Posted by Simen Kjaeraas

Simen Kjaeraas

On 2012-12-18, 02:28, H. S. Teoh wrote:

> If we implement a way of "hiding" implementation details that *allows*
> CTFE and templates (and thus one up the C++ situation), this will create
> a stronger incentive for D adoption. It doesn't matter if it's not hard
> to "unhide" the implementation; we don't lose anything (having no way to
> hide implementation is what we already have), plus it increases our
> chances of adoption -- esp. by enterprises, who are generally the kind
> of people who even care about this issue in the first place, and who are
> the people we *want* to attract. Sounds like a win to me.

.zip already has encryption, and unpacking those files and feeding them to
the compiler should be a rather simple tool. Sure, if someone makes it, it
could probably become part of the distribution. But making it a part of the
compiler seems more than excessive.

-- 
Simen

December 18, 2012

Re: Compilation strategy

Posted by evilrat
in reply to H. S. Teoh

evilrat

Posted in reply to H. S. Teoh

On Tuesday, 18 December 2012 at 01:30:22 UTC, H. S. Teoh wrote:
>
> If we implement a way of "hiding" implementation details that *allows*
> CTFE and templates (and thus one up the C++ situation), this will create
> a stronger incentive for D adoption. It doesn't matter if it's not hard
> to "unhide" the implementation; we don't lose anything (having no way to
> hide implementation is what we already have), plus it increases our
> chances of adoption -- esp. by enterprises, who are generally the kind
> of people who even care about this issue in the first place, and who are
> the people we *want* to attract. Sounds like a win to me.
>

i agreed with that, involving big guys is necessary to make language live, if we don't than D would become just another fan loved language. it's really bad...

well, that's the point.

December 18, 2012

Re: Compilation strategy

Posted by Walter Bright
in reply to H. S. Teoh

Walter Bright

Posted in reply to H. S. Teoh

On 12/17/2012 5:28 PM, H. S. Teoh wrote:
> Using PIMPL only helps if you're trying to hide implementation details
> of a struct or class. Anything that requires CTFE is out of the
> question. Templates are out of the question (this was also true with
> C++). This reduces the incentive to adopt D, since they might as well
> just stick with C++. We lose.

I've never seen any closed-source companies reticent about using C++ because of obfuscation issues, which are the same as for D, so I do not see this as a problem.


> If we implement a way of "hiding" implementation details that *allows*
> CTFE and templates (and thus one up the C++ situation), this will create
> a stronger incentive for D adoption. It doesn't matter if it's not hard
> to "unhide" the implementation;

Yes, it does, because we would be lying if we were pretending this was an effective solution.

> we don't lose anything (having no way to
> hide implementation is what we already have), plus it increases our
> chances of adoption -- esp. by enterprises, who are generally the kind
> of people who even care about this issue in the first place, and who are
> the people we *want* to attract. Sounds like a win to me.

We'd lose credibility with them, as people will laugh at us over this.


> But then again, even if we never do this, it makes no difference to *me*
> -- the current situation is good enough for *me*. The question is
> whether or not we want to D to be better received by enterprises.

As I said, C++ is well received by enterprises. This is not an issue.

December 18, 2012

Re: Compilation strategy

Posted by Walter Bright
in reply to Simen Kjaeraas

Walter Bright

Posted in reply to Simen Kjaeraas

On 12/17/2012 5:40 PM, Simen Kjaeraas wrote:
> .zip already has encryption,

Just for the record, zip file "encryption" is trivially broken, and there are free downloadable tools to do that.

About all it will do is keep your kid sister from reading your diary.

December 18, 2012

Re: Compilation strategy

Posted by Rob T
in reply to Walter Bright

Rob T

Posted in reply to Walter Bright

On Tuesday, 18 December 2012 at 01:52:21 UTC, Walter Bright wrote:
>> If we implement a way of "hiding" implementation details that *allows*
>> CTFE and templates (and thus one up the C++ situation), this will create
>> a stronger incentive for D adoption. It doesn't matter if it's not hard
>> to "unhide" the implementation;
>
> Yes, it does, because we would be lying if we were pretending this was an effective solution.

If you can hide the implementation details for other reasons, then no such claim need to be made at all, in fact you can explicitly warn people that the code is not really hidden should they think otherwise.

Your suggestion concerning the use of zip files is a good idea, although you mention the encryption algo is very weak, but is there any reason to use a weak encryption algo, and is there even a reason to bother maintaining compatibility with the common zip format? I would expect that there other compression formats that could be used.

--rt

December 18, 2012

Re: Compilation strategy

Posted by Walter Bright
in reply to Rob T

Walter Bright

Posted in reply to Rob T

On 12/17/2012 6:13 PM, Rob T wrote:
> Your suggestion concerning the use of zip files is a good idea, although you
> mention the encryption algo is very weak, but is there any reason to use a weak
> encryption algo, and is there even a reason to bother maintaining compatibility
> with the common zip format?

Using standard zip tools is a big plus.

Top | Forum index | About this forum

Copyright © 1999-2021 by the D Language Foundation