Thread overview
j2d - translating Java to D with the language machine - progress
Apr 01, 2006
Peri Hankey
Apr 01, 2006
Peri Hankey
Apr 02, 2006
Walter Bright
Apr 04, 2006
Peri Hankey
Apr 04, 2006
Georg Wrede
April 01, 2006
Hello - just to let you all know that the j2d java-to-D-language translator is now producing clean code (accepted by gdc syntax pass) for all the bits of  of gnu classpath that I have tried, and also for the joeq_core part of the joeq jvm-in-java.

The difficult bit starts now: getting it to produce code that is actually correct and works.

See http://languagemachine.sourceforge.net/j2d.html

=== results ===

  j2d    gdc1    gdc2     lines java
  y      y       n          917 java/applet/
  y      y       n        98944 java/awt/
  y      y       n        11843 java/beans/
  y      y       n        23187 java/io/
  y      y       n        33871 java/lang/
  y      y       n         2780 java/math/
  y      y       n        15173 java/net/
  y      y       n        13287 java/nio/
  y      y       n         6233 java/rmi/
  y      y       n        23482 java/security/
  y      y       n         9250 java/sql/
  y      y       n        11861 java/text/
  y      y       n        56636 java/util/

  j2d    gdc1    gdc2     lines javax
  y      y       n         4081 javax/accessibility/
  y      y       n         7661 javax/crypto/
  y      y       n        14485 javax/imageio/
  y      y       n          138 javax/management/
  y      y       n         8754 javax/naming/
  y      y       n         4082 javax/net/
  y      y       n        17942 javax/print/
  y      y       n         1805 javax/rmi/
  y      y       n         8148 javax/security/
  y      y       n         8851 javax/sound/
  y      y       n         1033 javax/sql/
  y      y       n       205028 javax/swing/
  y      y       n         1049 javax/transaction/
  y      y       n        11274 javax/xml/

  j2d    gdc1    gdc2     lines gnu
  y      y       n        12911 gnu/classpath/
  y      y       n        42958 gnu/CORBA/
  y      y       n       107506 gnu/java/
  y      y       n        92887 gnu/javax/
  y      y       n         5959 gnu/regexp/
  y      y       n          542 gnu/test/
  y      y       n        94523 gnu/xml/

  j2d    gdc1    gdc2     lines vm
  y      y       n         7505 vm/reference/

  j2d    gdc1    gdc2     lines joeq
  y      y       n           62 joeq/Allocator/
  y      y       n         8083 joeq/Class/
  y      y       n          385 joeq/ClassLib/
  y      y       n        71684 joeq/Compiler/
  y      y       n         1962 joeq/Interpreter/
  y      y       n         3303 joeq/Main/
  y      y       n          306 joeq/Memory/
  y      y       n         2246 joeq/Runtime/
  y      y       n         2440 joeq/Support/
  y      y       n          578 joeq/UTF/
  y      y       n         1096 joeq/Util/

NB 'y?' means that the status is subject to regression tests, or that there are some errors which are understood. The actual code produced is probably wrong in many places.

At present gdc compilation is failing mainly because:

* gdc complains that modules in the java.lang hierarchy are importing themselves - the code generator needs to notice this

* gdc complains "invalid UTF character \U0000ffff" for "char MAX_VALUE = '\uFFFF';" and similar (should be wchar or dchar?)

Of course there is there is still a great deal to do. The sources (about 700 rules, 1100 lines) are in SVN at dsource - it's easiest to start at

   http://languagemachine.sourceforge.net/j2d.html

Suggestions, feedback, assistance all welcome as ever.

Peri

-- 
Peri Hankey                               mpah@thegreen.co.uk
http://languagemachine.sourceforge.net - The language machine
April 01, 2006
It looks now as if the 'invalid UTF character' message is a problem with the gdc/dmd front end in the version (gdc 0.17/dmd140) that I am using. Is this fixed in later versions of dmd? How else would a java string

   '\uffff'

be translated as initialising a D language wchar variable? The following

   wchar x = '\uffff';

is flagged as invalid.

Peri

Peri Hankey wrote:
> Hello - just to let you all know that the j2d java-to-D-language translator is now producing clean code (accepted by gdc syntax pass) for all the bits of  of gnu classpath that I have tried, and also for the joeq_core part of the joeq jvm-in-java.
> 
> The difficult bit starts now: getting it to produce code that is actually correct and works.
> 
> See http://languagemachine.sourceforge.net/j2d.html
> 
> === results ===
> 
>   j2d    gdc1    gdc2     lines java
>   y      y       n          917 java/applet/
>   y      y       n        98944 java/awt/
>   y      y       n        11843 java/beans/
>   y      y       n        23187 java/io/
>   y      y       n        33871 java/lang/
>   y      y       n         2780 java/math/
>   y      y       n        15173 java/net/
>   y      y       n        13287 java/nio/
>   y      y       n         6233 java/rmi/
>   y      y       n        23482 java/security/
>   y      y       n         9250 java/sql/
>   y      y       n        11861 java/text/
>   y      y       n        56636 java/util/
> 
>   j2d    gdc1    gdc2     lines javax
>   y      y       n         4081 javax/accessibility/
>   y      y       n         7661 javax/crypto/
>   y      y       n        14485 javax/imageio/
>   y      y       n          138 javax/management/
>   y      y       n         8754 javax/naming/
>   y      y       n         4082 javax/net/
>   y      y       n        17942 javax/print/
>   y      y       n         1805 javax/rmi/
>   y      y       n         8148 javax/security/
>   y      y       n         8851 javax/sound/
>   y      y       n         1033 javax/sql/
>   y      y       n       205028 javax/swing/
>   y      y       n         1049 javax/transaction/
>   y      y       n        11274 javax/xml/
> 
>   j2d    gdc1    gdc2     lines gnu
>   y      y       n        12911 gnu/classpath/
>   y      y       n        42958 gnu/CORBA/
>   y      y       n       107506 gnu/java/
>   y      y       n        92887 gnu/javax/
>   y      y       n         5959 gnu/regexp/
>   y      y       n          542 gnu/test/
>   y      y       n        94523 gnu/xml/
> 
>   j2d    gdc1    gdc2     lines vm
>   y      y       n         7505 vm/reference/
> 
>   j2d    gdc1    gdc2     lines joeq
>   y      y       n           62 joeq/Allocator/
>   y      y       n         8083 joeq/Class/
>   y      y       n          385 joeq/ClassLib/
>   y      y       n        71684 joeq/Compiler/
>   y      y       n         1962 joeq/Interpreter/
>   y      y       n         3303 joeq/Main/
>   y      y       n          306 joeq/Memory/
>   y      y       n         2246 joeq/Runtime/
>   y      y       n         2440 joeq/Support/
>   y      y       n          578 joeq/UTF/
>   y      y       n         1096 joeq/Util/
> 
> NB 'y?' means that the status is subject to regression tests, or that there are some errors which are understood. The actual code produced is probably wrong in many places.
> 
> At present gdc compilation is failing mainly because:
> 
> * gdc complains that modules in the java.lang hierarchy are importing themselves - the code generator needs to notice this
> 
> * gdc complains "invalid UTF character \U0000ffff" for "char MAX_VALUE = '\uFFFF';" and similar (should be wchar or dchar?)
> 
> Of course there is there is still a great deal to do. The sources (about 700 rules, 1100 lines) are in SVN at dsource - it's easiest to start at
> 
>    http://languagemachine.sourceforge.net/j2d.html
> 
> Suggestions, feedback, assistance all welcome as ever.
> 
> Peri
> 


-- 
Peri Hankey        mpah@thegreen.co.uk        +44-1865-300740
http://languagemachine.sourceforge.net - The language machine
April 02, 2006
Peri Hankey wrote:
> It looks now as if the 'invalid UTF character' message is a problem with the gdc/dmd front end in the version (gdc 0.17/dmd140) that I am using. Is this fixed in later versions of dmd? How else would a java string
> 
>    '\uffff'
> 
> be translated as initialising a D language wchar variable? The following
> 
>    wchar x = '\uffff';
> 
> is flagged as invalid.

That's because \uFFFF is an invalid UTF sequence. You can try instead:

	wchar x = cast(wchar)0xFFFF;
April 04, 2006
Walter Bright wrote:
> Peri Hankey wrote:
> 
>> It looks now as if the 'invalid UTF character' message is a problem with the gdc/dmd front end in the version (gdc 0.17/dmd140) that I am using. Is this fixed in later versions of dmd? How else would a java string
>>
>>    '\uffff'
>>
>> be translated as initialising a D language wchar variable? The following
>>
>>    wchar x = '\uffff';
>>
>> is flagged as invalid.
> 
> 
> That's because \uFFFF is an invalid UTF sequence. You can try instead:
> 
>     wchar x = cast(wchar)0xFFFF;

As I said, we are essentially down to real problems. The j2d translator produces syntactically acceptable D-language code for the main elements of gnu classpath - the gnu, java, and javax package hierarchies - I haven't yet tried to do much with other parts. Here are some of the questions/problems that have come up:

invalid UTF sequences:
* the wchar fix is ok for a single character
* but the same problem can arise in a java String literal
* not just \uffff, some other values as well

   however these seem to be accepted by gdc:

   char []c = "abc\xffffxyz";   // must see what this actually does
   wchar[]w = "abc\xffffxyz";   // can be used for java string literal?

name conflicts between java and D-language:
* between identifiers from java and D-language reserved words etc
* in general, can be fixed by adding a suffix to names from java

name conflicts in generated code:
* a class in java can have both 'int x' and 'int x()'
* may be able to use different suffixes to distinguish these cases
* may have to use dictionaries - not yet required otherwise

constants in java interfaces
* Java interfaces can include constants (initialised at runtime)
* quite tricky - useful feature - Walter, what do you think?

java packages, classes, interfaces vs D-language modules
* the unit of import in Java is class/interface or package
* the unit of import in D is the module
* in java class/interface name is effectively module name

* so: (A) produce a modules.d per java package

  (java) package x;  -> (D) module x_.thismodule_;
  (java) import x.*; -> (D) import x_.modules;
  (java) import x.y; -> (D) import x_.y_;

* or: (B) concatenate all D-code for a java package into one file

But in either case there are gotchas, and the ordering of the imports or of the separate code chunks needs to relate to dependencies.

dependency ordering:
* circular dependencies in gnu classpath
* java copes (maybe) by selective imports

So, lots of progress, some problems. Any ideas?

Peri

-- 
Peri Hankey                               mpah@thegreen.co.uk
http://languagemachine.sourceforge.net - The language machine
April 04, 2006
Peri Hankey wrote:
> constants in java interfaces * Java interfaces can include constants
> (initialised at runtime) * quite tricky - useful feature - Walter,
> what do you think?

After some thought, I believe it's hard to motivate not implementing this in D too.