UNICODE operators

Re: [OT] UNICODE operators
Dec 05, 2003 J C Calvarese
Dec 05, 2003 Sean L. Palmer
Dec 05, 2003 J C Calvarese
Dec 05, 2003 Elias Martenson
Dec 05, 2003 Mark J. Brudnak
Dec 05, 2003 Sean L. Palmer
Dec 05, 2003 J C Calvarese
Dec 20, 2003 Walter

Dec 03, 2003

Andy Friesen

Dec 04, 2003

Dec 04, 2003

Dec 04, 2003

December 03, 2003

Posted by Mark Brudnak

Permalink

Mark Brudnak

Permalink

When reading the D spec I noticed that it supports UNICODE UTF-8, UTF-16, UTF-32 source code formats. I propose that D extend its available set of operators (and maintain the current set) and draw from the unicode extensions for additional operators.  For example:

LOGICAL OPERATORS
==================
? (unicode 2264)     may be used instead of     <=
? (unicode 2265)     may be used instead of     >=
? (unicode 2260)     may be used instead of     !=
? (unicode 225F)     may be used instead of     ==
? (unicode 2227)     may be used instead of     &&
? (unicode 2228)     may be used instead of     ||

INFIX OPERATORS (may only be overloaded)
================
? (unicode 2218)    may be introduced as the Schur product
? (unicode 22C5)    may be introduced as the dot product
× (unicode 00D7)    may be introduced as the cross product
? (unicode 22C2)    may be introduced as the union of two sets



etc...



UNARY OPERATORS (may only be overloaded)

============

? (unicode 2218)    may be introduced as the square root



These were just chosen to provide some examples.  There are a slew of symbols, most of which do not make sense in a programming environment. However some of these symbols may be useful to those who wish to over load them for a particular class they are developing.



i.e.



a = b × c ;



is cleaner than



a = cross(b, c) ;



or worse yet



a = b.cross(c) ;



The largest difficulty with such a scheme is that our keyboards are not UNICODE friendly.  This I see as a problem for the editor and operating system and not so much for the D language itself.



Any way ... your thoughts??



Mark.

December 03, 2003

Re: UNICODE operators

Posted by Georg Wrede
in reply to Mark Brudnak

Permalink

Georg Wrede

Posted in reply to Mark Brudnak

Permalink

In article <bqjndj$138p$1@digitaldaemon.com>, Mark Brudnak says...
>
>When reading the D spec I noticed that it supports UNICODE UTF-8, UTF-16, UTF-32 source code formats. I propose that D extend its available set of operators (and maintain the current set) and draw from the unicode
..
>The largest difficulty with such a scheme is that our keyboards are not UNICODE friendly.  This I see as a problem for the editor and operating system and not so much for the D language itself.

I see it as a problem for code maintainers and debugging people. _They_ are not guaranteed to have the last and most international os version at hand, or if they do they still might no be able to see or even type such characters.

December 03, 2003

Re: UNICODE operators

Posted by Walter
in reply to Mark Brudnak

Permalink

Walter

Posted in reply to Mark Brudnak

Permalink

These ideas have merit. Something useful ought to be done with unicode! The lack of a decent unicode keyboard is a problem, though, as it will be hard for anyone to type in the unicode operators.

December 03, 2003

Re: UNICODE operators

Posted by Sean L. Palmer
in reply to Mark Brudnak

Permalink

Sean L. Palmer

Posted in reply to Mark Brudnak

Permalink

I want more operators.  I am with you.  I want to take advantage of unicode.

I really see no reason why we should not be able to take any combination of characters that Unicode classifies as symbols, and make an operator out of it.  The designers of D cannot possibly predict all the operators people are going to need or want.

Sean

"Mark Brudnak" <malibrud@provide.net> wrote in message news:bqjndj$138p$1@digitaldaemon.com...
> When reading the D spec I noticed that it supports UNICODE UTF-8, UTF-16, UTF-32 source code formats. I propose that D extend its available set of operators (and maintain the current set) and draw from the unicode extensions for additional operators.  For example:
>
> LOGICAL OPERATORS
> ==================
> ? (unicode 2264)     may be used instead of     <=
> ? (unicode 2265)     may be used instead of     >=
> ? (unicode 2260)     may be used instead of     !=
> ? (unicode 225F)     may be used instead of     ==
> ? (unicode 2227)     may be used instead of     &&
> ? (unicode 2228)     may be used instead of     ||
>
> INFIX OPERATORS (may only be overloaded)
> ================
> ? (unicode 2218)    may be introduced as the Schur product
> ? (unicode 22C5)    may be introduced as the dot product
> × (unicode 00D7)    may be introduced as the cross product
> ? (unicode 22C2)    may be introduced as the union of two sets
>
>
>
> etc...
>
>
>
> UNARY OPERATORS (may only be overloaded)
>
> ============
>
> ? (unicode 2218)    may be introduced as the square root
>
>
>
> These were just chosen to provide some examples.  There are a slew of symbols, most of which do not make sense in a programming environment. However some of these symbols may be useful to those who wish to over load them for a particular class they are developing.
>
>
>
> i.e.
>
>
>
> a = b × c ;
>
>
>
> is cleaner than
>
>
>
> a = cross(b, c) ;
>
>
>
> or worse yet
>
>
>
> a = b.cross(c) ;
>
>
>
> The largest difficulty with such a scheme is that our keyboards are not UNICODE friendly.  This I see as a problem for the editor and operating system and not so much for the D language itself.
>
>
>
> Any way ... your thoughts??
>
>
>
> Mark.
>
>

December 03, 2003

Re: UNICODE operators

Posted by Walter
in reply to Sean L. Palmer

Permalink

Walter

Posted in reply to Sean L. Palmer

Permalink

"Sean L. Palmer" <palmer.sean@verizon.net> wrote in message news:bqlasv$d7e$1@digitaldaemon.com...
> I really see no reason why we should not be able to take any combination
of
> characters that Unicode classifies as symbols, and make an operator out of it.  The designers of D cannot possibly predict all the operators people
are
> going to need or want.

Some problems:
1) the precedence level of those operators.
2) what this implies is user-definable tokens, which is a big problem with a
language that has as a design goal the ability to tokenize it without
needing to do parse or semantic analysis.

December 03, 2003

Re: UNICODE operators

Posted by Mark J. Brudnak
in reply to Sean L. Palmer

Permalink

Mark J. Brudnak

Posted in reply to Sean L. Palmer

Permalink

The UNICODE spec has a lot of mathematical symbols already defined (~100's). In my view combining ASCII symbols to form more operators is *not* the way to go.  It would make the syntax even more difficult to parse and probably lead to abmbiguous syntax.  A UNICODE character is one text symbol which can map to an operation (easy to parse).

In "ASCII-land" the best approach to arbitrary operators is to define them with strings along with some yet-to-be-defined "bracket operator" to delimit them.

For example, say I wanted to define some obtuse binary operator like the vector-exterior-product then my operator would be defined as a string like 'extprod' and some language-defined bracket, say <[ and ]>.  To call this operator the code would look like this.

myBivector = oneVector <[extprod]> anotherVector ; /* traditional infix notation w/ bulky operator */

The operator would be defined as:

class ga {
    float [] vector ;
    int    size ;

    ga = <[extprod]>( ga vectorB) {
        /* compute the exterior product of 'this' and vectorB */
    }
}

It is bulky, but it would allow the definition of arbitrary operators in ASCII!  As was said earlier, UNICODE is the way to go, it has a defined symbol for the exterior product :^).

mark.

"Sean L. Palmer" <palmer.sean@verizon.net> wrote in message news:bqlasv$d7e$1@digitaldaemon.com...
> I want more operators.  I am with you.  I want to take advantage of
unicode.
>
> I really see no reason why we should not be able to take any combination
of
> characters that Unicode classifies as symbols, and make an operator out of it.  The designers of D cannot possibly predict all the operators people
are
> going to need or want.
>
> Sean
>
> "Mark Brudnak" <malibrud@provide.net> wrote in message news:bqjndj$138p$1@digitaldaemon.com...
> > When reading the D spec I noticed that it supports UNICODE UTF-8,
UTF-16,
> > UTF-32 source code formats. I propose that D extend its available set of operators (and maintain the current set) and draw from the unicode extensions for additional operators.  For example:
> >
> > LOGICAL OPERATORS
> > ==================
> > ? (unicode 2264)     may be used instead of     <=
> > ? (unicode 2265)     may be used instead of     >=
> > ? (unicode 2260)     may be used instead of     !=
> > ? (unicode 225F)     may be used instead of     ==
> > ? (unicode 2227)     may be used instead of     &&
> > ? (unicode 2228)     may be used instead of     ||
> >
> > INFIX OPERATORS (may only be overloaded)
> > ================
> > ? (unicode 2218)    may be introduced as the Schur product
> > ? (unicode 22C5)    may be introduced as the dot product
> > × (unicode 00D7)    may be introduced as the cross product
> > ? (unicode 22C2)    may be introduced as the union of two sets
> >
> >
> >
> > etc...
> >
> >
> >
> > UNARY OPERATORS (may only be overloaded)
> >
> > ============
> >
> > ? (unicode 2218)    may be introduced as the square root
> >
> >
> >
> > These were just chosen to provide some examples.  There are a slew of symbols, most of which do not make sense in a programming environment. However some of these symbols may be useful to those who wish to over
load
> > them for a particular class they are developing.
> >
> >
> >
> > i.e.
> >
> >
> >
> > a = b × c ;
> >
> >
> >
> > is cleaner than
> >
> >
> >
> > a = cross(b, c) ;
> >
> >
> >
> > or worse yet
> >
> >
> >
> > a = b.cross(c) ;
> >
> >
> >
> > The largest difficulty with such a scheme is that our keyboards are not UNICODE friendly.  This I see as a problem for the editor and operating system and not so much for the D language itself.
> >
> >
> >
> > Any way ... your thoughts??
> >
> >
> >
> > Mark.
> >
> >
>
>

December 03, 2003

Re: UNICODE operators

Posted by Hauke Duden
in reply to Mark Brudnak

Permalink

Hauke Duden

Posted in reply to Mark Brudnak

Permalink

Mark Brudnak wrote:

> When reading the D spec I noticed that it supports UNICODE UTF-8, UTF-16,
> UTF-32 source code formats. I propose that D extend its available set of
> operators (and maintain the current set) and draw from the unicode
> extensions for additional operators.  For example:
> 
> LOGICAL OPERATORS
> ==================
> ? (unicode 2264)     may be used instead of     <=
> ? (unicode 2265)     may be used instead of     >=
> ? (unicode 2260)     may be used instead of     !=
> ? (unicode 225F)     may be used instead of     ==
> ? (unicode 2227)     may be used instead of     &&
> ? (unicode 2228)     may be used instead of     ||
> 

My email client shows '?' for all your suggestions. I expect most current code editors will do the same, since most programming languages use ASCII encoding for their source code.

It would be quite some task to figure out what another programmer meant when he wrote:

x = ((a ? b) ? c ? d ) ? e;

Some operating systems (i.e. Win9x) don't even have support for printing unicode text on the screen, unless the used characters happen to also be available in the current code page. So it would be close to impossible to write a proper Unicode code editor on those OSs.

And then, of course, there's the problem of entering such operators. My keyboard doesn't have any keys for (unicode 2264), (unicode 2265),... .

It's a great idea, but currently I fear it is not practical.

Hauke

December 03, 2003

Re: UNICODE operators

Posted by Andy Friesen
in reply to Mark Brudnak

Permalink

Andy Friesen

Posted in reply to Mark Brudnak

Permalink

Mark Brudnak wrote:

> When reading the D spec I noticed that it supports UNICODE UTF-8, UTF-16,
> UTF-32 source code formats. I propose that D extend its available set of
> operators (and maintain the current set) and draw from the unicode
> extensions for additional operators.  For example:
> [...]
> The largest difficulty with such a scheme is that our keyboards are not
> UNICODE friendly.  This I see as a problem for the editor and operating
> system and not so much for the D language itself.
> 
> Any way ... your thoughts??
> 
> Mark.

Bjarne suggested something similar to this for C++ once: http://www.research.att.com/~bs/whitespace98.pdf

(yes, this is a joke)

 -- andy

December 03, 2003

Re: UNICODE operators

Posted by Ilya Minkov
in reply to Mark J. Brudnak

Permalink

Ilya Minkov

Posted in reply to Mark J. Brudnak

Permalink

Mark J. Brudnak wrote:

> It is bulky, but it would allow the definition of arbitrary operators in
> ASCII!  As was said earlier, UNICODE is the way to go, it has a defined
> symbol for the exterior product :^).

You shall have a big, no, really HUGE parser handling these...

Because the parsing manner is not generic and you need to set operator precedence by constructing a big... mess!

or have all these operators have the same precedence?

or even make it an error to rely on precedence of these operators like lint does?

Another idea: 'blabla' should be enough for the ascii infix notation.

-eye

December 03, 2003

Re: UNICODE operators

Posted by Sean L. Palmer
in reply to Walter

Permalink

Sean L. Palmer

Posted in reply to Walter

Permalink

That really doesn't matter.  That's what Character Map or BabelMap are for!

Besides you'd likely be able to cut and paste them either from the header or the documentation.

If someone makes some code that uses wierd unicode operators, you don't have to use it (or you can wrap it in ugly function call syntax).

Sean


"Walter" <walter@digitalmars.com> wrote in message news:bql9s5$bkg$1@digitaldaemon.com...
> These ideas have merit. Something useful ought to be done with unicode!
The
> lack of a decent unicode keyboard is a problem, though, as it will be hard for anyone to type in the unicode operators.

Top | Forum index | About this forum

Forums