why ; ? (page 2)

May 05, 2008

Re: why ; ?

Posted by bearophile
in reply to Tomasz Sowinski

Permalink

bearophile

Posted in reply to Tomasz Sowinski

Permalink

Tomasz Sowinski Wrote:

As you say such things of D probably aren't going to change, but with some careful design most (I think all) of those problems can be solved.
And you can even go all the way :-)

This is a little old D example code of mine:

import std.stdio, std.stream, std.string, std.ctype, std.gc;

void traduct(char[] n, char[] digits, int start, char[][] words, char[][][char[]] gdict) {
    if (start >= digits.length)
        writefln(n, ": ", words.join(" "));
    else {
        auto found_word = false;
        for(auto i = start; i < digits.length; i++)
            if (digits[start .. i+1] in gdict) {
                found_word = true;
                foreach(hit; gdict[digits[start .. i+1]])
                    traduct(n, digits, i+1, words ~ [hit], gdict);
            }
        if (!found_word && (!words || (words && !std.ctype.isdigit(words[words.length-1][0]))))
            traduct(n, digits, start+1, words ~ [digits[start..start+1]], gdict);
    }
}

void main() {
    std.gc.disable();
    auto gtable = maketrans("ejnqrwxdsyftamcivbkulopghzEJNQRWXDSYFTAMCIVBKULOPGHZ",
                            "0111222333445566677788899901112223334455666777888999");

    size_t line_start;
    char[][][char[]] gdict;
    auto input_dict = cast(char[])std.file.read("dictionary.txt");

    foreach (current_pos, c; input_dict)
        if (c == '\n') { // words with DOS newlines too
            auto word = input_dict[line_start .. current_pos].strip();
            // word isn't a string, it's just a reference (start-end index) to
            //   the input_dict string, despite being stripped.
            gdict[word.translate(gtable, "\"")] ~= word;
            line_start = current_pos+1;
        }
    auto word = input_dict[line_start .. input_dict.length].strip();
    if (word.length > 0)
        gdict[word.translate(gtable, "\"")] ~= word;

    foreach(char[] n; new BufferedFile("input.txt"))
        traduct(n, n.removechars("/-"), 0, [], gdict);
}


The alternative version without ; and braces may look unusual for C programmers:


import std.stdio, std.stream, std.string, std.ctype, std.gc

void traduct(char[] n, char[] digits, int start, char[][] words, char[][][char[]] gdict):
    if (start >= digits.length):
        writefln(n, ": ", words.join(" "))
    else:
        auto found_word = false
        foreach(i; range(start, digits.length)):
            if (digits[start .. i+1] in gdict):
                found_word = true
                foreach(hit; gdict[digits[start .. i+1]]):
                    traduct(n, digits, i+1, words ~ [hit], gdict)
        if (!found_word && (!words || (words && !std.ctype.isdigit(words[words.length-1][0])))):
            traduct(n, digits, start+1, words ~ [digits[start..start+1]], gdict)

void main():
    std.gc.disable()
    auto gtable = maketrans("ejnqrwxdsyftamcivbkulopghzEJNQRWXDSYFTAMCIVBKULOPGHZ",
                            "0111222333445566677788899901112223334455666777888999")

    char[][][char[]] gdict
    foreach(char[] w; new BufferedFile("dictionary.txt")):
        gdict[w.translate(gtable, "\"")] ~= w.dup

    foreach(char[] n; new BufferedFile("input.txt")):
        traduct(n, n.removechars("/-"), 0, [], gdict)


It seems some people have tried that:

http://www.imitationpickles.org/pyplus/ http://blog.micropledge.com/2007/09/nobraces/ http://micropledge.com/projects/nobraces

But they use very simple means, so they fail in certain situations. To solve the problem better a pymeta (OMeta parser) may be useful: http://washort.twistedmatrix.com/

Bye,
bearophile

On Mon, 05 May 2008 12:13:48 +0400, Tomasz Sowinski <tomeksowi@gmail.com> wrote: > Just another feature thought. Never gonna happen, but still... > > What's the reason of having lines end with a semicolon? Anything else than a legacy issue with C/C++? > > The only thing I can think of is having multiple statements in one line, but that only makes code unreadable. Wouldn't getting rid of ; improve readability? > > > Tomek Although Javascript has a C-style syntax, it doesn't force you to use semicolon. From my experience, the code doesn't get any readability improvements that way.

downs wrote: > void main() { writefln("Hello World") int a float b = 4 writefln(a, " - ", b) return 0 } > eyes.. bleeding.. Please don't do that. Another (non style) reason to have the ; is that it provides a bit of redundancy in the code. This results in you needing 2 errors before it compile wrong rather than just one int j = 6; void main() { bob(); writef("%d", j); // 5 or 6? } void bob() { int i j = 5 // this could be //int i, j = 5; // declare i and j as local var //int i; j = 5; // declare i and modify .j }

Tomasz Sowinski wrote: > > There is a meaningful newline character anyway to know where the // comment ends as a side issue, that is a lexical effect. The newline is part of the comment token so it likely never even shows up in the parser.

bearophile Wrote: > Tomasz Sowinski Wrote: > > As you say such things of D probably aren't going to change, but with some careful design most (I think all) of those problems can be solved. > And you can even go all the way :-) > > This is a little old D example code of mine: > > import std.stdio, std.stream, std.string, std.ctype, std.gc; > > void traduct(char[] n, char[] digits, int start, char[][] words, char[][][char[]] gdict) { > if (start >= digits.length) > writefln(n, ": ", words.join(" ")); > else { > auto found_word = false; > for(auto i = start; i < digits.length; i++) > if (digits[start .. i+1] in gdict) { > found_word = true; > foreach(hit; gdict[digits[start .. i+1]]) > traduct(n, digits, i+1, words ~ [hit], gdict); > } > if (!found_word && (!words || (words && !std.ctype.isdigit(words[words.length-1][0])))) > traduct(n, digits, start+1, words ~ [digits[start..start+1]], gdict); > } > } > > void main() { > std.gc.disable(); > auto gtable = maketrans("ejnqrwxdsyftamcivbkulopghzEJNQRWXDSYFTAMCIVBKULOPGHZ", > "0111222333445566677788899901112223334455666777888999"); > > size_t line_start; > char[][][char[]] gdict; > auto input_dict = cast(char[])std.file.read("dictionary.txt"); > > foreach (current_pos, c; input_dict) > if (c == '\n') { // words with DOS newlines too > auto word = input_dict[line_start .. current_pos].strip(); > // word isn't a string, it's just a reference (start-end index) to > // the input_dict string, despite being stripped. > gdict[word.translate(gtable, "\"")] ~= word; > line_start = current_pos+1; > } > auto word = input_dict[line_start .. input_dict.length].strip(); > if (word.length > 0) > gdict[word.translate(gtable, "\"")] ~= word; > > foreach(char[] n; new BufferedFile("input.txt")) > traduct(n, n.removechars("/-"), 0, [], gdict); > } > > > The alternative version without ; and braces may look unusual for C programmers: > > > import std.stdio, std.stream, std.string, std.ctype, std.gc > > void traduct(char[] n, char[] digits, int start, char[][] words, char[][][char[]] gdict): > if (start >= digits.length): > writefln(n, ": ", words.join(" ")) > else: > auto found_word = false > foreach(i; range(start, digits.length)): > if (digits[start .. i+1] in gdict): > found_word = true > foreach(hit; gdict[digits[start .. i+1]]): > traduct(n, digits, i+1, words ~ [hit], gdict) > if (!found_word && (!words || (words && !std.ctype.isdigit(words[words.length-1][0])))): > traduct(n, digits, start+1, words ~ [digits[start..start+1]], gdict) > > void main(): > std.gc.disable() > auto gtable = maketrans("ejnqrwxdsyftamcivbkulopghzEJNQRWXDSYFTAMCIVBKULOPGHZ", > "0111222333445566677788899901112223334455666777888999") > > char[][][char[]] gdict > foreach(char[] w; new BufferedFile("dictionary.txt")): > gdict[w.translate(gtable, "\"")] ~= w.dup > > foreach(char[] n; new BufferedFile("input.txt")): > traduct(n, n.removechars("/-"), 0, [], gdict) > I like the version without ; but how can you tell where a block ends without braces? indents?

Tomasz Sowinski wrote: > What's the reason of having lines end with a semicolon? A well designed language has some redundancy built in. The reason for the redundancy is so the compiler can detect and diagnose errors. If there was no redundancy, any random stream of characters, i.e. oidhfoi123413j4h1ohsc!@#$%^&*(vjkasdasdf would be a valid program. Having the ; end statements provides a nice "anchor" point for the parser. It means that what comes before it must form a grammatically correct statement. Otherwise, the compiler must hopefully keep scanning forward, and then try all kinds of parse trees out on the jumble of tokens looking for a set of statements that will fit it. Furthermore, when the compiler does diagnose an error, error recovery can be as simple as "skip forward to the ;, then restart the statement parser." Without such an anchor, you'll get one error message followed by a cascade of useless drivel. BTW, double entry bookkeeping, invented in the middle ages, was a huge advance in accounting. It essentially made everything redundant, which helped find and correct arithmetic errors. It spawned the term "balancing the books" which is nothing more than tracking down and reconciling all the errors. Without the redundancy, there'd be no way to balance the books because there'd be no way to detect errors.

Jarrett Billingsley wrote: > The one major downside to this change in the grammar, however, is that it makes lexical analysis dependent upon syntactic analysis, since the significance of newlines depends upon the current construct being parsed. D prides itself on having no such interdependencies, and you'd be hard-pressed to convince Walter to do otherwise. It's not pride based. There are sound technical reasons.

Forums