April 20, 2012
I'm writing some code that does some very simplistic parsing, and I'm just totally geeking out on how awesome D is for writing such code:

	import std.conv;
	import std.regex;
	import std.stdio;

	struct Data {
		string name;
		string phone;
		int age;
		... // a whole bunch of other stuff
	}

	void main() {
		Data d;
		foreach (line; stdin.byLine()) {
			auto m = match(line, "(\w+)\s+(\w+)");
			if (!m) continue;

			auto key = m.captures[1];
			auto value = m.captures[2];

			alias void delegate(string key, string value) attrDg;
			attrDg[string] dgs = [
				"name": delegate(string key, string value) {
					d.name = value;
				},
				"phone": delegate(string key, string value) {
					d.phone = value;
				},
				"age": delegate(string key, string value) {
					d.age = to!int(value);
				},
				...	// whole bunch of other stuff to
					// parse different attributes
			];
			attrDg errordg = delegate(string key, string value) {
				throw Exception("Invalid attribute '%s'"
					.format(key));
			};

			// This is pure awesomeness:
			dgs.get(key.idup, errordg)(key.idup, value.idup);
		}
		// ... do something with Data
	}

Basically, I use std.regex to extract keywords from the input, then use an AA to map keywords to code that implement said keyword.  That AA of delegates is just pure awesomeness. AA.get's default value parameter lets you process keywords and handle errors with a single AA lookup.  I mean, this is even better than Perl for this kind of text-processing code!

The only complaint is that I couldn't write auto[string] dgs and have the compiler auto-infer the delegate type. :-) Additionally, I wasn't sure if I could omit the "delegate(string,string)" after each keyword; if that's actually allowed, then this would make D totally pwn Perl!!

(I left out some stuff that makes this code even more of a joy to write: using nested try/catch blocks, I can throw exceptions from deep-down parsing code and have the loop that loops over input lines automatically prefix error messages with the filename/line number where the error occurred. This way, even errors thrown by to!int() will be formatted nicely. With Perl, this gets extremely messy due to its pathological use of $. for line numbers which can get overwritten in unexpected places if you're processing more than one file at a time.)

Did I mention I'm totally in love with D?? Seriously. It can handle system-level code and "high-level" text-processing code with total impunity. What's there not to like?!


T

-- 
Without geometry, life would be pointless. -- VS
April 20, 2012
"H. S. Teoh" <hsteoh@quickfur.ath.cx> wrote in message news:mailman.1953.1334894800.4860.digitalmars-d@puremagic.com...
> I'm writing some code that does some very simplistic parsing, and I'm just totally geeking out on how awesome D is for writing such code:
>

Heh, yup :)

I grew up on C/C++ (well, after outgrowing BASIC anyway), and one of the first things that blew me away about D was its string-processing.

>
> alias void delegate(string key, string value) attrDg;
> attrDg[string] dgs = [
> "name": delegate(string key, string value) {
> d.name = value;
> },
> "phone": delegate(string key, string value) {
> d.phone = value;
> },
> "age": delegate(string key, string value) {
> d.age = to!int(value);
> },
> ... // whole bunch of other stuff to
> // parse different attributes
> ];

Yea, I've done the same trick :) Fantastic stuff. But the one issue I have with it is that you can't do this:

void delegate()[string] dgs = [
    "name": delegate() {
        // do stuff
    },
    "phone": delegate() {
        // do stuff
        dgs["name"](); // ERR! (Shit!)
        // do stuff
    }
];

That limitation is kind of annoying sometimes. I think I filed a ticket for it...

http://d.puremagic.com/issues/show_bug.cgi?id=3995

Ahh, shit, it's been marked invalid :(

>Did I mention I'm totally in love with D?? Seriously. It can handle system-level code and "high-level" text-processing code with total impunity. What's there not to like?!

Yup. Like, totally. :)


April 20, 2012
20.04.2012 8:06, H. S. Teoh написал:
> I'm writing some code that does some very simplistic parsing, and I'm
> just totally geeking out on how awesome D is for writing such code:
>
> 	import std.conv;
> 	import std.regex;
> 	import std.stdio;
>
> 	struct Data {
> 		string name;
> 		string phone;
> 		int age;
> 		... // a whole bunch of other stuff
> 	}
>
> 	void main() {
> 		Data d;
> 		foreach (line; stdin.byLine()) {
> 			auto m = match(line, "(\w+)\s+(\w+)");

It's better not to create a regex every iteration. Use e.g.
---
auto regEx = regex(`(\w+)\s+(\w+)`);
---
before foreach. Of course, you are not claiming this as a high-performance program, but creating a regex every iteration is too common mistake to show such code to newbies.

> 			if (!m) continue;
>
> 			auto key = m.captures[1];

One `.idup` here will be better. (sorry, just like to nitpick)

> 			auto value = m.captures[2];
>
> 			alias void delegate(string key, string value) attrDg;
> 			attrDg[string] dgs = [
> 				"name": delegate(string key, string value) {
> 					d.name = value;
> 				},
> 				"phone": delegate(string key, string value) {
> 					d.phone = value;
> 				},
> 				"age": delegate(string key, string value) {
> 					d.age = to!int(value);
> 				},
> 				...	// whole bunch of other stuff to
> 					// parse different attributes
> 			];
> 			attrDg errordg = delegate(string key, string value) {
> 				throw Exception("Invalid attribute '%s'"
> 					.format(key));
> 			};
>
> 			// This is pure awesomeness:
> 			dgs.get(key.idup, errordg)(key.idup, value.idup);
> 		}
> 		// ... do something with Data
> 	}
>
> Basically, I use std.regex to extract keywords from the input, then use
> an AA to map keywords to code that implement said keyword.  That AA of
> delegates is just pure awesomeness. AA.get's default value parameter
> lets you process keywords and handle errors with a single AA lookup.  I
> mean, this is even better than Perl for this kind of text-processing
> code!
>
> The only complaint is that I couldn't write auto[string] dgs and have
> the compiler auto-infer the delegate type. :-) Additionally, I wasn't
> sure if I could omit the "delegate(string,string)" after each keyword;
> if that's actually allowed, then this would make D totally pwn Perl!!

A shorter variant:
---
void delegate(string, string)[string] dgs = [
	"name" : (key, value) { d.name = value; },
	"phone": (key, value) { d.phone = value; },
	"age"  : (key, value) { d.age = to!int(value); },
	...	// whole bunch of other stuff to
		// parse different attributes
];

// `delegate` is needed because otherwise `errordg` will be inferred
// as a `function`, not `delegate` and `dgs.get` will fail
auto errordg = delegate(string key, string value) {
	throw new Exception("Invalid attribute '%s'"
		.format(key));
};
---

>
> (I left out some stuff that makes this code even more of a joy to write:
> using nested try/catch blocks, I can throw exceptions from deep-down
> parsing code and have the loop that loops over input lines automatically
> prefix error messages with the filename/line number where the error
> occurred. This way, even errors thrown by to!int() will be formatted
> nicely. With Perl, this gets extremely messy due to its pathological use
> of $. for line numbers which can get overwritten in unexpected places if
> you're processing more than one file at a time.)
>
> Did I mention I'm totally in love with D?? Seriously. It can handle
> system-level code and "high-level" text-processing code with total
> impunity. What's there not to like?!
>
>
> T
>


-- 
Денис В. Шеломовский
Denis V. Shelomovskij
April 20, 2012
Denis Shelomovskij wrote:
> A shorter variant:
> ---
> void delegate(string, string)[string] dgs = [
> 	"name" : (key, value) { d.name = value; },
> 	"phone": (key, value) { d.phone = value; },
> 	"age"  : (key, value) { d.age = to!int(value); },
> 	...	// whole bunch of other stuff to
> 		// parse different attributes
> ];

That's a pretty slick example of D's type inference. This example is worthy of a reference in the docs somewhere, IMO. Although, written to use UFCS of course:

    auto m = line.match("(\w+)\s+(\w+)");

    ...

    "age" : (key, value) { d.age = value.to!int(); }

:D gotta love UFCS!
April 20, 2012
Denis Shelomovskij wrote:
> A shorter variant:
> ---
> void delegate(string, string)[string] dgs = [
> 	"name" : (key, value) { d.name = value; },
> 	"phone": (key, value) { d.phone = value; },
> 	"age"  : (key, value) { d.age = to!int(value); },
> 	...	// whole bunch of other stuff to
> 		// parse different attributes
> ];

That's a pretty slick example of D's type inference. This example is worthy of a reference in the docs somewhere, IMO. Although, written to use UFCS of course:

    auto m = line.match("(\w+)\s+(\w+)");

    ...

    "age" : (key, value) { d.age = value.to!int(); }

:D gotta love UFCS!
April 20, 2012
On 20.04.2012 8:44, Denis Shelomovskij wrote:
> 20.04.2012 8:06, H. S. Teoh написал:
>> I'm writing some code that does some very simplistic parsing, and I'm
>> just totally geeking out on how awesome D is for writing such code:
>>
>> import std.conv;
>> import std.regex;
>> import std.stdio;
>>
>> struct Data {
>> string name;
>> string phone;
>> int age;
>> ... // a whole bunch of other stuff
>> }
>>
>> void main() {
>> Data d;
>> foreach (line; stdin.byLine()) {
>> auto m = match(line, "(\w+)\s+(\w+)");
>
> It's better not to create a regex every iteration. Use e.g.
> ---
> auto regEx = regex(`(\w+)\s+(\w+)`);
> ---
> before foreach. Of course, you are not claiming this as a
> high-performance program, but creating a regex every iteration is too
> common mistake to show such code to newbies.

And that's why I pluged this hole - it happens too often. At least up to mm... 16 regexes are cached.



-- 
Dmitry Olshansky
April 20, 2012
On 20.04.2012 9:19, F i L wrote:
> Denis Shelomovskij wrote:
>> A shorter variant:
>> ---
>> void delegate(string, string)[string] dgs = [
>> "name" : (key, value) { d.name = value; },
>> "phone": (key, value) { d.phone = value; },
>> "age" : (key, value) { d.age = to!int(value); },
>> ... // whole bunch of other stuff to
>> // parse different attributes
>> ];
>

How about putting it on dlang?( [your code here] )

-- 
Dmitry Olshansky
April 20, 2012
On 2012-04-20 06:06, H. S. Teoh wrote:
> I'm writing some code that does some very simplistic parsing, and I'm
> just totally geeking out on how awesome D is for writing such code:
>
> 	import std.conv;
> 	import std.regex;
> 	import std.stdio;
>
> 	struct Data {
> 		string name;
> 		string phone;
> 		int age;
> 		... // a whole bunch of other stuff
> 	}
>
> 	void main() {
> 		Data d;
> 		foreach (line; stdin.byLine()) {
> 			auto m = match(line, "(\w+)\s+(\w+)");
> 			if (!m) continue;
>
> 			auto key = m.captures[1];
> 			auto value = m.captures[2];
>
> 			alias void delegate(string key, string value) attrDg;
> 			attrDg[string] dgs = [
> 				"name": delegate(string key, string value) {
> 					d.name = value;
> 				},
> 				"phone": delegate(string key, string value) {
> 					d.phone = value;
> 				},
> 				"age": delegate(string key, string value) {
> 					d.age = to!int(value);
> 				},
> 				...	// whole bunch of other stuff to
> 					// parse different attributes
> 			];
> 			attrDg errordg = delegate(string key, string value) {
> 				throw Exception("Invalid attribute '%s'"
> 					.format(key));
> 			};
>
> 			// This is pure awesomeness:
> 			dgs.get(key.idup, errordg)(key.idup, value.idup);
> 		}
> 		// ... do something with Data
> 	}
>
> Basically, I use std.regex to extract keywords from the input, then use
> an AA to map keywords to code that implement said keyword.  That AA of
> delegates is just pure awesomeness. AA.get's default value parameter
> lets you process keywords and handle errors with a single AA lookup.  I
> mean, this is even better than Perl for this kind of text-processing
> code!
>
> The only complaint is that I couldn't write auto[string] dgs and have
> the compiler auto-infer the delegate type. :-) Additionally, I wasn't
> sure if I could omit the "delegate(string,string)" after each keyword;
> if that's actually allowed, then this would make D totally pwn Perl!!

I think you should be able to write:

"age": (key, value) {
    d.age = to!int(value);
}

Or perhaps even:

"age": (key, value) => d.age = to!int(value);

-- 
/Jacob Carlborg
April 20, 2012
On Fri, 20 Apr 2012 00:06:41 -0400, H. S. Teoh <hsteoh@quickfur.ath.cx> wrote:


> The only complaint is that I couldn't write auto[string] dgs and have
> the compiler auto-infer the delegate type. :-)

Does this not work?

auto dgs = ...

Also, it doesn't look like that needs to be in the inner loop.  Each time you specify an AA literal, it allocates a new one.  So you are allocating another AA literal per line.

-Steve
April 20, 2012
As a D learner this thread is very interesting. It would be great to maintain it with a polished and error catching version that incorporates people's tweaks.
« First   ‹ Prev
1 2
Top | Discussion index | About this forum | D home