September 01, 2008 Re: "Protocol Buffers" for Tango & Phobos ?? | ||||
|---|---|---|---|---|
| ||||
Posted in reply to Nick B | == Quote from Nick B (nick.barbalich@gmail.com)'s article
> Hi
> I came across this the other day, and no one has mentioned this, on this
> news group before, I thought I would bring the subject to the
> communities attention, so to speak.
> Google has very recently, open sourced "Protocol Buffers".
> What is it you ask ? In a couple of lines it is a language-neutral,
> platform-neutral, extensible way of serializing structured data for use
> in communications protocols, data storage, and more.
> Think XML, but smaller, faster, and simpler.
> Why not just use XML ?
> Protocol buffers have many advantages over XML for serializing
> structured data. Protocol buffers:
> * are simpler
> * are 3 to 10 times smaller
> * are 20 to 100 times faster
> * are less ambiguous
> * generate data access classes that are easier to use programmatically.
> PB supports Java, Python,and C++ currently.
> A more detailed overview can be found here:
> http://code.google.com/apis/protocolbuffers/docs/overview.html
> and the FAQ here:
> http://code.google.com/apis/protocolbuffers/docs/faq.html
> See the answer to the question "Can I add support for a new language to
> protocol buffers?" inside the FAQ.
> Some Tips and comments can be found here:
> http://zunger.livejournal.com/164024.html
> My questions.
> Does the D community see this of interest ?
> Is this something they might use ?
> Do they see value in it being added to the respective
> Tango or Phobos frameworks ?
> any other comments ?
> cheers
> Nick B.
Hate to say it but this is yet another case of reinventing the wheel. The worst thing about this throwback to the early 90s is its inherent violation of DRY. This package intermingles and confuses three separate issues, treating them as a monolithic whole, namely: serialization, marshalling, and versioning.
Binary solutions such as this, while more efficient byte-wise, run into portability problems especially with floating point values. They also lose the human readability of the data (sans the use of special tools). Often the use of a binary solution is a case of premature optimization and indicates bad design at a higher level. The marshalling strategy should be 'pluggable' so that one can use an easier to debug, human readable, data format during development.
Versioning problems can be (and have been) addressed in a number of ways over the years, the simplest and imo often the best, is to sidestep the problem by serializing a map of the data rather than just the raw data. That is, each value has associated metadata indicating its field name (a key-value mapping iow).
If XML is too heavy a hammer, JSON (with or without embedded metadata) is a good
alternative with quite a bit of industry support. In fact, the use of JSON
encoding with embedded metadata gives you a lightweight solution that works with
nearly every language in present use. Even better, any language with good
reflection support (such as Java) allows an implementation that does not violate DRY.
This "Protocol Buffers" solution is a dinosaur, a throwback, yet another wheel when we need ground effects or anti-gravity.
My 2c,
Brian
| |||
September 01, 2008 Re: "Protocol Buffers" for Tango & Phobos ?? | ||||
|---|---|---|---|---|
| ||||
Posted in reply to Brian Price | Brian Price, el 1 de septiembre a las 14:31 me escribiste: > Hate to say it but this is yet another case of reinventing the wheel. The worst thing about this throwback to the early 90s is its inherent violation of DRY. This package intermingles and confuses three separate issues, treating them as a monolithic whole, namely: serialization, marshalling, and versioning. Different problems has differents solutions. What you say is really nice and have sense when *latency* and other performance issues are not a problem to you (it's obviously a problem for Google and for a lot of other people, like me, I was doing a little "framework", similar to Google's PB at work, just before they release it =). When you deal with (almost) realtime systems, you can't pay the price for human readability, extra parsing time, extra abstraction and extra bytes thrown throgh the wire. You just can't. You don't even care about portability (this kind of things usually run in a very controlled environment). I just wanted to clarify this. -- Leandro Lucarella (luca) | Blog colectivo: http://www.mazziblog.com.ar/blog/ ---------------------------------------------------------------------------- GPG Key: 5F5A8D05 (F8CD F9A7 BF00 5431 4145 104C 949E BFB6 5F5A 8D05) ---------------------------------------------------------------------------- PADRES DENUNCIAN QUE SU HIJA SE ESCAPO CON UN PARAGUAYITO -- Crónica TV | |||
September 01, 2008 Re: Array append performance 2 | ||||
|---|---|---|---|---|
| ||||
Posted in reply to bearophile Attachments: | This is the first beta version, it seems to work well enough (despite too much easy memory overflows), I'll add ddoc comments, better comments, I'll clean the code, improve unit tests, etc. If you spot problems I'd like to know them. You will probably find its finished version inside my libs in one or two days. I presume Walter isn't interested to put this into Phobos, nor to 'fix' the built-in arrays (or maybe he's just busy). Bye, bearophile | |||
September 01, 2008 Re: Array append performance 2 | ||||
|---|---|---|---|---|
| ||||
Posted in reply to bearophile | Well, a finished version of ArrayBuilder is in my libs, in the 'builders' module: http://www.fantascienza.net/leonardo/so/libs_d.zip I have already used it to replace the old string appending used in the putr() (printing functions) and now printing is about 2.5 times faster (and using it for that purpose is a way to perform hundreds of unit tests on ArrayBuilder), so I'm quite pleased. Bye, and thank you for all the help and answers (but more bugs can be present still), bearophile | |||
September 02, 2008 Re: Array append performance 2 | ||||
|---|---|---|---|---|
| ||||
Posted in reply to bearophile | bearophile wrote:
> Well, a finished version of ArrayBuilder is in my libs, in the 'builders' module:
> http://www.fantascienza.net/leonardo/so/libs_d.zip
>
> I have already used it to replace the old string appending used in the putr() (printing functions) and now printing is about 2.5 times faster (and using it for that purpose is a way to perform hundreds of unit tests on ArrayBuilder), so I'm quite pleased.
>
> Bye, and thank you for all the help and answers (but more bugs can be present still),
> bearophile
Very cool.
Philosophically, though, isn't the whole purpose of having dynamic arrays in D to avoid having to create library implementations like this?
I can definitely appreciate how something like this provides proof-of-concept for compiler & stdlib improvements, but in the end, having to rely on an array-builder class seems very non-D-ish.
Thoughts?
--benji
| |||
September 02, 2008 Re: Array append performance 2 | ||||
|---|---|---|---|---|
| ||||
Posted in reply to Benji Smith | Benji Smith: > Philosophically, though, isn't the whole purpose of having dynamic arrays in D to avoid having to create library implementations like this? I agree, but so far I am not able to modify/improve the language. And an ArrayBuilder isn't meant to replace dynamic arrays, it's just a way to build them faster, "patching" what I perceive as one of their problems. > I can definitely appreciate how something like this provides proof-of-concept for compiler & stdlib improvements, My libs are meant to improve std libs, if/where/when the compiler can't be modified. They are un-standard but they are largish libs anyway, and I'm developing them doing my best :-) Bye, bearophile | |||
September 02, 2008 Re: Array append performance 2 | ||||
|---|---|---|---|---|
| ||||
Posted in reply to Benji Smith | Reply to Benji,
> bearophile wrote:
>
>> Well, a finished version of ArrayBuilder is in my libs, in the
>> 'builders' module: http://www.fantascienza.net/leonardo/so/libs_d.zip
>>
>> I have already used it to replace the old string appending used in
>> the putr() (printing functions) and now printing is about 2.5 times
>> faster (and using it for that purpose is a way to perform hundreds of
>> unit tests on ArrayBuilder), so I'm quite pleased.
>>
>> Bye, and thank you for all the help and answers (but more bugs can be
>> present still), bearophile
>>
> Very cool.
>
> Philosophically, though, isn't the whole purpose of having dynamic
> arrays in D to avoid having to create library implementations like
> this?
>
> I can definitely appreciate how something like this provides
> proof-of-concept for compiler & stdlib improvements, but in the end,
> having to rely on an array-builder class seems very non-D-ish.
>
> Thoughts?
>
> --benji
>
I think that in many cases the information needed for the compiler to produce the needed optimizations is to hard to encode in D like languages. D can handle the individual cats but when to do the rest is a problem that can span to much code to expect the compiler to solve it. A human with knowledge of what is supposed to happen can use libs to improve the situation.
OTOH the best solution would be to describe programs in a way that lets a computer do that work. We can dream.
| |||
September 02, 2008 Re: "Protocol Buffers" for Tango & Phobos ?? | ||||
|---|---|---|---|---|
| ||||
Posted in reply to Leandro Lucarella | == Quote from Leandro Lucarella (llucax@gmail.com)'s article
> Brian Price, el 1 de septiembre a las 14:31 me escribiste:
> > Hate to say it but this is yet another case of reinventing the wheel. The worst thing about this throwback to the early 90s is its inherent violation of DRY. This package intermingles and confuses three separate issues, treating them as a monolithic whole, namely: serialization, marshalling, and versioning.
> Different problems has differents solutions. What you say is really nice
> and have sense when *latency* and other performance issues are not a problem
> to you (it's obviously a problem for Google and for a lot of other people,
> like me, I was doing a little "framework", similar to Google's PB at work,
> just before they release it =).
> When you deal with (almost) realtime systems, you can't pay the price for
> human readability, extra parsing time, extra abstraction and extra bytes
> thrown throgh the wire. You just can't.
> You don't even care about portability (this kind of things usually run in
> a very controlled environment).
> I just wanted to clarify this.
In a (near) realtime embedded system with a high wire cost (cellphone, smart meters, etc) I'd have to agree with you. However, my initial impression was that this was being proposed/offered as a general purpose object streaming system.
Multicore/hyperthreaded processors with multiple levels of caching have a huge latency (comparitively) between core and wire. Even in a high demand system there's generally plenty of cpu cycles left over to do quite a bit of extra parsing and data massaging (compression). A compressed text message (XML or JSON format) is very competitive in size with binary format (depending of course on the actual makeup of the data payload). I expect that in a few years the embedded systems will be sporting processors with similar capabilities to today's desktops and servers. At that time, it may be worthwhile re-evaluating the decision to go with a 'brittle' binary format in new designs.
On a personal note, some of my friends are laughing at me over this - ten or twelve years ago I was vehemently against the idea of using human readable data formats for transmission - for the same reasons you give above. Of course my cellphone today probably has nearly the power of my desktop back then.
Sorry for any misunderstanding,
Brian
| |||
September 03, 2008 Re: Array append performance 2 | ||||
|---|---|---|---|---|
| ||||
Posted in reply to bearophile | "bearophile" <bearophileHUGS@lycos.com> wrote in message news:g9i1nb$26pd$1@digitalmars.com... > Benji Smith: >> Philosophically, though, isn't the whole purpose of having dynamic >> arrays in D to avoid having to create library implementations like this? > > I agree, but so far I am not able to modify/improve the language. And an ArrayBuilder isn't meant to replace dynamic arrays, it's just a way to build them faster, "patching" what I perceive as one of their problems. > There have been quite a few examples of contributors modifying the D runtime to improve things like array concatenation. If it generally performs better and is demonstrably correct, I'm pretty sure Walter and the Tango crew would incorporate it. In other words, have a go at improving the arraycat code in internal/gc/gc.d if you have the time. It's simple enough to modify and build. > >> I can definitely appreciate how something like this provides >> proof-of-concept for compiler & stdlib improvements, > > My libs are meant to improve std libs, if/where/when the compiler can't be modified. They are un-standard but they are largish libs anyway, and I'm developing them doing my best :-) > > Bye, > bearophile | |||
September 03, 2008 Re: "Protocol Buffers" for Tango & Phobos ?? | ||||
|---|---|---|---|---|
| ||||
Posted in reply to Brian Price | Brian Price wrote: > == Quote from Nick B (nick.barbalich@gmail.com)'s article >> Hi >> I came across this the other day, and no one has mentioned this, on this >> news group before, I thought I would bring the subject to the >> communities attention, so to speak. >> Google has very recently, open sourced "Protocol Buffers". >> What is it you ask ? In a couple of lines it is a language-neutral, >> platform-neutral, extensible way of serializing structured data for use >> in communications protocols, data storage, and more. >> Think XML, but smaller, faster, and simpler. >> Why not just use XML ? >> Protocol buffers have many advantages over XML for serializing >> structured data. Protocol buffers: >> * are simpler >> * are 3 to 10 times smaller >> * are 20 to 100 times faster >> * are less ambiguous >> * generate data access classes that are easier to use programmatically. >> PB supports Java, Python,and C++ currently. >> A more detailed overview can be found here: >> http://code.google.com/apis/protocolbuffers/docs/overview.html >> and the FAQ here: >> http://code.google.com/apis/protocolbuffers/docs/faq.html >> See the answer to the question "Can I add support for a new language to >> protocol buffers?" inside the FAQ. >> Some Tips and comments can be found here: >> http://zunger.livejournal.com/164024.html >> My questions. >> Does the D community see this of interest ? >> Is this something they might use ? >> Do they see value in it being added to the respective >> Tango or Phobos frameworks ? >> any other comments ? >> cheers >> Nick B. > > Hate to say it but this is yet another case of reinventing the wheel. The worst > thing about this throwback to the early 90s is its inherent violation of DRY. > This package intermingles and confuses three separate issues, treating them as a > monolithic whole, namely: serialization, marshalling, and versioning. Can you explain where you see violation of DRY exactly? Also, can you explain the big difference between serialization and marshalling? I find it hard to draw any meaningful line between the two.. > Binary solutions such as this, while more efficient byte-wise, run into > portability problems especially with floating point values. So, what sort of problems do you see with this specific solution? To me, it looks like a text based solution would have more problems with floating point values, because they typically get converted to/from decimal.. > They also lose the > human readability of the data (sans the use of special tools). Often the use of a > binary solution is a case of premature optimization and indicates bad design at a > higher level. The marshalling strategy should be 'pluggable' so that one can use > an easier to debug, human readable, data format during development. Well, in my experience, most data tends to live somewhere not readily accessible and/or human readable, so you need a tool one way or the other. Making generic tools for PBs shouldn't be too hard, as far as I can tell they do have support for generic message handling, 'reflection', there's a DebugString() method on each of them, etc.. > Versioning problems can be (and have been) addressed in a number of ways over the > years, the simplest and imo often the best, is to sidestep the problem by > serializing a map of the data rather than just the raw data. That is, each value > has associated metadata indicating its field name (a key-value mapping iow). And the PB encoding differs from this how exactly? Each value has a key preceding it, which identifies which field it is.. > If XML is too heavy a hammer, JSON (with or without embedded metadata) is a good > alternative with quite a bit of industry support. In fact, the use of JSON > encoding with embedded metadata gives you a lightweight solution that works with > nearly every language in present use. Even better, any language with good > reflection support (such as Java) allows an implementation that does not violate DRY. I haven't tried it, but it looks like generic mapping to/from JSON would be very easily done on top of PBs, for when you need it (like when publishing a JavaScript API). But why would you want to use JSON in any other case? Compared to PBs, it's harder/slower to parse, definitely takes more space/bandwidth, has fewer types, and you don't get the nice free API protocol buffers provide you for each message type. > This "Protocol Buffers" solution is a dinosaur, a throwback, yet another wheel > when we need ground effects or anti-gravity. Well, you haven't convinced me yet :) Mitja | |||
Copyright © 1999-2021 by the D Language Foundation
Permalink
Reply