Jump to page: 1 2
Thread overview
Streaming parsers in D a novel design
Apr 18, 2023
Dmitry Olshansky
Apr 18, 2023
Sebastiaan Koppe
Apr 18, 2023
H. S. Teoh
Apr 19, 2023
Walter Bright
Apr 18, 2023
Sebastiaan Koppe
Apr 18, 2023
Dmitry Olshansky
Apr 19, 2023
Jacob Shtokolov
Apr 19, 2023
Sebastiaan Koppe
Apr 19, 2023
Dmitry Olshansky
Apr 19, 2023
Sebastiaan Koppe
Apr 20, 2023
Dmitry Olshansky
Apr 21, 2023
Sebastiaan Koppe
Apr 19, 2023
Adam D Ruppe
Apr 19, 2023
Dmitry Olshansky
Apr 20, 2023
ikod
April 18, 2023

A streaming parser is doing 2 things:

  • waiting for input
  • processing the current input and spitting out Events

This maps to D beautifully:

  1. Waiting for input means it's an OutputRange with explicit put method!

  2. Processing events means it's an InputRange of events. It may even be ForwardRange, mine first of this kind is Forward range.

Destroy!

--
Olshansky Dmitry

April 18, 2023

On Tuesday, 18 April 2023 at 08:13:05 UTC, Dmitry Olshansky wrote:

>

A streaming parser is doing 2 things:

  • waiting for input
  • processing the current input and spitting out Events

This maps to D beautifully:

  1. Waiting for input means it's an OutputRange with explicit put method!

  2. Processing events means it's an InputRange of events. It may even be ForwardRange, mine first of this kind is Forward range.

Destroy!

--
Olshansky Dmitry

Sounds like a natural fit! I think it's interesting to explore related concepts that other languages have like their versions of iterators/ranges, but also streams, observables, interaction with promises (observables vs async iterables). Another related topic is validation vs parsing (converting data from unstructured input to a more structured form). I don't have time to discuss in detail at the moment, so I'll just share a few links.

https://lexi-lambda.github.io/blog/2019/11/05/parse-don-t-validate/
https://serokell.io/blog/parser-combinators-in-haskell

http://csl.stanford.edu/~christos/pldi2010.fit/meijer.duality.pdf
https://www.youtube.com/watch?v=sTSQlYX5DU0&ab_channel=ReactConference

https://dev.solita.fi/2021/10/14/grokking-clojure-transducers.html
https://www.youtube.com/watch?v=6mTbuzafcII
https://hypirion.com/musings/recursive-transducers
https://hypirion.com/musings/haskell-transducers

https://www.youtube.com/watch?v=vohGJjGxtJQ&ab_channel=CppCon
https://www.open-std.org/jtc1/sc22/wg21/docs/papers/2022/p2300r5.html

April 18, 2023
On Tue, Apr 18, 2023 at 08:13:05AM +0000, Dmitry Olshansky via Digitalmars-d wrote:
> A streaming parser is doing 2 things:
> - waiting for input
> - processing the current input and spitting out Events
> 
> This maps to D beautifully:
> 
> 1. Waiting for input means it's an OutputRange with explicit put method!
> 
> 2. Processing events means it's an InputRange of events. It may even be ForwardRange, mine first of this kind is Forward range.
[...]

In practice, the input would be an input range consumed by the parser which returns an input range of tokens.  IOW the parser is a filter that transforms an input range of characters into a range of tokens.

You'd only need an output range interface if you needed to explicitly feed characters to the parser, such as if you were using an async I/O API and you just received a block of fresh input, and now you need to stuff the input into the parser.  But even then, I wouldn't do it this way, because on the other end, the input range wouldn't be able to return a result if, say, the next block of input hasn't arrived yet. So .empty would have to block, which is an ugly design for something with an input range API.

At the end of the day, I think treating a parser as a transformation (range of char -> range of tokens) is a better abstraction than trying to shoehorn it into something involved output ranges.


T

-- 
Nearly all men can stand adversity, but if you want to test a man's character, give him power. -- Abraham Lincoln
April 18, 2023

On Tuesday, 18 April 2023 at 08:13:05 UTC, Dmitry Olshansky wrote:

>

A streaming parser is doing 2 things:

  • waiting for input
  • processing the current input and spitting out Events

This maps to D beautifully:

  1. Waiting for input means it's an OutputRange with explicit put method!

  2. Processing events means it's an InputRange of events. It may even be ForwardRange, mine first of this kind is Forward range.

Destroy!

--
Olshansky Dmitry

I think it will work beautifully until the day you want to do it asynchronously.

April 18, 2023

On Tuesday, 18 April 2023 at 13:01:40 UTC, Petar Kirov [ZombineDev] wrote:

>

https://www.open-std.org/jtc1/sc22/wg21/docs/papers/2022/p2300r5.html

For an implementation in D, see: https://github.com/symmetryinvestments/concurrency

April 18, 2023

On Tuesday, 18 April 2023 at 14:26:00 UTC, Sebastiaan Koppe wrote:

>

On Tuesday, 18 April 2023 at 08:13:05 UTC, Dmitry Olshansky wrote:

>

A streaming parser is doing 2 things:

  • waiting for input
  • processing the current input and spitting out Events

This maps to D beautifully:

  1. Waiting for input means it's an OutputRange with explicit put method!

  2. Processing events means it's an InputRange of events. It may even be ForwardRange, mine first of this kind is Forward range.

Destroy!

--
Olshansky Dmitry

I think it will work beautifully until the day you want to do it asynchronously.

Photon takes care of that:
https://github.com/DmitryOlshansky/photon

April 18, 2023
On 4/18/23 7:24 AM, H. S. Teoh wrote:

> At the end of the day, I think treating a parser as a transformation
> (range of char -> range of tokens) is a better abstraction than trying
> to shoehorn it into something involved output ranges.

This is exactly how jsoniopipe works (though the input is not a "range" but an iopipe of course).

The resulting thing is essentially a "range", but not with the range functions. The implementation doesn't quite fit the range paradigm.

https://github.com/schveiguy/jsoniopipe

-Steve
April 18, 2023
On 4/18/2023 7:24 AM, H. S. Teoh wrote:
> In practice, the input would be an input range consumed by the parser
> which returns an input range of tokens.  IOW the parser is a filter that
> transforms an input range of characters into a range of tokens.

The lexer produces a stream of tokens, not the parser. The output of the parser is an AST.

April 19, 2023

On Tuesday, 18 April 2023 at 14:26:00 UTC, Sebastiaan Koppe wrote:

>

On Tuesday, 18 April 2023 at 08:13:05 UTC, Dmitry Olshansky wrote:

>

A streaming parser is doing 2 things:

  • waiting for input
  • processing the current input and spitting out Events

I think it will work beautifully until the day you want to do it asynchronously.

Ranges can be async, in this case, you just need to return the Result type (pending|error|success) and provide hints to something that will be pulling from this range.

It will look like an infinite range, but not fully infinite: it returns "success" once the result is ready.

The sync version will always return error|success without the 'pending' phase.

That kind of interface is similar to Rust's futures, providing poll-based primitives with some wakeup hints (callbacks).

April 19, 2023
On Tuesday, 18 April 2023 at 08:13:05 UTC, Dmitry Olshansky wrote:
> This maps to D beautifully:

I recently wrote a new stream class that works kinda like this and thought about doing put() and front etc, but I wanted heterogeneous types.

Still, it kinda did work.

https://github.com/adamdruppe/arsd/blob/master/core.d#L5040

there's the test rn to show how it works
« First   ‹ Prev
1 2