Formatted read consumes input (page 2) - D Programming Language Discussion Forum

Forums

New users
- Learn
Community
- General
- Announce
Improvements
- DIP Ideas
- DIP Devel.
Ecosystem
- GDC
- LDC
- Debuggers
- IDEs
- DWT
Development
- Internals
- Issues
- Beta
- DMD
- Phobos
- Druntime
- Study
Turkish
- Genel
- Duyuru

Index » General » Formatted read consumes input (page 2)

September 07, 2012

Re: Formatted read consumes input

Posted by Jonathan M Davis
in reply to Steven Schveighoffer

Jonathan M Davis

Posted in reply to Steven Schveighoffer

On Friday, September 07, 2012 10:52:07 Steven Schveighoffer wrote:
> We have three situations:
> 
> 1. input range is a ref type already (i.e. a class or a pImpl struct), no need to pass this by ref, just wastes cycles doing double dereference. 2. input range is a value type, and you want to preserve the original. 3. input range is a value type, and you want to update the original.
> 
> I'd like to see the library automatically make the right decision for 1, and give you some mechanism to choose between 2 and 3.  To preserve existing code, 3 should be the default.

Does it _ever_ make sense for a range to be an input range and not a forward range and _not_ have it be a reference type? Since it would be implicitly saving it if it were a value type, it would then make sense that it should have save on it. So, I don't think that input ranges which aren't forward ranges make any sense unless they're reference types, in which case, there's no point in taking them by ref, and you _can't_ preserve the original.

- Jonathan M Davis

September 07, 2012

Re: Formatted read consumes input

Posted by monarch_dodra
in reply to Steven Schveighoffer

monarch_dodra

Posted in reply to Steven Schveighoffer

On Friday, 7 September 2012 at 14:51:45 UTC, Steven Schveighoffer wrote:
> On Fri, 07 Sep 2012 10:35:37 -0400, monarch_dodra
>
> This looks ugly.  Returning a tuple and having to split the result is horrible, I hated dealing with that in C++ (and I even wrote stuff that returned pairs!)
>
> Not only that, but there are possible ranges which may not be reassignable.
>
> I'd rather have a way to wrap a string into a ref-based input range.
>
> We have three situations:
>
> 1. input range is a ref type already (i.e. a class or a pImpl struct), no need to pass this by ref, just wastes cycles doing double dereference.
> 2. input range is a value type, and you want to preserve the original.
> 3. input range is a value type, and you want to update the original.
>
> I'd like to see the library automatically make the right decision for 1, and give you some mechanism to choose between 2 and 3.  To preserve existing code, 3 should be the default.
>
> -Steve

True...

Still, I find it horrible to have to create a named "dummy" variable just when I simply want to pass a copy of my range.

I think I found 2 other solutions:
1: auto ref.
2: Kind of like auto ref: Just provide a non-ref overload. This creates less executable bloat.

Like this:
--------
//Formatted read for R-Value input range.
uint formattedRead(R, Char, S...)(R r, const(Char)[] fmt, S args)
{
    return formattedRead(r, fmt, args);
}
//Standard formated read
uint formattedRead(R, Char, S...)(ref R r, const(Char)[] fmt, S args)
--------
This allows me to write, as I would expect:

--------
void main()
{
  string s = "x42xT";
  int v;
  formattedRead(s.save, "x%dx", &v); //Pyssing a copy
  writefln("[%s][%s]", v, s);
  formattedRead(s, "x%dx", &v); //Please consusme me
  writefln("[%s][%s]", v, s);
}
--------
[42][x42xT] //My range is unchanged
[42][T]     //My range was consumed
--------

I think this is a good solution. Do you see anything I may have failed to see?

September 07, 2012

Re: Formatted read consumes input

Posted by monarch_dodra
in reply to monarch_dodra

monarch_dodra

Posted in reply to monarch_dodra

On Friday, 7 September 2012 at 15:34:12 UTC, monarch_dodra wrote:
> I think this is a good solution. Do you see anything I may have failed to see?

I've made a pull request out of it.

https://github.com/D-Programming-Language/phobos/pull/777

September 07, 2012

Re: Formatted read consumes input

Posted by Steven Schveighoffer
in reply to Jonathan M Davis

Steven Schveighoffer

Posted in reply to Jonathan M Davis

On Fri, 07 Sep 2012 11:04:36 -0400, Jonathan M Davis <jmdavisProg@gmx.com> wrote:

> On Friday, September 07, 2012 10:52:07 Steven Schveighoffer wrote:
>> We have three situations:
>>
>> 1. input range is a ref type already (i.e. a class or a pImpl struct), no
>> need to pass this by ref, just wastes cycles doing double dereference.
>> 2. input range is a value type, and you want to preserve the original.
>> 3. input range is a value type, and you want to update the original.
>>
>> I'd like to see the library automatically make the right decision for 1,
>> and give you some mechanism to choose between 2 and 3.  To preserve
>> existing code, 3 should be the default.
>
> Does it _ever_ make sense for a range to be an input range and not a forward
> range and _not_ have it be a reference type?

No it doesn't.  That is case 1.

However, it's quite easy to forget to define "save" when your range really is a forward range.  I don't really know a good way to fix this.  To assume that an input-and-not-forward range has reference semantics is prone to inappropriate code compiling just fine.

Clearly we can say classes are easily defined as not needing ref.

-Steve

September 07, 2012

Re: Formatted read consumes input

Posted by Steven Schveighoffer
in reply to monarch_dodra

Steven Schveighoffer

Posted in reply to monarch_dodra

On Fri, 07 Sep 2012 11:34:28 -0400, monarch_dodra <monarchdodra@gmail.com> wrote:

> On Friday, 7 September 2012 at 14:51:45 UTC, Steven Schveighoffer wrote:
>> On Fri, 07 Sep 2012 10:35:37 -0400, monarch_dodra
>>
>> This looks ugly.  Returning a tuple and having to split the result is horrible, I hated dealing with that in C++ (and I even wrote stuff that returned pairs!)
>>
>> Not only that, but there are possible ranges which may not be reassignable.
>>
>> I'd rather have a way to wrap a string into a ref-based input range.
>>
>> We have three situations:
>>
>> 1. input range is a ref type already (i.e. a class or a pImpl struct), no need to pass this by ref, just wastes cycles doing double dereference.
>> 2. input range is a value type, and you want to preserve the original.
>> 3. input range is a value type, and you want to update the original.
>>
>> I'd like to see the library automatically make the right decision for 1, and give you some mechanism to choose between 2 and 3.  To preserve existing code, 3 should be the default.
>>
>> -Steve
>
> True...
>
> Still, I find it horrible to have to create a named "dummy" variable just when I simply want to pass a copy of my range.
>
> I think I found 2 other solutions:
> 1: auto ref.
> 2: Kind of like auto ref: Just provide a non-ref overload. This creates less executable bloat.
>
> Like this:
> --------
> //Formatted read for R-Value input range.
> uint formattedRead(R, Char, S...)(R r, const(Char)[] fmt, S args)
> {
>      return formattedRead(r, fmt, args);
> }
> //Standard formated read
> uint formattedRead(R, Char, S...)(ref R r, const(Char)[] fmt, S args)
> --------
> This allows me to write, as I would expect:
>
> --------
> void main()
> {
>    string s = "x42xT";
>    int v;
>    formattedRead(s.save, "x%dx", &v); //Pyssing a copy
>    writefln("[%s][%s]", v, s);
>    formattedRead(s, "x%dx", &v); //Please consusme me
>    writefln("[%s][%s]", v, s);
> }
> --------
> [42][x42xT] //My range is unchanged
> [42][T]     //My range was consumed
> --------
>
> I think this is a good solution. Do you see anything I may have failed to see?

Well, this does work.  But I don't like that the semantics depend on whether the value is an rvalue or not.

Note that even ranges that are true input ranges (i.e. a file) still consume their data, even as rvalues, there is no way around it.

-Steve

September 07, 2012

Re: Formatted read consumes input

Posted by monarch_dodra
in reply to Steven Schveighoffer

monarch_dodra

Posted in reply to Steven Schveighoffer

On Friday, 7 September 2012 at 18:15:00 UTC, Steven Schveighoffer wrote:
>
> Well, this does work.  But I don't like that the semantics depend on whether the value is an rvalue or not.
>
> Note that even ranges that are true input ranges (i.e. a file) still consume their data, even as rvalues, there is no way around it.
>
> -Steve

Yes, but that is another issue, it is a "copy" vs "save" semantic issue. In theory, one should assume that *even* with pass by value, if you want your range to not be consumed, you have to call "save". Most ranges are value types, so we tend to forget it. std.algorithm had a few save-related bugs like that as a matter of fact.

But, contrary to post 1, that is not the actual issue being fixed here. It is merely a "compile with unnamed" fix:
formattedRead(file.save, ...)
And now it compiles fine. AND the range is saved. That's it. Nothing more, nothing less.

...

That's *if* file provides "save". I do not know much about file/stream handling in D, but you get my "save" point.

September 08, 2012

Re: Formatted read consumes input

Posted by kenji hara
in reply to monarch_dodra

kenji hara

Posted in reply to monarch_dodra

I have commented to the pull.
I don't like adding convenient interfaces to std.format module.

https://github.com/D-Programming-Language/phobos/pull/777#issuecomment-8385551

Kenji Hara

2012/9/8 monarch_dodra <monarchdodra@gmail.com>:
> On Friday, 7 September 2012 at 15:34:12 UTC, monarch_dodra wrote:
>>
>> I think this is a good solution. Do you see anything I may have failed to see?
>
>
> I've made a pull request out of it.
>
> https://github.com/D-Programming-Language/phobos/pull/777

September 08, 2012

Re: Formatted read consumes input

Posted by kenji hara
in reply to monarch_dodra

kenji hara

Posted in reply to monarch_dodra

2012/9/8 monarch_dodra <monarchdodra@gmail.com>:
[snip]
>
> Still, I find it horrible to have to create a named "dummy" variable just when I simply want to pass a copy of my range.

Why you are afraid to declaring "dummy" variable?
formattedRead is a parser, not an algorithm (as I said in the pull
request comment). After calling it, zero or more elements will remain.
And, in almost cases, the remains will be used other purpose, or just
checked that is empty.

int n = formattedRead(input_range, fmt, args...);
next_parsing(input_range);   // reusing input_range
assert(input_range.empty);  // or just checked that is empty

If formattedRead can receive rvalue, calling it would ignore the remains, and it will cause hidden bug.

int n = formattedRead(r.save, fmt, args...);
// If the remains is not empty, it is ignored. Is this expected, or
something logical bug?

auto dummy = r.save;
int n = formattedRead(dummy, fmt, args...);
assert(dummy.empty);   // You can assert that remains should be empty.

formattedRead returns multiple states (the values which are read, how many values are read, and remains of input), so allowing to ignore them would introduce bad usage and possibilities of bugs.

Kenji Hara

September 08, 2012

Re: Formatted read consumes input

Posted by monarch_dodra
in reply to kenji hara

monarch_dodra

Posted in reply to kenji hara

On Saturday, 8 September 2012 at 12:10:26 UTC, kenji hara wrote:
> 2012/9/8 monarch_dodra <monarchdodra@gmail.com>:
> [snip]
>>
>> Still, I find it horrible to have to create a named "dummy" variable just
>> when I simply want to pass a copy of my range.
>
> Why you are afraid to declaring "dummy" variable?
> formattedRead is a parser, not an algorithm (as I said in the pull
> request comment). After calling it, zero or more elements will remain.
> And, in almost cases, the remains will be used other purpose, or just
> checked that is empty.
>
> int n = formattedRead(input_range, fmt, args...);
> next_parsing(input_range);   // reusing input_range
> assert(input_range.empty);  // or just checked that is empty
>
> If formattedRead can receive rvalue, calling it would ignore the
> remains, and it will cause hidden bug.
>
> int n = formattedRead(r.save, fmt, args...);
> // If the remains is not empty, it is ignored. Is this expected, or
> something logical bug?
>
> auto dummy = r.save;
> int n = formattedRead(dummy, fmt, args...);
> assert(dummy.empty);   // You can assert that remains should be empty.
>
> formattedRead returns multiple states (the values which are read, how
> many values are read, and remains of input), so allowing to ignore
> them would introduce bad usage and possibilities of bugs.
>
> Kenji Hara

Hum, I think I see your point, although in my opinion, checking the return value is all that is required for generic error checking.

Checking the state of the range afterwards is being super extra careful for a specific use case, and should not necessarilly be forced onto the programmer.

I'll close the pull in the morning.

September 08, 2012

Re: Formatted read consumes input

Posted by monarch_dodra
in reply to kenji hara

monarch_dodra

Posted in reply to kenji hara

On Saturday, 8 September 2012 at 12:10:26 UTC, kenji hara wrote:
> 2012/9/8 monarch_dodra <monarchdodra@gmail.com>:
> [snip]
>>
>> Still, I find it horrible to have to create a named "dummy" variable just
>> when I simply want to pass a copy of my range.
>
> Why you are afraid to declaring "dummy" variable?
> formattedRead is a parser, not an algorithm (as I said in the pull
> request comment). After calling it, zero or more elements will remain.
> And, in almost cases, the remains will be used other purpose, or just
> checked that is empty.
>
> int n = formattedRead(input_range, fmt, args...);
> next_parsing(input_range);   // reusing input_range
> assert(input_range.empty);  // or just checked that is empty
>
> If formattedRead can receive rvalue, calling it would ignore the
> remains, and it will cause hidden bug.
>
> int n = formattedRead(r.save, fmt, args...);
> // If the remains is not empty, it is ignored. Is this expected, or
> something logical bug?
>
> auto dummy = r.save;
> int n = formattedRead(dummy, fmt, args...);
> assert(dummy.empty);   // You can assert that remains should be empty.
>
> formattedRead returns multiple states (the values which are read, how
> many values are read, and remains of input), so allowing to ignore
> them would introduce bad usage and possibilities of bugs.
>
> Kenji Hara

Hum, I think I see your point, although in my opinion, checking the return value is all that is required for generic error checking.

Checking the state of the range afterwards is being super extra careful for a specific use case, and should not necessarilly be forced onto the programmer.

I'll close the pull in the morning.

Top | Forum index | About this forum

Copyright © 1999-2021 by the D Language Foundation