Jump to page: 1 2
Thread overview
Formatted read consumes input
Aug 23, 2012
monarch_dodra
Aug 24, 2012
Dmitry Olshansky
Aug 24, 2012
monarch_dodra
Aug 24, 2012
Denis Shelomovskij
Aug 24, 2012
monarch_dodra
Aug 24, 2012
Tove
Aug 24, 2012
Dmitry Olshansky
Sep 07, 2012
monarch_dodra
Sep 07, 2012
Jonathan M Davis
Sep 07, 2012
monarch_dodra
Sep 07, 2012
monarch_dodra
Sep 08, 2012
kenji hara
Sep 07, 2012
monarch_dodra
Sep 08, 2012
kenji hara
Sep 08, 2012
monarch_dodra
Sep 08, 2012
monarch_dodra
August 23, 2012
As title implies:

----
import std.stdio;
import std.format;

void main()
{
  string s = "42";
  int v;
  formattedRead(s, "%d", &v);
  writefln("[%s] [%s]", s, v);
}
----
[] [42]
----

Is this the "expected" behavior?

Furthermore, it is not possible to try to "save" s:
----
import std.stdio;
import std.format;
import std.range;

void main()
{
  string s = "42";
  int v;
  formattedRead(s.save, "%d", &v);
  writefln("[%s] [%s]", s, v);
}
----
main.d(9): Error: template std.format.formattedRead does not match any function template declaration
C:\D\dmd.2.060\dmd2\windows\bin\..\..\src\phobos\std\format.d(526): Error: template std.format.formattedRead(R,Char,S...) cannot deduce template function from argument types !()(string,string,int*)
----

The workaround is to have a named backup:
  auto ss = s.save;
  formattedRead(ss, "%d", &v);


I've traced the root issue to formattedRead's signature, which is:
uint formattedRead(R, Char, S...)(ref R r, const(Char)[] fmt, S args);

Is there a particular reason for this pass by ref? It is inconsistent with the rest of phobos, or even C's scanf?

Is this a file-able bug_report/enhancement_request?
August 24, 2012
On Thursday, 23 August 2012 at 11:33:19 UTC, monarch_dodra wrote:
> As title implies:
>
> ----
> import std.stdio;
> import std.format;
>
> void main()
> {
>   string s = "42";
>   int v;
>   formattedRead(s, "%d", &v);
>   writefln("[%s] [%s]", s, v);
> }
> ----
> [] [42]
> ----
>
> Is this the "expected" behavior?

Yes, both parse family and formattedRead are operating on ref argument. That means they modify in place. Also ponder the thought that 2 consecutive reads should obviously read first and 2nd value in the string not the same one.


> Furthermore, it is not possible to try to "save" s:
> ----
> import std.stdio;
> import std.format;
> import std.range;
>
> void main()
> {
>   string s = "42";
>   int v;
>   formattedRead(s.save, "%d", &v);
>   writefln("[%s] [%s]", s, v);
> }
> ----

Yes, because ref doesn't bind r-value.

> The workaround is to have a named backup:
>   auto ss = s.save;
>   formattedRead(ss, "%d", &v);
>
>
> I've traced the root issue to formattedRead's signature, which is:
> uint formattedRead(R, Char, S...)(ref R r, const(Char)[] fmt, S args);
>

As I explained above the reason is because the only sane logic of multiple reads is to consume input and to do so it needs ref.

> Is there a particular reason for this pass by ref? It is inconsistent with the rest of phobos, or even C's scanf?

C's scanf is a poor argument as it uses pointers instead of ref (and it can't do ref as there is no ref in C :) ). Yet it doesn't allow to read things in a couple of calls AFAIK. In C scanf returns number of arguments successfully read not bytes so there is no way to continue from where it stopped.

BTW it's not documented what formattedRead returns ... just ouch.
August 24, 2012
On Friday, 24 August 2012 at 11:18:55 UTC, Dmitry Olshansky wrote:
> On Thursday, 23 August 2012 at 11:33:19 UTC, monarch_dodra wrote:
>> I've traced the root issue to formattedRead's signature, which is:
>> uint formattedRead(R, Char, S...)(ref R r, const(Char)[] fmt, S args);
>>
>
> As I explained above the reason is because the only sane logic of multiple reads is to consume input and to do so it needs ref.

I had actually considered that argument. But a lot of algorithms have the same approach, yet they don't take refs, they *return* the consumed front:

----
R formattedRead(R, Char, S...)(R r, const(Char)[] fmt, S args)

auto s2 = formatedRead(s, "%d", &v);
----

Or arguably:

----
Tuple!(size_t, R) formattedRead(R, Char, S...)(R r, const(Char)[] fmt, S args)
----

"minCount", "boyerMooreFinder" and "levenshteinDistanceAndPath" all take this approach to return a consumed range plus an index/count.
August 24, 2012
24.08.2012 16:16, monarch_dodra пишет:
> On Friday, 24 August 2012 at 11:18:55 UTC, Dmitry Olshansky wrote:
>> On Thursday, 23 August 2012 at 11:33:19 UTC, monarch_dodra wrote:
>>> I've traced the root issue to formattedRead's signature, which is:
>>> uint formattedRead(R, Char, S...)(ref R r, const(Char)[] fmt, S args);
>>>
>>
>> As I explained above the reason is because the only sane logic of
>> multiple reads is to consume input and to do so it needs ref.
>
> I had actually considered that argument. But a lot of algorithms have
> the same approach, yet they don't take refs, they *return* the consumed
> front:
>
> ----
> R formattedRead(R, Char, S...)(R r, const(Char)[] fmt, S args)
>
> auto s2 = formatedRead(s, "%d", &v);
> ----
>
> Or arguably:
>
> ----
> Tuple!(size_t, R) formattedRead(R, Char, S...)(R r, const(Char)[] fmt, S
> args)
> ----
>
> "minCount", "boyerMooreFinder" and "levenshteinDistanceAndPath" all take
> this approach to return a consumed range plus an index/count.

It's because `formattedRead` is designed to work with an input range which isn't a forward range (not save-able).

-- 
Денис В. Шеломовский
Denis V. Shelomovskij
August 24, 2012
On Friday, 24 August 2012 at 11:18:55 UTC, Dmitry Olshansky wrote:
> C's scanf is a poor argument as it uses pointers instead of ref (and it can't do ref as there is no ref in C :) ). Yet it doesn't allow to read things in a couple of calls AFAIK. In C scanf returns number of arguments successfully read not bytes so there is no way to continue from where it stopped.
>
> BTW it's not documented what formattedRead returns ... just ouch.

Actually... look up "%n" in sscanf it's wonderful, I use it all the time.

August 24, 2012
On Friday, 24 August 2012 at 13:08:43 UTC, Denis Shelomovskij wrote:
> 24.08.2012 16:16, monarch_dodra пишет:
>> On Friday, 24 August 2012 at 11:18:55 UTC, Dmitry Olshansky wrote:
>>> On Thursday, 23 August 2012 at 11:33:19 UTC, monarch_dodra wrote:
>>>> I've traced the root issue to formattedRead's signature, which is:
>>>> uint formattedRead(R, Char, S...)(ref R r, const(Char)[] fmt, S args);
>>>>
>>>
>>> As I explained above the reason is because the only sane logic of
>>> multiple reads is to consume input and to do so it needs ref.
>>
>> I had actually considered that argument. But a lot of algorithms have
>> the same approach, yet they don't take refs, they *return* the consumed
>> front:
>>
>> ----
>> R formattedRead(R, Char, S...)(R r, const(Char)[] fmt, S args)
>>
>> auto s2 = formatedRead(s, "%d", &v);
>> ----
>>
>> Or arguably:
>>
>> ----
>> Tuple!(size_t, R) formattedRead(R, Char, S...)(R r, const(Char)[] fmt, S
>> args)
>> ----
>>
>> "minCount", "boyerMooreFinder" and "levenshteinDistanceAndPath" all take
>> this approach to return a consumed range plus an index/count.
>
> It's because `formattedRead` is designed to work with an input range which isn't a forward range (not save-able).

You had me ready to throw in the towel on that argument, but thinking harder about it, that doesn't really change anything actually:

At the end of formattedRead, the passed range has a certain state. whether you give this range back to the caller via "pass by ref" or "return by value" has nothing to do with save-ability.
August 24, 2012
On 24-Aug-12 17:43, Tove wrote:
> On Friday, 24 August 2012 at 11:18:55 UTC, Dmitry Olshansky wrote:
>> C's scanf is a poor argument as it uses pointers instead of ref (and
>> it can't do ref as there is no ref in C :) ). Yet it doesn't allow to
>> read things in a couple of calls AFAIK. In C scanf returns number of
>> arguments successfully read not bytes so there is no way to continue
>> from where it stopped.
>>
>> BTW it's not documented what formattedRead returns ... just ouch.
>
> Actually... look up "%n" in sscanf it's wonderful, I use it all the time.
>
God... what an awful kludge :)

-- 
Olshansky Dmitry
September 07, 2012
On Thu, 23 Aug 2012 07:33:13 -0400, monarch_dodra <monarchdodra@gmail.com> wrote:

> As title implies:
>
> ----
> import std.stdio;
> import std.format;
>
> void main()
> {
>    string s = "42";
>    int v;
>    formattedRead(s, "%d", &v);
>    writefln("[%s] [%s]", s, v);
> }
> ----
> [] [42]
> ----
>
> Is this the "expected" behavior?
>
> Furthermore, it is not possible to try to "save" s:
> ----
> import std.stdio;
> import std.format;
> import std.range;
>
> void main()
> {
>    string s = "42";
>    int v;
>    formattedRead(s.save, "%d", &v);
>    writefln("[%s] [%s]", s, v);
> }
> ----
> main.d(9): Error: template std.format.formattedRead does not match any function template declaration
> C:\D\dmd.2.060\dmd2\windows\bin\..\..\src\phobos\std\format.d(526): Error: template std.format.formattedRead(R,Char,S...) cannot deduce template function from argument types !()(string,string,int*)
> ----
>
> The workaround is to have a named backup:
>    auto ss = s.save;
>    formattedRead(ss, "%d", &v);
>
>
> I've traced the root issue to formattedRead's signature, which is:
> uint formattedRead(R, Char, S...)(ref R r, const(Char)[] fmt, S args);
>
> Is there a particular reason for this pass by ref? It is inconsistent with the rest of phobos, or even C's scanf?
>
> Is this a file-able bug_report/enhancement_request?

I believe it behaves as designed, but could be designed in such a way that does not need ref input range.  In fact, I think actually R needing to be ref is a bad thing.  Consider that if D didn't consider string literals to be lvalues (an invalid assumption IMO), then passing a string literal as the input would not work!

The only issue is, what if you *do* want ref behavior for strings?  You would need to wrap the string into a ref'd range.  That is not a good proposition.  Unfortunately, the way IFTI works, there isn't an opportunity to affect the parameter type IFTI decides to use.

I think a reasonable enhancement would be to add a formattedReadNoref (or better named alternative) that does not take a ref argument.

-Steve
September 07, 2012
On Friday, 7 September 2012 at 13:58:43 UTC, Steven Schveighoffer wrote:
> On Thu, 23 Aug 2012 07:33:13 -0400, monarch_dodra
>
> The only issue is, what if you *do* want ref behavior for strings?  You would need to wrap the string into a ref'd range.
>  That is not a good proposition.  Unfortunately, the way IFTI works, there isn't an opportunity to affect the parameter type IFTI decides to use.
>
> [SNIP]
>
> -Steve

If you want *do* ref behavior, I still don't see why you we don't just do it the algorithm way of return by value:

----
Tuple!(uint, R)
formattedRead2(R, Char, S...)(R r, const(Char)[] fmt, S args)
{
    auto ret = formattedRead(r, fmt, args);
    return Tuple!(uint, R)(ret, r);
}

void main()
{
  string s = "42 worlds";
  int v;
  s = formattedRead(s.save, "%d", &v)[1];
  writefln("[%s][%s]", v, s);
}
----

September 07, 2012
On Fri, 07 Sep 2012 10:35:37 -0400, monarch_dodra <monarchdodra@gmail.com> wrote:

> On Friday, 7 September 2012 at 13:58:43 UTC, Steven Schveighoffer wrote:
>> On Thu, 23 Aug 2012 07:33:13 -0400, monarch_dodra
>>
>> The only issue is, what if you *do* want ref behavior for strings?  You would need to wrap the string into a ref'd range.
>>  That is not a good proposition.  Unfortunately, the way IFTI works, there isn't an opportunity to affect the parameter type IFTI decides to use.
>>
>> [SNIP]
>>
>> -Steve
>
> If you want *do* ref behavior, I still don't see why you we don't just do it the algorithm way of return by value:
>
> ----
> Tuple!(uint, R)
> formattedRead2(R, Char, S...)(R r, const(Char)[] fmt, S args)
> {
>      auto ret = formattedRead(r, fmt, args);
>      return Tuple!(uint, R)(ret, r);
> }
>
> void main()
> {
>    string s = "42 worlds";
>    int v;
>    s = formattedRead(s.save, "%d", &v)[1];
>    writefln("[%s][%s]", v, s);
> }
> ----
>

This looks ugly.  Returning a tuple and having to split the result is horrible, I hated dealing with that in C++ (and I even wrote stuff that returned pairs!)

Not only that, but there are possible ranges which may not be reassignable.

I'd rather have a way to wrap a string into a ref-based input range.

We have three situations:

1. input range is a ref type already (i.e. a class or a pImpl struct), no need to pass this by ref, just wastes cycles doing double dereference.
2. input range is a value type, and you want to preserve the original.
3. input range is a value type, and you want to update the original.

I'd like to see the library automatically make the right decision for 1, and give you some mechanism to choose between 2 and 3.  To preserve existing code, 3 should be the default.

-Steve
« First   ‹ Prev
1 2