Thread overview
[Issue 1482] New: std.file docs are insufficient
Sep 07, 2007
d-bugmail
Sep 07, 2007
d-bugmail
Sep 08, 2007
Stewart Gordon
Sep 07, 2007
d-bugmail
Sep 11, 2007
d-bugmail
Re: std.file docs
Sep 11, 2007
Regan Heath
Sep 12, 2007
d-bugmail
September 07, 2007
http://d.puremagic.com/issues/show_bug.cgi?id=1482

           Summary: std.file docs are insufficient
           Product: D
           Version: 2.005
          Platform: PC
        OS/Version: Linux
            Status: NEW
          Severity: normal
          Priority: P2
         Component: Phobos
        AssignedTo: bugzilla@digitalmars.com
        ReportedBy: jlquinn@us.ibm.com


The prototype for read() returns a void[].  The docs say it returns an array of bytes.  Why does it return void[] instead of byte[] or ubyte[]?  What types is it safe to cast the return value to?

Does read() return the complete file from a single call?

write() and append() have the same issue.  What types may be passed in safely? What exactly gets written out if you pass an array of dchar?

The class (and others) really need overview text at the top.  Look at the Java API docs for inspiration.


-- 

September 07, 2007
http://d.puremagic.com/issues/show_bug.cgi?id=1482


thecybershadow@gmail.com changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                URL|                            |http://digitalmars.com/d/pho
                   |                            |bos/std_file.html




------- Comment #1 from thecybershadow@gmail.com  2007-09-07 18:46 -------
It returns an array of the bytes that it read from the file. I think that the bug here is that the specs don't describe what a void[] really is. Closest to that is the "Implicit Conversions" sections here:

http://digitalmars.com/d/arrays.html#strings
(linked to closest anchor, scroll a bit below)

A void[] is functionally the same as a byte[] or ubyte[], with the differences:
1) you can't access an element of it (since the type of the underlying data is
unknown)
2) any array implicitly converts to void[] - this allows you to work with
functions that take void[] arguments without using explicit casts, except when
you need to pass something that isn't an array, in which case you have to
resort to syntax like write(filename, &mystruct[0..1]).

IMO D or Phobos should have a simple syntax or template that allows you to convert any variable to a void[], which is essentially a void* and a length. Currently I use this (pretty crude) template (which likely can be rewritten in a much better way):

struct BufferEx
{
        union
        {
                buffer buf;
                struct Fields
                {
                        size_t length;
                        void* ptr;
                } Fields fields;
        }
}

buffer toBuffer(T)(inout T data)
{
        BufferEx b;
        b.fields.ptr = &data;
        b.fields.length = T.sizeof;
        return b.buf;
}

The std.file functions aren't part of a class. Upside is simple and readable code for small programs, downside is possible name collisions ("read"/"write" is a common thing to be found in the global namespace). Yay for selective/static/renaming imports, I guess. Note that Tango uses a File class (which makes the code more bloated, as you need two operations - class instantiation and the operation - for a single file operation).

P.S. It's none of my business, but is IBM interested in D now? :)


-- 

September 07, 2007
http://d.puremagic.com/issues/show_bug.cgi?id=1482





------- Comment #2 from thecybershadow@gmail.com  2007-09-07 18:52 -------
Oops, in the snippet of code above I forgot to include:

public alias void[] buffer;

(actually it's in a compiler version clause which mixins "public alias
const(void)[] buffer;" for the 2.0 branch)


-- 

September 08, 2007
<d-bugmail@puremagic.com> wrote in message news:fbsnsp$q84$1@digitalmars.com...
<snip>
> IMO D or Phobos should have a simple syntax or template that allows you to
> convert any variable to a void[], which is essentially a void* and a length.
> Currently I use this (pretty crude) template (which likely can be rewritten in
> a much better way):
<snip>

And I use

   cast(void[]) (&var)[0..1]

Of course, you'd need to .dup it if you want it to be valid after var has gone out of scope.

Stewart. 

September 11, 2007
http://d.puremagic.com/issues/show_bug.cgi?id=1482





------- Comment #3 from jlquinn@us.ibm.com  2007-09-11 09:04 -------
(In reply to comment #1)
> It returns an array of the bytes that it read from the file. I think that the bug here is that the specs don't describe what a void[] really is. Closest to that is the "Implicit Conversions" sections here:
> 
> http://digitalmars.com/d/arrays.html#strings
> (linked to closest anchor, scroll a bit below)
> 
> A void[] is functionally the same as a byte[] or ubyte[], with the differences:

It's essentially the same as char[] as well, then, right?  A char is really just  a byte that is given preferential treatment as a UTF-8 string, no?

> The std.file functions aren't part of a class. Upside is simple and readable code for small programs, downside is possible name collisions ("read"/"write" is a common thing to be found in the global namespace). Yay for selective/static/renaming imports,

The docs don't make this 100% clear.  I think it's mostly visual formatting, combined with a lack of a cohesive summary for the module.

A related doc note ...  I find that the docs make it difficult to distinguish the methods associated with a class when there multiple classes on a single page.  Again, I'd call it visual formatting.

Another issue I found is that I didn't get a definitive answer on whether reading files automatically treats them as UTF-8 or not.


> P.S. It's none of my business, but is IBM interested in D now? :)

I can't speak for the company, but I personally find the core language more pleasant than C++.  If it offers better performance than java then it becomes more interesting to me :-)



-- 

September 11, 2007
> A related doc note ...  I find that the docs make it difficult to distinguish
> the methods associated with a class when there multiple classes on a single
> page.  Again, I'd call it visual formatting.

I find the same thing.

> Another issue I found is that I didn't get a definitive answer on whether
> reading files automatically treats them as UTF-8 or not.

I believe;  std.file.read simply reads bytes, it doesn't do any conversion, it doesn't handle UTF-8, 16, or 32 BOM or anything else.

>> P.S. It's none of my business, but is IBM interested in D now? :)
> 
> I can't speak for the company, but I personally find the core language more
> pleasant than C++.  If it offers better performance than java then it becomes
> more interesting to me :-)

D offers similar if not better performance than C/C++ in some cases.

Regan
September 12, 2007
http://d.puremagic.com/issues/show_bug.cgi?id=1482





------- Comment #4 from thecybershadow@gmail.com  2007-09-12 07:07 -------
(In reply to comment #3)
> It's essentially the same as char[] as well, then, right?  A char is really just  a byte that is given preferential treatment as a UTF-8 string, no?

Indeed (also character literals don't implicitly cast to integers, and
vice-versa).

> The docs don't make this 100% clear.  I think it's mostly visual formatting, combined with a lack of a cohesive summary for the module.
> 
> A related doc note ...  I find that the docs make it difficult to distinguish the methods associated with a class when there multiple classes on a single page.  Again, I'd call it visual formatting.

It's true. I got the habit of just checking the library source most of the time...

> Another issue I found is that I didn't get a definitive answer on whether reading files automatically treats them as UTF-8 or not.

The std.file routines treat the files as raw data. No conversion is performed. You are free to operate on the data as you please, however Phobos routines that take char[] arguments will usually expect them to be encoded in UTF-8.

Pretty sure there was a page about Unicode and UTF-8 in D somewhere...


-- 

October 11, 2009
http://d.puremagic.com/issues/show_bug.cgi?id=1482


Andrei Alexandrescu <andrei@metalanguage.com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|NEW                         |ASSIGNED
                 CC|                            |andrei@metalanguage.com
         AssignedTo|nobody@puremagic.com        |andrei@metalanguage.com


-- 
Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
September 25, 2010
http://d.puremagic.com/issues/show_bug.cgi?id=1482


Andrei Alexandrescu <andrei@metalanguage.com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|ASSIGNED                    |RESOLVED
         Resolution|                            |FIXED


--- Comment #5 from Andrei Alexandrescu <andrei@metalanguage.com> 2010-09-25 15:38:02 PDT ---
http://www.dsource.org/projects/phobos/changeset/2048

-- 
Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email
------- You are receiving this mail because: -------