Thread overview | ||||||||
---|---|---|---|---|---|---|---|---|
|
September 04, 2013 Multidimensional dynamic array of strings initialized with split() | ||||
---|---|---|---|---|
| ||||
Hello friends, with the following code import std.stdio; import std.array; auto file71 = File(argv[2], "r"); string[][] buffer; foreach (line; file71.byLines) { buffer ~= split(line, "\t"); } I am trying to cut the lines from the file with tab as delimiter to pre-fetch the content of a file before further processing. Each split() call gives correct string[] values in and of itself. But when I try to read buffer, after the loop, I got corrupted data, like this: [ ["-", "_Unit226", "constructor", "sub_00BE896C\t1\t?:?\t\t//con", "t", "uc... Obviously the concatenation is doing no good, since there are tabs in the values... What am I missing here ? Is it that split() allocated memory that gets overwritten in the loop and the ~= just copies the subarrays not copying the subsubarrays ? How to overcome this ? Thank you very much, Ludovit |
September 04, 2013 Re: Multidimensional dynamic array of strings initialized with split() | ||||
---|---|---|---|---|
| ||||
Posted in reply to Ludovit Lucenic | On Thu, Sep 05, 2013 at 12:57:34AM +0200, Ludovit Lucenic wrote: > Hello friends, > > with the following code > > import std.stdio; > import std.array; > > auto file71 = File(argv[2], "r"); > > string[][] buffer; > foreach (line; file71.byLines) { > buffer ~= split(line, "\t"); > } > > I am trying to cut the lines from the file with tab as delimiter to pre-fetch the content of a file before further processing. > > Each split() call gives correct string[] values in and of itself. But when I try to read buffer, after the loop, I got corrupted data, like this: > > [ ["-", "_Unit226", "constructor", "sub_00BE896C\t1\t?:?\t\t//con", "t", "uc... > > Obviously the concatenation is doing no good, since there are tabs in the values... > > What am I missing here ? Is it that split() allocated memory that gets overwritten in the loop and the ~= just copies the subarrays not copying the subsubarrays ? How to overcome this ? [...] The problem is that File.byLine() reuses its buffer for efficiency, and split is optimized to return slices into that buffer instead of copying each substring. So after every iteration the buffer (and therefore the slices into it) gets overwritten. Replace the loop body with the following and it should work: buffer ~= split(line.dup, "\t"); T -- Dogs have owners ... cats have staff. -- Krista Casada |
September 04, 2013 Re: Multidimensional dynamic array of strings initialized with split() | ||||
---|---|---|---|---|
| ||||
Posted in reply to H. S. Teoh | On Wednesday, 4 September 2013 at 23:06:10 UTC, H. S. Teoh wrote:
>
> The problem is that File.byLine() reuses its buffer for efficiency, and
> split is optimized to return slices into that buffer instead of copying
> each substring. So after every iteration the buffer (and therefore the
> slices into it) gets overwritten.
>
> Replace the loop body with the following and it should work:
>
> buffer ~= split(line.dup, "\t");
>
>
> T
Thank you so much for your explanation.
Helped me a lot to understand things and works actually :-)
LL
|
September 05, 2013 Re: Multidimensional dynamic array of strings initialized with split() | ||||
---|---|---|---|---|
| ||||
Posted in reply to Ludovit Lucenic | I have created a wiki on this one. http://wiki.dlang.org/Read_table_data_from_file |
September 05, 2013 Re: Multidimensional dynamic array of strings initialized with split() | ||||
---|---|---|---|---|
| ||||
Posted in reply to Ludovit Lucenic | On 09/05/2013 01:14 AM, Ludovit Lucenic wrote:
> I have created a wiki on this one.
> http://wiki.dlang.org/Read_table_data_from_file
>
Compiling with "DMD64 D Compiler v2.064-devel-52cc287" produces the following errors:
* You had byLines in your original code as well. Shouldn't it be byLine?
* You are missing the closing brace of the foreach loop as well.
* "Error: cannot append type char[][] to type string[][]" I have to replace .dup with .idup
The following version is lazy:
import std.stdio;
import std.array;
import std.algorithm;
auto readInData(File inputFile, string fieldSeparator)
{
return
inputFile
.byLine
.map!(line => line
.idup
.split("\t"));
}
The caller can either use the result lazily:
import std.range;
void main()
{
auto file = File("deneme.txt");
writeln(readInData(file, "\t").take(2));
}
Or call .array on the result to consume the range eagerly:
auto table = readInData(file, "\t").array;
Ali
|
September 22, 2017 Re: Multidimensional dynamic array of strings initialized with split() | ||||
---|---|---|---|---|
| ||||
Posted in reply to Ali Çehreli | On Thursday, 5 September 2013 at 16:22:46 UTC, Ali Çehreli wrote: > > Compiling with "DMD64 D Compiler v2.064-devel-52cc287" produces the following errors: > > * You had byLines in your original code as well. Shouldn't it be byLine? > > * You are missing the closing brace of the foreach loop as well. > > * "Error: cannot append type char[][] to type string[][]" I have to replace .dup with .idup Thank you for pointing out the errors, Ali. I have updated the example. > > The following version is lazy: > > import std.stdio; > import std.array; > import std.algorithm; > > auto readInData(File inputFile, string fieldSeparator) > { > return > inputFile > .byLine > .map!(line => line > .idup > .split("\t")); > } > > The caller can either use the result lazily: > > import std.range; > > void main() > { > auto file = File("deneme.txt"); > writeln(readInData(file, "\t").take(2)); > } > > Or call .array on the result to consume the range eagerly: > > auto table = readInData(file, "\t").array; > > Ali Thank you for the alternative approaches. This thread is linked from Credits section, if someone wants to find out more on the topic from the wiki. |
Copyright © 1999-2021 by the D Language Foundation