Thread overview
How to work with one very large text table but not in memory
Apr 06, 2021
Alain De Vos
Apr 06, 2021
Ali Çehreli
Apr 06, 2021
tsbockman
April 06, 2021

I have one very large text table I want to work with.
But I don't want to keep de table in memory, what do I use ?
Using an sql database is overkill in my setting.
There are 10 colums but millions of rows.

April 06, 2021
On 4/6/21 12:55 PM, Alain De Vos wrote:
> I have one very large text table I want to work with.
> But I don't want to keep de table in memory, what do I use ?
> Using an sql database is overkill in my setting.
> There are 10 colums but millions of rows.

Jon Degenhardt of eBay uses D in similar ways with tsv-utils. He has good documentation here:

  https://github.com/eBay/tsv-utils

I have a feeling one of his tools may already be useful to you. :)

Personally, I would just parse the file line-by-line, potentially filtering, meanwhile building an array or an associative array and use the data from there.

Ali

April 06, 2021

On Tuesday, 6 April 2021 at 19:55:03 UTC, Alain De Vos wrote:

>

I have one very large text table I want to work with.
But I don't want to keep de table in memory, what do I use ?
Using an sql database is overkill in my setting.
There are 10 colums but millions of rows.

You might find memory mapped files useful: http://phobos.dpldocs.info/std.mmfile.MmFile.html

This allows D code to access the entire contents of the file as though it were a giant byte array in RAM, without requiring that there actually be enough physical RAM available to really do that. The OS is responsible for paging data to and from the disk as it is accessed.