Jump to page: 1 2
Thread overview
Lisp like lists and a problem with TANGO fromStringz
Mar 02, 2008
Bjoern
Mar 02, 2008
Bjoern
Mar 02, 2008
Bjoern
Mar 02, 2008
bearophile
Mar 02, 2008
Bjoern
Lisp like lists -> box/invariant
Mar 02, 2008
Bjoern
Mar 02, 2008
Oskar Linde
Mar 03, 2008
bearophile
Mar 03, 2008
kov_serg
Mar 03, 2008
bearophile
Array append performance (Was: Re: Lisp like lists and a problem with TANGO fromStringz)
Mar 03, 2008
Oskar Linde
Mar 03, 2008
Bjoern
March 02, 2008
Undefined Symbol ... fromStringz
Hi
even after spending a few hours I was not able to figure out where I
made an mistake.

The error msg as follows :

C:\dmd\projects\bsdutil>dmd lisplist
C:\dmd\bin\link.exe lisplist,,,user32+kernel32/noi+tango-user-dmd.lib;
OPTLINK (R) for Win32  Release 8.00.1
Copyright (C) Digital Mars 1989-2004  All rights reserved.
lisplist.obj(lisplist)
  Error 42: Symbol Undefined _D5tango4stdc7stringz11fromStringzFPaZAa
--- errorlevel 1

somehow :) nagged, hope somebody is willing to help
Many thanks in advance :
Bjoern

attached the source



March 02, 2008
"Bjoern" <nanali@nospam-wanadoo.fr> wrote in message news:fqea2o$tnj$1@digitalmars.com...
> Undefined Symbol ... fromStringz
> Hi
> even after spending a few hours I was not able to figure out where I
> made an mistake.
>
> The error msg as follows :
>
> C:\dmd\projects\bsdutil>dmd lisplist
> C:\dmd\bin\link.exe lisplist,,,user32+kernel32/noi+tango-user-dmd.lib;
> OPTLINK (R) for Win32  Release 8.00.1
> Copyright (C) Digital Mars 1989-2004  All rights reserved.
> lisplist.obj(lisplist)
>  Error 42: Symbol Undefined _D5tango4stdc7stringz11fromStringzFPaZAa
> --- errorlevel 1
>
> somehow :) nagged, hope somebody is willing to help
> Many thanks in advance :
> Bjoern

If you're not using Tango trunk (that is, you're using the 0.99.4 release), it's called "fromUtf8z".  They just forgot to rename it.

If that doesn't work, well hell.


March 02, 2008
Bjoern:
> z = cast(CONS *) malloc(CONS.sizeof);

Suggestion 1: all those malloc will slow your code down, and on 32 bit CPU they burn more than 8 bytes for each cons. So I suggest you to manage a free list of them, allocating them at blocks, for example of about 64 kb. This reduces memory used, speeds up the allocation/deallocation, and may increase cache locality of your cons cells pointers a little too.

Suggestion 2: I don't fully understand yet how the GC heap interacts with the C heap, so the following may lead to GC problems. You can allocate those memory blocks from the C heap instead of the GC heap (then you may want to add a root for each externally given pointer, this may be a bit messy). The C malloc gives you an aligned memory block, so all cons structs have an address aligned to 8/16 bytes too. So you have one or more bits free into the pointers. So you can use your cons cells to represent far more than just pairs of pointers, like chars, symbols, not-too-much-short integers, almost-floats, short strings (just need their length, that is small. Small strings are very common), etc. Many Lisp implementations use such tagged data, this reduces pointer deferences and memory use. The problem is that this is dynamic typing, so such cons become a kind of box/variant (and this isn't too much bad, I think).

[Unrelated question: why aren't D dynamic arrays triples of values? So they can store their actual length and the memory they have available, so they can support  something like the reserve() of C++ STL. Is the GC already storing such third value? But then I haven't found a reliable way to perform a reserve-like operation.]

Bye,
bearophile
March 02, 2008
Jarrett Billingsley schrieb:
> "Bjoern" <nanali@nospam-wanadoo.fr> wrote in message news:fqea2o$tnj$1@digitalmars.com...
>> Undefined Symbol ... fromStringz
>> Hi
>> even after spending a few hours I was not able to figure out where I
>> made an mistake.
>>
>> The error msg as follows :
>>
>> C:\dmd\projects\bsdutil>dmd lisplist
>> C:\dmd\bin\link.exe lisplist,,,user32+kernel32/noi+tango-user-dmd.lib;
>> OPTLINK (R) for Win32  Release 8.00.1
>> Copyright (C) Digital Mars 1989-2004  All rights reserved.
>> lisplist.obj(lisplist)
>>  Error 42: Symbol Undefined _D5tango4stdc7stringz11fromStringzFPaZAa
>> --- errorlevel 1
>>
>> somehow :) nagged, hope somebody is willing to help
>> Many thanks in advance :
>> Bjoern
> 
> If you're not using Tango trunk (that is, you're using the 0.99.4 release), it's called "fromUtf8z".  They just forgot to rename it.
> 
> If that doesn't work, well hell. 
> 
> 
fromUtf8z results in undefined identifier ...

Ah, the joy of Sunday afternoon programming :)

As workaround I use :
alias bool function(char*, char*) CompFunc;
bool string_compare(char* a, char* b)
{
	return ( a[0 .. strlen(a)] == b[0 .. strlen(b)] );
	//return (fromStringz(a) == fromStringz(b));
}

Thanks Jarret, have a nice day
B
March 02, 2008
bearophile schrieb:
> Bjoern:
>> z = cast(CONS *) malloc(CONS.sizeof);
> 
> Suggestion 1: all those malloc will slow your code down, and on 32 bit CPU they burn more than 8 bytes for each cons. So I suggest you to manage a free list of them, allocating them at blocks, for example of about 64 kb. This reduces memory used, speeds up the allocation/deallocation, and may increase cache locality of your cons cells pointers a little too.
> 

Thanks bearophile,
well I thought 1) make it work then make it fast.

I'll pick up your idea. I recall that Uwe Salomon uses this technic in his Indigo container library. I'll have a look.

> Suggestion 2: I don't fully understand yet how the GC heap interacts with the C heap, so the following may lead to GC problems. You can allocate those memory blocks from the C heap instead of the GC heap (then you may want to add a root for each externally given pointer, this may be a bit messy). The C malloc gives you an aligned memory block, so all cons structs have an address aligned to 8/16 bytes too. So you have one or more bits free into the pointers. So you can use your cons cells to represent far more than just pairs of pointers, like chars, symbols, not-too-much-short integers, almost-floats, short strings (just need their length, that is small. Small strings are very common), etc. Many Lisp implementations use such tagged data, this reduces pointer deferences and memory use. The problem is that this is dynamic typing, so such cons become a kind of box/variant (and this isn't too much bad, I think).
> 
Yes,  could lead to problems, I think. So I will re-read Walters memory management article. HTH to understand.

The source compiles now. using dmd 1.025 but the output is not as expected. Maybe this is a GC problem. Dunno yet. I've attached the source. Hope you have a look.

> [Unrelated question: why aren't D dynamic arrays triples of values? So they can store their actual length and the memory they have available, so they can support  something like the reserve() of C++ STL. Is the GC already storing such third value? But then I haven't found a reliable way to perform a reserve-like operation.]
> 

If YOU don't know it .... :)

> Bye,
> bearophile

Ah, the joy of Sunday afternoon programming .. Bjoern



March 02, 2008
bearophile wrote:

quote :
So you have one or more bits free into the pointers. So you can use your cons cells to represent far more than just pairs of pointers, like chars, symbols, not-too-much-short integers, almost-floats, short strings (just need their length, that is small. Small strings are very common), etc. Many Lisp implementations use such tagged data, this reduces pointer deferences and memory use. The problem is that this is dynamic typing, so such cons become a kind of box/variant (and this isn't too much bad, I think).
end quote
---------
Yes, this is excactly what I want!
The idea is to implement a  rowset (database related) datatype based on lists.
Well, having a LISP list in D is, beside a possible rowset implementation, not that bad, I guess.
Thanks, Bjoern
March 02, 2008
Jarrett Billingsley schrieb:
> "Bjoern" <nanali@nospam-wanadoo.fr> wrote in message news:fqea2o$tnj$1@digitalmars.com...
>> Undefined Symbol ... fromStringz
>> Hi
>> even after spending a few hours I was not able to figure out where I
>> made an mistake.
>>
>> The error msg as follows :
>>
>> C:\dmd\projects\bsdutil>dmd lisplist
>> C:\dmd\bin\link.exe lisplist,,,user32+kernel32/noi+tango-user-dmd.lib;
>> OPTLINK (R) for Win32  Release 8.00.1
>> Copyright (C) Digital Mars 1989-2004  All rights reserved.
>> lisplist.obj(lisplist)
>>  Error 42: Symbol Undefined _D5tango4stdc7stringz11fromStringzFPaZAa
>> --- errorlevel 1
>>
>> somehow :) nagged, hope somebody is willing to help
>> Many thanks in advance :
>> Bjoern
> 
> If you're not using Tango trunk (that is, you're using the 0.99.4 release), it's called "fromUtf8z".  They just forgot to rename it.
> 
> If that doesn't work, well hell. 
> 
> 
Oh, fromStringz() works fine, except you use f.i. Cout(fromStringz(c)). The undefined symbol error message is very confusing.
Bjoern
March 02, 2008
bearophile wrote:

> [Unrelated question: why aren't D dynamic arrays triples of values? So they can store their actual length and the memory they have available, so they can support  something like the reserve() of C++ STL. Is the GC already storing such third value? But then I haven't found a reliable way to perform a reserve-like operation.]

The gc is storing that third value. Each allocated chunk has a size. If an array starts at the beginning of a gc chunk, its capacity is defined as the size of that chunk. Here is a way to reserve space for arrays:

// int rather than size_t due to IFTI sensitivity
void reserve(T)(inout T[] arr, int len) {
	if (len > arr.length) {
		auto t = arr.length;
		arr.length = len;
		arr.length = t;
	}
}

-- 
Oskar
March 03, 2008
To: Newsgroups: digitalmars.D
Subject: Re: Lisp like lists and a problem with TANGO fromStringz

Oskar Linde>Here is a way to reserve space for arrays:<

I did try a function equal to that one, but have you timed it? My timing results don't show any advantage of using such reserve.

------------------------------

D code:

// this D code is written to look as much as possible as the C++ version
import std.stdio: printf;
import std.c.time: clock, CLOCKS_PER_SEC, time_t;
import std.conv: toInt;

double myclock() {
    time_t t = clock();
    if (t == -1)
        return 0.0;
    else
        return t/cast(double)CLOCKS_PER_SEC;
}

void reserve(T)(ref T[] arr, int len) {
    // int rather than size_t due to IFTI sensitivity
    if (len > arr.length) {
        auto t = arr.length;
        arr.length = len;
        arr.length = t;
    }
}

int main(string[] argv) {
    int n = toInt(argv[1]);
    double t = myclock();

    switch (argv[2][0]) {
        case '1':
            int[] a;
            for (int i = 0; i < n; ++i)
                a ~= i;
            break;
        case '2':
            int[] a;
            a.reserve(n);
            for (int i = 0; i < n; ++i)
                a ~= i;
            break;
        case '3':
            auto a = new int[n];
            for (int i = 0; i < n; ++i)
                a[i] = i;
            break;
    }

    printf("%.3f s\n", myclock() - t);
    return 0;
}


------------------------------

C++ code:

#include <stdio.h>
#include <time.h>
#include <vector>

using namespace std;

double myclock() {
    time_t t = clock();
    if (t == -1)
        return 0.0;
    else
        return t/(double)CLOCKS_PER_SEC;
}

int main(int argc, char **argv) {
    int n = atoi(argv[1]);
    double t = myclock();

    switch (argv[2][0]) {
        case '1': {
            vector<int> a;
            for (int i = 0; i < n; ++i)
                a.push_back(i);
            break;
        }
        case '2': {
            vector<int> a;
            a.reserve(n);
            for (int i = 0; i < n; ++i)
                a.push_back(i);
            break;
        }
        case '3': {
            vector<int> a(n);
            a.reserve(n);
            for (int i = 0; i < n; ++i)
                a[i] = i;
            break;
        }
    }

    printf("%.3f s\n", myclock() - t);
    return 0;
}

-----------------------------------

OUTPUT TIMINGS:

DMD 1.026, N = 12_000_000:
  append:            3.3 s
  reserve + append:  3.3 s
  allocate + assign: 1.0 s

C++ MinGW 4.2.1, N = 12_000_000, best of 3:
  append:            1.4 s
  reserve + append:  0.5 s
  allocate + assign: 1.0 s

C++ compiled with -O3 -s
DMD compiled with -O -release -inline
Win on Pentium3 CPU.

Notes:
- All timings are best of 3, rounded to 2 significant digits.
- I don't understand how in C++ the append() can twice faster than the assign.

Bye,
bearophile
March 03, 2008
bearophile Wrote:

...
> 
> OUTPUT TIMINGS:
> 
> DMD 1.026, N = 12_000_000:
>   append:            3.3 s
>   reserve + append:  3.3 s
>   allocate + assign: 1.0 s
> 
> C++ MinGW 4.2.1, N = 12_000_000, best of 3:
>   append:            1.4 s
>   reserve + append:  0.5 s
>   allocate + assign: 1.0 s
> 
> C++ compiled with -O3 -s
> DMD compiled with -O -release -inline
> Win on Pentium3 CPU.
> 
> Notes:
> - All timings are best of 3, rounded to 2 significant digits.
> - I don't understand how in C++ the append() can twice faster than the assign.
...
It's simple.
1st do not use clock function -- it measure processor time for current thread. Simply use funtion that measure the time passed.
2nd reason is CPU cache behaviour. Turn it off and compare again.

http://en.wikipedia.org/wiki/CPU_cache
http://leitl.org/docs/comp/AMD_block_prefetch_paper.pdf
...

« First   ‹ Prev
1 2