February 05, 2012
On Sat, 28 Jan 2012 17:58:40 -0800, J Arrizza wrote:

Isn't this the killer "app" for D (like ROR for Ruby, etc.) ?  There was a thread a while ago where someone said the popularity of a language depends on an app that drives the use of that language.

There's also the up-coming ARM based Blackberry Pi. That could achieve a lot of penetration in the education sector internationally.

Catch em while they're young!

Steve
February 06, 2012
Don't you mean Raspberry Pi?

http://www.raspberrypi.org/

On Sunday, 5 February 2012 at 15:45:11 UTC, Steve Teale wrote:
> On Sat, 28 Jan 2012 17:58:40 -0800, J Arrizza wrote:
>
> Isn't this the killer "app" for D (like ROR for Ruby, etc.) ?  There was a
> thread a while ago where someone said the popularity of a language depends
> on an app that drives the use of that language.
>
> There's also the up-coming ARM based Blackberry Pi. That could achieve a lot of penetration in the education sector internationally.
>
> Catch em while they're young!
>
> Steve


February 07, 2012
On Mon, 06 Feb 2012 12:22:45 +0100, Ludovic Silvestre wrote:

> Don't you mean Raspberry Pi?
> 
Ah, yes - I keep doing that.

I have an virtualbox emulator for it up and running, and will try to build GDC within that as soon as I get myself into town and buy more memory.

Steve
February 07, 2012
Hi, this is me again with some "size matters" topic. This time, it's not the executable size, no! Instead I want to discuss a runtime memory footprint and speed issue that affects everyone, and how to improve the situation dramatically.

In D we allocate memory through the GC, that is initialized according to
the type's .init, which gives us a save default. In most cases this will
result in the memory block being zeroed out, like in the case of
allocating ubyte[] buffers. Let's assume, we have a program that allocates
some buffers in advance, that it may not use fully. This often happens
when the input data is much smaller than the anticipated case. So our
memory manager should handle this situation well:
   o  zero out a memory block
   o  we probably don't need all of it

So here is a small benchmark that allocates 512 * 1 MB, first using the typical method: new ubyte[1024 * 1024]. The oputput is:

	** new ubyte[1024 * 1024]
	   ressource usage: +526840 KB
	   user time: +0.098s | sys. time: +0.368s

As expected we have a physical memory usage increase of ~512 MB and spent a considerable amount of time in the system to find free memory blocks and in our program to initialize the data to zero. Can we use the GC more directly? Let's try GC.calloc:

	** GC.calloc(1024 * 1024)
	   ressource usage: +525104 KB
	   user time: +0.089s | sys. time: +0.370s

Again, 512 MB and about the same time. Nothing gained, but my RAM is starting to fill up. By the way, how does a good old system call to 'malloc' compare? That gives us a block of garbage 'initialized' data - a situation we left behind for good in D! So here we go with another test:

	** malloc(1024 * 1024)
	   ressource usage: +2048 KB
	   user time: +0.000s | sys. time: +0.002s

Oh nice! May I say... these 512 calls were for free? 2 MB and 0.002 seconds ain't worth talking about. The operating system didn't actually allocate the memory, it merely gave us a virtual memory range to use. Only when we write to the memory will physical memory be bound. That's perfect for a generously sized buffer, right? Well... we still want it zeroed out, so let's initialize this data to zero with ptr[0 .. 1024 * 1024] = 0:

	** malloc(1024 * 1024) + zero out
	   ressource usage: +526336 KB
	   user time: +0.053s | sys. time: +0.366s

... and we are back at square one. With the exception, that the user time is notably lower. What we need is a facility that gives us lazily allocated zeroed out memory. And guess what, it's not too much to ask for. Here is 'calloc' to the rescue:

	** calloc(1, 1024 * 1024)
	   ressource usage: +2048 KB
	   user time: +0.001s | sys. time: +0.001s

How does it work? The operating system fakes the memory allocation and
just gives us 131072 references to a special read-only memory page of
zeroes. The semantic is copy-on-write. So we start with a view on zeroed
out memory and get the real thing once we write into it. (Sorry, if I tell
some of you nothing new, but I just found this out today ;) )
The question I have is, should we go and improve druntime with that
knowledge? I'm not aware of any caveats, are there any?
Thanks for reading and the test program for Linux is in the attachment (I
used GDC to compile).

-- Marco

February 07, 2012
Aw Opera fooled me again into answering a post, instead of creating a new one - ignore this, I'll repost a proper thread.
1 2 3
Next ›   Last »