Jump to page: 1 24  
Page
Thread overview
Vectorization vs. sql
Feb 16, 2005
Knud Sørensen
Feb 16, 2005
Charlie Patterson
Feb 16, 2005
Knud Sørensen
Feb 17, 2005
Charlie Patterson
Feb 17, 2005
Dave
Feb 16, 2005
Craig Black
Feb 17, 2005
Norbert Nemec
Feb 17, 2005
Georg Wrede
Feb 17, 2005
Norbert Nemec
Feb 17, 2005
Georg Wrede
Feb 18, 2005
Norbert Nemec
Feb 19, 2005
Georg Wrede
Feb 20, 2005
Norbert Nemec
Feb 21, 2005
Georg Wrede
Feb 21, 2005
Dave
Feb 21, 2005
Norbert Nemec
Feb 21, 2005
Dave
Feb 24, 2005
Charles Hixson
Feb 24, 2005
Norbert Nemec
Feb 17, 2005
Norbert Nemec
Feb 17, 2005
Knud Sørensen
Feb 20, 2005
Knud Sørensen
Feb 20, 2005
Norbert Nemec
Feb 21, 2005
Georg Wrede
Feb 21, 2005
Knud Sørensen
Feb 21, 2005
Norbert Nemec
Feb 21, 2005
Knud Sørensen
Feb 21, 2005
Norbert Nemec
Feb 22, 2005
Knud Sørensen
Feb 22, 2005
Norbert Nemec
Feb 22, 2005
Knud Sørensen
Feb 21, 2005
Georg Wrede
February 16, 2005
I think vectorization is very similar to sql.

Take matrix multiplication in sql.


select a.index1,b.index2, sum(a.value*b.value) into c (index1,index2,value) from a,b where a.index2=b.index1 group by a.index1,b.index2;


So, maybe we can make a similar notation for vectorization in D.

Something like: vectorize sum(a[i,j]*b[j,k]) into c[i,k] with i=1..l,k=1..n over j=1..m;

Just an idea.

Knud

February 16, 2005
"Knud Sørensen" <12tkvvb02@sneakemail.com> wrote in message news:pan.2005.02.16.17.09.36.655981@sneakemail.com...
>I think vectorization is very similar to sql.

Sorry to be the first rain on the parade, but lots of people despise SQL (and lots admire it).  It is usually considered to have "impedence mismatch" with imperative languages and that's how I tend to feel too.  It's hard to map the mind back and forth and the setting up of variable, etc are done differently.  The question would be, does another syntax to work "through" and array buy enough to justify itself?  Or is it easy enough to write one or a couple of for loops when the occasional need rises?


February 16, 2005
On Wed, 16 Feb 2005 13:51:00 -0500, Charlie Patterson wrote:

> 
> "Knud Sørensen" <12tkvvb02@sneakemail.com> wrote in message news:pan.2005.02.16.17.09.36.655981@sneakemail.com...
>>I think vectorization is very similar to sql.
> 
> Sorry to be the first rain on the parade, but lots of people despise SQL
> (and lots admire it).  It is usually considered to have "impedence mismatch"
> with imperative languages and that's how I tend to feel too.
>  It's hard to
> map the mind back and forth and the setting up of variable, etc are done
> differently.  The question would be, does another syntax to work "through"
> and array buy enough to justify itself?
> Or is it easy enough to write one
> or a couple of for loops when the occasional need rises?


So, far none have posted a general syntax for vectorization
and I would like to play with this syntax to see where it leads.

I invites you to play with me so we might learn from each other ;-)

If you think vectorization is just for loops you might read up
on the subject.

http://www.google.com/custom?domains=www.digitalmars.com&q=vectorization&sa=Search&sitesearch=www.digitalmars.com

February 16, 2005
"Knud Sørensen" <12tkvvb02@sneakemail.com> wrote in message news:pan.2005.02.16.17.09.36.655981@sneakemail.com...
>I think vectorization is very similar to sql.
>
> Take matrix multiplication in sql.
>
>
> select a.index1,b.index2, sum(a.value*b.value) into c
> (index1,index2,value)
> from a,b where a.index2=b.index1 group by a.index1,b.index2;
>
>
> So, maybe we can make a similar notation for vectorization in D.
>
> Something like:
> vectorize sum(a[i,j]*b[j,k]) into c[i,k] with i=1..l,k=1..n over j=1..m;
>
> Just an idea.
>
> Knud

Why not just have a vectorize keyword that will give a hint to the compiler to optimize a block of code?  Then just use regular D loops.

for(int i = 1; i < l; i++)
for(int j = 1; j < m; j++)
for(int k = 1; k < n; k++)
{
   vectorize { c[i, k] = a[i,j] * b[j, k]; }
}


February 17, 2005
Knud Sørensen schrieb:
> I think vectorization is very similar to sql.
> 
> Take matrix multiplication in sql.
> 
> 
> select a.index1,b.index2, sum(a.value*b.value) into c (index1,index2,value) from a,b where a.index2=b.index1 group by a.index1,b.index2;
> 
> 
> So, maybe we can make a similar notation for vectorization in D.
> 
> Something like: vectorize sum(a[i,j]*b[j,k]) into c[i,k] with i=1..l,k=1..n over j=1..m;

I guess, most people in this group will not like that SQL-like syntax. Anyhow, the idea is very similar to what I had in mind already for vectorized expressions:

	c = [i in 0..l,k in 0..n](sum([j in 0..m](a[i,j]*b[j,k])))

(The exact syntax is still disputable.)

The idea is to make
	[x in a..b](some_expr(x))
a vectorized expression, i.e. an expression that returns an array, where the entries of the array are evaluated without any specific order any only if needed. I.e.

	([i in 0..5](a[i]*b[i]))[2]

would be equivalent to

	a[2]*b[2]

or, a bit trickier:

	([i in 2..5](a[i]*b[i]))[2] 	== 	a[4]*b[4]

Main difference to the SQL-like 'vectorize'-statement is, that this is not restrictred to complete statements, even though, it does work for assignments just as well, of course:

	[i in 0..m](a[i] = b[i+1]+c);

Furthermore, it should work across functions as well:

	int[] add(int[] a, int[] b) {
		assert(a.length == b.length);
		return [i in 0..a.length](a[i]+b[j]);
	}

The compiler should be able to recognize the vectorized expression in the return statement and optimize it when the function is inlined.
(This detail needs further refinement)
February 17, 2005
Craig Black schrieb:
> "Knud Sørensen" <12tkvvb02@sneakemail.com> wrote in message news:pan.2005.02.16.17.09.36.655981@sneakemail.com...
> 
>>I think vectorization is very similar to sql.
>>
>>Take matrix multiplication in sql.
>>
>>
>>select a.index1,b.index2, sum(a.value*b.value) into c (index1,index2,value)
>>from a,b where a.index2=b.index1 group by a.index1,b.index2;
>>
>>
>>So, maybe we can make a similar notation for vectorization in D.
>>
>>Something like:
>>vectorize sum(a[i,j]*b[j,k]) into c[i,k] with i=1..l,k=1..n over j=1..m;
>>
>>Just an idea.
>>
>>Knud
> 
> 
> Why not just have a vectorize keyword that will give a hint to the compiler to optimize a block of code?  Then just use regular D loops.
> 
> for(int i = 1; i < l; i++)
> for(int j = 1; j < m; j++)
> for(int k = 1; k < n; k++)
> {
>    vectorize { c[i, k] = a[i,j] * b[j, k]; }
> } 

I don't like that idea too much: first, you say "do in that order", then you say "but ignore the order".

The for-loop simply is far too flexible to combine it with vectorization. What is needed is a new kind of a loop that does not imply an order in the first place.

(See my suggestion in the other post)
February 17, 2005
>> Why not just have a vectorize keyword that will give a hint to the compiler to optimize a block of code?  Then just use regular D loops.
>>
>> for(int i = 1; i < l; i++)
>> for(int j = 1; j < m; j++)
>> for(int k = 1; k < n; k++)
>> {
>>    vectorize { c[i, k] = a[i,j] * b[j, k]; }
>> } 
> 
> 
> I don't like that idea too much: first, you say "do in that order", then you say "but ignore the order".
> 
> The for-loop simply is far too flexible to combine it with vectorization. What is needed is a new kind of a loop that does not imply an order in the first place.

Why not

>> vectorize(int i; minVi; maxVi) {
>> vectorize(int j; minVj; maxVj) {
>> vectorize(int k; minVk; maxVk)
>> {
>>    c[i, k] = a[i,j] * b[j, k];
>> } }}
February 17, 2005
Georg Wrede schrieb:
> Why not
> 
>  >> vectorize(int i; minVi; maxVi) {
>  >> vectorize(int j; minVj; maxVj) {
>  >> vectorize(int k; minVk; maxVk)
>  >> {
>  >>    c[i, k] = a[i,j] * b[j, k];
>  >> } }}

Because this means that you can only vectorize statements and not expressions. This again means:
* you cannot express vectorized returns
* you cannot mix vectorized operations with other array operations
* you cannot feed the result of a vectorized operation into a function without storing it into a temporary variable

Furthermore, this syntax still seems to imply that the 'i' loop is the outermost loop. Especially loop reordering is an important tool for optimization which could be done by an advanced optimizing compiler.

(Instead of having a syntax that implies an order and then saying "But the compiler is allowed to change the order", I would strongly prefer a syntax that differs substantially from sequential statements.)

By the way: be aware that I only consider *vectorization*, i.e. command level parallelization. Parallelizing whole blocks of code is a completely different story and should be handled separately. That kind of parallelization needs different strategies in the compiler and aims for a mostly distinct audience.
February 17, 2005
"Knud Sørensen" <12tkvvb02@sneakemail.com> wrote in message
>> Sorry to be the first rain on the parade, but lots of people despise SQL
>> (and lots admire it).  It is usually considered to have "impedence
>> mismatch"
>> with imperative languages and that's how I tend to feel too.
>>  It's hard to
>> map the mind back and forth and the setting up of variable, etc are done
>> differently.  The question would be, does another syntax to work
>> "through"
>> and array buy enough to justify itself?
>> Or is it easy enough to write one
>> or a couple of for loops when the occasional need rises?
>
>
> So, far none have posted a general syntax for vectorization
> and I would like to play with this syntax to see where it leads.
>
> I invites you to play with me so we might learn from each other ;-)

Thanks and I will follow along and help if I can.  But my point still stands that any relational-looking syntax will have that "impedance mismatch."

My understanding of D is that it cleans up where Java is too limited and C++ is too complicated.  I think vectorization is "out there" and not used often enough to warrant scaring some potential new users looking for an upgrade from Java.  IMHO.

Which is not to say I don't want any new concepts.  I think a good regular expressions engine would be used in every non-class-assignment program and is worthy of debating syntax.  (Maybe completely forgetting about the grep/awk/perl legacy.)  Again, I just think vectorization is an advanced concept that won't be used enough to warrant all that effort grafting it into D.  Sorry.


February 17, 2005
In article <cv2fha$b1q$1@digitaldaemon.com>, Charlie Patterson says...
>
>
>"Knud Sørensen" <12tkvvb02@sneakemail.com> wrote in message
>>> Sorry to be the first rain on the parade, but lots of people despise SQL
>>> (and lots admire it).  It is usually considered to have "impedence
>>> mismatch"
>>> with imperative languages and that's how I tend to feel too.
>>>  It's hard to
>>> map the mind back and forth and the setting up of variable, etc are done
>>> differently.  The question would be, does another syntax to work
>>> "through"
>>> and array buy enough to justify itself?
>>> Or is it easy enough to write one
>>> or a couple of for loops when the occasional need rises?
>>
>>
>> So, far none have posted a general syntax for vectorization
>> and I would like to play with this syntax to see where it leads.
>>
>> I invites you to play with me so we might learn from each other ;-)
>
>Thanks and I will follow along and help if I can.  But my point still stands that any relational-looking syntax will have that "impedance mismatch."
>

Agreed here, but..

>My understanding of D is that it cleans up where Java is too limited and C++ is too complicated.  I think vectorization is "out there" and not used often enough to warrant scaring some potential new users looking for an upgrade from Java.  IMHO.
>

. I'm not so sure here. Totally aside from the performance implications of the compiler vectorizing, what this discussion also implies is some sort of abbreviated syntax for common array operations.

IMHO, abbreviated array operations would probably be a larger productivity enhancement to D and, if done right, not nearly as scary as learning the regex syntax for new users ;)

That is not to say that the regex ideas are not great - they are - and would also be very welcome by me.

- Dave

>Which is not to say I don't want any new concepts.  I think a good regular expressions engine would be used in every non-class-assignment program and is worthy of debating syntax.  (Maybe completely forgetting about the grep/awk/perl legacy.)  Again, I just think vectorization is an advanced concept that won't be used enough to warrant all that effort grafting it into D.  Sorry.
>
>


« First   ‹ Prev
1 2 3 4