Thread overview
array.reverse segfaults
Oct 22, 2008
Moritz Warning
Oct 22, 2008
Bill Baxter
Oct 22, 2008
Moritz Warning
Oct 22, 2008
Moritz Warning
Oct 22, 2008
Denis Koroskin
Oct 23, 2008
Moritz Warning
October 22, 2008
Hi,

This piece of code segfaults on Debian Linux (with dmd 1.035):
Can someone tell me why?

char[] get(char[] str)
{
    return new char[](4);
}

void main(char[][] args)
{
    char[] str =  get("abc");
   char[] reversed = str.reverse; // <-- access violation
}
October 22, 2008
On Wed, Oct 22, 2008 at 7:16 PM, Moritz Warning <moritzwarning@web.de> wrote:
> Hi,
>
> This piece of code segfaults on Debian Linux (with dmd 1.035):
> Can someone tell me why?
>
> char[] get(char[] str)
> {
>    return new char[](4);
> }
>
> void main(char[][] args)
> {
>    char[] str =  get("abc");
>   char[] reversed = str.reverse; // <-- access violation
> }

Does str.reverse actually return anything?
I think you need to do that as:
   str.reverse;
   char[] reversed = str;

However, if it's not supposed to return anything, then it's a bug that
it compiles.
If it's supposed to return something, then it's a bug that it crashes.
Does it work doing it as two lines?

--bb
October 22, 2008
On Wed, 22 Oct 2008 19:28:08 +0900, Bill Baxter wrote:

> On Wed, Oct 22, 2008 at 7:16 PM, Moritz Warning <moritzwarning@web.de> wrote:
>> Hi,
>>
>> This piece of code segfaults on Debian Linux (with dmd 1.035): Can
>> someone tell me why?
>>
>> char[] get(char[] str)
>> {
>>    return new char[](4);
>> }
>>
>> void main(char[][] args)
>> {
>>    char[] str =  get("abc");
>>   char[] reversed = str.reverse; // <-- access violation
>> }
> 
> Does str.reverse actually return anything? I think you need to do that
> as:
>    str.reverse;
>    char[] reversed = str;
> 
> However, if it's not supposed to return anything, then it's a bug that
> it compiles.
> If it's supposed to return something, then it's a bug that it crashes.
> Does it work doing it as two lines?
> 
> --bb

From the specs:
.reverse  	Reverses in place the order of the elements in the array.
Returns the array.

Removing the assignment to "reversed" doesn't change anything.
October 22, 2008
Moritz Warning wrote:
> Hi,
> 
> This piece of code segfaults on Debian Linux (with dmd 1.035):
> Can someone tell me why?
> 
> char[] get(char[] str)
> {
>     return new char[](4);
> }
> 
> void main(char[][] args)
> {
>     char[] str =  get("abc");
>    char[] reversed = str.reverse; // <-- access violation
> }

Simpler version:

void main()
{
    char[4] str;
    str.reverse;
}

Crashes in _adReverseChar when trying to memmove (3 - 255) bytes ;)

My best guess is that is just doesn't handle char.init values properly!
October 22, 2008
Tomas Lindquist Olsen wrote:
> Moritz Warning wrote:
>> Hi,
>>
>> This piece of code segfaults on Debian Linux (with dmd 1.035):
>> Can someone tell me why?
>>
>> char[] get(char[] str)
>> {
>>     return new char[](4);
>> }
>>
>> void main(char[][] args)
>> {
>>     char[] str =  get("abc");
>>    char[] reversed = str.reverse; // <-- access violation
>> }
> 
> Simpler version:
> 
> void main()
> {
>     char[4] str;
>     str.reverse;
> }
> 
> Crashes in _adReverseChar when trying to memmove (3 - 255) bytes ;)
> 
> My best guess is that is just doesn't handle char.init values properly!

When it tries to get the lower stride, it gets 0xFF from the table, but it doesn't check if this value is usable.

Probably just ignoring these invalid bytes would make it work. But I think the real question is, what should _adReverseChar really do on invalid UTF-8 input?
October 22, 2008
On Wed, 22 Oct 2008 13:10:20 +0200, Tomas Lindquist Olsen wrote:

> Tomas Lindquist Olsen wrote:
>> Moritz Warning wrote:
>>> Hi,
>>>
>>> This piece of code segfaults on Debian Linux (with dmd 1.035): Can
>>> someone tell me why?
>>>
>>> char[] get(char[] str)
>>> {
>>>     return new char[](4);
>>> }
>>>
>>> void main(char[][] args)
>>> {
>>>     char[] str =  get("abc");
>>>    char[] reversed = str.reverse; // <-- access violation
>>> }
>> 
>> Simpler version:
>> 
>> void main()
>> {
>>     char[4] str;
>>     str.reverse;
>> }
>> 
>> Crashes in _adReverseChar when trying to memmove (3 - 255) bytes ;)
>> 
>> My best guess is that is just doesn't handle char.init values properly!
> 
> When it tries to get the lower stride, it gets 0xFF from the table, but it doesn't check if this value is usable.
> 
> Probably just ignoring these invalid bytes would make it work. But I think the real question is, what should _adReverseChar really do on invalid UTF-8 input?

I think it should do the same as on an invalid pointer: result in undefined behavior (=> segfault).
October 22, 2008
On Wed, 22 Oct 2008 15:21:03 +0400, Moritz Warning <moritzwarning@web.de> wrote:

> On Wed, 22 Oct 2008 13:10:20 +0200, Tomas Lindquist Olsen wrote:
>
>> Tomas Lindquist Olsen wrote:
>>> Moritz Warning wrote:
>>>> Hi,
>>>>
>>>> This piece of code segfaults on Debian Linux (with dmd 1.035): Can
>>>> someone tell me why?
>>>>
>>>> char[] get(char[] str)
>>>> {
>>>>     return new char[](4);
>>>> }
>>>>
>>>> void main(char[][] args)
>>>> {
>>>>     char[] str =  get("abc");
>>>>    char[] reversed = str.reverse; // <-- access violation
>>>> }
>>>
>>> Simpler version:
>>>
>>> void main()
>>> {
>>>     char[4] str;
>>>     str.reverse;
>>> }
>>>
>>> Crashes in _adReverseChar when trying to memmove (3 - 255) bytes ;)
>>>
>>> My best guess is that is just doesn't handle char.init values properly!
>>
>> When it tries to get the lower stride, it gets 0xFF from the table, but
>> it doesn't check if this value is usable.
>>
>> Probably just ignoring these invalid bytes would make it work. But I
>> think the real question is, what should _adReverseChar really do on
>> invalid UTF-8 input?
>
> I think it should do the same as on an invalid pointer: result in
> undefined behavior (=> segfault).

It should not pass the assert(isValidUtf8String(str)) prior to in-place reverse, thus throwing an exception in debug mode.
Release behaviour is a subject to debat, but I think it should be more robust. Given wrong input it may produce whatever wrong output, but segfault? That's too bold.
October 22, 2008
On Wed, Oct 22, 2008 at 9:46 AM, Denis Koroskin <2korden@gmail.com> wrote:
> On Wed, 22 Oct 2008 15:21:03 +0400, Moritz Warning <moritzwarning@web.de> wrote:
>
>> On Wed, 22 Oct 2008 13:10:20 +0200, Tomas Lindquist Olsen wrote:
>>
>>> Tomas Lindquist Olsen wrote:
>>>>
>>>> Moritz Warning wrote:
>>>>>
>>>>> Hi,
>>>>>
>>>>> This piece of code segfaults on Debian Linux (with dmd 1.035): Can
>>>>> someone tell me why?
>>>>>
>>>>> char[] get(char[] str)
>>>>> {
>>>>>    return new char[](4);
>>>>> }
>>>>>
>>>>> void main(char[][] args)
>>>>> {
>>>>>    char[] str =  get("abc");
>>>>>   char[] reversed = str.reverse; // <-- access violation
>>>>> }
>>>>
>>>> Simpler version:
>>>>
>>>> void main()
>>>> {
>>>>    char[4] str;
>>>>    str.reverse;
>>>> }
>>>>
>>>> Crashes in _adReverseChar when trying to memmove (3 - 255) bytes ;)
>>>>
>>>> My best guess is that is just doesn't handle char.init values properly!
>>>
>>> When it tries to get the lower stride, it gets 0xFF from the table, but it doesn't check if this value is usable.
>>>
>>> Probably just ignoring these invalid bytes would make it work. But I think the real question is, what should _adReverseChar really do on invalid UTF-8 input?
>>
>> I think it should do the same as on an invalid pointer: result in
>> undefined behavior (=> segfault).
>
> It should not pass the assert(isValidUtf8String(str)) prior to in-place
> reverse, thus throwing an exception in debug mode.
> Release behaviour is a subject to debat, but I think it should be more
> robust. Given wrong input it may produce whatever wrong output, but
> segfault? That's too bold.
>

I'd expect it to work like every other piece of code in the runtime that deals with unicode and throw a UtfException or whatever it is.
October 23, 2008
On Wed, 22 Oct 2008 17:46:26 +0400, Denis Koroskin wrote:

> On Wed, 22 Oct 2008 15:21:03 +0400, Moritz Warning <moritzwarning@web.de> wrote:
> 
>> On Wed, 22 Oct 2008 13:10:20 +0200, Tomas Lindquist Olsen wrote:
>>
>>> Tomas Lindquist Olsen wrote:
>>>> Moritz Warning wrote:
>>>>> Hi,
>>>>>
>>>>> This piece of code segfaults on Debian Linux (with dmd 1.035): Can
>>>>> someone tell me why?
>>>>>
>>>>> char[] get(char[] str)
>>>>> {
>>>>>     return new char[](4);
>>>>> }
>>>>>
>>>>> void main(char[][] args)
>>>>> {
>>>>>     char[] str =  get("abc");
>>>>>    char[] reversed = str.reverse; // <-- access violation
>>>>> }
>>>>
>>>> Simpler version:
>>>>
>>>> void main()
>>>> {
>>>>     char[4] str;
>>>>     str.reverse;
>>>> }
>>>>
>>>> Crashes in _adReverseChar when trying to memmove (3 - 255) bytes ;)
>>>>
>>>> My best guess is that is just doesn't handle char.init values properly!
>>>
>>> When it tries to get the lower stride, it gets 0xFF from the table, but it doesn't check if this value is usable.
>>>
>>> Probably just ignoring these invalid bytes would make it work. But I think the real question is, what should _adReverseChar really do on invalid UTF-8 input?
>>
>> I think it should do the same as on an invalid pointer: result in
>> undefined behavior (=> segfault).
> 
> It should not pass the assert(isValidUtf8String(str)) prior to in-place reverse, thus throwing an exception in debug mode. Release behaviour is a subject to debat, but I think it should be more robust. Given wrong input it may produce whatever wrong output, but segfault? That's too bold.

I was only referring to release builds.
Imho, If additional robustness doesn't result in a speed hit,
then throwing an exception would be better.