Thread overview
Pasting illegal character causes segmentation fault
Dec 25, 2009
Mitja
Dec 25, 2009
Frank Benoit
Dec 25, 2009
Mitja
Dec 25, 2009
Frank Benoit
Dec 26, 2009
Mitja
Dec 26, 2009
Frank Benoit
December 25, 2009
When I paste illegal character (looks like rectangle) in Text widget,
application exits with segmentation fault. I cannot catch bad character with Verify Listener, because it's already too late. Is there any other way to detect it?

Platform is Debian 5.0.3 (lenny), DMD v1.033.
December 25, 2009
Am 25.12.2009 13:22, schrieb Mitja:
> When I paste illegal character (looks like rectangle) in Text widget,
> application exits with segmentation fault. I cannot catch bad character with Verify Listener, because it's already too late. Is there any other way to detect it?
> 
> Platform is Debian 5.0.3 (lenny), DMD v1.033.

Can you give a stack trace, the exact character you pasted and perhaps a compilable small example code to reproduce the bug?
December 25, 2009
strace output:
select(4, [3], [3], NULL, NULL)         = 1 (out [3])
writev(3, [{"\24\0\6\0005\2@\5\343\2\0\0\0\0\0\0\0\0\0\0\377\377\377\37"..., 24}], 1) = 24
select(4, [3], [], NULL, NULL)          = 1 (in [3])
read(3, "\1\10S6\4\0\0\0006\1\0\0\0\0\0\0\17\0\0\0VJ\25\10\374\325 \10tc\346\277\203"..., 4096) = 48
read(3, 0x840a2fc, 4096)                = -1 EAGAIN (Resource temporarily unavailable)
select(4, [3], [3], NULL, NULL)         = 1 (out [3])
writev(3, [{"+\0\1\0"..., 4}], 1)       = 4
select(4, [3], [], NULL, NULL)          = 1 (in [3])
read(3, "\1\2T6\0\0\0\0#\0@\5\374\325 \10tc\346\277x+\177\td\"\36\10Y3\t\10"..., 4096) = 32
read(3, 0x840a2fc, 4096)                = -1 EAGAIN (Resource temporarily unavailable)
select(4, [3], [3], NULL, NULL)         = 1 (out [3])
writev(3, [{"\23\0\3\0005\2@\5\343\2\0\0+\0\1\0"..., 16}], 1) = 16
select(4, [3], [], NULL, NULL)          = 1 (in [3])
read(3, "\34\"U65\2@\5\343\2\0\0\0\252Fc\1\2\0\0\4\0\0\0\1\0\0\0@B\233\t\1"..., 4096) = 64
read(3, 0x840a2fc, 4096)                = -1 EAGAIN (Resource temporarily unavailable)
--- SIGSEGV (Segmentation fault) @ 0 (0) ---
+++ killed by SIGSEGV +++

Example code:
module text;

import dwt.DWT;
import dwt.custom.StyleRange;
import dwt.widgets.Text;
import dwt.layout.FillLayout;
import dwt.widgets.Display;
import dwt.widgets.Shell;

void main() {
    Display display = new Display();
    Shell shell = new Shell(display);
    shell.setLayout(new FillLayout());
    Text text = new Text(shell, DWT.BORDER);
    shell.pack();
    shell.open();
    while (!shell.isDisposed()) {
      if (!display.readAndDispatch())
        display.sleep();
    }
    display.dispose();
}

I cannot reproduce the character.
The way I got it was by pasting block of Japanese or Chinese characters into text widget and then selecting by mouse and copying a few characters from that block.
Illegal character(s) would as a result of copying appear in Klipper (KDE clipboard), usually in front or at the end of copied selection.
Segmentation fault would occur when I pasted back the very same selection, now with prepended or appended illegal characters.


Frank Benoit Wrote:

> Am 25.12.2009 13:22, schrieb Mitja:
> > When I paste illegal character (looks like rectangle) in Text widget,
> > application exits with segmentation fault. I cannot catch bad character with Verify Listener, because it's already too late. Is there any other way to detect it?
> > 
> > Platform is Debian 5.0.3 (lenny), DMD v1.033.
> 
> Can you give a stack trace, the exact character you pasted and perhaps a compilable small example code to reproduce the bug?

December 25, 2009
Am 25.12.2009 23:26, schrieb Mitja:
> strace output:
> select(4, [3], [3], NULL, NULL)         = 1 (out [3])
> writev(3, [{"\24\0\6\0005\2@\5\343\2\0\0\0\0\0\0\0\0\0\0\377\377\377\37"..., 24}], 1) = 24
> select(4, [3], [], NULL, NULL)          = 1 (in [3])
> read(3, "\1\10S6\4\0\0\0006\1\0\0\0\0\0\0\17\0\0\0VJ\25\10\374\325 \10tc\346\277\203"..., 4096) = 48
> read(3, 0x840a2fc, 4096)                = -1 EAGAIN (Resource temporarily unavailable)
> select(4, [3], [3], NULL, NULL)         = 1 (out [3])
> writev(3, [{"+\0\1\0"..., 4}], 1)       = 4
> select(4, [3], [], NULL, NULL)          = 1 (in [3])
> read(3, "\1\2T6\0\0\0\0#\0@\5\374\325 \10tc\346\277x+\177\td\"\36\10Y3\t\10"..., 4096) = 32
> read(3, 0x840a2fc, 4096)                = -1 EAGAIN (Resource temporarily unavailable)
> select(4, [3], [3], NULL, NULL)         = 1 (out [3])
> writev(3, [{"\23\0\3\0005\2@\5\343\2\0\0+\0\1\0"..., 16}], 1) = 16
> select(4, [3], [], NULL, NULL)          = 1 (in [3])
> read(3, "\34\"U65\2@\5\343\2\0\0\0\252Fc\1\2\0\0\4\0\0\0\1\0\0\0@B\233\t\1"..., 4096) = 64
> read(3, 0x840a2fc, 4096)                = -1 EAGAIN (Resource temporarily unavailable)
> --- SIGSEGV (Segmentation fault) @ 0 (0) ---
> +++ killed by SIGSEGV +++
> 

Oh, i did not mean strace (which is a trace of the OS calls a process does), instead i mean stack trace. You can perhaps get it with the current tango (i think) or by running the program in GDC and after the segfault run the "backtrace" command in it. It show the code line where the segfault happens and the code lines where this one was called from.


> Example code:
> module text;
> 
> import dwt.DWT;
> import dwt.custom.StyleRange;
> import dwt.widgets.Text;
> import dwt.layout.FillLayout;
> import dwt.widgets.Display;
> import dwt.widgets.Shell;
> 
> void main() {
>     Display display = new Display();
>     Shell shell = new Shell(display);
>     shell.setLayout(new FillLayout());
>     Text text = new Text(shell, DWT.BORDER);
>     shell.pack();
>     shell.open();
>     while (!shell.isDisposed()) {
>       if (!display.readAndDispatch())
>         display.sleep();
>     }
>     display.dispose();
> }
> 

thanks.


> I cannot reproduce the character.
> The way I got it was by pasting block of Japanese or Chinese characters into text widget and then selecting by mouse and copying a few characters from that block.
> Illegal character(s) would as a result of copying appear in Klipper (KDE clipboard), usually in front or at the end of copied selection.
> Segmentation fault would occur when I pasted back the very same selection, now with prepended or appended illegal characters.
> 

Hm, perhaps you can paste the same into a editor and open it with a hex-view (hexdump) to make the byte values visible.

December 26, 2009
gdb backtrace:
(gdb) run
Starting program:
[Thread debugging using libthread_db enabled]
[New Thread 0xb74fd6b0 (LWP 5486)]

Program received signal SIGSEGV, Segmentation fault.
[Switching to Thread 0xb74fd6b0 (LWP 5486)]
0x080a1458 in _D3dwt8graphics6Device6Device11logFunctionUPaiPaPvZv ()
Current language:  auto; currently asm
(gdb) backtrace
#0  0x080a1458 in _D3dwt8graphics6Device6Device11logFunctionUPaiPaPvZv ()
#1  0xb7ebcab7 in IA__g_logv (log_domain=0xb7c3c4ab "Gdk", log_level=G_LOG_LEVEL_WARNING,
    format=0xb7c5b58c "Error converting selection from UTF8_STRING", args1=0xbf971288 "��ŷ P\026\n\005\202ŷ�\022\227�") at gmessages.c:474
#2  0xb7ebccb9 in IA__g_log (log_domain=0xb7c3c4ab "Gdk", log_level=G_LOG_LEVEL_WARNING, format=0xb7c5b58c "Error converting selection from UTF8_STRING")
    at gmessages.c:517
#3  0xb7c2ea7e in make_list (text=<value optimized out>, length=<value optimized out>, latin1=0, list=0xbf971364) at gdkselection-x11.c:523
#4  0xb7c2eef6 in IA__gdk_text_property_to_utf8_list_for_display (display=0xa165020, encoding=0x4a, format=8,
    text=0xa2281f8 "\203\236&#65533;\203&#65533;&#65533;\203\206&#65533;\202&#65533;&#65533;", length=12, list=0xbf971364) at gdkselection-x11.c:597
#5  0xb7a22df4 in IA__gtk_selection_data_get_text (selection_data=0xbf971870) at gtkselection.c:1431
#6  0xb7af7335 in request_text_received_func (clipboard=0xa1ee4f0, selection_data=0xbf971870, data=0xa2271a0) at gtkclipboard.c:911
#7  0xb7af636b in selection_received (widget=0xa1e02e8, selection_data=0xbf971870, time=1699038784) at gtkclipboard.c:847
#8  0xb79bdc1f in _gtk_marshal_VOID__BOXED_UINT (closure=0xa224318, return_value=0x0, n_param_values=3, param_values=0xbf971618,
    invocation_hint=0xbf971528, marshal_data=0xb7af6310) at gtkmarshalers.c:1584
#9  0xb7e451db in IA__g_closure_invoke (closure=0xa224318, return_value=0x0, n_param_values=3, param_values=0xbf971618, invocation_hint=0xbf971528)
    at gclosure.c:490
#10 0xb7e55fc3 in signal_emit_unlocked_R (node=0xa176170, detail=0, instance=0xa1e02e8, emission_return=0x0, instance_and_params=0xbf971618)
    at gsignal.c:2440
#11 0xb7e574b9 in IA__g_signal_emit_valist (instance=0xa1e02e8, signal_id=51, detail=0, var_args=0xbf97186c "\200]\"\nE") at gsignal.c:2199
#12 0xb7e59bae in IA__g_signal_emit_by_name (instance=0xa1e02e8, detailed_signal=0xb7bae9df "selection_received") at gsignal.c:2267
#13 0xb7a1ff72 in gtk_selection_retrieval_report (info=0xa22cc80, type=<value optimized out>, format=<value optimized out>,
    buffer=0xa2281f8 "\203\236&#65533;\203&#65533;&#65533;\203\206&#65533;\202&#65533;&#65533;", length=12, time=1699038784) at gtkselection.c:2772
#14 0xb7a20315 in _gtk_selection_notify (widget=0xa1e02e8, event=0xa192f80) at gtkselection.c:2578
#15 0xb79bfb10 in _gtk_marshal_BOOLEAN__BOXED (closure=0xa177080, return_value=0xbf971a5c, n_param_values=2, param_values=0xbf971b38,
    invocation_hint=0xbf971a48, marshal_data=0xb7a20220) at gtkmarshalers.c:84
#16 0xb7e43789 in g_type_class_meta_marshal (closure=0xa177080, return_value=0xbf971a5c, n_param_values=2, param_values=0xbf971b38,
    invocation_hint=0xbf971a48, marshal_data=0xfc) at gclosure.c:567
#17 0xb7e451db in IA__g_closure_invoke (closure=0xa177080, return_value=0xbf971a5c, n_param_values=2, param_values=0xbf971b38, invocation_hint=0xbf971a48)
    at gclosure.c:490
#18 0xb7e565ff in signal_emit_unlocked_R (node=0xa1770b8, detail=0, instance=0xa1e02e8, emission_return=0xbf971cf8, instance_and_params=0xbf971b38)
    at gsignal.c:2478
#19 0xb7e57298 in IA__g_signal_emit_valist (instance=0xa1e02e8, signal_id=50, detail=0,
    var_args=0xbf971d7c "\224\035\227&#65533;\200/\031\n&#65533;\002\036\no\022&#65533;&#65533;&#65533;\002\036\nH)\027\n") at gsignal.c:2209
#20 0xb7e57669 in IA__g_signal_emit (instance=0xa1e02e8, signal_id=50, detail=0) at gsignal.c:2243
#21 0xb7adc594 in gtk_widget_event_internal (widget=0xa1e02e8, event=0xa192f80) at gtkwidget.c:4678
#22 0xb79b9e1b in IA__gtk_main_do_event (event=0xa192f80) at gtkmain.c:1534
#23 0x08094b2b in _D3dwt8internal3gtk2OS2OS101__T17ForwardGtkOsCFuncS75_D3dwt8internal1c3gtk17gtk_main_do_eventPUPS3dwt8internal1c3gdk9_GdkEventZvZ17gtk_main_do_eventFPS3dwt8internal1c3gdk9_GdkEventZv ()
#24 0x080e1948 in _D3dwt7widgets7Display7Display13eventProcMethMFPS3dwt8internal1c3gdk9_GdkEventZv ()
#25 0x080e186d in _D3dwt7widgets7Display7Display13eventProcFuncUPS3dwt8internal1c3gdk9_GdkEventPvZv ()
#26 0xb7c2240a in gdk_event_dispatch (source=0xa171908, callback=0, user_data=0x0) at gdkevents-x11.c:2351
#27 0xb7eb382b in IA__g_main_context_dispatch (context=0xa171950) at gmain.c:2012
#28 0xb7eb6d46 in g_main_context_iterate (context=0xa171950, block=0, dispatch=1, self=0xa14b8e0) at gmain.c:2645
#29 0xb7eb72c7 in IA__g_main_context_iteration (context=0xa171950, may_block=0) at gmain.c:2708
#30 0x0808cc25 in _D3dwt8internal3gtk2OS2OS50__T17ForwardGtkOsCFuncS24g_main_context_iterationZ24g_main_context_iterationFPviZi ()
#31 0x080e5f88 in _D3dwt7widgets7Display7Display15readAndDispatchMFZb ()
#32 0x08150186 in _D3gui3GUI4drawMFZv ()
#33 0x08054e80 in _Dmain ()
#34 0x0815faa4 in _D6dmain24mainUiPPaZi7runMainMFZv ()
#35 0x0815f879 in _D6dmain24mainUiPPaZi7tryExecMFDFZvZv ()
#36 0x0815fae1 in _D6dmain24mainUiPPaZi6runAllMFZv ()
---Type <return> to continue, or q <return> to quit--- 

After more fiddling around with this, I think only copying from
StyledText widget to clipboard appends illegal characters, not from standard Text widget.

There appear to be at least 7 different questionable characters.

Emacs' representation of illegal characters:
\276\264 &#12488;&#12514;&#12487;&#12523; \343
\201\250&#12475;&#12510;&#12531;&#12486;&#12451;
\203\236&#12531;&#12486;&#12451;\343

Hex dump:
00000000: beb4 20e3 8388 e383 a2e3 8387 e383 ab20  .. ............
00000010: e30a 81a8 e382 bbe3 839e e383 b3e3 8386  ................
00000020: e382 a30a 839e e383 b3e3 8386 e382 a3e3  ................

Actual text:
&#65533;&#65533; &#12488;&#12514;&#12487;&#12523; &#65533;
&#65533;&#65533;&#12475;&#12510;&#12531;&#12486;&#12451;
&#65533;&#65533;&#12531;&#12486;&#12451;&#65533;


Frank Benoit Wrote:

> Am 25.12.2009 23:26, schrieb Mitja:
> > strace output:
> > select(4, [3], [3], NULL, NULL)         = 1 (out [3])
> > writev(3, [{"\24\0\6\0005\2@\5\343\2\0\0\0\0\0\0\0\0\0\0\377\377\377\37"..., 24}], 1) = 24
> > select(4, [3], [], NULL, NULL)          = 1 (in [3])
> > read(3, "\1\10S6\4\0\0\0006\1\0\0\0\0\0\0\17\0\0\0VJ\25\10\374\325 \10tc\346\277\203"..., 4096) = 48
> > read(3, 0x840a2fc, 4096)                = -1 EAGAIN (Resource temporarily unavailable)
> > select(4, [3], [3], NULL, NULL)         = 1 (out [3])
> > writev(3, [{"+\0\1\0"..., 4}], 1)       = 4
> > select(4, [3], [], NULL, NULL)          = 1 (in [3])
> > read(3, "\1\2T6\0\0\0\0#\0@\5\374\325 \10tc\346\277x+\177\td\"\36\10Y3\t\10"..., 4096) = 32
> > read(3, 0x840a2fc, 4096)                = -1 EAGAIN (Resource temporarily unavailable)
> > select(4, [3], [3], NULL, NULL)         = 1 (out [3])
> > writev(3, [{"\23\0\3\0005\2@\5\343\2\0\0+\0\1\0"..., 16}], 1) = 16
> > select(4, [3], [], NULL, NULL)          = 1 (in [3])
> > read(3, "\34\"U65\2@\5\343\2\0\0\0\252Fc\1\2\0\0\4\0\0\0\1\0\0\0@B\233\t\1"..., 4096) = 64
> > read(3, 0x840a2fc, 4096)                = -1 EAGAIN (Resource temporarily unavailable)
> > --- SIGSEGV (Segmentation fault) @ 0 (0) ---
> > +++ killed by SIGSEGV +++
> > 
> 
> Oh, i did not mean strace (which is a trace of the OS calls a process does), instead i mean stack trace. You can perhaps get it with the current tango (i think) or by running the program in GDC and after the segfault run the "backtrace" command in it. It show the code line where the segfault happens and the code lines where this one was called from.
> 
> 
> > Example code:
> > module text;
> > 
> > import dwt.DWT;
> > import dwt.custom.StyleRange;
> > import dwt.widgets.Text;
> > import dwt.layout.FillLayout;
> > import dwt.widgets.Display;
> > import dwt.widgets.Shell;
> > 
> > void main() {
> >     Display display = new Display();
> >     Shell shell = new Shell(display);
> >     shell.setLayout(new FillLayout());
> >     Text text = new Text(shell, DWT.BORDER);
> >     shell.pack();
> >     shell.open();
> >     while (!shell.isDisposed()) {
> >       if (!display.readAndDispatch())
> >         display.sleep();
> >     }
> >     display.dispose();
> > }
> > 
> 
> thanks.
> 
> 
> > I cannot reproduce the character.
> > The way I got it was by pasting block of Japanese or Chinese characters into text widget and then selecting by mouse and copying a few characters from that block.
> > Illegal character(s) would as a result of copying appear in Klipper (KDE clipboard), usually in front or at the end of copied selection.
> > Segmentation fault would occur when I pasted back the very same selection, now with prepended or appended illegal characters.
> > 
> 
> Hm, perhaps you can paste the same into a editor and open it with a hex-view (hexdump) to make the byte values visible.
> 

December 26, 2009
sorry, i thought i can take the time.
but i haven't, so i can get you the support and fix the problem.

I someone else willing to fix that?