GVim and Unicode ( was Re: What's left for 1.0?) (page 8) - D Programming Language Discussion Forum

Forums

New users
- Learn
Community
- General
- Announce
Improvements
- DIP Ideas
- DIP Devel.
Ecosystem
- GDC
- LDC
- Debuggers
- IDEs
- DWT
Development
- Internals
- Issues
- Beta
- DMD
- Phobos
- Druntime
- Study
Turkish
- Genel
- Duyuru

Index » General » GVim and Unicode ( was Re: What's left for 1.0?) (page 8)

November 20, 2006

Re: GVim and Unicode ( was Re: What's left for 1.0?)

Posted by Lars Ivar Igesund
in reply to Daniel Keep

Lars Ivar Igesund

Posted in reply to Daniel Keep

Daniel Keep wrote:

> 
> 
> Lars Ivar Igesund wrote:
>> Daniel Keep wrote:
>> 
>>>
>>> BCS wrote:
>>>> ...
>>>> However, I don't known how to put in a BOM.
>>> You can use Notepad to do it.  Yes, the crappy old Notepad that comes with Windows.  When you go File -> Save As, make sure to set the encoding as appropriate.
>>>
>>> I'm still very annoyed that Notepad has better Unicode support than GVim
>> 
>> How so? I've never had any problems getting GVim probably setup for Unicode.
>> 
> 
> It's basically a font problem.  GVim allows you to select exactly two fonts: a "normal" monospace font, and a "wide" font (which is used for things like kanji.)
> 
> The problem is that once you've picked those fonts, GVim will never use anything else.  This is a pain because you end up with heaps of unknown Unicode characters.  For example, none of the weird characters I used in the examples for my article on text in D show up in GVim (except for the hiragana since I have a Japanese font installed) but they all show up in Notepad which falls over to other fonts if the one it's using doesn't have that character.  There appear to be options for selecting a set of fonts to use, but they don't work on Windows.
> 
> -- Daniel
> 

Right, that is a usecase I've not needed to test :/

-- 
Lars Ivar Igesund
blog at http://larsivi.net
DSource & #D: larsivi

November 20, 2006

Re: What's left for 1.0?

Posted by Jeff
in reply to Bill Baxter

Jeff

Posted in reply to Bill Baxter

Though I don't really have anything interesting to add, I'll second this, since it's important to me.

November 20, 2006

Re: What's left for 1.0?

Posted by Olli Aalto
in reply to Marcin Kuszczak

Olli Aalto

Posted in reply to Marcin Kuszczak

Marcin Kuszczak wrote:
> But if Walter is not happy enough with this implementation now maybe there
> should be at least added alias in object.d:
> alias char[] string;
> 

I'm not an expert on these things, but while reading Daniel Keep's excellent article on text in D, I got an idea about the alias declaration. Why not have something like this in either object.d or std.string?

version(UTF8)
{
    alias char[] string;
}
version(UTF16)
{
    alias wchar[] string;
}
version(UTF32)
{
    alias dchar[] string;
}

It would default to UTF8, if not defined on command line. This way everyone could use the version their application requires.

Am I way out of line here? As I said I'm not an expert and don't know if that just creates more problems.

O.

November 20, 2006

Re: What's left for 1.0?

Posted by Daniel Keep
in reply to Olli Aalto

Daniel Keep

Posted in reply to Olli Aalto

Olli Aalto wrote:
> Marcin Kuszczak wrote:
>> But if Walter is not happy enough with this implementation now maybe
>> there
>> should be at least added alias in object.d:
>> alias char[] string;
>>
> 
> I'm not an expert on these things, but while reading Daniel Keep's excellent article on text in D, I got an idea about the alias declaration. Why not have something like this in either object.d or std.string?
> 
> version(UTF8)
> {
>     alias char[] string;
> }
> version(UTF16)
> {
>     alias wchar[] string;
> }
> version(UTF32)
> {
>     alias dchar[] string;
> }
> 
> It would default to UTF8, if not defined on command line. This way everyone could use the version their application requires.
> 
> Am I way out of line here? As I said I'm not an expert and don't know if that just creates more problems.
> 
> O.

Imagine you compile the standard library with -version=UTF8.  Let's take the following function:

> int find(string s, dchar c) { ... }

This would be compiled as:

> int find(char[] s, dchar c) { ... }

You then write some code to use that:

> ...
> string attr = "key:value";
> ...
> auto pos = find(attr, ':');
> ...

For whatever reason, your program will run optimally using UTF-32.

> dmd -version=UTF32 app.d

But that means that in the standard library, "string" is really "char[]", and in your program it's "dchar[]".  You try to link against the standard library, and the linker barfs (quite correctly) since the function you're using doesn't exist.

It's a nice idea, but with the current object formats, and the way conditional compilation works, I don't think it's actually possible.

	-- Daniel

P.S.  See sig.

-- 
Unlike Knuth, I have neither proven or tried the above; it may not even make sense.

v2sw5+8Yhw5ln4+5pr6OFPma8u6+7Lw4Tm6+7l6+7D i28a2Xs3MSr2e4/6+7t4TNSMb6HTOp5en5g6RAHCP  http://hackerkey.com/

November 20, 2006

Re: What's left for 1.0?

Posted by Olli Aalto
in reply to Daniel Keep

Olli Aalto

Posted in reply to Daniel Keep

Daniel Keep wrote:
> Imagine you compile the standard library with -version=UTF8.  Let's take
> the following function:
> 

How about if the standard library didn't use string? So it would still have 3 versions of find for example?

>> int find(string s, dchar c) { ... }
> 
> This would be compiled as:
> 
>> int find(char[] s, dchar c) { ... }
> 
> You then write some code to use that:
> 
>> ...
>> string attr = "key:value";
>> ...
>> auto pos = find(attr, ':');
>> ...
> 

Something like this:

int find(char[], char c) { ... }
int find(wchar[], wchar c) { ... }
int find(dchar[], dchar c) { ... }

void foo()
{
  string attr = "key:value";
  ...
  auto pos = find(attr, ':');
  ...
}

Wouldn't that link properly?

Hmm...

Probably not good enough. This whole idea is based on the assumption that the application writer knows the environment where and how his/her application will be used. If the application was compiled as UTF-8 and it gets a UTF-32 character as input, it would not be very good? Would the coder be required to put all the input through std.utf.toUTF8()?
Or is that something that should be done even now?
Well, you all seem to be smart people so I'm content to wait for whatever you come up with. :)

> For whatever reason, your program will run optimally using UTF-32.
> 
>> dmd -version=UTF32 app.d
> 
> But that means that in the standard library, "string" is really
> "char[]", and in your program it's "dchar[]".  You try to link against
> the standard library, and the linker barfs (quite correctly) since the
> function you're using doesn't exist.
> 
> It's a nice idea, but with the current object formats, and the way
> conditional compilation works, I don't think it's actually possible.
> 

Yes, I got the feeling that it was a little too simple. :)

Personally I think I'll stick mostly to dchar[]s in real applications and char[]s in test programs. Memory is cheap enough these days that it doesn't matter to me right now.

O.

November 21, 2006

Re: What's left for 1.0?

Posted by Walter Bright
in reply to Olli Aalto

Walter Bright

Posted in reply to Olli Aalto

Something similar is done in the C world using #define UNICODE. It looks like a good idea, but it's awful. Applications just don't want to be *all* UNICODE or *no* UNICODE. Most want to be mixed.

November 21, 2006

Re: What's left for 1.0?

Posted by Anders F Björklund
in reply to Walter Bright

Anders F Björklund

Posted in reply to Walter Bright

Walter Bright wrote:

> Something similar is done in the C world using #define UNICODE. It looks like a good idea, but it's awful. Applications just don't want to be *all* UNICODE or *no* UNICODE. Most want to be mixed.

I always thought that in D it was whether to use char[] or wchar_t[] ?

In wxD there will be two versions: version(ANSI) means that it will
use char[] in D and char* in C++ and version(UNICODE) means that it
will use wchar_t[] in D and wchar_t* in C++ - for the wxString class.

At least that's needed for the implementation, unsure about public API.

All the wx methods are using "string" parameters now, which is an alias
for the default char[] type in D. This might change to "dstring" later,
if that struct wrapper has merits to unify code better on the D side...

I really don't want to do two functions for each string-using method ?

--anders

PS. wchar_t is an alias which is wchar in Windows and dchar in Unix.

November 22, 2006

Re: What's left for 1.0? (static data)

Posted by Walter Bright
in reply to Bill Baxter

Walter Bright

Posted in reply to Bill Baxter

Bill Baxter wrote:
> So is this whole issue really just a bug with deducing what's const and what's not?

Possibly.

November 24, 2006

Re: What's left for 1.0?

Posted by Bill Baxter
in reply to Kirk McDonald

Bill Baxter

Posted in reply to Kirk McDonald

Kirk McDonald wrote:
> Bill Baxter wrote:
>> BCS wrote:
>>> There is no way to differentiate between function overloads.
>>>
>>>
>>> int foo(){ return 0;}
>>> int foo(int i){ return 0;}
>>>
>>>
>>> int bob()
>>> {
>>>         // foo() or foo(int)?
>>>     auto fn = &foo;
>>>     auto tmp = TemplateTakingFn!(foo);
>>> }
>>
>> That should probably be an "error: ambiguous" if it isn't already, but anyway can't you do   'int function() fn = &foo' to get the one you want?
>>
>> --bb
> 
> I've played with just about every permutation of this problem during the course of writing Pyd.
> 
> int foo() { return 0; }
> int foo(int i) { return 0; }
> 
> void main() {
>     auto fn = &foo; // Uses the lexically first function
>     static assert(is(typeof(fn) == int function()));
>     fn();
>     //fn(12); // Error: expected 0 arguments, not 1
>     int function(int) fn2 = &foo; // Works
>     fn2(12);
> }
> 
> In writing Pyd, I've come to the conclusion that if you have a template that accepts an arbitrary function as an alias parameter (and then does anything involving the type of that function), you should always have a second parameter representing the type of the function. (And you can easily make this second parameter have a default value of typeof(&fn).) In this way the user can be sure the template is getting the proper overload of the function.
> 

Ugh.  I just hit this using my flexible signals and slots wrapper.

   widget.value_changed.fconnect(&other_widget.value);

Doh!
I really don't want writing a gui to involve gobs of code like:

  widget.value_changed.fconnect!(
       typeof(&other_widget.value))(&other_widget.value);

Now I really wish I had a tool to find property-like methods in my source code so I can at least make sure they are in the right lexical ordering. :-(

Has there been any bug/enhancement filed on this that I can keep a watch on?

For my case I'm not even sure what would be the right thing for it to do.  What I really need to happen is for fconnect to prefer methods with non-trivial argument lists, but I can't rule out the possibility someone actually wants to connect up a no-argument slot like "updateGui()" to a signal.

Maybe we need to be able to optionally specify the arguments when taking a delegates, like:

    connect(&obj.value(int));

-bb

November 28, 2006

Re: What's left for 1.0?

Posted by Stewart Gordon
in reply to Bill Baxter

Stewart Gordon

Posted in reply to Bill Baxter

Bill Baxter wrote:
> BCS wrote:
>> Bill Baxter wrote:
>>> So, what's left on everyone's lists for D1.0 must-have features?
>>>
>>> I glanced over the "Pending Peeves" list, but none of those things seems particularly serious to me.
>>>
>>> What about the iterators?  Mostly that can be a library thing that comes after the 1.0 release, but it would be nice if foreach at least had the smarts built-in to use an iterator once the method names are decided upon.
>>>
>>> --bb
>>
>> There is no way to differentiate between function overloads.
>>
>>
>> int foo(){ return 0;}
>> int foo(int i){ return 0;}
>>
>>
>> int bob()
>> {
>>         // foo() or foo(int)?
>>     auto fn = &foo;
>>     auto tmp = TemplateTakingFn!(foo);
>> }
> 
> That should probably be an "error: ambiguous" if it isn't already, but anyway can't you do   'int function() fn = &foo' to get the one you want?

I can indeed.  And indeed, trying to autotype such a thing should certainly be an error.

http://d.puremagic.com/issues/show_bug.cgi?id=52

Stewart.

-- 
-----BEGIN GEEK CODE BLOCK-----
Version: 3.1
GCS/M d- s:-@ C++@ a->--- UB@ P+ L E@ W++@ N+++ o K-@ w++@ O? M V? PS- PE- Y? PGP- t- 5? X? R b DI? D G e++++ h-- r-- !y
------END GEEK CODE BLOCK------

My e-mail is valid but not my primary mailbox.  Please keep replies on the 'group where everyone may benefit.

Top | Forum index | About this forum

Copyright © 1999-2021 by the D Language Foundation