Jump to page: 1 2 3
Thread overview
String Parsing with \" in a ".." text line
Mar 21, 2005
AEon
Mar 21, 2005
Regan Heath
Re: String Parsing with \
Mar 21, 2005
AEon
Mar 21, 2005
Stewart Gordon
Mar 21, 2005
AEon
Mar 21, 2005
Stewart Gordon
Mar 21, 2005
David Medlock
Mar 21, 2005
Stewart Gordon
Mar 21, 2005
AEon
Mar 21, 2005
Regan Heath
Mar 21, 2005
AEon
Mar 21, 2005
Regan Heath
Re: String Parsing with \" in a ".." text line - "test.d" (1/1) uuEncoded 697 bytes - "linetoken.d" (1/1) uuEncoded 8058 bytes
Mar 22, 2005
Derek Parnell
Re: String Parsing with \" in a ".." text line
Mar 22, 2005
AEon
Mar 22, 2005
Regan Heath
UUDECODE in D (was Re: String Parsing with \" in a ".." text line)
Mar 23, 2005
J C Calvarese
Mar 22, 2005
Derek Parnell
Re: String Parsing with \
Mar 22, 2005
AEon
TokenizeLine() (was String Parsing with \" in a ".." text line)
Mar 22, 2005
AEon
Mar 22, 2005
Derek Parnell
Re: TokenizeLine()
Mar 22, 2005
AEon
Archiving posts (Re: TokenizeLine())
Mar 23, 2005
J C Calvarese
Mar 23, 2005
AEon
Mar 23, 2005
J C Calvarese
Mar 25, 2005
AEon
March 21, 2005
I started to code by parser, a *lot* easier with D, commands like std.string.split work mirracles.

But I am still wondering how to optimize parsing, in this case of a configuration file:

<code>
// comments
[General]
game		"Quake III Arena"
gameInfo	"Retail, Rocket Arena III, Q3: Team Arena"
gameOpt		"-q3a"			// *** comment
gameMode	"16"
// comments
</code>

I do a

   std.string.find(line, "game")

to find out if the line contains my key-variable. And then a

  char[][] splitLine = std.string.split(line, "\"");

accessing the value of the var of interest via

 splitLine[1]

Now that is fine and dandy. But when I want to allow the user to use double quotes (") in the config file, this will turn ugly, since the above split does not differ between " and \".

Any ideas how to elegantly read the var/value pairs should the value contain a \"?

(In C I did some very evil manual hacking to make that work).

Thanx.

AEon
March 21, 2005
On Mon, 21 Mar 2005 01:27:31 +0000 (UTC), AEon <AEon_member@pathlink.com> wrote:
> Any ideas how to elegantly read the var/value pairs should the value contain a
> \"?
>
> (In C I did some very evil manual hacking to make that work).

I think you have to write your own version of split, one that allows "escaped" characters. Once written I'd recommend it for inclusion into std.string.

Regan
March 21, 2005
Regan Heath says...

>> Any ideas how to elegantly read the var/value pairs should the value
>> contain a
>> \"?
>>
>> (In C I did some very evil manual hacking to make that work).
>
>I think you have to write your own version of split, one that allows "escaped" characters. Once written I'd recommend it for inclusion into std.string.

:)... will take a while to get a useful version written, since I am still learning about all the goodies in std.string.


Basically what could be useful would be a

char[][] splitx(char[] stringtosplit, char[] delimiter, char[] non-delimiters)

of sorts:

splitx( line, "\"", "\\\"");

A simpler solution would be to use another delimiter in my config files. But that would leave the problem, that any delimiter could also be needed in the text.

If I find anything useful, will post the code.

AEon
March 21, 2005
AEon wrote:
> I started to code by parser, a *lot* easier with D, commands like
> std.string.split work mirracles.
> 
> But I am still wondering how to optimize parsing, in this case of a
> configuration file:
> 
> <code>
> // comments
> [General]
> game		"Quake III Arena"
> gameInfo	"Retail, Rocket Arena III, Q3: Team Arena"
> gameOpt		"-q3a"			// *** comment
> gameMode	"16"
> // comments
> </code>

Is this a third-party file format?  If not, why not define a format that's that little bit easier to parse?  I'd be inclined to go for something resembling Windows .ini files.  But if you still want to do it this way....

> I do a 
> 
>    std.string.find(line, "game")
> 
> to find out if the line contains my key-variable.

Which won't work if "game" is somewhere in the value, not in the key. How about checking whether the line _begins_ with "game"?

> And then a
> 
>   char[][] splitLine = std.string.split(line, "\"");
> 
> accessing the value of the var of interest via
> 
>  splitLine[1]
> 
> Now that is fine and dandy. But when I want to allow the user to use double
> quotes (") in the config file, this will turn ugly, since the above split does
> not differ between " and \".
<snip>

By using split for this you're making life difficult for yourself.  How about just picking out the first and last quotes, using find and findr?

Stewart.

-- 
My e-mail is valid but not my primary mailbox.  Please keep replies on the 'group where everyone may benefit.
March 21, 2005
AEon wrote:
> I started to code by parser, a *lot* easier with D, commands like
> std.string.split work mirracles.
> 
> But I am still wondering how to optimize parsing, in this case of a
> configuration file:
> 
> <code>
> // comments
> [General]
> game		"Quake III Arena"
> gameInfo	"Retail, Rocket Arena III, Q3: Team Arena"
> gameOpt		"-q3a"			// *** comment
> gameMode	"16"
> // comments
> </code>
> 
> I do a 
> 
>    std.string.find(line, "game")
> 
> to find out if the line contains my key-variable. And then a
> 
>   char[][] splitLine = std.string.split(line, "\"");
> 
> accessing the value of the var of interest via
> 
>  splitLine[1]
> 
> Now that is fine and dandy. But when I want to allow the user to use double
> quotes (") in the config file, this will turn ugly, since the above split does
> not differ between " and \".
> 
> Any ideas how to elegantly read the var/value pairs should the value contain a
> \"?
> 
> (In C I did some very evil manual hacking to make that work).
> 
> Thanx.
> 
> AEon


Why not just use an existing scripting language for your configuration files?

I would recommend Small (http://www.compuphase.com/small.htm) or
Lua (http://www.lua.org/).

This scripting language would be useful within your game as well.

-David
March 21, 2005
David Medlock wrote:
<snip>
> Why not just use an existing scripting language for your configuration files?
> 
> I would recommend Small (http://www.compuphase.com/small.htm) or
> Lua (http://www.lua.org/).

Around two years ago I invented a configuration language called Configur8.  It's basically a slightly more powerful version of Windows INI files (with one or two syntactical differences).  It's no match for a scripting language, but is perfect for stuff like the above appears to be.

I haven't yet created a D interface, but I plan to do it at some point.

Stewart.

-- 
My e-mail is valid but not my primary mailbox.  Please keep replies on the 'group where everyone may benefit.
March 21, 2005
Stewart Gordon...

good points

>> <code>
>> // comments
>> [General]
>> game		"Quake III Arena"
>> gameInfo	"Retail, Rocket Arena III, Q3: Team Arena"
>> gameOpt		"-q3a"			// *** comment
>> gameMode	"16"
>> // comments
>> </code>
>
>Is this a third-party file format?  If not, why not define a format that's that little bit easier to parse?  I'd be inclined to go for something resembling Windows .ini files.  But if you still want to do it this way....

True... it is totally up to me to define the format, I felt that was the easiest way to format the cfg file, and is easy to read.

>> I do a
>> 
>>    std.string.find(line, "game")
>> 
>> to find out if the line contains my key-variable.
>
>Which won't work if "game" is somewhere in the value, not in the key. How about checking whether the line _begins_ with "game"?

I had thought of that, but then forgot to check for it. Sigh :)


>> And then a
>> 
>>   char[][] splitLine = std.string.split(line, "\"");
>> 
>> accessing the value of the var of interest via
>> 
>>  splitLine[1]
>> 
>> Now that is fine and dandy. But when I want to allow the user to use double quotes (") in the config file, this will turn ugly, since the above split does not differ between " and \".
><snip>
>
>By using split for this you're making life difficult for yourself.  How about just picking out the first and last quotes, using find and findr?

Well as long as there is no \" in the line, split will do the job much quicker. Just checked, you are talking regular expression. Still need to learn about those.

AEon
March 21, 2005
David Medlock says...

>Why not just use an existing scripting language for your configuration files?
>
>I would recommend Small (http://www.compuphase.com/small.htm) or
>Lua (http://www.lua.org/).
>
>This scripting language would be useful within your game as well.

Is that not a tad overkill... I only want to define a few variables and log file obituaries, that need to be as readable as possible.

AEon
March 21, 2005
AEon wrote:
<snip>
>> By using split for this you're making life difficult for yourself.  How about just picking out the first and last quotes, using find and findr?
> 
> Well as long as there is no \" in the line, split will do the job much quicker. Just checked, you are talking regular expression. Still need to learn about those.

I actually meant the find and rfind (oops, where did findr come from?) in std.string, not std.regexp.

Stewart.

-- 
My e-mail is valid but not my primary mailbox.  Please keep replies on the 'group where everyone may benefit.
March 21, 2005
On Mon, 21 Mar 2005 16:44:36 +0000 (UTC), AEon <AEon_member@pathlink.com> wrote:
> David Medlock says...
>
>> Why not just use an existing scripting language for your configuration
>> files?
>>
>> I would recommend Small (http://www.compuphase.com/small.htm) or
>> Lua (http://www.lua.org/).
>>
>> This scripting language would be useful within your game as well.
>
> Is that not a tad overkill... I only want to define a few variables and log file
> obituaries, that need to be as readable as possible.

The simplest possible format...

If you assume your values cannot contain \r\n and your labels/settings cannot contain spaces then you can simply use the following format:

label<space>value<\r\n>

and parse it by calling "find" on each line, looking for a space, and assuming the rest of the line (minus the \r\n) is the value.

If you decide later on that you need \r\n in your values you can encode them as \, r, \, n eg.

label<space>regan\r\nwas\r\nhere<\r\n>

In general the fewer special characters you define, the fewer special cases you have to handle in values. Further if you can pick characters you will never want to use in values you don't have to handle any special cases at all.

Regan

Regan
« First   ‹ Prev
1 2 3