Jump to page: 1 2
Thread overview
Using regular expressions when reading a file
May 05, 2022
Alexander Zhirov
May 05, 2022
H. S. Teoh
May 05, 2022
Alexander Zhirov
May 05, 2022
H. S. Teoh
May 05, 2022
Alexander Zhirov
May 05, 2022
Ali Çehreli
May 05, 2022
Alexander Zhirov
May 06, 2022
forkit
May 06, 2022
Alexander Zhirov
May 06, 2022
forkit
May 06, 2022
novice2
May 05, 2022

I want to use a configuration file with external settings. I'm trying to use regular expressions to read the Property = Value settings. I would like to do it all more beautifully. Is there any way to get rid of the line break character? How much does everything look "right"?

settings.conf:

host = 127.0.0.1
port = 5432
dbname = database
user = postgres

code:

auto file = File("settings.conf", "r");
string[string] properties;
auto p_property = regex(r"^\w+ *= *.+", "s");
while (!file.eof())
{
  string line = file.readln();
  auto m = matchAll(line, p_property);
  if (!m.empty())
  {
    string property = matchAll(line, regex(r"^\w+", "m")).hit;
    string value = replaceAll(line, regex(r"^\w+ *= *", "m"), "");
    properties[property] = value;
  }
}
file.close();
writeln(properties);

output:

["host":"127.0.0.1\n", "dbname":"mydb\n", "user":"postgres", "port":"5432\n"]
May 05, 2022
On Thu, May 05, 2022 at 05:53:57PM +0000, Alexander Zhirov via Digitalmars-d-learn wrote:
> I want to use a configuration file with external settings. I'm trying to use regular expressions to read the `Property = Value` settings. I would like to do it all more beautifully. Is there any way to get rid of the line break character? How much does everything look "right"?
[...]
> ```d
> auto file = File("settings.conf", "r");
> string[string] properties;
> auto p_property = regex(r"^\w+ *= *.+", "s");
> while (!file.eof())
> {
>   string line = file.readln();
>   auto m = matchAll(line, p_property);
>   if (!m.empty())
>   {
>     string property = matchAll(line, regex(r"^\w+", "m")).hit;
>     string value = replaceAll(line, regex(r"^\w+ *= *", "m"), "");
>     properties[property] = value;
>   }
> }

Your regex already matches the `Property = Value` pattern; why not just use captures to extract the relevant parts of the match, insteead of doing it all over again inside the if-statement?

	// I added captures (parentheses) to extract the property name
	// and value directly from the pattern.
	auto p_property = regex(r"^(\w+) *= *(.+)", "s");

	// I assume you only want one `Property = Value` pair per input
	// line, so you really don't need matchAll; matchFirst will do
	// the job.
	auto m = matchFirst(line, p_property);

	if (m) {
		// No need to run a match again, just extract the
		// captures
		string property = m[1];
		string value = m[2];
		properties[property] = value;
	}


T

-- 
"You are a very disagreeable person." "NO."
May 05, 2022

On Thursday, 5 May 2022 at 18:15:28 UTC, H. S. Teoh wrote:

>

auto m = matchFirst(line, p_property);

Yes, it looks more attractive. Thanks! I just don't quite understand how matchFirst works. I seem to have read the description, but I can't understand something.

And yet I have to manually remove the line break:

["host":"192.168.100.236\n", "dbname":"belpig\n", "user":"postgres", "port":"5432\n"]
May 05, 2022
On Thu, May 05, 2022 at 06:50:17PM +0000, Alexander Zhirov via Digitalmars-d-learn wrote:
> On Thursday, 5 May 2022 at 18:15:28 UTC, H. S. Teoh wrote:
> > 	auto m = matchFirst(line, p_property);
> 
> Yes, it looks more attractive. Thanks! I just don't quite understand how `matchFirst` works. I seem to have read the [description](https://dlang.org/phobos/std_regex.html#Captures), but I can't understand something.
> 
> And yet I have to manually remove the line break:
> ```sh
> ["host":"192.168.100.236\n", "dbname":"belpig\n", "user":"postgres",
> "port":"5432\n"]
> ```

You don't have to. Just add a `$` to the end of your regex, and it should match the newline. If you put it outside the capture parentheses, it will not be included in the value.


T

-- 
In a world without fences, who needs Windows and Gates? -- Christian Surchi
May 05, 2022

On Thursday, 5 May 2022 at 18:58:41 UTC, H. S. Teoh wrote:

>

You don't have to. Just add a $ to the end of your regex, and it should match the newline. If you put it outside the capture parentheses, it will not be included in the value.

In fact, it turned out to be much easier. It was just necessary to use the m flag instead of the s flag:

auto p_property = regex(r"^(\w+) *= *(.+)", "m");
May 05, 2022
On 5/5/22 12:05, Alexander Zhirov wrote:
> On Thursday, 5 May 2022 at 18:58:41 UTC, H. S. Teoh wrote:
>> You don't have to. Just add a `$` to the end of your regex, and it should match the newline. If you put it outside the capture parentheses, it will not be included in the value.
> 
> In fact, it turned out to be much easier. It was just necessary to use the `m` flag instead of the `s` flag:
> 
> ```d
> auto p_property = regex(r"^(\w+) *= *(.+)", "m");
> ```
> 

Couldn't help myself from improving. :) The following regex works in my Linux console. No issues with '\n'. (?) It also allows for leading and trailing spaces:

import std.regex;
import std.stdio;
import std.algorithm;
import std.array;
import std.typecons;
import std.functional;

void main() {
  auto p_property = regex(r"^ *(\w+) *= *(\w+) *$");
  const properties = File("settings.conf")
                     .byLineCopy
                     .map!(line => matchFirst(line, p_property))
                     .filter!(not!empty) // OR: .filter!(m => !m.empty)
                     .map!(m => tuple(m[1], m[2]))
                     .assocArray;

  writeln(properties);
}

Ali
May 05, 2022

On Thursday, 5 May 2022 at 19:19:26 UTC, Ali Çehreli wrote:

>

Couldn't help myself from improving. :) The following regex works in my Linux console. No issues with '\n'. (?) It also allows for leading and trailing spaces:

import std.regex;
import std.stdio;
import std.algorithm;
import std.array;
import std.typecons;
import std.functional;

void main() {
auto p_property = regex(r"^ *(\w+) *= *(\w+) *$");
const properties = File("settings.conf")
.byLineCopy
.map!(line => matchFirst(line, p_property))
.filter!(not!empty) // OR: .filter!(m => !m.empty)
.map!(m => tuple(m[1], m[2]))
.assocArray;

writeln(properties);
}

It will need to be sorted out with a fresh head. 😀 Thanks!

May 06, 2022
On Thursday, 5 May 2022 at 17:53:57 UTC, Alexander Zhirov wrote:
> I want to use a configuration file with external settings. I'm trying to use regular expressions to read the `Property = Value` settings. I would like to do it all more beautifully. Is there any way to get rid of the line break character? How much does everything look "right"?

regex never looks right ;-)

try something else perhaps??

// ------------

module test;

import std;

void main()
{
    auto file = File("d:\\settings.conf", "r");
    string[string] aa;

    // create an associate array of settings -> [key:value]
    foreach (line; file.byLine().filter!(a => !a.empty))
    {
        auto myTuple = line.split(" = ");
        aa[myTuple[0].to!string] = myTuple[1].to!string;
    }

    // write out all the settings.
    foreach (key, value; aa.byPair)
        writefln("%s:%s", key, value);

    writeln;

    // write just the host value
    writeln(aa["host"]);

}


// ------------

May 06, 2022

On Friday, 6 May 2022 at 05:40:52 UTC, forkit wrote:

>

auto myTuple = line.split(" = ");

Well, only if as a strict form :)

May 06, 2022
On Friday, 6 May 2022 at 07:51:01 UTC, Alexander Zhirov wrote:
> On Friday, 6 May 2022 at 05:40:52 UTC, forkit wrote:
>> auto myTuple = line.split(" = ");
>
> Well, only if as a strict form :)

well.. a settings file should be following a strict format.

..otherwise...anything goes... and good luck with that...

regex won't help you either in that case...

e.g:

user =som=eu=ser  (how you going to deal with this ?)

« First   ‹ Prev
1 2