March 04, 2022
On Thu, Mar 03, 2022 at 06:36:35PM -0800, Ali Çehreli via Digitalmars-d-learn wrote:
> On 3/3/22 13:03, H. S. Teoh wrote:
> 
> > 	string s = "blahblah123blehbleh456bluhbluh";
> 
> > 	assert(result == 123456);
> 
> I assumed it would generate separate integers 123 and 456. I started to implement a range with findSkip, findSplit, and friends but failed. :/
[...]

	import std;
	void main() {
		string s = "blahblah123blehbleh456bluhbluh";
		auto result = s.matchAll(regex(`\d+`))
			.each!(m => writeln(m[0]));
	}

Output:
	123
	456

Takes only 3 lines of code. ;-)


T

-- 
People demand freedom of speech to make up for the freedom of thought which they avoid. -- Soren Aabye Kierkegaard (1813-1855)
March 04, 2022
On Thursday, 3 March 2022 at 23:46:49 UTC, H. S. Teoh wrote:
> ...
> This version doesn't even allocate extra storage for the filtered digits, since no storage is actually needed (each digit is spooled directly to the output).

OK but there is another problem, I tested your version and mine and there is a HUGE difference in speed:

LDC 1.27.1, with -O2:

import std.datetime.stopwatch;
import std.stdio: write, writeln, writef, writefln;
import std;

void printStrTim(string s,StopWatch sw){
    writeln("\nstr: ", s
            ,"\nTim(ms): ", sw.peek.total!"msecs"
            ,"\nTim(us): ", sw.peek.total!"usecs"
           );
}

void main(){
    auto sw = StopWatch(AutoStart.no);
    string s, str = "4A0B1de!2C9~6";
    int j;

    sw.start();
    for(j=0;j<1_000_000;++j){
        s="";
        foreach(i;str){
            (i >= '0' && i <= '9') ? s~=i : null;
        }
    }
    sw.stop();
    printStrTim(s,sw);

    s = "";
    sw.reset();
    sw.start();
    for(j=0;j<1_000_000;++j){
        s="";
    	s = str.filter!(ch => ch.isDigit).to!string;
    }
    sw.stop();
    printStrTim(s,sw);
}

Prints:

str: 401296
Tim(ms): 306
Tim(us): 306653

str: 401296
Tim(ms): 1112
Tim(us): 1112648

-------------------------------

Unless I did something wrong (If anything please tell). By the way on DMD was worse, it was like 5x slower in your version.

Matheus.
March 04, 2022
On Fri, Mar 04, 2022 at 07:51:44PM +0000, matheus via Digitalmars-d-learn wrote: [...]
>     for(j=0;j<1_000_000;++j){
>         s="";
>     	s = str.filter!(ch => ch.isDigit).to!string;

This line allocates a new string for every single loop iteration.  This is generally not something you want to do in an inner loop. :-)


>     }

[...]
> Unless I did something wrong (If anything please tell). By the way on DMD was worse, it was like 5x slower in your version.
[...]

I don't pay any attention to DMD when I'm doing anything remotely performance-related. Its optimizer is known to be suboptimal. :-P


T

-- 
Study gravitation, it's a field with a lot of potential.
March 04, 2022
On Friday, 4 March 2022 at 19:51:44 UTC, matheus wrote:
> import std.datetime.stopwatch;
> import std.stdio: write, writeln, writef, writefln;
> import std;
>
> void printStrTim(string s,StopWatch sw){
>     writeln("\nstr: ", s
>             ,"\nTim(ms): ", sw.peek.total!"msecs"
>             ,"\nTim(us): ", sw.peek.total!"usecs"
>            );
> }
>
> void main(){
>     auto sw = StopWatch(AutoStart.no);
>     string s, str = "4A0B1de!2C9~6";
>     int j;
>
>     sw.start();
>     for(j=0;j<1_000_000;++j){
>         s="";
>         foreach(i;str){
>             (i >= '0' && i <= '9') ? s~=i : null;
>         }
>     }
>     sw.stop();
>     printStrTim(s,sw);
>
>     s = "";
>     sw.reset();
>     sw.start();
>     for(j=0;j<1_000_000;++j){
>         s="";
>     	s = str.filter!(ch => ch.isDigit).to!string;
>     }
>     sw.stop();
>     printStrTim(s,sw);
> }
>
> Prints:
>
> str: 401296
> Tim(ms): 306
> Tim(us): 306653
>
> str: 401296
> Tim(ms): 1112
> Tim(us): 1112648
>
> -------------------------------
>
> Unless I did something wrong (If anything please tell).

The second version involves auto-decoding, which isn't actually needed. You can work around it with `str.byCodeUnit.filter!...`. On my machine, times become the same then.

Typical output:

str: 401296
Tim(ms): 138
Tim(us): 138505

str: 401296
Tim(ms): 137
Tim(us): 137376

March 04, 2022
On Fri, Mar 04, 2022 at 08:38:11PM +0000, ag0aep6g via Digitalmars-d-learn wrote: [...]
> The second version involves auto-decoding, which isn't actually needed. You can work around it with `str.byCodeUnit.filter!...`. On my machine, times become the same then.
[...]

And this here is living proof of why autodecoding is a Bad Idea(tm).

Whatever happened to Andrei's std.v2 effort?!  The sooner we can shed this baggage, the better.


T

-- 
The two rules of success: 1. Don't tell everything you know. -- YHL
March 04, 2022
On Friday, 4 March 2022 at 19:51:44 UTC, matheus wrote:

> OK but there is another problem, I tested your version and mine and there is a HUGE difference in speed:

>     string s, str = "4A0B1de!2C9~6";

> Unless I did something wrong (If anything please tell). By the way on DMD was worse, it was like 5x slower in your version.

To add to the already-mentioned difference in allocation strategies, try replacing the input with e.g. a command-line argument. Looping over a literal may be skewing the results.

March 04, 2022
On Friday, 4 March 2022 at 20:33:08 UTC, H. S. Teoh wrote:
> On Fri, Mar 04, 2022 at 07:51:44PM +0000, matheus via ...
> I don't pay any attention to DMD when I'm doing anything remotely performance-related. Its optimizer is known to be suboptimal. :-P

Yes, in fact I usually do my coding/compiling with DMD because is faster, then I go for LDC for production and speed.

Matheus.
March 04, 2022
On Friday, 4 March 2022 at 20:38:11 UTC, ag0aep6g wrote:
> ...
> The second version involves auto-decoding, which isn't actually needed. You can work around it with `str.byCodeUnit.filter!...`. On my machine, times become the same then.
>
> Typical output:
>
> str: 401296
> Tim(ms): 138
> Tim(us): 138505
>
> str: 401296
> Tim(ms): 137
> Tim(us): 137376

That's awesome my timing are pretty much like yours. In fact now with "byCodeUnit"  it's faster. :)

Thanks,

Matheus.
March 04, 2022
On Friday, 4 March 2022 at 21:20:20 UTC, Stanislav Blinov wrote:
> On Friday, 4 March 2022 at 19:51:44 UTC, matheus wrote:
>
>> OK but there is another problem, I tested your version and mine and there is a HUGE difference in speed:
>
>>     string s, str = "4A0B1de!2C9~6";
>
>> Unless I did something wrong (If anything please tell). By the way on DMD was worse, it was like 5x slower in your version.
>
> To add to the already-mentioned difference in allocation strategies, try replacing the input with e.g. a command-line argument. Looping over a literal may be skewing the results.

Interesting and I'll try that way. Thanks.

Matheus.
1 2 3
Next ›   Last »