Thread overview | ||||||||||||||||||||||||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
|
August 05, 2013 Which option is faster... | ||||
---|---|---|---|---|
| ||||
Greetings! I have this code, foreach (...) { if (std.string.tolower(fext[0]) == "doc" || std.string.tolower(fext[0]) == "docx" || std.string.tolower(fext[0]) == "xls" || std.string.tolower(fext[0]) == "xlsx" || std.string.tolower(fext[0]) == "ppt" || std.string.tolower(fext[0]) == "pptx") continue; } foreach (...) { if (std.string.tolower(fext[0]) == "doc") continue; if (std.string.tolower(fext[0]) == "docx") continue; if (std.string.tolower(fext[0]) == "xls") continue; if (std.string.tolower(fext[0]) == "xlsx") continue; if (std.string.tolower(fext[0]) == "ppt") continue; if (std.string.tolower(fext[0]) == "pptx") continue; ... ... } thanks. josé |
August 05, 2013 Re: Which option is faster... | ||||
---|---|---|---|---|
| ||||
Posted in reply to jicman | On 08/05/2013 03:59 PM, jicman wrote:
>
> Greetings!
>
> I have this code,
>
> foreach (...)
> {
>
> if (std.string.tolower(fext[0]) == "doc" ||
> std.string.tolower(fext[0]) == "docx" ||
> std.string.tolower(fext[0]) == "xls" ||
> std.string.tolower(fext[0]) == "xlsx" ||
> std.string.tolower(fext[0]) == "ppt" ||
> std.string.tolower(fext[0]) == "pptx")
> continue;
> }
>
> foreach (...)
> {
> if (std.string.tolower(fext[0]) == "doc")
> continue;
> if (std.string.tolower(fext[0]) == "docx")
> continue;
> if (std.string.tolower(fext[0]) == "xls")
> continue;
> if (std.string.tolower(fext[0]) == "xlsx")
> continue;
> if (std.string.tolower(fext[0]) == "ppt")
> continue;
> if (std.string.tolower(fext[0]) == "pptx")
> continue;
> ...
> ...
> }
>
> thanks.
>
> josé
They are both equally slow.
|
August 05, 2013 Re: Which option is faster... | ||||
---|---|---|---|---|
| ||||
Posted in reply to jicman | On Monday, 5 August 2013 at 13:59:24 UTC, jicman wrote: > > Greetings! > > I have this code, First option... > foreach (...) > { > > if (std.string.tolower(fext[0]) == "doc" || > std.string.tolower(fext[0]) == "docx" || > std.string.tolower(fext[0]) == "xls" || > std.string.tolower(fext[0]) == "xlsx" || > std.string.tolower(fext[0]) == "ppt" || > std.string.tolower(fext[0]) == "pptx") > continue; > } Second option... > foreach (...) > { > if (std.string.tolower(fext[0]) == "doc") > continue; > if (std.string.tolower(fext[0]) == "docx") > continue; > if (std.string.tolower(fext[0]) == "xls") > continue; > if (std.string.tolower(fext[0]) == "xlsx") > continue; > if (std.string.tolower(fext[0]) == "ppt") > continue; > if (std.string.tolower(fext[0]) == "pptx") > continue; > ... > ... > } > > thanks. > > josé So, after I saw this post I asked myself, what? So, the question is: which of the two foreach loops options are faster: 1. The concatenated if || 2. The single if I am trying to see if it matters. I have a project with lots of files and if one is faster, then, it will matter to code it the faster way. Thanks. |
August 05, 2013 Re: Which option is faster... | ||||
---|---|---|---|---|
| ||||
Posted in reply to Timon Gehr | On Monday, 5 August 2013 at 14:13:33 UTC, Timon Gehr wrote:
> On 08/05/2013 03:59 PM, jicman wrote:
>>
>> Greetings!
>>
>> I have this code,
>>
>> foreach (...)
>> {
>>
>> if (std.string.tolower(fext[0]) == "doc" ||
>> std.string.tolower(fext[0]) == "docx" ||
>> std.string.tolower(fext[0]) == "xls" ||
>> std.string.tolower(fext[0]) == "xlsx" ||
>> std.string.tolower(fext[0]) == "ppt" ||
>> std.string.tolower(fext[0]) == "pptx")
>> continue;
>> }
>>
>> foreach (...)
>> {
>> if (std.string.tolower(fext[0]) == "doc")
>> continue;
>> if (std.string.tolower(fext[0]) == "docx")
>> continue;
>> if (std.string.tolower(fext[0]) == "xls")
>> continue;
>> if (std.string.tolower(fext[0]) == "xlsx")
>> continue;
>> if (std.string.tolower(fext[0]) == "ppt")
>> continue;
>> if (std.string.tolower(fext[0]) == "pptx")
>> continue;
>> ...
>> ...
>> }
>>
>> thanks.
>>
>> josé
>
> They are both equally slow.
How would you make it faster in D1?
|
August 05, 2013 Re: Which option is faster... | ||||
---|---|---|---|---|
| ||||
Posted in reply to jicman | did you benchmarked your current szenario - how do you know that this is the slow part - or are you working on an only-extension-compare-tool?
btw: they are both equal and slow - and full of partly code-duplication
std.string.tolower(fext[0]) multiple times, i hope your list isn't going much longer
Am 05.08.2013 15:59, schrieb jicman:
>
> Greetings!
>
> I have this code,
>
> foreach (...)
> {
>
> if (std.string.tolower(fext[0]) == "doc" ||
> std.string.tolower(fext[0]) == "docx" ||
> std.string.tolower(fext[0]) == "xls" ||
> std.string.tolower(fext[0]) == "xlsx" ||
> std.string.tolower(fext[0]) == "ppt" ||
> std.string.tolower(fext[0]) == "pptx")
> continue;
> }
>
> foreach (...)
> {
> if (std.string.tolower(fext[0]) == "doc")
> continue;
> if (std.string.tolower(fext[0]) == "docx")
> continue;
> if (std.string.tolower(fext[0]) == "xls")
> continue;
> if (std.string.tolower(fext[0]) == "xlsx")
> continue;
> if (std.string.tolower(fext[0]) == "ppt")
> continue;
> if (std.string.tolower(fext[0]) == "pptx")
> continue;
> ...
> ...
> }
>
> thanks.
>
> josé
>
|
August 05, 2013 Re: Which option is faster... | ||||
---|---|---|---|---|
| ||||
Posted in reply to dennis luehring | On Monday, 5 August 2013 at 14:27:43 UTC, dennis luehring wrote:
> did you benchmarked your current szenario - how do you know that this is the slow part - or are you working on an only-extension-compare-tool?
>
> btw: they are both equal and slow - and full of partly code-duplication
> std.string.tolower(fext[0]) multiple times, i hope your list isn't going much longer
Ok, how would you make it faster?
|
August 05, 2013 Re: Which option is faster... | ||||
---|---|---|---|---|
| ||||
Posted in reply to jicman | > Ok, how would you make it faster?
i don't see a better solution here - how to reduce ONE lowercase and SOME compares in any way? (i dont think a hash or something will help) but i know that anything like your continue-party is worth nothing (feels a little bit like script-kiddies "do it with assembler that would it make million times faster" blabla)
question: is this the slow part in your project?
do you know it for sure or just an emotion - HOW do you benchmark?
Am 05.08.2013 16:31, schrieb jicman:
> On Monday, 5 August 2013 at 14:27:43 UTC, dennis luehring wrote:
>> did you benchmarked your current szenario - how do you know
>> that this is the slow part - or are you working on an
>> only-extension-compare-tool?
>>
>> btw: they are both equal and slow - and full of partly
>> code-duplication
>> std.string.tolower(fext[0]) multiple times, i hope your list
>> isn't going much longer
>
> Ok, how would you make it faster?
>
|
August 05, 2013 Re: Which option is faster... | ||||
---|---|---|---|---|
| ||||
Posted in reply to jicman | jicman:
> How would you make it faster in D1?
Compute std.string.tolower(fext[0]) and put it in a temporary variable. And then compare that variable with all your string literals. In most cases that's fast enough. If it's not enough, you could create a little finite state machine that represents your directed acyclic word graph, and uses gotos to jump around states. The amount of strings is small, so perhaps there are not enough code cache misses to nullify this optimization.
Bye,
bearophile
|
August 05, 2013 Re: Which option is faster... | ||||
---|---|---|---|---|
| ||||
Posted in reply to dennis luehring | On Monday, 5 August 2013 at 14:47:37 UTC, dennis luehring wrote:
> > Ok, how would you make it faster?
>
> i don't see a better solution here - how to reduce ONE lowercase and SOME compares in any way? (i dont think a hash or something will help) but i know that anything like your continue-party is worth nothing (feels a little bit like script-kiddies "do it with assembler that would it make million times faster" blabla)
>
> question: is this the slow part in your project?
> do you know it for sure or just an emotion - HOW do you benchmark?
>
> Am 05.08.2013 16:31, schrieb jicman:
>> On Monday, 5 August 2013 at 14:27:43 UTC, dennis luehring wrote:
>>> did you benchmarked your current szenario - how do you know
>>> that this is the slow part - or are you working on an
>>> only-extension-compare-tool?
>>>
>>> btw: they are both equal and slow - and full of partly
>>> code-duplication
>>> std.string.tolower(fext[0]) multiple times, i hope your list
>>> isn't going much longer
>>
>> Ok, how would you make it faster?
It is a tool that was a script, but I have turned it into do, which now has taken two hours from the last jscript script. I have not benchmarked it, yet. I may. But I see that a great idea has been provided, which I will use. Thanks for the help.
|
August 05, 2013 Re: Which option is faster... | ||||
---|---|---|---|---|
| ||||
Posted in reply to jicman | On Monday, 5 August 2013 at 13:59:24 UTC, jicman wrote:
>
> Greetings!
>
> I have this code,
>
> foreach (...)
> {
>
> if (std.string.tolower(fext[0]) == "doc" ||
> std.string.tolower(fext[0]) == "docx" ||
> std.string.tolower(fext[0]) == "xls" ||
> std.string.tolower(fext[0]) == "xlsx" ||
> std.string.tolower(fext[0]) == "ppt" ||
> std.string.tolower(fext[0]) == "pptx")
> continue;
> }
>
> foreach (...)
> {
> if (std.string.tolower(fext[0]) == "doc")
> continue;
> if (std.string.tolower(fext[0]) == "docx")
> continue;
> if (std.string.tolower(fext[0]) == "xls")
> continue;
> if (std.string.tolower(fext[0]) == "xlsx")
> continue;
> if (std.string.tolower(fext[0]) == "ppt")
> continue;
> if (std.string.tolower(fext[0]) == "pptx")
> continue;
> ...
> ...
> }
>
> thanks.
>
> josé
better:
foreach (...)
{
auto tmp = std.string.tolower(fext[0]);
if(tmp == "doc" || tmp == "docx"
|| tmp == "xls" || tmp == "xlsx"
|| tmp == "ppt" || tmp == "pptx")
{
continue;
}
}
but still not super-fast as (unless the compiler is very clever) it still means multiple passes over tmp. Also, it converts the whole string to lower case even when it's not necessary.
If you have large numbers of possible matches you will probably want to be clever with your data structures / algorithms. E.g.
You could create a tree-like structure to quickly eliminate possibilities as you read successive letters. You read one character, follow the appropriate branch, check if there are any further branches, if not then no match and break. Else, read the next character and follow the appropriate branch and so on.... Infeasible for large (or even medium-sized) character-sets without hashing, but might be pretty fast for a-z and a large number of short strings.
|
Copyright © 1999-2021 by the D Language Foundation