May 02, 2022

On Sunday, 1 May 2022 at 16:31:41 UTC, claptrap wrote:

>

On Sunday, 1 May 2022 at 15:50:17 UTC, Guillaume Piolat wrote:

>

On Sunday, 1 May 2022 at 14:36:12 UTC, Ola Fosheim Grøstad wrote:

>

Autotune is more about fashion and marketing

Things that "objectively" sounds bad can become fashionable for a limited time period (or become a musical style and linger on).

It has been just a fad for over 23 years now.

And 99.9% of the time you're listening to AutoTuned vocals you dont even know.

Autotune and vocal mixing are two different things, albide the normal population don't know the difference and think it's the same.

A lot of people mistake vocal mixing for autotune, when it really isn't.

Autotune takes vocals as input and changes each pitch to match a specific pitch etc.

Vocal mixing, might fix individual notes that were just sung wrong like an A that had accidenitally become A# a single time in the chorus and stuff like that, you don't go through all pitches in the vocal sample, on top of that it might add reverb, compression etc. all of which has nothing to do with autotune, but improves the sound a lot.

May 02, 2022
On Monday, 2 May 2022 at 01:44:03 UTC, claptrap wrote:
> On Sunday, 1 May 2022 at 18:09:16 UTC, Walter Bright wrote:
>> On 5/1/2022 9:31 AM, claptrap wrote:
>>> And 99.9% of the time you're listening to AutoTuned vocals you dont even know.
>>
>> It's gotten to the point where I can tell :-)
>
> How do you know when you cant tell? You dont, you just assume because you spot it sometimes you can always tell, you cant.
>
> and the thing about singers being better in the 70s, it's not true, it's just that we've forgotten 90% of the music and we only remember the good stuff. It's natural selection. 20 or 30 years from now people will say the same about the 2010s, because all the crap will have been forgotten and only the good stuff remains. There's a name for it but I cant remember what it is.
>
> I mean seriously look up the charts for a specific week in the 70s, or 80s or whatever, most of it was awful. But we just remember the stuff that stood the test of time.

I agree entirely with you. Even though there's a lot of bad music being made, there's still so much good music too. I don't think it's really that much different from then, I also believe that for nostalgic reasons people won't think newer music is better, even when it is. Same reason some people think movies etc. were better back then, when that isn't close to the truth either. Tons of movies I watched as a child and thought were amazing that I rewatched as an adult and hated.
May 02, 2022

On Monday, 2 May 2022 at 01:43:03 UTC, claptrap wrote:

>

I said it likely wasn't "feasible" not that it was impossible. Even the high end digital effects units in the mid 90s only managed a handful of basic effects at the same time, and they usually did that by using multiple chips, with different chips handling different blocks in the chain. A phase vocoder would have been pretty hard to pull off on that kind of hardware even if it was possible to a level of quality that was useful.

Technically even the Motorola 56000 can do over 500 FFTs per second with a window size of 1024 according to Wikipedia. So the phase vocoder part was feasible, but it might not have been sonically feasible in the sense that you would not end up with a product believed to be marketable or that it wasn't believed to be feasible to reach a sonic quality that would satisfy the market. That could come down to pitch-tracking, phase-vocoder issues or the details of putting it all together.

Phase vocoders do introduce artifacts in the sound, it kinda follows from the uncertainty principle, you get to choose between high resolution in time or high resolution in frequency, but not both. So when you modify the sound of chunks of sound only in the frequency domain (with no concern for time) and then glue those chunks back together you will get something that has changed not only in pitch (in the general case). So it takes a fair amount of cleverness and time consuming fiddling to "suppress" those "time domain artifacts" in such a way that we don't find it disturbing. (But as I said, by the late 90s, such artifacts was becoming the norm in commercial music. House music pushed the sound of popular music in a that direction throughout the 90s.)

However, the concept of decomposing sound into spectral components in order to modify or improve on the resulting sound has been an active field ever since ordinary computers were able to run FFT in reasonable time. So there is no reason to claim that someone suddenly woke up with this obvious idea that nobody had thought about before. It comes down to executing and hitting a wave (being adopted).

In general truly original innovators rarely succeed in producing a marketable product, market success usually happens by someone else with the right knowledge taking ideas that exists, refining them, making them less costly to produce, using good marketing at the right time (+ a stroke of luck, like being picked up by someone that gives it traction).

"Somone woke up with an obvious idea that nobody had thought about before" makes for good journalistic entertainment, but is usually not true. Successful products tend to come in the wake of "not quite there efforts". You very rarely find examples of the opposite. (The exception might be in chemistry where people stumble upon a substance with interesting properties.)

May 02, 2022

On Monday, 2 May 2022 at 08:52:06 UTC, Ola Fosheim Grøstad wrote:

>

On Monday, 2 May 2022 at 01:43:03 UTC, claptrap wrote:

>

I said it likely wasn't "feasible" not that it was impossible. Even the high end digital effects units in the mid 90s only managed a handful of basic effects at the same time, and they usually did that by using multiple chips, with different chips handling different blocks in the chain. A phase vocoder would have been pretty hard to pull off on that kind of hardware even if it was possible to a level of quality that was useful.

Technically even the Motorola 56000 can do over 500 FFTs per second with a window size of 1024 according to Wikipedia. So the phase vocoder part was feasible, but it might not have been sonically feasible in the sense that you would not end up with a product believed to be marketable or that it wasn't believed to be feasible to reach a sonic quality that would satisfy the market. That could come down to pitch-tracking, phase-vocoder issues or the details of putting it all together.

Phase vocoders do introduce artifacts in the sound, it kinda follows from the uncertainty principle, you get to choose between high resolution in time or high resolution in frequency, but not both. So when you modify the sound of chunks of sound only in the frequency domain (with no concern for time) and then glue those chunks back together you will get something that has changed not only in pitch (in the general case). So it takes a fair amount of cleverness and time consuming fiddling to "suppress" those "time domain artifacts" in such a way that we don't find it disturbing. (But as I said, by the late 90s, such artifacts was becoming the norm in commercial music. House music pushed the sound of popular music in a that direction throughout the 90s.)

The concept of "windowing" + "overlapp add" to reduce artifacts is quite old, e.g the Harris Window is 1978. Dont known for better ones (typically Hanning).
This doubles the amount of FFT required for a frame but you seem to say this was technically possible.

May 02, 2022
On Monday, 2 May 2022 at 01:42:19 UTC, Walter Bright wrote:

>
> A language designed for native compilation draws a hard distinction between compile time and run time. You'll see this in the grammar for the language, in the form of a constant-expression for compile time, and just expression for run time. The constant-expression does constant folding at compile time. The runtime does not include a compiler.

Nope, Nemerle doesn't require a compiler runtime at runtime (however, you can include it if you need to). The Nemerle compiler compiles the const-expressions into a dll (yes, the target is bytecode, but it could be native code - it doesn't matter) and then loads the compiled code back and executes it *at compile time*. It could as well do interpretation the way D does. Both approaches have their pros and cons, but they do fundamentally the same thing.
May 02, 2022

On Monday, 2 May 2022 at 06:15:32 UTC, FeepingCreature wrote:

>

But I still think this is fundamentally the right way to think about CTFE. The compiler runtime is just a backend target, and because the compiler is a library, the macro can just recurse back into the compiler for parsing and helper functions. It's elegant, it gets native performance and complete language support "for free", and most importantly, it did not require much effort to implement.

Yay!

May 02, 2022

On Monday, 2 May 2022 at 08:52:06 UTC, Ola Fosheim Grøstad wrote:

>

(But as I said, by the late 90s, such artifacts was becoming the norm in commercial music. House music pushed the sound of popular music in a that direction throughout the 90s.)

Sometimes artifacts sound "good", be it for cultural or "objective" reason.

Many small delays can help a voice "fit in the mix", and spectral leakage in a phase vocoder do just that. So some may want to come through a STFT process just for the sound of leakage, that makes a voice sound "processed" (even without pitch change). Why? Because in a live performance, you would have those delays because of mic leakage.

It is also true of the artifacts that leads to reduced dynamics (such as phase misalignment in a phase vocoder). Didn't like those annoying vocal dynamics? Here is less of them, as a side-effect.

The phase-shift in oversampling? It can make drums sound more processed by delaying the basses, again. To the point people use oversampling for processors that only add minimal aliasing.

Plus in the 2020s, anything with the sound of a popular codec is going to sound "good" because it's the sound of streaming.

May 02, 2022

On Monday, 2 May 2022 at 00:24:24 UTC, Bruce Carneal wrote:

>

Does writing a compile time function require any new knowledge/skill or is it like writing a runtime function? Accurately answering "they're like any other function, use functions in either context and you'll be fine" means you've got something immediately useful to newcomers, an ultra low friction path to more power.

It does require new knowledge - you have to stick "macro" to the function declaration. In D, you don't need to do that, because the grammatical context the function is used in determines whether the function will be executed at compile time.

>

Answering "no, but we have super duper xyz which is every bit as powerful theoretically and should probably be preferred because it's hard for people to understand and qualifies you for your programming wizard merit badge", means you, as a language designer, did not understand what you could have had.

Unless I'm missing something big from the Nemerle wiki page those language designers did not understand what they could have had.

There is nothing big about CTFE. )

>

I'm happy to give credit where it is due but I'd advise hanging on to that beer in this case. :-)

I need my beer badly right now!

May 02, 2022

On Monday, 2 May 2022 at 09:23:10 UTC, Guillaume Piolat wrote:

>

Sometimes artifacts sound "good", be it for cultural or "objective" reason.

Yes, this is true. Like, the loudness-competition that lead to excessive use of compression (multiband?) and ducking (to let bass drum through) lead to a sound image that was pumping in and out. I personally find that annoying, but when you see kids driving in the streets playing loud music they seem to favour this "musically bad" sound. I guess they find excitement in it, where I think of it as poor mastering. And I guess in some genres it is now considered bad mastering if you don't use excessive compression.

I believe this loudness-competition and "overproduction" also has affected non-pop genres. If you get the ability to tweak, it is difficult to stop in time... I frequently find the live performances of talented singers on youtube more interesting than their studio recordings, actually.

The french music scene might be different? French "electro" seemed more refined/sophisticated in the sound than many other "similar" genres, but this is only my impression, which could be wrong.

>

Many small delays can help a voice "fit in the mix", and spectral leakage in a phase vocoder do just that. So some may want to come through a STFT process just for the sound of leakage, that makes a voice sound "processed" (even without pitch change). Why? Because in a live performance, you would have those delays because of mic leakage.

I hadn't thought of that. Interesting perspective about mics, but a phase vocoder have other challenges related to changing the frequency content. How would you create a glissando from scratch just using inverse FFT, it is not so obvious? How do you tell the difference between a click and a "shhhhhhh" sound? The only difference is in the phase… so not so intuitive in the frequency domain, but very intuitive in the time domain. You don't only get spectral leakage from windowing, you also can get some phasing-artifacts when you manipulate the frequency content. And so on…

But, the audience today is very much accustomed to electronic soundscapes in mainstream music, so sounding "artificial" is not a negative. In the 80s you could see people argue seriously and with a fair amount of contempt that electronic music wasn't real music… That is a big difference!

Maybe similar things are happening in programming. Maybe very young programmers have a completely different view of what programming should be like? I don't know, but I've got a feeling that they would view C as a relic of the past. If we were teens, would we then focus on the GPU and forget about the CPU, or just patching together libraries in Javascript? Javascript is actually quite capable today, so…

>

The phase-shift in oversampling? It can make drums sound more processed by delaying the basses, again. To the point people use oversampling for processors that only add minimal aliasing.

I didn't understand this one, do you mean that musicians misunderstand what is causing the effect so that they think that it is caused by the main effect, but instead it caused by the internal delay of the unit? Or did you mean something else?

>

Plus in the 2020s, anything with the sound of a popular codec is going to sound "good" because it's the sound of streaming.

I hadn't though of that. I'm not sure if I hear the difference between the original or mp3 when playing other people's music (maybe the hi-hats). I do hear a difference when listening to my own mix (maybe because I've spent so many hours analysing it).

May 02, 2022

On Monday, 2 May 2022 at 08:57:21 UTC, user1234 wrote:

>

The concept of "windowing" + "overlapp add" to reduce artifacts is quite old, e.g the Harris Window is [1978]. Dont known for better ones (typically Hanning).
This doubles the amount of FFT required for a frame but you seem to say this was technically possible.

Yes, I assume anyone who knows about FFT also knows the theory for windowing? The theoretically "optimal" one for analysis is DPSS, although Kaiser is basicially the same, but I never use those.

I use 4x not 2x and Hann^2 (cos*cos) as the window function for simplicity. The reason for this is that when you heavily modify the frequency content you need to window it again. So you multiply with cos(t) twice, but when you add them together the sum = 1. Probably not optimal, but easy to deal with for experiments.

I also believe it is possible to use Hann-Poisson for analysis. It has excessive spectral leakage, but supposedly allows you to accurately find the peaks as the flanks for the spectral leakage are monotone (smooth slope) so you can use hill climbing. But I doubt you can use this for resynthesis.

What you could do is use Hann-Poisson for detecting peaks and then use another window function for resynthesis. I will try this some day :-).