March 21, 2015
On Sat, Mar 21, 2015 at 04:17:00AM +0000, Joakim via Digitalmars-d wrote: [...]
> What I was going to say too, neither CLI or GUI will win, speech recognition will replace them both, by providing the best of both. Rather than writing a script to scrape several shopping websites for the price of a Galaxy S6, I'll simply tell the intelligent agent on my computer "Find me the best deal on a S6" and it will go find it.

I dunno, I find that I can express myself far more precisely and concisely on the keyboard than I can verbally. Maybe for everyday tasks like shopping for the best deals voice recognition is Good Enough(tm), but for more complex tasks, I have yet to find something more expressive than the keyboard.


> As for touch, seems like a dead-end to me, far less expressive than anything else and really only geared for rudimentary interaction.  It may always be there but you likely won't use it much.

Yeah, it's just another variation of point-and-grunt. Except the grunt part is replaced with tap. :-P


> I do think some sort of hand gesture-based interface will stick around for when voice isn't expressive enough, ie you'll still want to use your hands when painting:
> 
> http://www.engadget.com/2015/03/19/jaw-dropping-magic-leap-demo/
> 
> That video is not the way it will be done, as waving your arms around Minority Report-style is way too much effort, but something akin to the small finger movements I make on my touch-based trackpad, but in 3D, will likely be it.

You might be on to something. Manipulation of 3D holograms via hand motion detection perhaps might be what will eventually work best.


T

-- 
"Maybe" is a strange word.  When mom or dad says it it means "yes", but when my big brothers say it it means "no"! -- PJ jr.
March 21, 2015
On 2015-03-21 at 06:30, H. S. Teoh via Digitalmars-d wrote:
> On Sat, Mar 21, 2015 at 04:17:00AM +0000, Joakim via Digitalmars-d wrote:
> [...]
>> What I was going to say too, neither CLI or GUI will win, speech
>> recognition will replace them both, by providing the best of both.
>> Rather than writing a script to scrape several shopping websites for
>> the price of a Galaxy S6, I'll simply tell the intelligent agent on my
>> computer "Find me the best deal on a S6" and it will go find it.
>
> I dunno, I find that I can express myself far more precisely and
> concisely on the keyboard than I can verbally. Maybe for everyday tasks
> like shopping for the best deals voice recognition is Good Enough(tm),
> but for more complex tasks, I have yet to find something more expressive
> than the keyboard.

"Find me the best deal on a S6" is only a little more complex than "make me a cup of coffee." Fine for doing predefined tasks but questionable as an ubiquitous input method. It's hard enough for mathematicians to dictate a theorem without using any symbolic notation. There is too much ambiguity and room for interpretation in speech to make it a reliable and easy input method for all tasks. Even in your example:

You say: "Find me the best deal on a S6."
I hear: "Fine me the best teal on A.S. six."
Computer: "Are you looking for steel?"

Now imagine the extra trouble if you mix languages. Also, how do you include meta-text control sequences in a message? By raising your voice or tilting your head when you say the magic words? Cf.:

"There was this famous quote QUOTE to be or not to be END QUOTE on page six END PARAGRAPH..."

Very awkward, if talking to oneself wasn't awkward already. Therefore I just cannot imagine voice being used anywhere where exact representation is required, especially in programming:

"Define M1 as a function that takes in two arguments. The state of the machine labelled ES and an integer number in range between two and six inclusive labelled X. The result of M1 is a boolean. M1 shall return true if and only if the ES member labelled squat THATS SQUAT WITH A T AT THE END is equal to zero modulo B. OH SHIT IT WAS NOT B BUT X. SCRATCH EVERYTHING."

March 21, 2015
On Friday, 20 March 2015 at 22:55:24 UTC, Laeeth Isharc wrote:
> So one must be careful to avoid being dazzled by shiny 'scientific' approaches when their value remains yet to be proven.

I sense a recursive problem here...
March 21, 2015
On Friday, 20 March 2015 at 17:25:54 UTC, H. S. Teoh wrote:
> On Fri, Mar 20, 2015 at 05:04:20PM +0000, ketmar via Digitalmars-d wrote:
>> On Fri, 20 Mar 2015 13:28:45 +0000, Paulo  Pinto wrote:
>> 
>> > Given that I have been an IDE fan since the Amiga days, I fully
>> > agree.
>> > 
>> > Every time I am on UNIX I feel like a time travel to the days of
>> > yore.
>> 
>> being on non-nix system is a torture. there aren't even gcc, let alone
>> emacs/vim.
>
> Yeah, I've become so accustomed to the speed of keyboard-based controls
> that every time I use my wife's Windows laptop, I feel so frustrated at
> the rodent dependence and its slowness that I want to throw the thing
> out the window.
>
> But at another level, it's not even about keyboard vs. rodent... it's
> about *scriptability*. It's about abstraction. Typing commands at the
> CLI, while on the surface looks so tedious, actually has a powerful
> advantage: you can abstract it. You can encapsulate it into a script.
> Most well-designed CLI programs are scriptable, which means complex
> operations can be encapsulated and then used as new primitives with
> greater expressiveness.
>
> Sure you can have keyboard shortcuts in GUI programs, but you can't
> abstract a series of mouse clicks and drags or a series of keyboard
> shortcuts into a single action. They will forever remain in the realm of
> micromanagement -- click this menu, move mouse to item 6, open submenu,
> click that, etc.. I have yet to see a successful attempt at
> encapsulation a series of actions as a single meta-action (I've seen
> attempts at it, but none that were compelling enough to be useful.) You
> can't build meta-meta-actions from meta-actions. Everything is bound to
> what-you-see-is-all-you-get. You can't parametrize a series of mouse
> interactions the same way you can take a bash script and parametrize it
> to do something far beyond what the original sequence of typed commands
> did.
>
> Ultimately, I think rodent-based UIs will go the way of the dinosaur.
> It's a regression from the expressiveness of an actual language with
> grammar and semantics back to caveman-style point-and-grunt. It may take
> decades, maybe even centuries, before the current GUI trendiness fades
> away, but eventually it will become obvious that there is no future in a
> non-abstractible UI. Either CLIs will be proven by the test of time, or
> something else altogether will come along to replace the rodent dead-end
> with something more powerful. Something abstractible with the
> expressiveness of language and semantics, not regressive
> point-and-grunt.
>
>
> T

In general I'm in agreement with you, but I think there *is* a place for more visual structure than a terminal editing a text-file can give you (essentially 1-D or maybe 1.5D, whatever that means). Some models/data/tasks are inherently more intuitive and quicker to work with in 2D.
March 21, 2015
On Saturday, 21 March 2015 at 14:07:28 UTC, FG wrote:
> Now imagine the extra trouble if you mix languages. Also, how do you include meta-text control sequences in a message? By raising your voice or tilting your head when you say the magic words? Cf.:
>
> "There was this famous quote QUOTE to be or not to be END QUOTE on page six END PARAGRAPH..."
>
> Very awkward, if talking to oneself wasn't awkward already. Therefore I just cannot imagine voice being used anywhere where exact representation is required, especially in programming:
>
> "Define M1 as a function that takes in two arguments. The state of the machine labelled ES and an integer number in range between two and six inclusive labelled X. The result of M1 is a boolean. M1 shall return true if and only if the ES member labelled squat THATS SQUAT WITH A T AT THE END is equal to zero modulo B. OH SHIT IT WAS NOT B BUT X. SCRATCH EVERYTHING."

Just for fun. A visualization of the problem from 2007 (I doubt there was breakthrough meanwhile)

https://www.youtube.com/watch?v=MzJ0CytAsec

Piotrek
March 21, 2015
On Sat, Mar 21, 2015 at 03:10:37PM +0000, John Colvin via Digitalmars-d wrote:
> On Friday, 20 March 2015 at 17:25:54 UTC, H. S. Teoh wrote:
[...]
> >But at another level, it's not even about keyboard vs. rodent... it's about *scriptability*. It's about abstraction. Typing commands at the CLI, while on the surface looks so tedious, actually has a powerful advantage: you can abstract it. You can encapsulate it into a script. Most well-designed CLI programs are scriptable, which means complex operations can be encapsulated and then used as new primitives with greater expressiveness.
> >
> >Sure you can have keyboard shortcuts in GUI programs, but you can't abstract a series of mouse clicks and drags or a series of keyboard shortcuts into a single action. They will forever remain in the realm of micromanagement -- click this menu, move mouse to item 6, open submenu, click that, etc.. I have yet to see a successful attempt at encapsulation a series of actions as a single meta-action (I've seen attempts at it, but none that were compelling enough to be useful.) You can't build meta-meta-actions from meta-actions. Everything is bound to what-you-see-is-all-you-get. You can't parametrize a series of mouse interactions the same way you can take a bash script and parametrize it to do something far beyond what the original sequence of typed commands did.
> >
> >Ultimately, I think rodent-based UIs will go the way of the dinosaur. It's a regression from the expressiveness of an actual language with grammar and semantics back to caveman-style point-and-grunt. It may take decades, maybe even centuries, before the current GUI trendiness fades away, but eventually it will become obvious that there is no future in a non-abstractible UI. Either CLIs will be proven by the test of time, or something else altogether will come along to replace the rodent dead-end with something more powerful. Something abstractible with the expressiveness of language and semantics, not regressive point-and-grunt.
> >
> >
> >T
> 
> In general I'm in agreement with you, but I think there *is* a place for more visual structure than a terminal editing a text-file can give you (essentially 1-D or maybe 1.5D, whatever that means). Some models/data/tasks are inherently more intuitive and quicker to work with in 2D.

Certainly, some tasks are more suited for 2D, or even 3D, manipulation than editing a text file, say. But just because task X is more profitably manipulated with a 2D interface, does not imply that *every* task is better manipulated the same way.

But at a more fundamental level, it's not really about text vs. graphics or 1D (1.5D) vs. 2D. It's about the ability to abstract, that's currently missing from today's ubiquitous GUIs. I would willingly leave my text-based interfaces behind if you could show me a GUI that gives me the same (or better) abstraction power as the expressiveness of a CLI script, for example. Contemporary GUIs fail me on the following counts:

1) Expressiveness: there is no simple way of conveying complex ideas like "from here until the first line that contains the word 'END', replace all occurrences of 'x' with 'y'". A single sed command could accomplish this, whereas using contemporary GUI idioms you'd need to invent a morass of hard-to-navigate nested submenus.

2) Speed: I can type the sed command in far less time than it takes to move my hand to the mouse, move the cursor across the screen, and click through said morass of nested submenus to select the requisite checkboxes to express what I want to do.

3) Abstraction power: I can parametrize said sed command, and put a whole collection of such commands into a script, that I can thereafter refer to by name to execute the same commands again, *without having to remember* the individual details of said commands.

4) Annotative power: As somebody else pointed out, I can add comments to a script explaining what is needed to perform task X, and why the given steps were chosen for that purpose. This alleviates the need to memorize obscure details about the system that you don't really care about to get your job done, as well as serve to jog your memory when something went wrong and you need to recall why things were done this way and how you might be able to fix it. I simply cannot see how these kinds of meta-annotations can even remotely be shoehorned into contemporary GUI idioms.

5) Precision: Even when working with graphical data, I prefer text-based interfaces where practical, not because text is the best way to work with them -- it's quite inefficient, in fact -- but because I can specify the exact coordinates of object X and the exact displacement(s) I desire, rather than fight with the inherently imprecise mouse movement and getting myself a wrist aneurysm trying to position object X precisely in a GUI. I have yet to see a GUI that allows you to specify things in a precise way without essentially dropping back to a text-based interface (e.g., an input field that requires you to type in numbers... which is actually not a bad solution; many GUIs don't even provide that, but instead give you the dreaded slider control which is inherently imprecise and extremely cumbersome to use. Or worse, the text box with the inconveniently-small 5-pixel up/down arrows that changes the value by 0.1 per mouse click, thereby requiring an impractical number of clicks to get you to the right value -- if you're really unlucky, you can't even type in an explicit number but can only use those microscopic arrows to change it).


A GUI that is NOT rodent-based would alleviate a large part of these problems, actually.  I've been using Vimperator for my browser recently, and in spite of its warts (mostly due to the fact that it's merely a customization layer on top of an essentially rodent-dependent browser core), it's proven to be a far more efficient way of using a GUI browser than the rodent. Well, OK, it's hearkening back to the CLI days of modal editors (y'know, vim), but it proves that it's not really *graphics* per se that are the problem, but it's today's obsession with the rodent that's the source of much of my complaint.


T

-- 
Just because you survived after you did it, doesn't mean it wasn't stupid!
March 21, 2015
On Saturday, 21 March 2015 at 14:07:28 UTC, FG wrote:
> On 2015-03-21 at 06:30, H. S. Teoh via Digitalmars-d wrote:
>> On Sat, Mar 21, 2015 at 04:17:00AM +0000, Joakim via Digitalmars-d wrote:
>> [...]
>>> What I was going to say too, neither CLI or GUI will win, speech
>>> recognition will replace them both, by providing the best of both.
>>> Rather than writing a script to scrape several shopping websites for
>>> the price of a Galaxy S6, I'll simply tell the intelligent agent on my
>>> computer "Find me the best deal on a S6" and it will go find it.
>>
>> I dunno, I find that I can express myself far more precisely and
>> concisely on the keyboard than I can verbally. Maybe for everyday tasks
>> like shopping for the best deals voice recognition is Good Enough(tm),
>> but for more complex tasks, I have yet to find something more expressive
>> than the keyboard.
>
> "Find me the best deal on a S6" is only a little more complex than "make me a cup of coffee." Fine for doing predefined tasks but questionable as an ubiquitous input method. It's hard enough for mathematicians to dictate a theorem without using any symbolic notation. There is too much ambiguity and room for interpretation in speech to make it a reliable and easy input method for all tasks. Even in your example:
>
> You say: "Find me the best deal on a S6."
> I hear: "Fine me the best teal on A.S. six."
> Computer: "Are you looking for steel?"
>
> Now imagine the extra trouble if you mix languages. Also, how do you include meta-text control sequences in a message? By raising your voice or tilting your head when you say the magic words? Cf.:
>
> "There was this famous quote QUOTE to be or not to be END QUOTE on page six END PARAGRAPH..."
>
> Very awkward, if talking to oneself wasn't awkward already. Therefore I just cannot imagine voice being used anywhere where exact representation is required, especially in programming:
>
> "Define M1 as a function that takes in two arguments. The state of the machine labelled ES and an integer number in range between two and six inclusive labelled X. The result of M1 is a boolean. M1 shall return true if and only if the ES member labelled squat THATS SQUAT WITH A T AT THE END is equal to zero modulo B. OH SHIT IT WAS NOT B BUT X. SCRATCH EVERYTHING."


I don't expect programming will remain so low level in the future. We are at the infancy of our skills, when comparing with engineerings with a fee centuries of progress.

For me the future lyes in something like Wolfram/Mathematic with natural voice processing.


March 21, 2015
On Saturday, 21 March 2015 at 14:07:28 UTC, FG wrote:
> On 2015-03-21 at 06:30, H. S. Teoh via Digitalmars-d wrote:
>> On Sat, Mar 21, 2015 at 04:17:00AM +0000, Joakim via Digitalmars-d wrote:
>> [...]
>>> What I was going to say too, neither CLI or GUI will win, speech
>>> recognition will replace them both, by providing the best of both.
>>> Rather than writing a script to scrape several shopping websites for
>>> the price of a Galaxy S6, I'll simply tell the intelligent agent on my
>>> computer "Find me the best deal on a S6" and it will go find it.
>>
>> I dunno, I find that I can express myself far more precisely and
>> concisely on the keyboard than I can verbally. Maybe for everyday tasks
>> like shopping for the best deals voice recognition is Good Enough(tm),
>> but for more complex tasks, I have yet to find something more expressive
>> than the keyboard.
>
> "Find me the best deal on a S6" is only a little more complex than "make me a cup of coffee." Fine for doing predefined tasks but questionable as an ubiquitous input method. It's hard enough for mathematicians to dictate a theorem without using any symbolic notation. There is too much ambiguity and room for interpretation in speech to make it a reliable and easy input method for all tasks. Even in your example:
>
> You say: "Find me the best deal on a S6."
> I hear: "Fine me the best teal on A.S. six."
> Computer: "Are you looking for steel?"

Just tried it on google's voice search, it thought I said "Find me the best deal on a last sex" the first time I tried.  After 3-4 more tries- "a sex," "nsx," etc- it finally got it right.  But it never messed up anything before "on," only the intentionally difficult S6, which requires context to understand.  Ask that question to the wrong person and they'd have no idea what you meant by S6 either.

My point is that the currently deployed, state-of-the-art systems are already much better than what you'd hear or what you think the computer would guess, and soon they will get that last bit right too.

> Now imagine the extra trouble if you mix languages. Also, how do you include meta-text control sequences in a message? By raising your voice or tilting your head when you say the magic words? Cf.:
>
> "There was this famous quote QUOTE to be or not to be END QUOTE on page six END PARAGRAPH..."

Just read that out normally and it'll be smart enough to know that the upper-case terms you highlighted are punctuation marks and not part of the sentence, by using various grammar and word frequency heuristics.  In the rare occurrence of real ambiguity, you'll be able to step down to a lower-level editing mode and correct it.

Mixing languages is already hellish with keyboards and will be a lot easier with speech recognition.

> Very awkward, if talking to oneself wasn't awkward already.

Put a headset on and speak a bit lower and nobody watching will know what you're saying or who you're saying it to.

> Therefore I just cannot imagine voice being used anywhere where exact representation is required, especially in programming:
>
> "Define M1 as a function that takes in two arguments. The state of the machine labelled ES and an integer number in range between two and six inclusive labelled X. The result of M1 is a boolean. M1 shall return true if and only if the ES member labelled squat THATS SQUAT WITH A T AT THE END is equal to zero modulo B. OH SHIT IT WAS NOT B BUT X. SCRATCH EVERYTHING."

As Paulo alludes to, the current textual representation of programming languages is optimized for keyboard entry.  Programming languages themselves will change to allow fluid speech input.

On Saturday, 21 March 2015 at 15:13:13 UTC, Piotrek wrote:
> Just for fun. A visualization of the problem from 2007 (I doubt there was breakthrough meanwhile)
>
> https://www.youtube.com/watch?v=MzJ0CytAsec

Got a couple minutes into that before I knew current speech recognition is much better, as it has progressed by leaps and bounds over the intervening eight years.  Doesn't mean it's good enough to throw away your keyboard yet, but it's nowhere near that bad anymore.

On Saturday, 21 March 2015 at 15:47:14 UTC, H. S. Teoh wrote:
> It's about the ability to abstract, that's
> currently missing from today's ubiquitous GUIs. I would willingly leave
> my text-based interfaces behind if you could show me a GUI that gives me
> the same (or better) abstraction power as the expressiveness of a CLI
> script, for example. Contemporary GUIs fail me on the following counts:
>
> 1) Expressiveness: there is no simple way of conveying complex
--snip--
> 5) Precision: Even when working with graphical data, I prefer text-based
> interfaces where practical, not because text is the best way to work
> with them -- it's quite inefficient, in fact -- but because I can
> specify the exact coordinates of object X and the exact displacement(s)
> I desire, rather than fight with the inherently imprecise mouse movement
> and getting myself a wrist aneurysm trying to position object X
> precisely in a GUI. I have yet to see a GUI that allows you to specify
> things in a precise way without essentially dropping back to a
> text-based interface (e.g., an input field that requires you to type in
> numbers... which is actually not a bad solution; many GUIs don't even
> provide that, but instead give you the dreaded slider control which is
> inherently imprecise and extremely cumbersome to use. Or worse, the text
> box with the inconveniently-small 5-pixel up/down arrows that changes
> the value by 0.1 per mouse click, thereby requiring an impractical
> number of clicks to get you to the right value -- if you're really
> unlucky, you can't even type in an explicit number but can only use
> those microscopic arrows to change it).

A lot of this is simply that you are a different kind of computer user than the vast majority of computer users.  You want to drive a Mustang with a manual transmission and a beast of an engine, whereas most computer users are perfectly happy with their Taurus with automatic transmission.  A touch screen or WIMP GUI suits their mundane tasks best, while you need more expressiveness and control so you use the CLI.

The great promise of voice interfaces is that they will _both_ be simple enough for casual users and expressive enough for power users, while being very efficient and powerful for both.  We still have some work to do to get these speech recognition engines there, but once we do, the entire visual interface to your computer will have to be redone to best suit voice input and nobody will use touch, mice, _or_ keyboards after that.
March 21, 2015
On Saturday, 21 March 2015 at 15:51:38 UTC, Paulo Pinto wrote:
> I don't expect programming will remain so low level in the future. We are at the infancy of our skills, when comparing with engineerings with a fee centuries of progress.
>
> For me the future lyes in something like Wolfram/Mathematic with natural voice processing.

People have been saying this for longer than I'm alive.
March 21, 2015
On Saturday, 21 March 2015 at 19:20:18 UTC, deadalnix wrote:
> On Saturday, 21 March 2015 at 15:51:38 UTC, Paulo Pinto wrote:
>> I don't expect programming will remain so low level in the future. We are at the infancy of our skills, when comparing with engineerings with a fee centuries of progress.
>>
>> For me the future lyes in something like Wolfram/Mathematic with natural voice processing.
>
> People have been saying this for longer than I'm alive.

Unless you've been alive for a few centuries, they still could be right. ;)