Mac Speech To Text For Windows

Active2 years, 2 months ago

Apple's tight integration of POSIX-compliant file paths and a command line interface, and their historically strong hardware and manufacturing standards have had me on the Mac platform for years. However, Apple's recent disappointing and lackluster focus on MacOS and their hardware, and a Lenovo X1 Yoga, 2nd Generation, has caught my attention.

Switching OS's, however, has two seemingly insurmountable areas which concern me. This post focuses on Text-to-Speech OS integration.

Mac text to speech for windows

I've been through the MicrosoftNarrator documentation, which I've found unhelpful. Granted, my use-case isn't related to being visually-impaired. One of my use cases is for Narrator to only read selected text, as I outline below. For example in this 2012 SuperUser post, the questioner has the same issue, with no satisfactory answer provided.

Hi guys, first time poster here. I'm looking for a way to use the male TTS voice from mac computers (the one used in 'Satisfaction' by Benny Benassi or 'Fitter Happier' by Radiohead) on a windows computer. Speechnotes – Speech To Text For PC (Windows & MAC) May 30, 2018 By Pixie Leave a Comment Speechnotes is an application that will allow us to take voice notes in a comfortable and effective way. 5 Free Speech to Text Software for Android, iOS, Windows, and Mac OS By: Reneide De Souza| Updated: August 13, 2018 Leave a Comment Productivity-boosting applications to convert speech to text are something which you can find vastly available in the market for mobile platforms like Android and iOS.

I also wish to emphasize that 'copy and paste into a third party TTS application' is unsatisfactory. On my Mac, I can provide an input, and get an MP3 TTS file with no user intervention in-between, for my #1 scenario, below. I perform this only with Open Source tools, too, except for the 'say' command.

Read And Write Gold For Mac Speech To Text

I've long taken advantage of Mac's Text-to-Speech integration. I use it in three specific ways, though a combination of the below defines 90% of my use-cases.

  1. Converting reformatted text from emails that I wish to have read to me at a later time
    • My current Mac workflow: I copy the source from my email, use a vim script that removes HTML, leaving the text I wish to have read. For example, this script inserts a 'silence' [slnc 2000] command that helps me identify paragraph markers when I listen to the read text.
    • After text markup is complete, I pass the formatted text through the 'say' command, which creates an AIFF of the text-to-speech.
    • Using lame, I then convert this to an mp3 and using dropcaster, push the mp3's to a static public location where my podcast client can retrieve it.
    • Thanks to bash scripts, the above takes 5 seconds of my time. The last time I switched from Mac to Windows, I dearly missed having this. I used ReadAloud's TTs software in the past, but was always more kludgy than the above.
  2. Live proof-reading of emails or documents I'm creating. I find errors more easily when I have my Mac read my written text back to me.
    • Yes, I can copy and past into Notepad, but that's clumsy. Looking at Narrator's interface, I found it very difficult to figure out how to get Narrator to read selected text across applications, i.e., Outlook, Firefox, Word, and so forth.
  3. Using TTS to read selected browser text on long articles I wish to hear while I perform non-attention-demanding tasks.
    • This is similar to #2, however, I might decide it's worth creating a file for podcast if the read text captures my attention, and I'll shift to a #1 process.
    • Firefox has a 'reader' mode which largely helps and works well under Windows.

My questions are:

  1. Is there an equivalent way to pass a formatted text file on Win10 to an MS binary for processing, similar to the 'say' command on Mac? I see dockerimages that are TTS specific, thought that seems more kludgy.
  2. What is the native way to have Windows 10 Narrator read selected text in a fashion as straight-forward as selecting text in any application, invoking a keyboard command, and Win10 perform TTS services?

I'm open their may be different but similar ways to do the above. 'Copy and paste into notepad' however is a kludge as well. I'm hoping MS did their accessibility homework and deployment as well as Apple has.

Some notes to self as I continue to explore this question

  • There are several python packages that enable TTS within a python script. At first this looked promising, but there are several fatal issues, focusing on the python methods outlined here: https://pythonprogramminglanguage.com/text-to-speech/
    • I had problems installing pyttsx. I have brew-installed py2.7.13 and py3.6.1 and using pip3 or pip, was unable to successfully install either version. The original pyttsx is py2, with a fork for py3. This is too bad, as the design calls for the python module to use the native TTS engine. If pyttsx worked on python3, and the project were more active, I'd be more amenable to troubleshoot the module's failure. You can read my comments to a proposed answer here.
    • pyTTS uses Google TTS. This sounds good, but necessarily requires an internet connection. Since I want to match native TTS capability, this moots this option.
  • There is a docker option, https://github.com/parente/espeakbox works great, but the voice is where TTS was 6+ years ago. While I respect the author's desire for creating a performant TTS engine, I love Mac's native TTS and I'd like to be at par with this.
    • Playing with other TTS non-native options, such as Merlin or Festival, the TTS quality is not at par with Mac or Windows native TTS.
  • as per Lưu Vĩnh Phúc's suggestion, it does appear easy to automate native Windows TTS, as per this page: https://www.pdq.com/blog/powershell-text-to-speech-examples/. I step closer to a solution.
Screenack
ScreenackScreenack

1 Answer

Text

MS Office has supported text-to-speech long before it was integrated into Windows (since Vista). As a result you can always open MS Word and have it read the document for you. Just add the Speak button to the ribbon/Quick Access Toolbar then select the text and click it, or assign a shortcut to the speak feature

Narrator also supports this feature. You just have to check the shortcut list

Windows 10 supports Scan Mode to help you go faster. It can be toggled by Caps lock+Spacebar

However Narrator doesn't work well will MS Office so you need to copy the text to an external application. This can be achieved with an AutoHotkey. It'll need to copy the selected text and feed to the below VBS script

I don't think there's something different when reading a webpage compared to a simple text. But check this How to use narrator for reading the content of web pages?

Mac Speech To Text For Windows

Some other TTS applications on Windows can be found here

The text reading output can be recorded with tons of software out there. In case you don't want to hear it and just need to save the output file then use any stream mixing software like GraphStudioNext (included in K-lite codec pack) and redirect the output stream to a file; convert to mp3 before that if needed

All the things above can be automated with a script. Forget the batch file, PowerShell is very powerful and can do anything that can be done with Bash. It can strip format from text and edit it so no need for the vimscript. There's also vim for Windows. Or if needed you can always install bash on Windows or Cygwin. GUI automation can also be done with AutoHotKey.

phuclvphuclv
12.2k7 gold badges50 silver badges105 bronze badges

Not the answer you're looking for? Browse other questions tagged windows-10macaccessibilitytext-to-speech or ask your own question.

Macspeech Dictate

Set up Dictation

Choose Apple () menu > System Preferences, click Keyboard, then click Dictation. Turn on Dictation and choose from these Dictation options:

  • Choose whether to use Enhanced Dictation, which lets you use dictation when you're not connected to the Internet.
  • Choose your language and dialect. Some languages, such as English, have multiple dialects.
  • Choose the keyboard shortcut you will use to start start dictating.
  • Choose your preferred microphone from the pop-up menu below the microphone icon.


In macOS Sierra, you can ask Siri to “turn on Dictation” for you. Siri isn't the same as Dictation, but you can ask Siri to compose short messages, such as email and text messages.

Use Dictation

  1. Go to a document or other text field and place the insertion point where you want your dictated text to appear.
  2. Press the keyboard shortcut for starting dictation, or choose Edit > Start Dictation. The default shortcut is Fn Fn (press the Fn key twice).
    When your Mac is listening, it displays a microphone to the left or right of the page, aligned with the insertion point. If you turn on advanced dictation commands, the microphone appears in the lower-right corner of your screen, and you can drag it to another position. When your Mac can hear you, the input meter inside the microphone rises and falls as you speak.
  3. Speak the words that you want your Mac to type. Dictation learns the characteristics of your voice and adapts to your accent, so the more you use it, the better it understands you. If it doesn't understand you, learn what to do.
  4. To stop dictating, click Done below the microphone icon, press Fn once, or switch to another window.

Speak the following words to enter punctuation or other characters. These may vary by language or dialect.

  • apostrophe '
  • open bracket [
  • close bracket ]
  • open parenthesis (
  • close parenthesis )
  • open brace {
  • close brace }
  • open angle bracket <
  • close angle bracket >
  • colon :
  • comma ,
  • dash -
  • ellipsis …
  • exclamation mark !
  • hyphen -
  • period, point, dot, or full stop .
  • question mark ?
  • quote ”
  • end quote ”
  • begin single quote '
  • end single quote '
  • semicolon ;
  • ampersand &
  • asterisk *
  • at sign @
  • backslash
  • forward slash /
  • caret ^
  • center dot ·
  • large center dot •
  • degree sign °
  • hashtag or pound sign #
  • percent sign %
  • underscore _
  • vertical bar |
  • dollar sign $
  • cent sign ¢
  • pound sterling sign £
  • euro sign €
  • yen sign ¥
  • cross-eyed laughing face XD
  • frowny face :-(
  • smiley face :-)
  • winky face ;-)
  • copyright sign ©
  • registered sign ®
  • trademark sign ™
  • equals sign =
  • greater than sign >
  • less than sign <
  • minus sign -
  • multiplication sign x
  • plus sign +
  • caps on (formats next phrase in title case)
  • caps off (resumes default letter case)
  • all caps (formats next word in ALL CAPS)
  • all caps on (proceeds in ALL CAPS)
  • all caps off (resumes default letter case)
  • new line (adds line break)
  • numeral (formats next phrase as number)
  • roman numeral (formats next phrase as Roman numeral)
  • new paragraph (adds paragraph break)
  • no space on (formats next phrase without spaces)
  • no space off (resumes default spacing)
  • tab key (advances cursor to the next tab stop)


If you turned on Enhanced Dictation, you can also use dictation commands to bold, italicize, underline, select, copy, delete, undo, and perform other actions.

About Enhanced Dictation

Enhanced Dictation is available in OS X Mavericks v10.9 or later. With Enhanced Dictation:

  • You can dictate continuously.
  • You can dictate without being connected to the Internet.
  • Your words might convert to text more quickly.
  • You can use dictation commands to tell your Mac what to do.

Without Enhanced Dictation, your spoken words and certain other data are sent to Apple to be converted into text and help your Mac understand what you mean. As a result, your Mac must be connected to the Internet, your words might not convert to text as quickly, and you can speak for no more than 40 seconds at a time (30 seconds in OS X Yosemite or earlier).

If you're on a business or school network that uses a proxy server, Dictation might not be able to connect to the Internet. Have your network administrator refer to the list of network ports used by Apple software products.

About Dictation and privacy

To learn about Dictation and privacy, choose Apple () menu > System Preferences, click Keyboard, click Dictation, then click the About Dictation & Privacy button. At all times, information collected by Apple is treated in accordance with Apple’s Privacy Policy.

Text To Speech App Windows 10

Learn more

Mac Speech To Text

  • To use dictation on your iPhone, iPad, or iPod touch, tap the microphone on the onscreen keyboard, then speak. Consult your iPhone or iPad user guide for details.
  • If the Slow Keys or Sticky Keys feature is turned on in the Accessibility pane of System Preferences, the default keyboard shortcuts for dictation might not work. If you need to use those accessibility features, create a custom dictation shortcut: Choose Apple menu > System Preferences, click Keyboard, click Dictation, then choose “Customize” from the Shortcut menu.