Hello people! Please help me out with this project!!! It'll benefit a lot of people.
Please go straight to '
Current Problem' near the end!
Objective- To create a FOSS Continuous Speech Recognition engine
- Initially for Linux, ultimately crossplatform
- A smartly designed GUI that finds the optimal balance between voice and hand input
This is a big project. Why am I pushing it? I have severe RSI. I _need_ speech recognition. And I _hate_ Windows. So I'm going to work towards creating it for Linux. I
cannot do it on my own. My hands are fucked and besides I am a little stupid on Linux. Please muck in! We need more hands on board!
Why Vista Speech is shitI hate the Vista speech recognition software. It makes me scream and yell. It was starting to damage my brain. The engine is great, but the interface is unbearable. Commands are intermingled with dictation. You dictate 'Fred was going to close the window'. And your window has gone! Designed by corporate fucking monkeys. I asked the Vista Speech team why it was so shit, and they said
'it's designed as a keyboard/mouse replacement. Not complement. Our specifications are _not_ to make an HCI optimally balanced between voice and hand.' Well this is
my specification.
So, how to make continuous speech for Linux? The FOSS community has
CMU-Sphinx. It's the original speech engine. 20+ years ago, DARPA funded a department at CMU (Carnegie Mellon University) to do it. Vista Speech is accurate because they found and hired all the current Sphinx developers. Vista Speech is working off the Sphinx engine. Go to
irc.freenode.net#cmusphinx and say hi to
dhdfoo or
nshm who seem to be the two most active maintainers right now. They're not always on, just hang awhile. I'm on as
ohmu.
So we've got the engine... Why don't we have continuous speech? Because it needs thousands of hours of training data. eg you need to record yourself saying 'Mary had a little lamb', feed it into the database together with the text. Do this for 1000 people at 10 hours each. Now Sphinx can chew on that data and make a decent engine. Nobody's done it. There's a current attempt at www.voxforge.com
The plan- Use WINE to get Vista's Speech Engine operating in Linux
- Create a GUI that'll interface with this engine.
The GUI will sporadically (unless the user disables the feature) send phrase-data to a central database (say VoxForge - I have contacted the maintainer and he is friendly) - Once we have enough data, throw out the WINE-wrapped Vista Engine, and replace it with our own FOSS engine.
Where are we?I have to contact the
VoxForge maintainer again, and check that it's ok to pipe data to him. Shouldn't be a problem.
I've contacted
RMS (Stallman), who says FSF will provide servers as long as they are not tarnished by non-free software. From the VoxForge guy: We may not need this.
I've contacted
Nickolai (nshm on irc - Russian Sphinx guru) who is willing to adapt the Sphinx engine to accomplish stage 3.
Current ProblemRight now I need help with stage 1. Stage 1 is transferring Vista's speech recognition engine to Linux via wine. I have
- Run Vista, examined the commandline behind the 'Speech Recognition' icon:
%SystemRoot%\Speech\Common\sapisvr.exe -SpeechUX
(I really just need the engine ported over, but if I can get this working in wine, that's engine + GUI - a good test the engine's ported ok)
- Installed wine
- Run winecfg, set global Applications -> 'Vista'
- Copied relevant folders from VistaBox:
/WINDOWS/Speech/* -> /(wine's C-Drive)/WINDOWS/Speech/*
/WINDOWS/System32/Speech/* -> /(wine's C-Drive)/WINDOWS/System32/Speech/*
- spud@spud-laptop:~/.wine/drive_c$ gedit SetPaths.bat
...and put in the following:
set PATH=C:\WINDOWS
set PATH=%PATH%;C:\WINDOWS\system32
set PATH=%PATH%;C:\WINDOWS\system32\Speech\Engines\SR
set PATH=%PATH%;C:\WINDOWS\system32\Speech\Engines\SR\en-US
set PATH=%PATH%;C:\WINDOWS\system32\Speech\SpeechUX
set PATH=%PATH%;C:\WINDOWS\system32\Speech\SpeechUX\en-gb
set PATH=%PATH%;C:\WINDOWS\system32\Speech\SpeechUX\en-us
set PATH=%PATH%;C:\WINDOWS\Speech\Common
set PATH=%PATH%;C:\WINDOWS\Speech\Common\en-US
set PATH=%PATH%;C:\WINDOWS\Speech\Engines\SR
set PATH=%PATH%;C:\WINDOWS\Speech\Engines\SR\en-GB
set PATH=%PATH%;C:\WINDOWS\Speech\Engines\SR\en-US
set PATH=%PATH%;C:\WINDOWS\Speech\Engines\Lexicon\en-GB
set PATH=%PATH%;C:\WINDOWS\Speech\Engines\Lexicon\en-US
echo 'Paths set! Have a look!'
PATH
- Launch a DOS prompt
spud@spud-laptop:~/.wine/drive_c$ wine cmd
CMD Version 1.0
C:\>SetPaths
'Path set to:'
PATH=C:\WINDOWS;C:\WINDOWS\system32;C:\WINDOWS\system32\Speech\Engines\SR;C:\WINDOWS\system32\Speech\Engines\SR\en-US;C:\WINDOWS\system32\Speech\SpeechUX;C:\WINDOWS\system32\Speech\SpeechUX\en-gb;C:\WINDOWS\system32\Speech\SpeechUX\en-us;C:\WINDOWS\Speech\Common;C:\WINDOWS\Speech\Common\en-US;C:\WINDOWS\Speech\Engines\SR;C:\WINDOWS\Speech\Engines\SR\en-GB;C:\WINDOWS\Speech\Engines\SR\en-US;C:\WINDOWS\Speech\Engines\Lexicon\en-GB;C:\WINDOWS\Speech\Engines\Lexicon\en-US
- C:\>sapisvr -fish
C:\>fixme:heap:HeapSetInformation (nil) 1 (nil) 0
err:ole:CoUninitialize Mismatched CoUninitialize
Good! Should fail - param is wrong.
- C:\>sapisvr -SpeechUX
C:\>fixme:heap:HeapSetInformation (nil) 1 (nil) 0
err:ole:CoGetClassObject class {1b2afb92-0b5e-4a30-b5cc-353db4f9e150} not registered
err:ole:CoGetClassObject class {1b2afb92-0b5e-4a30-b5cc-353db4f9e150} not registered
err:ole:create_server class {1b2afb92-0b5e-4a30-b5cc-353db4f9e150} not registered
fixme:ole:CoGetClassObject CLSCTX_REMOTE_SERVER not supported
err:ole:CoGetClassObject no class object {1b2afb92-0b5e-4a30-b5cc-353db4f9e150} could be created for context 0x17
- Googling {1b2afb92-0b5e-4a30-b5cc-353db4f9e150} gives
SpSapiServer Class
C:\Program Files\Common Files\Microsoft Shared\Speech\sapi.dll
thx to http://www.myplugins.info/guids/guid.php?guid=1B
There's no \Speech in \Microsoft Shared\, but there's a sapi.dll in the files I copied:
c:/windows/system32/Speech/Common/sapi.dll
so I guess I should register it.
spud@spud-laptop:~/.wine/drive_c$ wine regsvr32 c:/windows/system32/Speech/Common/sapi.dll
fixme:advapi:RegisterTraceGuidsW 0x34b0f75c 0x34b952d8 0x34ae1c2c 1 0x32f934 (null) (null) 0x34b952e0
Successfully registered DLL c:/windows/system32/Speech/Common/sapi.dll
Now try again:
spud@spud-laptop:~/.wine/drive_c$ wine ./windows/Speech/Common/sapisvr -SpeechUX
fixme:heap:HeapSetInformation (nil) 1 (nil) 0
fixme:advapi:RegisterTraceGuidsW 0x34b0f75c 0x34b952d8 0x34ae1c2c 1 0x32f4e8 (null) (null) 0x34b952e0
err:ntdll:NtQueryInformationToken Unhandled Token Information class 26!
fixme:ole:CoCreateInstance no instance created for interface {31e99ed0-6ad8-431b-ae3c-652d9e8c7832} of class {1b2afb92-0b5e-4a30-b5cc-353db4f9e150}, hres is 0x80070001
- Seems to be looking better. No idea how to proceed from here tho! I tried registering all the dlls I imported thus, but have got tangled and I'm not sure it's even the way to go. 3 succeed. Several Here's the sticking point:
spud@spud-laptop:~/.wine/drive_c$ wine cmd
CMD Version 1.0
C:\>regsvr32 c:/windows/system32/Speech/Common/sapi.dll
fixme:advapi:RegisterTraceGuidsW 0x34b0f75c 0x34b952d8 0x34ae1c2c 1 0x33f934 (null) (null) 0x34b952e0
Successfully registered DLL c:/windows/system32/Speech/Common/sapi.dll
C:\>regsvr32 c:/windows/system32/Speech/SpeechUX/speechuxcpl.dll
err:module:import_dll Library msvcrt.dll (which is needed by L"C:\\windows\\system32\\Speech\\SpeechUX\\speechuxcpl.dll") not found
err:module:import_dll Library msvcrt.dll (which is needed by L"C:\\windows\\system32\\DUser.dll") not found
err:module:import_dll Library DUser.dll (which is needed by L"C:\\windows\\system32\\Speech\\SpeechUX\\speechuxcpl.dll") not found
Failed to load DLL c:/windows/system32/Speech/SpeechUX/speechuxcpl.dll
C:\>regsvr32 c:/windows/system32/Speech/SpeechUX/SpeechUXPS.dll
err:module:import_dll Library msvcrt.dll (which is needed by L"C:\\windows\\system32\\Speech\\SpeechUX\\SpeechUXPS.dll") not found
Failed to load DLL c:/windows/system32/Speech/SpeechUX/SpeechUXPS.dll
C:\>regsvr32 c:/windows/Speech/Common/sqmapi.dll
wine: Call from 0x70d22a1e to unimplemented function KERNEL32.dll.InitializeCriticalSectionEx, aborting
fixme:ntdll:RtlNtStatusToDosErrorNoTeb no mapping for 80000100
Failed to load DLL c:/windows/Speech/Common/sqmapi.dll
C:\>regsvr32 c:/windows/Speech/Common/DUser.dll
wine: Call from 0x70d22a1e to unimplemented function KERNEL32.dll.InitializeCriticalSectionEx, aborting
fixme:ntdll:RtlNtStatusToDosErrorNoTeb no mapping for 80000100
Failed to load DLL c:/windows/Speech/Common/DUser.dll
C:\>regsvr32 c:/windows/system32/Speech/SpeechUX/en-gb/SpeechUXres.dll
DllRegisterServer not implemented in DLL c:/windows/system32/Speech/SpeechUX/en-gb/SpeechUXres.dll
C:\>regsvr32 c:/windows/system32/Speech/SpeechUX/en-us/SpeechUXres.dll
DllRegisterServer not implemented in DLL c:/windows/system32/Speech/SpeechUX/en-us/SpeechUXres.dll
C:\>regsvr32 c:/windows/system32/Speech/Engines/SR/spsreng.dll
fixme:heap:HeapSetInformation 0x560000 1 (nil) 0
Failed to register DLL c:/windows/system32/Speech/Engines/SR/spsreng.dll
C:\>regsvr32 c:/windows/system32/Speech/Engines/SR/spsrx.dll
fixme:heap:HeapSetInformation 0x560000 1 (nil) 0
Failed to register DLL c:/windows/system32/Speech/Engines/SR/spsrx.dll
C:\>regsvr32 c:/windows/system32/Speech/Engines/SR/srloc.dll
fixme:heap:HeapSetInformation 0x560000 1 (nil) 0
Failed to register DLL c:/windows/system32/Speech/Engines/SR/srloc.dll
C:\>regsvr32 c:/windows/system32/Speech/SpeechUX/SpeechUX.dll
fixme:advapi:RegisterTraceGuidsW 0x6cd15f38 0x6cd20180 0x6cd019f4 1 0x32f8d0 (null) (null) 0x6cd20188
fixme:advapi:RegisterTraceGuidsA 0x6ec16eb9 0x6ec265e8 0x6ec026b0 1 0x32f8cc (null) (null) 0x6ec265f0
fixme:advapi:RegisterTraceGuidsA 0x6ec16eb9 0x6ec26608 0x6ec026c0 1 0x32f8cc (null) (null) 0x6ec26610
wine: Call from 0x4b4775ab to unimplemented function USER32.dll.ChangeWindowMessageFilter, aborting
wine: Call from 0x4b46d706 to unimplemented function msvcrt.dll._except_handler4_common, aborting
:
(this line about 700 times)
:
wine: Call from 0x4b46d706 to unimplemented function msvcrt.dll._except_handler4_common, aborting
wine: Call from 0x4b46d706 to unimplemented function msvcrt.dll._except_handler4_common, aborting
err:seh:setup_exception_record stack overflow 1968 bytes in thread 0027 eip b7d271e3 esp 00230b80 stack 0x230000-0x231000-0x330000
This is becoming a hydra. First time round it complained about sqmapi.dll and DUser.dll not being present. So I've copied them across from Vista's /system32. I've placed a copy in wine/s /system32 as well as in the folder sapisvr resides. Yet on registering speechuxcpl.dll it's still complaining msvcrt.dll and DUser.dll cannot be found. msvcrt.dll is there! And I have copied DUser.dllthere! wtf?
As for SpeechUX.dll - I reckon I really need to get this registered, as the command line I'm trying to execute is 'sapisvr -SpeechUX'. It is complaining about msvcrt.dll. So I'm copying a native one over from Vista's /system32 into the same folder as sapisvr.exe. I go to winecfg global app settings and add it as a native dll.
OK just realized I have to exit and reenter the DosShell to effect settings from winecfg. Here's the new output.
C:\> regsvr32 c:/windows/system32/Speech/SpeechUX/SpeechUX.dll
err:module:import_dll Library msvcrt.dll (which is needed by L"C:\\windows\\system32\\Speech\\SpeechUX\\SpeechUX.dll") not found
err:module:import_dll Library msvcrt.dll (which is needed by L"C:\\windows\\system32\\sqmapi.dll") not found
err:module:import_dll Library sqmapi.dll (which is needed by L"C:\\windows\\system32\\Speech\\SpeechUX\\SpeechUX.dll") not found
err:module:import_dll Library msvcrt.dll (which is needed by L"C:\\windows\\system32\\DUser.dll") not found
err:module:import_dll Library DUser.dll (which is needed by L"C:\\windows\\system32\\Speech\\SpeechUX\\SpeechUX.dll") not found
Failed to load DLL c:/windows/system32/Speech/SpeechUX/SpeechUX.dll
Maybe permissions is the problem?
spud@spud-laptop:~/.wine/drive_c/windows/system32$ ls -l sqmapi.dll
-rw-r--r-- 1 spud spud 134144 2008-11-02 22:06 sqmapi.dll
spud@spud-laptop:~/.wine/drive_c/windows/system32$ chmod a+r sqmapi.dll
I fixed this with all other copied files. No luck. Same readout. Stuck.
- OK today I found out my wine is 1.0. So I upgraded. Now on 1.1.7 : Slightly different error:
spud@spud-laptop:~/.wine/drive_c/windows/system32/Speech/SpeechUX$ wine regsvr32 ./SpeechUX.dll
wine: Call from 0x70d22a1e to unimplemented function KERNEL32.dll.InitializeCriticalSectionEx, aborting
fixme:ntdll:RtlNtStatusToDosErrorNoTeb no mapping for 80000100
Failed to load DLL ./SpeechUX.dll
Then I upgraded wine to 1.1.7 and tried again
spud@spud-laptop:~/.wine/drive_c/windows/system32/Speech/SpeechUX$ wine regsvr32 ./SpeechUX.dll
fixme:ntdll:RtlInitializeCriticalSectionEx (0x70dae0d0,4000,0x04000000) semi-stub
fixme:ntdll:RtlInitializeCriticalSectionEx (0x70dae1c0,4000,0x04000000) semi-stub
fixme:ntdll:RtlInitializeCriticalSectionEx (0x70dae2f0,4000,0x04000000) semi-stub
fixme:ntdll:RtlInitializeCriticalSectionEx (0x70dae498,4000,0x04000000) semi-stub
fixme:ntdll:RtlInitializeCriticalSectionEx (0x70dae2c8,4000,0x04000000) semi-stub
fixme:ntdll:RtlInitializeCriticalSectionEx (0x70dae1d8,4000,0x04000000) semi-stub
fixme:ntdll:RtlInitializeCriticalSectionEx (0x70dae1f0,4000,0x04000000) semi-stub
fixme:ntdll:RtlInitializeCriticalSectionEx (0x70dae130,4000,0x04000000) semi-stub
fixme:ntdll:RtlInitializeCriticalSectionEx (0x70dae288,4000,0x04000000) semi-stub
fixme:ntdll:RtlInitializeCriticalSectionEx (0x70dae070,4000,0x04000000) semi-stub
fixme:ntdll:RtlInitializeCriticalSectionEx (0x70dae1a0,4000,0x04000000) semi-stub
fixme:ntdll:RtlInitializeCriticalSectionEx (0x70dae398,4000,0x04000000) semi-stub
fixme:ntdll:RtlInitializeCriticalSectionEx (0x70dae148,4000,0x04000000) semi-stub
fixme:ntdll:RtlInitializeCriticalSectionEx (0x70dae3b0,4000,0x04000000) semi-stub
fixme:advapi:RegisterTraceGuidsW 0x6cd15f38 0x6cd20180 0x6cd019f4 1 0x32f8d0 (null) (null) 0x6cd20188
fixme:advapi:RegisterTraceGuidsA 0x6ec16eb9 0x6ec265e8 0x6ec026b0 1 0x32f8cc (null) (null) 0x6ec265f0
fixme:advapi:RegisterTraceGuidsA 0x6ec16eb9 0x6ec26608 0x6ec026c0 1 0x32f8cc (null) (null) 0x6ec26610
wine: Call from 0x4b4775ab to unimplemented function USER32.dll.ChangeWindowMessageFilter, aborting
fixme:ntdll:RtlNtStatusToDosErrorNoTeb no mapping for 80000100
Failed to load DLL ./SpeechUX.dll
Looked up ChangeWindowMessageFilter on MSDN. No way am I gona be able to implement that. This is the realm of upper-echelon wine core devs. Dayum.
Can anyone help me get to the next level?
Sam
PS If you're interested in taking over this project, joining in or helping, many thanks! Please find me in #cmusphinx on freenode. Or email me sunfish7@gmail.com
Peace out
Sam