Thursday, October 20, 2011

E-learning / TTS Text To Speech / Machine Learning ML / python NLTK



http://searchstorage.techtarget.com/definition/How-many-bytes-for
http://fnoschese.wordpress.com/2011/05/10/khan-academy-my-final-remarks/
http://www.hackeducation.com/2011/07/19/the-wrath-against-khan-why-some-educators-are-questioning-khan-academy/
http://code.google.com/p/khanacademy/issues/detail?id=191
Price:
------------
Rackspace price: 10 GB = $2 , so 10MB is   200 cents/1000 => 0.2 cents , so each lesson is 0.2 cents , say they use 10 lessons(100 minutes total  per day)  then  10x 0.2  =  2 cents  

Size of a 10 Minute .WAV File for NaturallySpeaking 3

Answer ID 3129   |    Published 07/09/2002 12:00 AM   |    Updated 04/16/2010 04:54 PMSize of a 10 Minute .WAV File for NaturallySpeaking 3Question:How many megabytes will a 10 minute WAV file for use with NaturallySpeaking be?Answer:The WAV file for use with NaturallySpeaking needs to be recorded at 11Khz 16bit Mono. At that sampling rate, a WAV files is approximately 1.3MB per minute. 10 minutes X 1.3MB would be 13 MB.
An MP3 (music) downloadable file : 2 to 5 MB
10 min Mp3 = 10 MB 

1/ HOmework Helper

what we need , here is the list


0/ we need to keep it simple first 
  - just helping homework by looking at problems sheets from schook (K-6 , collect what schools use )
  - and helping with concepts , showing with pre-requisites ,


1/ we scaned the home work papers , it scanned good in PDF and conversion is also good like 90% , learned as follows.
  - why can't we type Text in first place instead of  a) scan to PDF  b)  then convert to Text  c) then Edit text to remove errors .
 - Type ( in india ) and store them in XML file so that you can format what ever way you wanted . In India they put typed file on DropBox  sync folder . Provide a simple UI tools to author/Edit XML document see here
 http://www.syntext.com/products/serna-free/


1.2/  you need good content  ( see amazon saved wishlist  'word problem',  'ace calculus ' etc... ) to present content in the style many people liked ( you know from reviews) and easy 

How to Solve Word Problems in Algebra - we need this book , look inside the book 
   2 more then unknown    -> 2 +x
   5 Less than Unknown    ->  x -5 
  we need to have every math problem convert into by calling functons  and  will be used to 'categorize' the given problem . I. e when you get another problem like this it will clasify as this type and we draw Picture based on it ( using D3.com )


get all amazon wishlist books ( best reviews 20- 50 etc.. ) and summarize what you  want to follow based on combined of all books.

2/ we need this NLTK and his cook  book ..
http://text-processing.com/demo/tokenize/  - use this with this problem below 
 Ed had 22 more marbles than Doug. Doug lost 8 of his marbles at the playground. How many more marble did Ed have than Doug then?


book review: Overall the book is easy to read, has a huge set of sample recipes and feels very useful. 

3/ we need D3.org  library, use this initially for Prototype to do simple Animations  ( for demo htm5 browsers is fine , as of 2012 MAR IE9 will be production release )
3.2  Use  this Math Rendering of SQRT etc..

1.2 / Machine Learning ML





1/ PyML :  PyML shows lots of command line based .( this way  RapidMiner is much better all GUI done ...)
2/ Orange:  Orange seems clean , easy conceptually ,  screen Shots
   Review:  Shortest script for doing training, cross validation, algorithms comparison and prediction.
  • I found Orange the easiest tool to learn.


3/ RapidMiner



RapidMiner, R, and Excel were again the most popular tools: http://www.kdnuggets.com/2011/05/tools-used-analytics-data-mining.html
Review: RapidMiner is an open source statistical and data mining package written in Java.  This Reviewer seems good one with credentials ..

Tutorial Vidoes  , Videos and answers from author at bottom .
asr: wow RapidMiner GUI shows  'Train' and "Test' windows and comparing the prediction of 2 methods .
 - In MLpy shows lots of theory  , this shows GUI how to get results ..



good decision chart ,  all screen shots   ,
asr: it seems for starter , RapidMiner seems best , it has Time series Extension for CL futures data .
 b/ being JAVA you can get all those STOCastic/MACD  etc. many other indicators .
c/ MLpy , scipy may be good , but not lot of Data mining comapred to RapidMiner
d/ Rapidminer has commercial support so must be improving year by year.

RapidMiner vs. WEKA

The most similar data mining packages are RapidMiner and WEKA. There have many similarities:
  • Written in in Java.
  • Free / open source software with GPL license.
  • RapidMiner includes many learning algorithms from WEKA.
My first thought what that RapidMiner has everything that WEKA has, plus a lot of other functionality and is more polished
What kind of technology is Watson based on?
Watson is an application of advanced Natural Language Processing, Information Retrieval, Knowledge Representation and Reasoning, and Machine Learning technologies to the field of open-domain question answering. At its core, Watson is built on IBM's DeepQA technology for hypothesis generation, massive evidence gathering, analysis, and scoring.




Machine learning deals with designing and developing algorithms to evolve behaviors based on empirical data. One key goal of machine learning is to be able to generalize from limited sets of data (paraphrased from [1]). Russell and Norvig [2] lists machine learning as a specific capability, namely the ability to "adapt to new circumstances and to detect and extrapolate patterns".


What is UIMA?Unstructured Information Management applications are software systems that analyze large volumes of unstructured information in order to discover knowledge that is relevant to an end user. An example UIM application might ingest plain text and identify entities, such as persons, places, organizations; or relations, such as works-for or located-at.


Standford ML course complete set of VIDEOS :

2/ NLTK  Python Natural Language Processing



asr: summary , I did not real  'value creation ' in doing parsing  Web text and giving info.
 - yes for Siri kind of voices services , you can provide input , but google/apple/MS need big 'Repositories ' to buy not small one.
 - Quora is doing this kind of Wiki ( summary ) , I guess auto generated from  'All user posts ' for a given topic , look at this answer Wiki of Quora

http://ianozsvald.com/2011/01/30/review-for-python-text-processing-with-nltk-2-0-cookbook-packt-2010/
http://streamhacker.com/2010/12/15/python-text-processing-nltk-book-reviews/

books:
http://www.amazon.com/dp/0596516495/ref=rdr_ext_sb_ti_hist_1
http://www.amazon.com/Python-Text-Processing-NLTK-Cookbook/dp/1849516383/ref=pd_sim_b3
http://www.amazon.com/Programming-Collective-Intelligence-Building-Applications/dp/0596529325/ref=pd_sim_b2





2/ Text-to-Speech vs Human Narration for eLearning



AT&T TTS: 
 Rep said $5500 linux server license ( for 10 servers $2500 each), can support upto 30 to 40 simultaneous users.
 - see min. requirement is only 250 MB ram, I guess by having 10GB kind of RAM , you can have 30 simultaneous users supported.
 - see control tage, so they can speak math ... 
The AT&T Natural Voices TTS engine does a great job of synthesizing most text without special  instructions, but there may be special circumstances where you wish to fine-tune the pronunciation of  certain words or phrases.  The AT&T Natural Voices TTS engine allows users to mark up the text to be  spoken to include special control tags that change the way the text is pronounced.  The AT&T Natural  Voices TTS engine supports a subset of the SSML control tags
-------------
Sphinx4 - CMU voice recog. software  , Wiki
 Sphinx-4 is a state-of-the-art speech recognition system written entirely in the JavaTM programming language. It was created via a joint collaboration between the Sphinx group at Carnegie Mellon University, Sun Microsystems Laboratories, Mitsubishi Electric Research Labs (MERL), and Hewlett Packard (HP), with contributions from the University of California at Santa Cruz (UCSC) and the Massachusetts Institute of Technology (MIT).
Sphinx-4 started out as a port of Sphinx-3 to the Java programming language
Sphinx4 uses by these users 
Speech Recognition on Android : 

-----

product: E-learning of subjects 
  - see Khanacademy : our product only works of till k - 8 : 
 model basis: like Khan acadamy 10 minute concepts ( else user loose interest as Khan said).
 - less production cost because  TTS is used to create courses based on 'input files'

issues: 1/   for math  'Square root of 4 = 2 ' spelled badly even on AT&T  ( so need to find a solution to tweak for Math , check AT&T talk to the Rep. )
 2/  see why so many companies failed in this field 

asr what is missing in E-learning:
 1/   audio and Video , how do you know it , by adding it , it will be killer succeess ?
   - see KhanAcadamy , it successful based on it
2/  so why people come to your site than Khan ?
   Khan is good , bur Human Resource Intensive , if we make Khan equivalent with
  a) Text  To Speech TTS    b) interactive Video   ( with visuals esp. for WORD problems  by categorizing Word problems in to known algebra problems )
   - word Problems:  show the problem ask user if he know how to solve,  user says NO , give equations like x + 4 = y  and Y - x = 2  , can he solve this problem , if says yes  then show ' how to convert WORD problem into equation.
   - if answer is NO , then TAKE to how to solve those x +4 = y and Y -2 = 3 , take to that Problem ..

tech:

subjects:  chess , Math  k - 8 , specific  'Word problems',  probability etc..
  physics , chemistry  etc..

--------------
 - Here is site , they have ton of customers...
 http://www.dessci.com/en/
 Here is MathPlayer shown in IE ....( I down loaded IE plugin , they say work of FFox , but no Chrome )
http://www.dessci.com/en/products/mathtype/compare/mathplayer.htm

compare: Ed helper is doing simple HTML tables for Math , these are meant for Printing and doing Math
  http://www.edhelper.com/math/math_grade2_review_1.htm
----------------------

Why didn't you use human voice-over?


Results?  Acceptance by Students?

Again, the responses are somewhat self-evident:

A. Yes. The TTS technology coupled with the software allowed us to create e-learning material in about half the time as human voice over. The maintenance of the e-learning material takes 75% less time than maintaining material with human voice over. This allows us to create and maintain material much faster with less resources and without needing specialized resources that have voices specialized for recording.
We have produced courses for 6000 people in the company and we are getting good feedback: 80% are satisfied, 10% love it and 10% feel offended. My conclusion is that the voices are "good enough" for training applications.



http://elearningtech.blogspot.com/2010/09/text-to-speech-vs-human-narration-for.html


His work in social media, e-Learning and Performance Support has won awards and has led him into engagements at many Fortune 500 companies 

Resources:
 AT&T voices:  http://www.wizzardsoftware.com/att_desktop_overview.php
-  FREE: ( may not be as high quality as AT&T voices)  http://sourceforge.net/projects/freetts/
   FreeTTS is a speech synthesis engine written entirely in the Java(tm) programming language. FreeTTS was written by the Sun Microsystems Laboratories Speech Team and is based on CMU's Flite engine. FreeTTS also includes a partial JSAPI 1.0



Edit

No comments: