Wednesday, November 16, 2011

Machine Learning ML , development of Forecasting system for CL


0/ Sources
1/ Potential Collaborators2/ My Learning path :How I leaned practical ML with Orange

0/ Sources

We have  two parts on these slides 1) Data mining  2) Machine Learning 

Machine Learing Part Slides:
b/  Orange
c/  association rule learninganalyzing and presenting strong rules discovered in databases using different measures of interestingness. different methods 
 - Association Rules Basics ( see all 1 to 14 lesson , this is great slides.
Lecture 05 Association Rules Advanced Topics
 - Apriori algorithm: Apriori[6] is the best-known algorithm to mine association rules. It uses a breadth-first search strategy to counting the support of itemsets and uses a candidate generation function which exploits the downward closure property of support.


Data Mining Part: 

Machine Learning and Data Mining: 15 Data Exploration and Preparation



Clustering: 


1/ Potential Collaborators:

( for ML based forecasting system ,  what I bring to table: Vantage Point forecast data, Orange coding )
1/ Vatsals Shaw: http://www.vatsals.com/Work.aspx ( see ML stock prediction paper )  http://www.linkedin.com/in/vatsals (under Interests Algorithmic Trading Systems, Recommendation Engines )

2/ Sami Badawi  (  author of an open source software project called ShapeLogic )

Experts:
1/ Dr. Jochen L. Leidner's Blog  :  Reuters Research Dept.     Endorsing Orange

2/ Microsfot ML team: Time Series Foundation ( asr: interesting work , we can study to learn methodology )

Time Series Foundation (TSF) is a .NET toolset for exploring new algorithms in time series analysis and forecasting ( see .Docx  for details what this .NET toolset provides on this page) .

Time Series Foundation (TSF) is an open, .NET platform for exploring and prototyping new algorithms in time series analysis and forecasting. TSF is based on state space model methodology that includes all types of exponential smoothing, some autoregressive algorithms, and innovative algorithms for event detection and calendar event impact prediction. 
http://research.microsoft.com/en-us/um/people/alexeib/ 
TSFsidePanel.png

Time Series Foundation (TSF) is about three things:
  -  forecasting future behavior
 - detecting the impact of external events  ( for CL tensions with IRAN , Pipe line break , Gulf coast Hurricane  etc.. )
 -   predicting  the impact of future calendar events( for CL  contract expiry, long weekend , Fed Meeting days )


3/ Time series clustering/classification techniques use some model information about the time series, which comes from the fact that time series data values are usually correlated. For example time series-specific techniques discussed in the book include time series clustering using Hidden Markov Models, clustering of ARIMA time series, and entropy-based time series classification. The book also discusses general (non-time series) classification/clustering techniques, such as decision trees, and support vector machines. The reason for the inclusion of these techniques is that many time series data can be summarized using global characteristics, such as trend, seasonality, skewness, etc., in which case one can use a non-time series classification/clustering technique.
   http://www.cs.berkeley.edu/~pliang/cs294-spring08/lectures/time/slides.pdf  ( good one on TIME SERIES )
http://timemachine.iic.harvard.edu/site_media/i/umaatutorial.pdf   ( another Best  Time seris )

     2/  My Learning path : How I leaned practical ML with Orange
____________________________________________________________________________________
Hope this  path will show my  'discovery process ' , will be useful to find out future subjects ...

1/ intially looked at Python NLTK articles ( in fact looked at NLTK a year ago amazon book )
2/ 9/2011 looked at NLTK for  e-learining project  ( how to produce similar Math set of questions )
3/ looked at Amazon WEKA book inside chapters and got an practical Idea on ML  given with example ( first thanks to this one )
4/ did google 'machine learning tutorial' got Andrew Moore   PDF Lecture notes ( eye opening understood theory )
5/ Looked at RapidMiner at  Sami Badawi   blog  and also looked on his  Orange comments.
6/ downloaded Orange and played , got complete IDEA  ( wow ..... )












1 comment:

Unknown said...

I really appreciate information shared above. It’s of great help. If someone want to learn Online (Virtual) instructor lead live training in Machine Learning , kindly contact us http://www.maxmunus.com/contact
MaxMunus Offer World Class Virtual Instructor led training on TECHNOLOGY. We have industry expert trainer. We provide Training Material and Software Support. MaxMunus has successfully conducted 100000+ trainings in India, USA, UK, Australlia, Switzerland, Qatar, Saudi Arabia, Bangladesh, Bahrain and UAE etc.
For Demo Contact us.
Sangita Mohanty
MaxMunus
E-mail: sangita@maxmunus.com
Skype id: training_maxmunus
Ph:(0) 9738075708 / 080 - 41103383
http://www.maxmunus.com/