Finished Machine Learning Course

I had couple of ML-related courses at University and used it in practice, but, still it was very interesting to review and refresh the core concepts.

Especially from such a badass ML expert as Andrew Ng, with lots of practical examples and advices.

My Notes and the Course by Andrew Ng.

Machine Learning Notes

  • Supervised / Unsupervised Learning.
  • Linear Regression, Logistic Regression, SVM.
  • Neural Networks.
  • K-means, PCA, Anomaly Detection.
  • Recommending Systems, Collaborative Filtering.
  • Bias / Variance, Regularization, Precision / Recall, Learning Curve etc.
  • How to gather, amplify and generate Learning Data.

Read more...

Countries with the best medicine

How can we decide if medicine is good or bad? About couple of years ago I found very interesting statistics, but only recently had time to investigate it in details.

Countries with people having the longest life

So, we can roughly guess that if lifespan is big, then the medicine should be probably good too. Also, as soon as lifespan is also affected by lots of other variables like - climate, food, ecology, lifestyle, income and so on - we can decide that they are probably good too.

But, the fact that lifespan is a sum of many variables has a drawbacks:

Read more...

SQL for Data Analyst

These days lots of data available in digital form, ability to analyze and get meaning from that data become more important. Usually such job is called Data Analysis or Data Mining and the person who does that is called Data Analyst.

Actually, I wrote that article for my brother (who's Analyst and ask me about SQL) and decided to publish it because it may be also useful for others.

Main skills of Analyst are Mathematics, Statistics and Domain expertise. But in order to apply those skills he should be able to get the data itself. Widely supported way to get access to data is called SQL, and Analyst can benefit greatly if it knows basics of it.

SQL is a declarative language for querying and transforming data stored in relational database.

  • Declarative - it means that you declare what you want without explicitly telling how to do that. And database figures it out by herself how to fulfill your request in the best way. It's a good thing, because describe what you want usually much simpler than to explain how to do that.
  • Relational database - special type of database (it's the most widely used type of database) that stores data as rows (also called records) in tables.

Read more...

Hadoop The Definitive Guide

 

If there’s a common theme, it is about raising the level of abstraction—to create building blocks for programmers who just happen to have lots of data to store, or lots of data to analyze, or lots of machines to coordinate, and who don’t have the time, the skill, or the inclination to become distributed systems experts to build the infrastructure to handle it.

This, in a nutshell, is what Hadoop provides: a reliable shared storage and analysis system. The storage is provided by HDFS, and analysis by MapReduce. There are other parts to Hadoop, but these capabilities are its kernel.

Read more...

Code Statistics and the Power Law

Last week I used Code statistics analyzer to examine my own and some other projects.

Complexity of Rails libraries

Ruby on Rails

Comparison of some Open Source projects

Read more...

Interesting resources about AI

Find if domain name is generated by human or robot http://nbviewer.ipython.org/github/ClickSecurity/data_hacking/blob/master/dga_detection/DGA_Domain_Detection.ipynb

Some learning resources, simple approach, lots of interesting examples https://github.com/hangtwenty/dive-into-machine-learning

Podcast about AI http://www.thetalkingmachines.com/blog/

Infographics http://www.randalolson.com/blog/page/4/

Collection of easy and practical approaches http://www.igvita.com/2011/04/20/intuition-data-driven-machine-learning and there (in the body of that article) is also link to another interesting presentation from Google about it's machine translator, don't miss it.

Foundations of Intelligent Agents how to generate algorithms and judge if its right using Kolmogorov Complexity.

Pether Norvig about Google Algorithms:

http://www.youtube.com/watch?v=HT540VrCDwg http://www.youtube.com/watch?v=nU8DcBF-qo4

Computing Like the Brain mathematical model of brain, sparse distributed representation, semantic (locality sensitive) hashing, sequential memory, prediction and anomaly detection and some practical applications

Read more...