Logistic Regression is for classification problem, and the predication value is fixed descrete values, such as 1 for positive or 0 for negative. The essence of logistic regression is:
- hypothesis function is sigmoid function
- cost function: J(theta)
- gradient descent and algorithms
- advantanced optimization with regularization to solve overfitting problem.
Basics about logistic regression
hypothesis function = 1 / (1 + exp(-htheta(x))),
where htheta(x) = theta’ * x(theta’ is transpose theta)
htheta(x) mean Probalitiy that y=1, given x parameterized by theta P(y=1 | x; theta),1
2
3if htheta(x) >= 0.5, then y = 1
if htheta(x) < 0.5, then y = 0Descision Boundary
Our goal is the calculate theta, can classify our traing data with descision boundary.
In the example, the traning data can be classified into 2 categories by a straight line.1
2if (theta'x) >= 0, then htheta(x) >= 0.5, then y = 1
if (theta'x) < 0, then htheta(x) < 0.5, then y = 0Cost function implementation
For the assignment of week3, predicate the adimission by university with 2 exams grade data.
I optimize the implementation with vectoriaztion
1 | function [J, grad] = costFunction(theta, X, y) |
Cost function with regularization
Regularzation is for overfitting problem.
- underfit: not fit the training data, with high bias between predications and actual value
- Just Right: great fit
- Overfitting: often with too many features, not so much traning data, fit traing data well, but with hight variance, predict new data not very well
1 | function [J, grad] = costFunctionReg(theta, X, y, lambda) |
the lambda for regularization can’t be too large:
- large lamba will got very small theta value, and underfit.
- small lambda will got large theta velue, and overfit.
- the lambda for the exerise is 1
Github assignments
Write on the last
After one year, I learn the logistic regression again. Last week, Andrew NG left Baidu. Maybe, these great people thought Baidu is not worth to fight for. Now I still decidated on a Spark project and focus on Spark Streaming. As team leader, I am bearing a great burden and is stressful. It’s a great chance to train my leadership. I am also wondering next opportunity. Learning Machine Learning is right and worth to do. Anyway, even though mist is on the path, just go forward and fight~