Powered by GitBook

Data Prediction (Modeling)/Overfitting (variance problem)

overfitting

* 低偏差
  * 記住訓練集合上所有data的label
  * 低偏差的model在訓練集合上更加準確
* 高變異

from : 偏差和變異之權衡 (Bias-Variance Tradeoff) 2012| 逍遙文工作室

find the model complexity that gives the smallest test error

from: Statistics - Bias-variance trade-off (between overfitting and underfitting) [Gerardnico]

bias-variance tradeoff - Bias and variance contributing to total error

training and testing error curves as a function of training set size
- will potentially inform us about whether the model has a bias or variance problem and give clues about what to do about it.

If the model has a variance problem (overfitting)

* the training error curve will remain well below the testing error and may not plateau. 
* If the training curve does not plateau, this suggests that collecting more data will improve model performance.  
* To prevent overfitting and bring the curves closer to one another, one should 
  * increase the severity of regularization, 
  * reduce the number of features 
  * and/or use an algorithm that can only fit simpler hypothesis functions.

from: Overfitting, bias-variance and learning curves - rmartinshort

Reference:

Overfitting, bias-variance and learning curves - rmartinshort

results matching ""

No results matching ""