Data Prediction (Modeling)/Overfitting (variance problem)

  • overfitting

    * 低偏差
      * 記住訓練集合上所有data的label
      * 低偏差的model在訓練集合上更加準確
    * 高變異
    

from : 偏差和變異之權衡 (Bias-Variance Tradeoff) 2012| 逍遙文工作室

from: Statistics - Bias-variance trade-off (between overfitting and underfitting) [Gerardnico]

  • training and testing error curves as a function of training set size
    • will potentially inform us about whether the model has a bias or variance problem and give clues about what to do about it.

  • If the model has a variance problem (overfitting)

* the training error curve will remain well below the testing error and may not plateau. 
* If the training curve does not plateau, this suggests that collecting more data will improve model performance.  
* To prevent overfitting and bring the curves closer to one another, one should 
  * increase the severity of regularization, 
  * reduce the number of features 
  * and/or use an algorithm that can only fit simpler hypothesis functions. 

from: Overfitting, bias-variance and learning curves - rmartinshort


Reference:

results matching ""

    No results matching ""