(Kaggle Home Credit) top 1% with only 2 submissions? how?

2018/08/31

learning from arnowaczynski

https://www.kaggle.com/c/home-credit-default-risk/discussion/64609

[[ 모델 블랜딩 전략 ]]

many CV runs with random seeds with different random split(10 folds)
simple average of top 30 models
ridge regression on top 60 models

[[ 하이퍼 파라미터 탐색 전략 ]]

num_boost_round = 10000 with early_stopping_rounds=200
bagging_freq = 1
tuning hyper-params(6): learning_rate, num_leaves, max_depth, min_data_in_leaf, feature_fraction, bagging_freq
random grid search

set discrete search space for each params
while searching, print every cv score with selected hyper-parameters
at any time, stop search and adjust the search space (make sure not same parames evaluated more than once)

-->