learning from arnowaczynski
[[ 모델 블랜딩 전략 ]]
many CV runs with random seeds with different random split(10 folds)
simple average of top 30 models
ridge regression on top 60 models
[[ 하이퍼 파라미터 탐색 전략 ]]
num_boost_round = 10000 with early_stopping_rounds=200
bagging_freq = 1
tuning hyper-params(6): learning_rate, num_leaves, max_depth, min_data_in_leaf, feature_fraction, bagging_freq
random grid search
set discrete search space for each params
while searching, print every cv score with selected hyper-parameters
at any time, stop search and adjust the search space (make sure not same parames evaluated more than once)