This repository has been archived by the owner on Jun 22, 2022. It is now read-only.
-
Notifications
You must be signed in to change notification settings - Fork 21
Exploring various dimension reduction techniques
Kamil A. Kaczmarek edited this page Jul 10, 2018
·
2 revisions
- factor analysis
factor_analysis__n_components: 50
- sparse random projection
sparse_random_projection__n_components: 50
- more row-wise aggregations
def aggregate_row(row):
non_zero_values = row.iloc[row.nonzero()]
aggs = {'non_zero_mean': non_zero_values.mean(),
'non_zero_max': non_zero_values.max(),
'non_zero_min': non_zero_values.min(),
'non_zero_std': non_zero_values.std(),
'non_zero_sum': non_zero_values.sum(),
'non_zero_count': non_zero_values.count(),
'non_zero_fraction': non_zero_values.count() / row.count()
}
return pd.Series(aggs)
- not using raw features
lightGBM new aggregations + projections (second best) 1.336 CV 1.39 LB
check our GitHub organization https://github.com/neptune-ml for more cool stuff 😃
Kamil & Kuba, core contributors
- honey bee 🐝 LightGBM and 5fold CV
- beetle 🪲 LightGBM on binarized dataset
- dromedary camel 🐪 LightGBM with row aggregations
- whale 🐳 LightGBM on dimension reduced dataset
- water buffalo 🐃 Exploring various dimension reduction techniques
- blowfish 🐡 bucketing row aggregations