Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ML3science CM110 contribution #6

Open
wants to merge 55 commits into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
55 commits
Select commit Hold shift + click to select a range
0078e23
Adding Mali and Niger
AnasHKM Dec 17, 2022
0f0e9e7
- Add osm features for:
BadrTad Dec 17, 2022
e7700d7
Fix training by removing transpose
BadrTad Dec 18, 2022
5c6293c
Add osm features for MLI and NER
BadrTad Dec 18, 2022
daa162b
Merge remote-tracking branch 'origin/cm110/bt/osm' into cm110/bt/osm
BadrTad Dec 18, 2022
9d94433
Fit MLI and NER to model
BadrTad Dec 19, 2022
439a6a8
Generate combined countries figs (+MLI & NER)
BadrTad Dec 19, 2022
2e612e0
Refactor notebook for county figs
BadrTad Dec 19, 2022
92cf8e6
Add clusters for NER & MLI
BadrTad Dec 19, 2022
2afa25c
Add process lsms for uganda
BadrTad Dec 21, 2022
fa77d2b
Add NG to countries list
BadrTad Dec 21, 2022
0dfc6da
Modifying the country_keys.json
AnasHKM Dec 21, 2022
4f3f709
Merge remote-tracking branch 'origin/cm110/bt/osm' into cm110/bt/osm
AnasHKM Dec 21, 2022
51205db
Rectify 2009 to 2010 for uganda
BadrTad Dec 21, 2022
3fbb220
Merge remote-tracking branch 'origin/cm110/bt/osm' into cm110/bt/osm
AnasHKM Dec 21, 2022
af6381b
Remove echo
BadrTad Dec 21, 2022
94fd663
add bidict to requirements
BadrTad Dec 21, 2022
43f995a
Fix bug of matched_keys params
BadrTad Dec 21, 2022
758be8a
Merge remote-tracking branch 'origin/cm110/bt/osm' into cm110/bt/osm
AnasHKM Dec 21, 2022
ed26d9d
Remove bidict from process_all
BadrTad Dec 21, 2022
0de3024
Merge remote-tracking branch 'origin/cm110/bt/osm' into cm110/bt/osm
AnasHKM Dec 21, 2022
ad63d72
update arg metadata
BadrTad Dec 21, 2022
2ce77db
Merge remote-tracking branch 'origin/cm110/bt/osm' into cm110/bt/osm
AnasHKM Dec 21, 2022
d7caedd
Add log comments process_all
BadrTad Dec 21, 2022
5158e2a
Merge remote-tracking branch 'origin/cm110/bt/osm' into cm110/bt/osm
AnasHKM Dec 21, 2022
6e678d4
update pandas requirements
BadrTad Dec 21, 2022
b0c2ecd
Modifying the country_keys.json
AnasHKM Dec 21, 2022
39cdd8c
Merge remote-tracking branch 'origin/cm110/bt/osm' into cm110/bt/osm
BadrTad Dec 21, 2022
344e07e
remove prints
BadrTad Dec 21, 2022
6afee8f
drop rows from all_nominals
BadrTad Dec 21, 2022
4076687
Removing null values and updating the all_nominal file
AnasHKM Dec 21, 2022
9a55768
remove index col from csv
BadrTad Dec 18, 2022
0efe447
Removing indexes
AnasHKM Dec 21, 2022
2bcc37a
Remove manually the -99999999.0
AnasHKM Dec 21, 2022
b98191d
add cnn_colab drive version
BadrTad Dec 22, 2022
29be7df
Merge remote-tracking branch 'origin/cm110/bt/osm' into cm110/bt/osm
BadrTad Dec 22, 2022
5da5a4b
Generating the tfrecords raw
AnasHKM Dec 22, 2022
a46638b
delete unecessary check branch
BadrTad Dec 22, 2022
493d23d
Merge remote-tracking branch 'origin/cm110/bt/osm' into cm110/bt/osm
BadrTad Dec 22, 2022
9c9c99a
Generate the cnn_weights for all countries
AnasHKM Dec 22, 2022
b5f44a2
Generate the OSM Features for all countries
AnasHKM Dec 22, 2022
51d6472
Merge remote-tracking branch 'origin/cm110/bt/osm' into cm110/bt/osm
AnasHKM Dec 22, 2022
d21af08
Generating weights
AnasHKM Dec 22, 2022
f174401
Gjem
AnasHKM Dec 22, 2022
bbc91f9
Generating combined countries consumption results
AnasHKM Dec 22, 2022
7552d79
Time travel
AnasHKM Dec 22, 2022
da35fc9
Generating the clusters for each country
AnasHKM Dec 22, 2022
e669624
Gjem updated with UGA
AnasHKM Dec 22, 2022
fe5a6e5
Merge pull request #1 from AnasHKM/cm110/bt/osm
BadrTad Dec 22, 2022
19cb96b
Generate htmls
BadrTad Dec 22, 2022
26debd7
Refactor notebooks
BadrTad Dec 22, 2022
76e3a0d
Merge pull request #2 from AnasHKM/cm110/bt/osm
BadrTad Dec 22, 2022
9db2a08
Add new figs
BadrTad Dec 22, 2022
c077934
Update readme with our drive data
BadrTad Dec 22, 2022
4f7d83d
Update figs
BadrTad Dec 22, 2022
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
6 changes: 3 additions & 3 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -28,16 +28,16 @@ We are using the surveys of the WorldBank as our true gold standart. You have to

- [0_download_country_codes](src/0_lsms_processing/0_download_country_codes.ipynb): Download the country codes for all Sub Saharian African countries from the WorldBank API to use the same country codes.
- [1_check_lsms_availability](src/0_lsms_processing/1_check_lsms_availability.ipynb): Checks the availability of the LSMS for the given countries.
- [2_consent_lsms_form](src/0_lsms_processing/2_consent_lsms_form.ipynb): Poor mans approach to automate the download. The WorldBank requires to fill a consent form and this file does it for us and downloads the survey files for us. You can download our downloaded surveys from [here](https://drive.google.com/file/d/1IlF66tdPrty5OmGdWGd7iN39KZCV-iKD/view?usp=sharing).
- [2_consent_lsms_form](src/0_lsms_processing/2_consent_lsms_form.ipynb): Poor mans approach to automate the download. The WorldBank requires to fill a consent form and this file does it for us and downloads the survey files for us. You can download our downloaded surveys from [here](https://drive.google.com/file/d/18W813JASB23pBPWm2S0890b1ZUX0HWVU/view?usp=sharing).
- [3_process_surveys](src/0_lsms_processing/3_process_surveys.ipynb): Preprocesses the RAW survey data. Please find the processing steps in [lib/lsms.py](src/lib/lsms.py).

After running this code you should have processed survey files in [data/lsms/processed](data/lsms/processed).

### Satellite data and features

To download the data please execute the [0_download_satellite.ipynb](src/1_feature_generation/0_download_satellite.ipynb) notebook. However we recommend you to execute it on Google Colab. For this we have a modified [Colab](src/1_feature_generation/0.1_download_satellite_colab.ipynb) of the notebook, which contains all necessary libs. Since you would need to install Earth Engine locally. You also need a [Google Earth Engine account](https://earthengine.google.com/) to execute the code. Researchers, NGO's and country get free access within a short time. You can download our extracted data from [here](https://drive.google.com/file/d/1HJ3Q6BhmcZsRxb-JjhSkL6zH7hoMj1HB/view?usp=sharing).
To download the data please execute the [0_download_satellite.ipynb](src/1_feature_generation/0_download_satellite.ipynb) notebook. However we recommend you to execute it on Google Colab. For this we have a modified [Colab](src/1_feature_generation/0.1_download_satellite_colab.ipynb) of the notebook, which contains all necessary libs. Since you would need to install Earth Engine locally. You also need a [Google Earth Engine account](https://earthengine.google.com/) to execute the code. Researchers, NGO's and country get free access within a short time. You can download our extracted data from [here](https://drive.google.com/drive/folders/1wV-fis5BG8_zCtB1NU6mi43o_5UfdMfG?usp=share_link).

After you download the data you can train the CNN using [1_cnn.ipynb](src/1_feature_generation/1_cnn.ipynb). Again we recommend to execute it on Colab for this you can use our [colab](src/1_feature_generation/1.1_cnn colab.ipynb) version. If you don't want to train the network from scratch, you can use our [weights](https://drive.google.com/file/d/1Vt6wC4d0qdbyzJlIILPCaf8zWoMbTzGB/view?usp=sharing).
After you download the data you can train the CNN using [1_cnn.ipynb](src/1_feature_generation/1_cnn.ipynb). Again we recommend to execute it on Colab for this you can use our [colab](src/1_feature_generation/1.1_cnn colab.ipynb) version. If you don't want to train the network from scratch, you can use our [weights](https://drive.google.com/drive/folders/1ctwl-LYlprZutHvoB2zg2ELC-TgJOHBg?usp=share_link).

⚠ Caution: The tfrecords need a lot of RAM!

Expand Down
15 changes: 13 additions & 2 deletions colab_ssh.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -42,10 +42,21 @@
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3.8.3",
"language": "python",
"name": "python3"
},
"language_info": {
"name": "python"
"name": "python",
"version": "3.8.3"
},
"orig_nbformat": 4
"orig_nbformat": 4,
"vscode": {
"interpreter": {
"hash": "d02c496bff33d56b3c08f9b675298569973d3021c909db82586ed45629749a93"
}
}
},
"nbformat": 4,
"nbformat_minor": 2
Expand Down
55 changes: 55 additions & 0 deletions data/countries_meta/counties.csv
Original file line number Diff line number Diff line change
@@ -0,0 +1,55 @@
country
Algeria
Angola
Benin
Botswana
Burkina
Burundi
Cameroon
Cape Verde
Central African Republic
Chad
Comoros
Congo
Democratic Republic of the Congo
Djibouti
Egypt
Equatorial Guinea
Eritrea
Ethiopia
Gabon
Gambia
Ghana
Guinea
Guinea-Bissau
Ivory Coast
Kenya
Lesotho
Liberia
Libya
Madagascar
Malawi
Mali
Mauritania
Mauritius
Morocco
Mozambique
Namibia
Niger
Nigeria
Rwanda
Sao Tome and Principe
Senegal
Seychelles
Sierra Leone
Somalia
South Africa
South Sudan
Sudan
Swaziland
Tanzania
Togo
Tunisia
Uganda
Zambia
Zimbabwe
10 changes: 8 additions & 2 deletions data/countries_meta/counties_lsms_time_valid.csv
Original file line number Diff line number Diff line change
Expand Up @@ -49,7 +49,7 @@ South Africa,ZAF,2015,https://microdata.worldbank.org/index.php/catalog/2882
South Africa,ZAF,1999,https://microdata.worldbank.org/index.php/catalog/1576
South Africa,ZAF,1993,https://microdata.worldbank.org/index.php/catalog/297
South Africa,ZAF,1993,https://microdata.worldbank.org/index.php/catalog/902
Uganda,UGA,2021,https://microdata.worldbank.org/index.php/catalog/3765
Uganda,UGA,2022,https://microdata.worldbank.org/index.php/catalog/3765
Uganda,UGA,2021,https://microdata.worldbank.org/index.php/catalog/4183
Uganda,UGA,2020,https://microdata.worldbank.org/index.php/catalog/3902
Uganda,UGA,2019,https://microdata.worldbank.org/index.php/catalog/3795
Expand All @@ -59,6 +59,7 @@ Uganda,UGA,2014,https://microdata.worldbank.org/index.php/catalog/2663
Uganda,UGA,2012,https://microdata.worldbank.org/index.php/catalog/2059
Uganda,UGA,2011,https://microdata.worldbank.org/index.php/catalog/2166
Uganda,UGA,2010,https://microdata.worldbank.org/index.php/catalog/1001
Tanzania,TZA,2021,https://microdata.worldbank.org/index.php/catalog/4542
Tanzania,TZA,2020,https://microdata.worldbank.org/index.php/catalog/3885
Tanzania,TZA,2016,https://microdata.worldbank.org/index.php/catalog/2584
Tanzania,TZA,2016,https://microdata.worldbank.org/index.php/catalog/2863
Expand All @@ -70,6 +71,11 @@ Tanzania,TZA,2010,https://microdata.worldbank.org/index.php/catalog/2251
Tanzania,TZA,2011,https://microdata.worldbank.org/index.php/catalog/1050
Tanzania,TZA,2009,https://microdata.worldbank.org/index.php/catalog/76
Tanzania,TZA,2015,https://microdata.worldbank.org/index.php/catalog/3814
Tanzania,TZA,2004,https://microdata.worldbank.org/index.php/catalog/79
Burkina Faso,BFA,2019,https://microdata.worldbank.org/index.php/catalog/4290
Burkina Faso,BFA,2014,https://microdata.worldbank.org/index.php/catalog/2538
Côte d'Ivoire,CIV,2019,https://microdata.worldbank.org/index.php/catalog/4292
Côte d'Ivoire,CIV,2016,https://microdata.worldbank.org/index.php/catalog/2789
Côte d'Ivoire,CIV,1989,https://microdata.worldbank.org/index.php/catalog/82
Côte d'Ivoire,CIV,1988,https://microdata.worldbank.org/index.php/catalog/85
Côte d'Ivoire,CIV,1987,https://microdata.worldbank.org/index.php/catalog/84
Côte d'Ivoire,CIV,1986,https://microdata.worldbank.org/index.php/catalog/83
4 changes: 3 additions & 1 deletion data/countries_meta/countries.csv
Original file line number Diff line number Diff line change
Expand Up @@ -10,6 +10,7 @@ Equatorial Guinea
Gabon
Kenya
Nigeria
Burkina Faso
Rwanda
São Tomé and Príncipe
Tanzania
Expand All @@ -26,6 +27,7 @@ Lesotho
Madagascar
Malawi
Mauritius
Morocco
Mozambique
Namibia
Seychelles
Expand All @@ -37,7 +39,7 @@ Benin
Mali
Burkina Faso
Cape Verde
Ivory Coast
Côte d'Ivoire
Gambia
Ghana
Guinea
Expand Down
2 changes: 2 additions & 0 deletions data/countries_meta/countries_code.csv
Original file line number Diff line number Diff line change
Expand Up @@ -23,6 +23,7 @@ Malawi,MWI
Mali,MLI
Mauritania,MRT
Mauritius,MUS
Morocco,MAR
Mozambique,MOZ
Namibia,NAM
Niger,NER
Expand All @@ -44,6 +45,7 @@ Tanzania,TZA
Burkina Faso,BFA
Zambia,ZMB
South Sudan,SSD
Côte d'Ivoire,CIV
Republic of the Congo,COG
Democratic Republic of the Congo,COD
Ivory Coast,CI
Expand Down
Loading