eto_swe_interview: Option 1

Minimal normalization of a json file containing organization affiliation data from https://arxiv.org/

Data was initially explored using jupyter notebook. I looked at how many entries there were versus the number of unique organization entries, the most common words, and the most common abbreviations (defined as having all characters capitalized). Though I could have done more exploration and leveraged these, I decided to normalize the data in a way more similar to the example given in the problem statement and fix abbreviations for common words such as "U." or "U" for "University".

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

eto_swe_interview: Option 1

Files

README.md

Latest commit

History

README.md

File metadata and controls

eto_swe_interview: Option 1