Monday, March 7
1:30-4:30pm
Chemical Heritage Foundation
Room: Ullyot S
Coordinators: Shawn Averkamp, Sara Rubinow, Matt Miller, Josh Hadro
Tools and standards abound for creating and enriching metadata, but measuring, monitoring, and managing metadata for the long haul can be a daunting task. What tools are out there to assess the shape of our metadata? How can visualizations show us the gaps or flaws in our description? What can web traffic analytics tell us about the value of our metadata? What is quality, really? We certainly don’t have all the answers, but together we can workshop the questions. Specific topics will be driven by the interest of attendees. The organizers will bring examples of their own work at NYPL in visualization, data analysis with Python, and Google analytics assessment and invite participants to bring their own tools and strategies to share in group discussion, short demos, and hands-on breakout sessions. Takeaways will include: exposure to approaches and tools in use in the field and an expanded network of commiserators to help you through your next metadata audit.
Schedule:
1:30 - 1:40 Introductions
1:40 - 3:00 Presentations
3:00 - 3:10 Break
3:10 - 4:10 Hands-on sessions
4:10 - 4:30 Reporting back and discussion
You'll get the most out of your hands-on sessions if you install the necessary applications ahead of time. If you're planning to participate in the following hands-on sessions, please try to come prepared!
For this hands-on session, we'll be using Jupyter (IPython) notebook to walk through some simple functions and scripts. We'll also be working with lxml, a third-party Python library for parsing and manipulating XML. Fortunately, both of these are already included in the Anaconda Python distribution. We strongly recommend installing Anaconda for this workshop. It also comes bundled with pandas and all of its dependencies, so it will be useful to have it you're interested in learning more about data analysis.
For this hands-on session, we'll be using Jupyter (IPython) notebook to explore basic data analysis with pandas, a Python data analysis library. Fortunately, both IPython notebook and pandas (as well as two additionally-necessary packages, numpy and matplotlib) are already included in the Anaconda Python distribution. We strongly recommend installing Anaconda for this session. It also comes bundled with lxml, a third-party Python library for parsing and manipulating XML, so it will be useful to have if you're interested in learning more about using Python to parse your XML data.
This hands-on session will be a beginner's intro to the d3 visualization library. We will use it to try to render metadata quality results which allows quick visual analysis. All you will need for this session are the examples and data provided in the d3_viz folder, a text editor and a web browser.
During the workshop, we'll take collaborative notes and share favorite resources in this Google Doc. We invite you to contribute!