Python API Wrapper for OpenAlex.
OpenAlex currently describes five different scholarly entity types and their connections:
Each entity type comes with an endpoint of the same name that can be queried for a single (random or specific) entity or multiple (grouped or listed) entities.
pip (or pip3) install diophila
First off, you need to initialize a client. The client offers all methods to query the API.
from diophila import OpenAlex
openalex = OpenAlex()
You can use the client to query for a single random entity
with the method get_random_<entity>
:
random_author = openalex.get_random_author()
random_author['id']
Or if you have a specific entity in mind, you can use the client
using one of the entity's IDs via the get_single_<entity>
method:
specific_work = openalex.get_single_work("https://doi.org/10.1364/PRJ.433188", "doi")
specific_work['display_name']
If you are interested in entities grouped into facets,
use the get_groups_of_<entities>
method:
grouped_institutions = openalex.get_groups_of_institutions("type")
for group in grouped_institutions['group_by']:
group['key']
And last but not least you can get multiple entities from a type
in a list by using the get_list_of_<entities>
method. Note that this method uses pagination,
either basic paging or
cursor paging
depending on whether the pages
parameter is supplied:
# if no `pages` parameter is supplied, we use cursor paging
pages = None
# if `pages` parameter is supplied, we use basic paging
pages = [1, 2, 3]
filters = {"is_oa": "true",
"works_count": ">15000"}
pages_of_venues = openalex.get_list_of_venues(filters=filters, pages=pages)
for page in pages_of_venues: # loop through pages
for venue in page['results']: # loop though list of venues
venue['id']
Bonus: If you want to retrieve all works
connected to another entity,
you may use the entity's works_api_url
property with the get_works_by_api_url
method:
works_api_url = "https://api.openalex.org/works?filter=author.id:A1969205032"
pages_of_works = openalex.get_works_by_api_url(works_api_url)
for page in pages_of_works:
for work in page['results']:
work['display_name']
It's a good idea to use OpenAlex polite pool which offers faster response times for users providing an email address. If you would like to use it, simply initialize the client with your email address.
from diophila import OpenAlex
openalex = OpenAlex("[email protected]")
The API currently doesn't have rate limits. However, if you need more than 100,000 calls per day, please drop the OpenAlex team a line at [email protected] or alternatively look into using a snapshot.
If you are using OpenAlex in your research, the OpenAlex team kindly asks you to cite https://doi.org/10.48550/arXiv.2205.01833