Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

KeyError: cd:initials #98

Open
Viet1004 opened this issue Oct 2, 2024 · 5 comments
Open

KeyError: cd:initials #98

Viet1004 opened this issue Oct 2, 2024 · 5 comments

Comments

@Viet1004
Copy link

Viet1004 commented Oct 2, 2024

Hello,

I'm having trouble after refining via scopus.

I got error KeyError: ce:initials

Here is the code to recreate the error:

docs_scopus, docs_notfound = litstudy.refine_scopus(docs)
litstudy.plot_affiliation_histogram(docs_scopus, limit=15)
I cannot share the csv here but they are any file from ieee and scopus.
Could you help me with this?
Thank you very much.
If you need any further information, please let me know

@fzzinchemical
Copy link

fzzinchemical commented Oct 10, 2024

Got the same issue running litstudy 1.0.6. on a Scopus dataset.
Here is the relevant part of the Traceback:

File ~/Documents/Notebooks/.venv/lib/python3.12/site-packages/pybliometrics/scopus/abstract_retrieval.py:92, in AbstractRetrieval.authorgroup(self)
     80 # Author information
     81 for author in authors:
     82     new = auth(affiliation_id=aff_id,
     83                 organization=org,
     84                 city=aff.get('city'),
     85                 dptid=dep_id,
     86                 postalcode=aff.get('postal-code'),
     87                 addresspart=aff.get('address-part'),
     88                 country=aff.get('country'),
     89                 auid=make_int_if_possible(author.get('@auid')),
     90                 orcid=author.get('@orcid'),
     91                 surname=author.get('ce:surname'),
---> 92                 given_name=author.get('ce:given-name', author['ce:initials']),
     93                 indexed_name=chained_get(author, ['preferred-name', 'ce:indexed-name']))
     94     out.append(new)
     95 # Collaboration information

KeyError: 'ce:initials'

@fzzinchemical
Copy link

Hello,

I'm having trouble after refining via scopus.

I got error KeyError: ce:initials

Here is the code to recreate the error:

docs_scopus, docs_notfound = litstudy.refine_scopus(docs) litstudy.plot_affiliation_histogram(docs_scopus, limit=15) I cannot share the csv here but they are any file from ieee and scopus. Could you help me with this? Thank you very much. If you need any further information, please let me know

Rereading the Stacktrace, it seems to be a pybliometrics issue. I'll investigate there.

@fzzinchemical
Copy link

Hello,

I'm having trouble after refining via scopus.

I got error KeyError: ce:initials

Here is the code to recreate the error:

docs_scopus, docs_notfound = litstudy.refine_scopus(docs) litstudy.plot_affiliation_histogram(docs_scopus, limit=15) I cannot share the csv here but they are any file from ieee and scopus. Could you help me with this? Thank you very much. If you need any further information, please let me know

Alright, quick update and temporary fix:
Downgrade the pybibliometrics to version 4.0.0 . It worked in my case, so fingers crossed it works for you as well.
I looked to many hours at the code to see if anything did not make any sense, and found nothing.

Alright, the thing you should try is the following:
pip install pybliometrics==4.0.0

@stijnh
Copy link
Member

stijnh commented Oct 11, 2024

Thank you very much for looking into this! It is a difficult issue to debug, since it seems to be an issue which the Scopus API.

Great that you were able to find a temporary fix!

Do you have version numbers for which pybibliometrics does and does not work? Might also be worth it to report the issue on their repository, since it looks to be a mismatch between how the data is provded by Scopus and how it is parsed by pybibliometrics

@fzzinchemical
Copy link

fzzinchemical commented Oct 11, 2024

Thank you very much for looking into this! It is a difficult issue to debug, since it seems to be an issue which the Scopus API.

Great that you were able to find a temporary fix!

Do you have version numbers for which pybibliometrics does and does not work? Might also be worth it to report the issue on their repository, since it looks to be a mismatch between how the data is provded by Scopus and how it is parsed by pybibliometrics

Tried out a couple of things regarding the versions on a smaller batch. The results are the following:

  • could not replicate the bug on smaller batches of studies, version-independent (yeah all pybliometrics versions were tested)
  • made a simple testbench to try out a couple of versions.

Test

Version Functional?
4.0 NO
3.6 NO
3.5.2 NO
3.5.1 YES
3.5.0 YES
3.4.0 YES
3.3.0 YES
3.2.0 YES
3.1.0 NO

Now I wonder why it worked on 4.0.0... so I thinK I'll leave it there for now.
Hope this helps.
Also the test results were kind of inconsistent, meaning nothing can really be trusted here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants