-
Notifications
You must be signed in to change notification settings - Fork 6
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Mine PMC for ethics statements #499
Comments
The main purpose here would be to see
|
A simple query for "approval number" currently yields 11404 hits: |
Just to clarify that conflict of interest statements are within scope here as well. |
I just reran that "approval number" query from Oct 12, 2017, and it now yields 37963 results, i.e. an about 3.5-fold increase in about 3.5 years. In the meantime, I have begun to collaborate with @petermr, and we are trying to use his ContentMine pipeline (which is currently being ported to Python) to extract ethics statements from PMC. On the way, we have built a first — still very rough — dictionary (i.e. a set of words highly indicative of the topic of ethics statements), and we are trying to also get a list of ethics committees mentioned in PMC-indexed papers. |
Meeting on April 29, 2021:
|
Some more notes on this by @ShweataNHegde sit at https://github.com/petermr/dictionary/wiki/Ethics-Statement-Project . |
A search for "approval number" now gives 38437 results, i.e. about 500 more than just two weeks ago. |
There are ambiguities at multiple levels. For instance, this article states that
The problem here is that Johns Hopkins School of Medicine runs multiple IRBs, and there does not seem to be a straightforward mechanisms to resolve the approval number to get more metadata about the process. |
There is a Office for Human Research Protections (OHRP) Database for Registered IORGs & IRBs, Approved FWAs, and Documents Received in Last 60 Days that has identifiers for IRBs, but these do not resolve either. |
I have started to test the phrase extraction tool NLTK-RAKE.
https://towardsdatascience.com/extracting-keyphrases-from-text-rake-and-gensim-in-python-eefd0fad582f
As with all language tools it will take a day or two to see how useful it
is.
…On Mon, May 10, 2021 at 4:03 PM Daniel Mietchen ***@***.***> wrote:
There is a Office for Human Research Protections (OHRP) Database for
Registered IORGs & IRBs, Approved FWAs, and Documents Received in Last 60
Days <https://ohrp.cit.nih.gov/search/irbsearch.aspx> that has
identifiers for IRBs, but these do not resolve either.
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#499 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AAFTCSYDBWTBXH7DUKKFQMLTM7YUNANCNFSM4D5M32KA>
.
--
Peter Murray-Rust
Founder ContentMine.org
and
Reader Emeritus in Molecular Informatics
Dept. Of Chemistry, University of Cambridge, CB2 1EW, UK
|
https://colab.research.google.com/drive/1sFj07mE2XRyeaplvsTs34-VaDHBjnt6U?usp=sharing Ayush (openVirus volunteers) and I wrote a piece of code that can extract common phrases from a text file with manually scraped Ethics Statements. |
Some updates from this week:
|
For more recent updates, see the notes over at Shweata's page. |
Here is a list of ethics-related entities Shweata has mined from articles on stem cells. |
Some more observations by Shweata and Peter sit here. We now have a dedicated organization, repo and wiki: |
The paper How does nursing research differ internationally? A bibliometric analysis of six countries. has a Table 1 that looks at certain features of previous studies, including
|
The project with Shweata and Peter (and Ayush) has since led to a publication:
It outlines a workflow for mining ethics statements and discusses motivations, applications and complications. |
possible search terms:
etc.
The text was updated successfully, but these errors were encountered: