Apr 18, 2023

Public workspaceCollecting Citations from Text V.2

  • 1Southern Connecticut State University
Icon indicating open access to content
QR code linking to this content
Protocol CitationRebecca Hedreen 2023. Collecting Citations from Text. protocols.io https://dx.doi.org/10.17504/protocols.io.bp2l6bkq5gqe/v2Version created by Rebecca Hedreen
License: This is an open access protocol distributed under the terms of the Creative Commons Attribution License,  which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited
Protocol status: Other
Partial retraction: Problems with the ChatGPT addition have come up. It makes up first names if the input citation only contains initials.
Created: April 18, 2023
Last Modified: April 18, 2023
Protocol Integer ID: 80727
Keywords: citations,
Disclaimer
DISCLAIMER – FOR INFORMATIONAL PURPOSES ONLY; USE AT YOUR OWN RISK

The protocol content here is for informational purposes only and does not constitute legal, medical, clinical, or safety advice, or otherwise; content added to protocols.io is not peer reviewed and may not have undergone a formal approval of any kind. Information presented in this protocol should not substitute for independent professional judgment, advice, diagnosis, or treatment. Any action you take or refrain from taking using or relying upon the information presented here is strictly at your own risk. You agree that neither the Company nor any of the authors, contributors, administrators, or anyone else associated with protocols.io, can be held responsible for your use of the information contained in or linked to this protocol or any of our Sites/Apps and Services.
Abstract
Basic steps and scripts used for translating text citations to bibtex files suitable for loading into citation management software or citation analysis scripts. 4 publically available webpage mounted scripts are suggested, that require no programming.

Updated in April 2023 to include a ChatGPT option
Materials
Computer with up-to-date web browser
Preparing text file
Preparing text file
Copy citations from the source document(s) into a text (.txt) document.
Note
The Preparation steps are not required to use ChatGPT (step 3.4) to produce bibtex, but may produce cleaner output.

Edit the text document so that each citation is on a separate line with one blank line between each citation. Not all the scripts require a blank line between citations, but it does improve readability and importing.
Note
A text editor that includes line numbers (the type used by software programmers) makes this step easier.

Processing citations
Processing citations
I've found 4 web-mounted scripts that will do text file to bibtex file translation. Only one is needed to produce a bib file - different text citation formats may work better in different systems, however.
Anystyle.io can be used on the web, or as a Ruby script. This is the most flexible script and it allows detailed editing before the final file export. The website is privately hosted, so it's not always been available. https://anystyle.io/
Software
Anystyle.io
NAME
Sylvester Keil, inukshuk on github
DEVELOPER

Expected result
.bib file

This PERL script is mounted on a website and will do basic translation. No support, but it's quite good. http://www.snowelm.com/~t/doc/tips/makebib.perl.cgi
Software
makebib.perl script
NAME
Makino Takaki
DEVELOPER

Expected result
.bib file

Hosted by the University of Toronto. Requires registration. Basic editing and checking in Google Scholar is available. Accurate and reliable. https://text2bib.economics.utoronto.ca/

Software
Text2Bib
NAME
J. O. Martin and Fabian Qifei Bai
DEVELOPER

Expected result
.bib file

ChatGPT can take text citations, even with errors or in non-standard formats, and format them as bibtex with high accuracy. While it is not required to process the text as much as for the other services, it may make it easier to be sure you are getting the correct number of fully formatted citations. The following prompt has worked well, but variations should work as well.


Command
ChatGPT prompt for producing bibtex formatting from text citations
"Generate bibtex formatting for the following citations. Put all the bibtex in one place for easier copying." Paste the citations following the prompt. 

Software
ChatGPT
NAME
OpenAI
DEVELOPER

Note
Directing ChatGPT to put all the bibtex in "one place" stops it from producing individual entries that must be copied separately. This also seems to prevent it from stalling out partway through the citation list.

There are word/character limits in most versions of ChatGPT, so it's necessary to break up large lists into smaller numbers of citations.


Expected result
The output is in ChatGPT's code output. In the web version of ChatGPT (as opposed to the API), there is a Copy Code at the top of the code block. The copied text can be added to a text file for later importing into citation managers or imported into Zotero using the Import from Clipboard option.


Export
Export
Once the script has exported a .bib file, load it into your citation manager of choice. Zotero has been the most reliable for importing. Zotero also has an option to "Import from Clipboard" if the output is available for copy & paste.
Software
Zotero
NAME
Zotero
DEVELOPER

Note
Zotero also produces more standardized .bib and .ris files than the scripts, especially the older scripts. I have imported into Zotero, exported a bib file, and imported into other software if a direct import didn't work.