DOI to BibTeX

(3 comments)

The Digital Object Identifier (DOI) resolution service at doi.org exposes an API for retrieving the BibTeX markup for a reference given its DOI. The following Python 3 script takes a DOI on the command line and returns the BibTeX. For example,

$ python doi2bib.py 10.1177/1470593113512323

@article{Avis_2013,
    doi = {10.1177/1470593113512323},
    url = {https://doi.org/10.1177%2F1470593113512323},
    year = 2013,
    month = {dec},
    publisher = {{SAGE} Publications},
    volume = {14},
    number = {4},
    pages = {451--475},
    author = {M. Avis and S. Forbes and S. Ferguson},
    title = {The brand personality of rocks: A critical evaluation of a brand personality scale},
    journal = {Marketing Theory}
}

The code:

import sys
import urllib.request
from urllib.error import HTTPError

BASE_URL = 'http://dx.doi.org/'

try:
    doi = sys.argv[1]
except IndexError:
    print('Usage:\n{} <doi>'.format(sys.argv[0]))
    sys.exit(1)

url = BASE_URL + doi
req = urllib.request.Request(url)
req.add_header('Accept', 'application/x-bibtex')
try:
    with urllib.request.urlopen(req) as f:
        bibtex = f.read().decode()
    print(bibtex)
except HTTPError as e:
    if e.code == 404:
        print('DOI not found.')
    else:
        print('Service unavailable.')
    sys.exit(1)
Current rating: 4.8

Comments

Comments are pre-moderated. Please be patient and your comment will appear soon.

Dr. Prateek Raj Gautam 2 years ago

Excellent work.
I face one issue that year entry in bibfile is not enclosed in curly brackets {}

so here is my modification on received string


## define function
def doi2bib(DOI):
global DIR
global bibComment
BASE_URL = 'http://dx.doi.org/'
bibString=bibComment
spacer='\n\n'
for i in range(0,len(DOI)):
try:
doi = DOI[i]
except IndexError:
print('Usage:\n{} <doi>'.format(doi))
## sys.exit(1)

url = BASE_URL + doi
req = urllib.request.Request(url)
req.add_header('Accept', 'application/x-bibtex')
try:
with urllib.request.urlopen(req) as f:
bibtex = f.read().decode()
testStr='year = '
for i in range(0,len(bibtex)-len(testStr)):
if bibtex[i:i+len(testStr)]==testStr:
if bibtex[i+len(testStr)+1]!='{':
year='{'+str(bibtex[i+len(testStr):i+len(testStr)+4]) + '}'
print(year)
for j in range(i+len(testStr),i+len(testStr)+10):
if bibtex[j]==',':
split1=i+len(testStr)
split2=j
newbib=bibtex[0:split1] + year + bibtex[split2:len(bibtex)]
bibtex=newbib




## print(bibtex)
except HTTPError as e:
if e.code == 404:
print(doi + 'DOI not found.')
else:
print('Service unavailable.')

bibString=bibString+spacer+bibtex

with open(DIR + Output + '.bib','w') as F:
F.write(bibString)
return bibString



## call function with following values
DIR='./'
Output='BibtexFromDOI'
DOI=['10.1109/tii.2019.2908437','10.1049/iet-com.2019.1298']

bib=doi2bib(DOI)

Link | Reply
Current rating: 1

christian 2 years ago

Interesting. Is it mandatory to have braces around a purely numeric value for the year, though?

Link | Reply
Current rating: 1

Güray Hatipoğlu 4 months ago

It partially retrieves arXiv paper's metadata, and finishes with the following error:
""
c:\tex>Focus to learn more
'Focus' is not recognized as an internal or external command,
operable program or batch file.
""

For example, try this: 10.48550/arXiv.2304.00728

Link | Reply
Currently unrated

New Comment

required

required (not published)

optional

required