Bibulous is a drop-in replacement for BibTeX, with the primary advantage that the bibliography template format is compact and very easy to modify.
The basic program flow is as follows:
If the input is a filename ending in ‘.aux’, then read through the .aux file and locate the lines ibdata{...} and ibstyle{...} to get the filename(s) for the bibliography database and style template.
If the input is a list of filenames, then assume that this is the complete list of files to use.
Parameters : | filename : str
|
---|---|
Returns : | filedict : dict
|
Reduce the case of the string to lower case, except for the first character in the string, and except if any given character is at nonzero brace level.
Parameters : | s : str
|
---|---|
Returns : | t : str
|
Split a string into tokens, taking care not to allow the separator to act unless at brace level zero.
Parameters : | s : str
|
---|---|
Returns : | tokens : list of str
|
A generator replacement for re.finditer() but without using regex expressions.
Parameters : | a_str : str
sub : str
|
---|
Parse a name field (“author” or “editor”) of a BibTeX entry into a list of dicts, one for each person.
Parameters : | namefield : str
key : str
nameabbrev : dict
disable : list of int, optional
|
---|---|
Returns : | namelist : list
|
Convert a name dictionary into a formatted name string.
Parameters : | namedict : dict
options : dict, optional
use_firstname_initials : bool
namelist_format : str
nameabbrev : list of str
|
---|---|
Returns : | namestr : str
|
From an input name element (first, middle, prefix, last, or suffix) , convert it to its initials.
Parameters : | name : str
options : dict, optional
|
---|---|
Returns : | new_name : str
|
Generate a list of level numbers for each character in a string.
Parameters : | s : str
ldelim : str
rdelim : str
is_regex : bool
|
---|---|
Returns : | oplevels : list of ints
|
Return a list which gives the “quotation level” of each character in the string.
Parameters : | s : str
disable : list of int, optional
|
---|---|
Returns : | alevels : list
blevels : list
clevels : list
|
Notes
When using double-quotes, it is easy to break the parser, so they should be used only sparingly.
Split a string at locations given by a list of indices.
This can be used more flexibly than Python’s native string split() function, when the character you are splitting on is not always a valid splitting location.
Parameters : | s : str
ilist : list
|
---|---|
Returns : | slist : list of str
|
Split a string using more than one separator.
Copied from http://stackoverflow.com/questions/1059559/python-strings-split-with-multiple-separators.
Parameters : | s : str
sep : list of str
|
---|---|
Returns : | res : list
|
This function will return the input string if it finds there are no nested operators inside (i.e. when the number of delimiters found is < 2).
Parameters : | s : str
delims : tuple of two strings
odd_operator : str
even_operator : str
disable : list of int, optional
|
---|---|
Returns : | s : str
|
Find nested quotes within strings and, if necessary, replace them with the proper nesting (i.e. outer quotes use ``...'' while inner quotes use `...').
Parameters : | s : str
disable : list of int, optional
|
---|---|
Returns : | s : str
|
Remove the LaTeX-based formatting elements from a string so that a sorting function can use alphanumerical sorting on the string.
Parameters : | s : str
|
---|---|
Returns : | p : str
|
Notes
Currently purify_string() does not allow LaTeX markup such as ‘i to refer to the Unicode character which is correctly written as ‘i. Add functionality to allow that?
Translate LaTeX-markup special characters to their Unicode equivalents.
Parameters : | s : str
|
---|---|
Returns : | s : str
|
From an “options train” [...|...|...], find the first fully defined block in the train.
A Bibulous type of bibliography style template string contains grammatical featues called options trains, of the form [...|...|...]. Each “block” in the train (divided from the others by a | symbol), contains fields which, if defined, replace the entire options train in the returned string.
Parameters : | bst_template_str : str
variables : list of str
bibentry : dict
undefstr : str
|
---|---|
Returns : | arg : str
|
Take a BibTeX string representing a single person’s name and parse it into its first, middle, last, etc pieces.
So, we can separate these three categories by counting the number of commas that appear.
Parameters : | namestr : str
disable : list of int, optional
|
---|---|
Returns : | namedict : dict
|
From the middle name of a single person, check if any of the names should be placed into the “prefix” and move them there.
Parameters : | namedict : dict
|
---|---|
Returns : | namedict : dict
|
Given a bibliography entry’s edition number, format it as an ordinal (i.e. “1st”, “2nd” instead of “1”, “2”) in the way that it will appear on the formatted page.
Parameters : | bibentry : dict
disable : list of int, optional
|
---|---|
Returns : | editionstr : str
|
Write a bibliography database dictionary into a .bib file.
Parameters : | filename : str
bibdata : dict
|
---|
Given a string containing the “pages” field of a bibliographic entry, figure out the start and end pages.
Parameters : | pages_str : str
citekey : str, optional
disable : list of int, optional
|
---|---|
Returns : | startpage : str
endpage : str
|
Given a string containing either a single “name” > “abbreviation” pair or a list of such pairs, parse the string into a dictionary of names and abbreviations.
Parameters : | abbrevstr : str
|
---|---|
Returns : | nameabbrev_dict : dict
|
Given a key that matches an already-present key in the input dictionary, generate a new key by appending zeros to the key string.
Parameters : | sortkey : str
sortdict : dict
|
---|---|
Returns : | newkey : str
|
Remove elements from a Python script which are provide the most egregious security flaws; also replace some identifiers with their correct namespace representation.
Parameters : | line : str
|
---|---|
Returns : | filtered : str
|
Check is an input string represents an integer value. Although a trivial function, it will be useful for user scripts.
Parameters : | s : str
|
---|---|
Returns : | is_integer : bool
|
Print a warning message, with the option to disable any given message.
Parameters : | msg : str
disable : list of int, optional
|
---|
Bibdata is a class to hold all data related to a bibliography database, a citation list, and a style template.
To initialize the class, either call it with the filename of the ”.aux” file containing the relevant file locations (for the ”.bib” database files and the ”.bst” template files) or simply call it with a list of all filenames to be used (”.bib”, ”.bst” and ”.aux”). The output file (the LaTeX-formatted bibliography) is assumed to have the same filename root as the ”.aux” file, but with ”.bbl” as its extension.
Attributes
abbrevs | dict | The list of abbreviations given in the bibliography database file(s). The dictionary keys are the abbreviations, and the values are their full forms. |
bibdata | dict | The database of bibliography entries and fields derived from parsing the bibliography database file(s). |
bstdict | dict | The style template for formatting the bibliography. The dictionary keys are the entrytypes, with the dictionary values their string template. |
citedict | dict | The dictionary of citation keys and their corresponding numerical order of citation. |
debug | bool | Whether to turn on debugging features. |
filedict | dict | The ditionary of filenames associated with the bibliographic data. The dictionary consists of keys bib, bst, aux, tex, and bbl. The first two are lists of filenames, while the others contain only a single filename. |
filename | str | (For error messages and debugging) The name of the file currently being parsed. |
i | int | (For error messages and debugging) The line of the file currently being parsed. |
options | dict | The dictionary containing the various option settings from the style template (BST) files. |
abbrevkey_pattern | compiled regular expression object | The regex used to search for abbreviation keys. |
anybrace_pattern | compiled regular expression object | The regex used to search for curly braces { or }. |
anybraceorquote_pattern | compiled regular expression object | The regex used to search for curly braces or for double-quotes, i.e. {, }, or “. |
endbrace_pattern | compiled regular expression object | The regex used to search for an ending curly brace, i.e. ‘}’. |
quote_pattern | compiled regular expression object | The regex used to search for a double-quote, i.e. “. |
startbrace_pattern | compiled regular expression object | The regex used to search for a starting curly brace, {. |
Methods
parse_bibfile(filename) | Parse a ”.bib” file to generate a dictionary representing a bibliography database. |
parse_bibentry(entrystr, entrytype) | Given a string representing the entire contents of the BibTeX-format bibliography entry, |
parse_bibfield(entrystr) | For a given string representing the raw contents of a BibTeX-format bibliography entry, |
parse_auxfile(filename[, debug]) | Read in an ”.aux” file and convert the citation{} entries found there into a dictionary of citekeys and citation order number. |
parse_bstfile(filename) | Convert a Bibulous-type bibliography style template into a dictionary. |
write_bblfile([filename, write_preamble, ...]) | Given a bibliography database bibdata, a dictionary containing the citations called out citedict, and a bibliography style template bstdict write the LaTeX-format file for the formatted bibliography. |
create_citation_list() | Create the list of citation keys, sorted into the proper order. |
format_bibitem(citekey[, debug]) | Create the “ibitem{...}” string to insert into the ”.bbl” file. |
generate_sortkey(citekey) | From a bibliography entry and the formatting template options, generate a sorting key for the entry. |
create_namelist(key, nametype) | Deconstruct the bibfile string following “author = ...” (or “editor = ...”), and create a new field authorlist or editorlist that is a list of dictionaries (one dict for each person). |
format_namelist(namelist, nametype) | Format a list of dictionaries (one dict for each person) into a long string, with the format according to the directives in the bibliography style template. |
insert_crossref_data(entrykey[, fieldname]) | Insert crossref info into a bibliography database dictionary. |
write_citeextract(outputfile[, debug]) | Extract a sub-database from a large bibliography database, with the former containing only those entries cited in the .aux file. |
write_authorextract(searchname[, ...]) | Extract a sub-database from a large bibliography database, with the former containing only those entries citing the given author/editor. |
replace_abbrevs_with_full(fieldstr, resultstr) | Given an input str, locate the abbreviation key within it and replace the abbreviation with its full form. |
Deconstruct the bibfile string following “author = ...” (or “editor = ...”), and create a new field authorlist or editorlist that is a list of dictionaries (one dict for each person).
Parameters : | key : str
nametype : str, {‘author’,’editor’}
|
---|
Create the “ibitem{...}” string to insert into the ”.bbl” file.
This is the workhorse function of Bibulous. For a given key, find the resulting entry in the bibliography database. From the entry’s entrytype, lookup the relevant template in bstdict and start replacing template variables with formatted elements of the database entry. Once you’ve replaced all template variables, you’re done formatting that entry.
This function is also where we compile any scripts present in the BST files.
Parameters : | citekey : str
|
---|---|
Returns : | itemstr : str
|
Format a list of dictionaries (one dict for each person) into a long string, with the format according to the directives in the bibliography style template.
Parameters : | namelist : str
nametype : str, {‘author’, ‘editor’}
|
---|---|
Returns : | namestr : str
|
From a bibliography entry and the formatting template options, generate a sorting key for the entry.
Parameters : | citekey : str
|
---|---|
Returns : | sortkey : str
|
Insert crossref info into a bibliography database dictionary.
Loop through a bibliography database dictionary and, for each entry which has a “crossref” field, locate the crossref entry and insert any missing bibliographic information into the main entry’s fields.
Parameters : | entrykey : str
fieldname : str, optional
|
---|---|
Returns : | foundit : bool
|
Read in an ”.aux” file and convert the citation{} entries found there into a dictionary of citekeys and citation order number.
Parameters : | filename : str
|
---|
Given a string representing the entire contents of the BibTeX-format bibliography entry, parse the contents and place them into the bibliography preamble string, the set of abbreviations, and the bibliography database dictionary.
Parameters : | entrystr : str
entrytype : str
|
---|
For a given string representing the raw contents of a BibTeX-format bibliography entry, parse the contents into a dictionary of key:value pairs corresponding to the field names and field values.
Parameters : | entrystr : str
|
---|
Parse a ”.bib” file to generate a dictionary representing a bibliography database.
Parameters : | filename : str
|
---|
Convert a Bibulous-type bibliography style template into a dictionary.
The resulting dictionary consists of keys which are the various entrytypes, and values which are the template strings. In addition, any formatting options are stored in the “options” key as a dictionary of option_name:option_value pairs.
Parameters : | filename : str
|
---|
Given an input str, locate the abbreviation key within it and replace the abbreviation with its full form.
Once the abbreviation key is found, remove it from the “fieldstr” and add the full form to the “resultstr”.
Parameters : | fieldstr : str
resultstr : str
|
---|---|
Returns : | fieldstr : str
resultstr : str
end_of_field : bool
|
Extract a sub-database from a large bibliography database, with the former containing only those entries citing the given author/editor.
Parameters : | searchname : str or dict
outputfile : str, optional
|
---|
Given a bibliography database bibdata, a dictionary containing the citations called out citedict, and a bibliography style template bstdict write the LaTeX-format file for the formatted bibliography.
Start with the preamble and then loop over the citations one by one, formatting each entry one at a time, and put end{thebibliography} at the end when done.
Parameters : | filename : str, optional
write_preamble : bool, optional
write_postamble : bool, optional
bibsize : str, optional
|
---|
Extract a sub-database from a large bibliography database, with the former containing only those entries cited in the .aux file.
Parameters : | filedict : str
outputfile : str, optional
|
---|