Pygments
The full Pygments API
« Back To IndexContents
This page describes the Pygments API.
High-level API
Functions from the pygments module:
- def lex(code, lexer):
- Lex code with the lexer (must be a Lexer instance) and return an iterable of tokens. Currently, this only calls lexer.get_tokens().
- def format(tokens, formatter, outfile=None):
- Format a token stream (iterable of tokens) tokens with the formatter (must be a Formatter instance). The result is written to outfile, or if that is None, returned as a string.
- def highlight(code, lexer, formatter, outfile=None):
- This is the most high-level highlighting function. It combines lex and format in one function.
Functions from pygments.lexers:
- def get_lexer_by_name(alias, **options):
Return an instance of a Lexer subclass that has alias in its aliases list. The lexer is given the options at its instantiation.
Will raise pygments.util.ClassNotFound if no lexer with that alias is found.
- def get_lexer_for_filename(fn, **options):
Return a Lexer subclass instance that has a filename pattern matching fn. The lexer is given the options at its instantiation.
Will raise pygments.util.ClassNotFound if no lexer for that filename is found.
- def get_lexer_for_mimetype(mime, **options):
Return a Lexer subclass instance that has mime in its mimetype list. The lexer is given the options at its instantiation.
Will raise pygments.util.ClassNotFound if not lexer for that mimetype is found.
- def guess_lexer(text, **options):
Return a Lexer subclass instance that's guessed from the text in text. For that, the analyse_text() method of every known lexer class is called with the text as argument, and the lexer which returned the highest value will be instantiated and returned.
pygments.util.ClassNotFound is raised if no lexer thinks it can handle the content.
- def guess_lexer_for_filename(filename, text, **options):
As guess_lexer(), but only lexers which have a pattern in filenames or alias_filenames that matches filename are taken into consideration.
pygments.util.ClassNotFound is raised if no lexer thinks it can handle the content.
- def get_all_lexers():
Return an iterable over all registered lexers, yielding tuples in the format:
(longname, tuple of aliases, tuple of filename patterns, tuple of mimetypes)
New in Pygments 0.6.
Functions from pygments.formatters:
- def get_formatter_by_name(alias, **options):
Return an instance of a Formatter subclass that has alias in its aliases list. The formatter is given the options at its instantiation.
Will raise pygments.util.ClassNotFound if no formatter with that alias is found.
- def get_formatter_for_filename(fn, **options):
Return a Formatter subclass instance that has a filename pattern matching fn. The formatter is given the options at its instantiation.
Will raise pygments.util.ClassNotFound if no formatter for that filename is found.
Functions from pygments.styles:
- def get_style_by_name(name):
Return a style class by its short name. The names of the builtin styles are listed in pygments.styles.STYLE_MAP.
Will raise pygments.util.ClassNotFound if no style of that name is found.
- def get_all_styles():
Return an iterable over all registered styles, yielding their names.
New in Pygments 0.6.
Lexers
A lexer (derived from pygments.lexer.Lexer) has the following functions:
- def __init__(self, **options):
The constructor. Takes a **keywords dictionary of options. Every subclass must first process its own options and then call the Lexer constructor, since it processes the stripnl, stripall and tabsize options.
An example looks like this:
def __init__(self, **options): self.compress = options.get('compress', '') Lexer.__init__(self, **options)
As these options must all be specifiable as strings (due to the command line usage), there are various utility functions available to help with that, see Option processing.
- def get_tokens(self, text):
This method is the basic interface of a lexer. It is called by the highlight() function. It must process the text and return an iterable of (tokentype, value) pairs from text.
Normally, you don't need to override this method. The default implementation processes the stripnl, stripall and tabsize options and then yields all tokens from get_tokens_unprocessed(), with the index dropped.
- def get_tokens_unprocessed(self, text):
This method should process the text and return an iterable of (index, tokentype, value) tuples where index is the starting position of the token within the input text.
This method must be overridden by subclasses.
- def analyse_text(text):
- A static method which is called for lexer guessing. It should analyse the text and return a float in the range from 0.0 to 1.0. If it returns 0.0, the lexer will not be selected as the most probable one, if it returns 1.0, it will be selected immediately.
For a list of known tokens have a look at the Tokens page.
A lexer also can have the following attributes (in fact, they are mandatory except alias_filenames) that are used by the builtin lookup mechanism.
- name
- Full name for the lexer, in human-readable form.
- aliases
- A list of short, unique identifiers that can be used to lookup the lexer from a list, e.g. using get_lexer_by_name().
- filenames
- A list of fnmatch patterns that match filenames which contain content for this lexer. The patterns in this list should be unique among all lexers.
- alias_filenames
- A list of fnmatch patterns that match filenames which may or may not contain content for this lexer. This list is used by the guess_lexer_for_filename() function, to determine which lexers are then included in guessing the correct one. That means that e.g. every lexer for HTML and a template language should include \*.html in this list.
- mimetypes
- A list of MIME types for content that can be lexed with this lexer.
Formatters
A formatter (derived from pygments.formatter.Formatter) has the following functions:
- def __init__(self, **options):
As with lexers, this constructor processes options and then must call the base class __init__.
The Formatter class recognizes the options style, full and title. It is up to the formatter class whether it uses them.
- def get_style_defs(self, arg=''):
This method must return statements or declarations suitable to define the current style for subsequent highlighted text (e.g. CSS classes in the HTMLFormatter).
The optional argument arg can be used to modify the generation and is formatter dependent (it is standardized because it can be given on the command line).
This method is called by the -S command-line option, the arg is then given by the -a option.
- def format(self, tokensource, outfile):
This method must format the tokens from the tokensource iterable and write the formatted version to the file object outfile.
Formatter options can control how exactly the tokens are converted.
A formatter must have the following attributes that are used by the builtin lookup mechanism. (New in Pygments 0.7.)
- name
- Full name for the formatter, in human-readable form.
- aliases
- A list of short, unique identifiers that can be used to lookup the formatter from a list, e.g. using get_formatter_by_name().
- filenames
- A list of fnmatch patterns that match filenames for which this formatter can produce output. The patterns in this list should be unique among all formatters.
Option processing
The pygments.util module has some utility functions usable for option processing:
- class OptionError
- This exception will be raised by all option processing functions if the type or value of the argument is not correct.
- def get_bool_opt(options, optname, default=None):
Interpret the key optname from the dictionary options as a boolean and return it. Return default if optname is not in options.
The valid string values for True are 1, yes, true and on, the ones for False are 0, no, false and off (matched case-insensitively).
- def get_int_opt(options, optname, default=None):
- As get_bool_opt, but interpret the value as an integer.
- def get_list_opt(options, optname, default=None):
- If the key optname from the dictionary options is a string, split it at whitespace and return it. If it is already a list or a tuple, it is returned as a list.
- def get_choice_opt(options, optname, allowed, default=None):
- If the key optname from the dictionary is not in the sequence allowed, raise an error, otherwise return it. New in Pygments 0.8.