gobyexample/vendor/pygments/doc/docs/formatterdevelopment.rst
2016-12-27 08:10:47 -08:00

170 lines
6.0 KiB
ReStructuredText

.. -*- mode: rst -*-
========================
Write your own formatter
========================
As well as creating :doc:`your own lexer <lexerdevelopment>`, writing a new
formatter for Pygments is easy and straightforward.
A formatter is a class that is initialized with some keyword arguments (the
formatter options) and that must provides a `format()` method.
Additionally a formatter should provide a `get_style_defs()` method that
returns the style definitions from the style in a form usable for the
formatter's output format.
Quickstart
==========
The most basic formatter shipped with Pygments is the `NullFormatter`. It just
sends the value of a token to the output stream:
.. sourcecode:: python
from pygments.formatter import Formatter
class NullFormatter(Formatter):
def format(self, tokensource, outfile):
for ttype, value in tokensource:
outfile.write(value)
As you can see, the `format()` method is passed two parameters: `tokensource`
and `outfile`. The first is an iterable of ``(token_type, value)`` tuples,
the latter a file like object with a `write()` method.
Because the formatter is that basic it doesn't overwrite the `get_style_defs()`
method.
Styles
======
Styles aren't instantiated but their metaclass provides some class functions
so that you can access the style definitions easily.
Styles are iterable and yield tuples in the form ``(ttype, d)`` where `ttype`
is a token and `d` is a dict with the following keys:
``'color'``
Hexadecimal color value (eg: ``'ff0000'`` for red) or `None` if not
defined.
``'bold'``
`True` if the value should be bold
``'italic'``
`True` if the value should be italic
``'underline'``
`True` if the value should be underlined
``'bgcolor'``
Hexadecimal color value for the background (eg: ``'eeeeeee'`` for light
gray) or `None` if not defined.
``'border'``
Hexadecimal color value for the border (eg: ``'0000aa'`` for a dark
blue) or `None` for no border.
Additional keys might appear in the future, formatters should ignore all keys
they don't support.
HTML 3.2 Formatter
==================
For an more complex example, let's implement a HTML 3.2 Formatter. We don't
use CSS but inline markup (``<u>``, ``<font>``, etc). Because this isn't good
style this formatter isn't in the standard library ;-)
.. sourcecode:: python
from pygments.formatter import Formatter
class OldHtmlFormatter(Formatter):
def __init__(self, **options):
Formatter.__init__(self, **options)
# create a dict of (start, end) tuples that wrap the
# value of a token so that we can use it in the format
# method later
self.styles = {}
# we iterate over the `_styles` attribute of a style item
# that contains the parsed style values.
for token, style in self.style:
start = end = ''
# a style item is a tuple in the following form:
# colors are readily specified in hex: 'RRGGBB'
if style['color']:
start += '<font color="#%s">' % style['color']
end = '</font>' + end
if style['bold']:
start += '<b>'
end = '</b>' + end
if style['italic']:
start += '<i>'
end = '</i>' + end
if style['underline']:
start += '<u>'
end = '</u>' + end
self.styles[token] = (start, end)
def format(self, tokensource, outfile):
# lastval is a string we use for caching
# because it's possible that an lexer yields a number
# of consecutive tokens with the same token type.
# to minimize the size of the generated html markup we
# try to join the values of same-type tokens here
lastval = ''
lasttype = None
# wrap the whole output with <pre>
outfile.write('<pre>')
for ttype, value in tokensource:
# if the token type doesn't exist in the stylemap
# we try it with the parent of the token type
# eg: parent of Token.Literal.String.Double is
# Token.Literal.String
while ttype not in self.styles:
ttype = ttype.parent
if ttype == lasttype:
# the current token type is the same of the last
# iteration. cache it
lastval += value
else:
# not the same token as last iteration, but we
# have some data in the buffer. wrap it with the
# defined style and write it to the output file
if lastval:
stylebegin, styleend = self.styles[lasttype]
outfile.write(stylebegin + lastval + styleend)
# set lastval/lasttype to current values
lastval = value
lasttype = ttype
# if something is left in the buffer, write it to the
# output file, then close the opened <pre> tag
if lastval:
stylebegin, styleend = self.styles[lasttype]
outfile.write(stylebegin + lastval + styleend)
outfile.write('</pre>\n')
The comments should explain it. Again, this formatter doesn't override the
`get_style_defs()` method. If we would have used CSS classes instead of
inline HTML markup, we would need to generate the CSS first. For that
purpose the `get_style_defs()` method exists:
Generating Style Definitions
============================
Some formatters like the `LatexFormatter` and the `HtmlFormatter` don't
output inline markup but reference either macros or css classes. Because
the definitions of those are not part of the output, the `get_style_defs()`
method exists. It is passed one parameter (if it's used and how it's used
is up to the formatter) and has to return a string or ``None``.