
After go1.16, go will use module mode by default, even when the repository is checked out under GOPATH or in a one-off directory. Add go.mod, go.sum to keep this repo buildable without opting out of the module mode. > go mod init github.com/mmcgrana/gobyexample > go mod tidy > go mod vendor In module mode, the 'vendor' directory is special and its contents will be actively maintained by the go command. pygments aren't the dependency the go will know about, so it will delete the contents from vendor directory. Move it to `third_party` directory now. And, vendor the blackfriday package. Note: the tutorial contents are not affected by the change in go1.16 because all the examples in this tutorial ask users to run the go command with the explicit list of files to be compiled (e.g. `go run hello-world.go` or `go build command-line-arguments.go`). When the source list is provided, the go command does not have to compute the build list and whether it's running in GOPATH mode or module mode becomes irrelevant.
170 lines
6.0 KiB
ReStructuredText
170 lines
6.0 KiB
ReStructuredText
.. -*- mode: rst -*-
|
|
|
|
========================
|
|
Write your own formatter
|
|
========================
|
|
|
|
As well as creating :doc:`your own lexer <lexerdevelopment>`, writing a new
|
|
formatter for Pygments is easy and straightforward.
|
|
|
|
A formatter is a class that is initialized with some keyword arguments (the
|
|
formatter options) and that must provides a `format()` method.
|
|
Additionally a formatter should provide a `get_style_defs()` method that
|
|
returns the style definitions from the style in a form usable for the
|
|
formatter's output format.
|
|
|
|
|
|
Quickstart
|
|
==========
|
|
|
|
The most basic formatter shipped with Pygments is the `NullFormatter`. It just
|
|
sends the value of a token to the output stream:
|
|
|
|
.. sourcecode:: python
|
|
|
|
from pygments.formatter import Formatter
|
|
|
|
class NullFormatter(Formatter):
|
|
def format(self, tokensource, outfile):
|
|
for ttype, value in tokensource:
|
|
outfile.write(value)
|
|
|
|
As you can see, the `format()` method is passed two parameters: `tokensource`
|
|
and `outfile`. The first is an iterable of ``(token_type, value)`` tuples,
|
|
the latter a file like object with a `write()` method.
|
|
|
|
Because the formatter is that basic it doesn't overwrite the `get_style_defs()`
|
|
method.
|
|
|
|
|
|
Styles
|
|
======
|
|
|
|
Styles aren't instantiated but their metaclass provides some class functions
|
|
so that you can access the style definitions easily.
|
|
|
|
Styles are iterable and yield tuples in the form ``(ttype, d)`` where `ttype`
|
|
is a token and `d` is a dict with the following keys:
|
|
|
|
``'color'``
|
|
Hexadecimal color value (eg: ``'ff0000'`` for red) or `None` if not
|
|
defined.
|
|
|
|
``'bold'``
|
|
`True` if the value should be bold
|
|
|
|
``'italic'``
|
|
`True` if the value should be italic
|
|
|
|
``'underline'``
|
|
`True` if the value should be underlined
|
|
|
|
``'bgcolor'``
|
|
Hexadecimal color value for the background (eg: ``'eeeeeee'`` for light
|
|
gray) or `None` if not defined.
|
|
|
|
``'border'``
|
|
Hexadecimal color value for the border (eg: ``'0000aa'`` for a dark
|
|
blue) or `None` for no border.
|
|
|
|
Additional keys might appear in the future, formatters should ignore all keys
|
|
they don't support.
|
|
|
|
|
|
HTML 3.2 Formatter
|
|
==================
|
|
|
|
For an more complex example, let's implement a HTML 3.2 Formatter. We don't
|
|
use CSS but inline markup (``<u>``, ``<font>``, etc). Because this isn't good
|
|
style this formatter isn't in the standard library ;-)
|
|
|
|
.. sourcecode:: python
|
|
|
|
from pygments.formatter import Formatter
|
|
|
|
class OldHtmlFormatter(Formatter):
|
|
|
|
def __init__(self, **options):
|
|
Formatter.__init__(self, **options)
|
|
|
|
# create a dict of (start, end) tuples that wrap the
|
|
# value of a token so that we can use it in the format
|
|
# method later
|
|
self.styles = {}
|
|
|
|
# we iterate over the `_styles` attribute of a style item
|
|
# that contains the parsed style values.
|
|
for token, style in self.style:
|
|
start = end = ''
|
|
# a style item is a tuple in the following form:
|
|
# colors are readily specified in hex: 'RRGGBB'
|
|
if style['color']:
|
|
start += '<font color="#%s">' % style['color']
|
|
end = '</font>' + end
|
|
if style['bold']:
|
|
start += '<b>'
|
|
end = '</b>' + end
|
|
if style['italic']:
|
|
start += '<i>'
|
|
end = '</i>' + end
|
|
if style['underline']:
|
|
start += '<u>'
|
|
end = '</u>' + end
|
|
self.styles[token] = (start, end)
|
|
|
|
def format(self, tokensource, outfile):
|
|
# lastval is a string we use for caching
|
|
# because it's possible that an lexer yields a number
|
|
# of consecutive tokens with the same token type.
|
|
# to minimize the size of the generated html markup we
|
|
# try to join the values of same-type tokens here
|
|
lastval = ''
|
|
lasttype = None
|
|
|
|
# wrap the whole output with <pre>
|
|
outfile.write('<pre>')
|
|
|
|
for ttype, value in tokensource:
|
|
# if the token type doesn't exist in the stylemap
|
|
# we try it with the parent of the token type
|
|
# eg: parent of Token.Literal.String.Double is
|
|
# Token.Literal.String
|
|
while ttype not in self.styles:
|
|
ttype = ttype.parent
|
|
if ttype == lasttype:
|
|
# the current token type is the same of the last
|
|
# iteration. cache it
|
|
lastval += value
|
|
else:
|
|
# not the same token as last iteration, but we
|
|
# have some data in the buffer. wrap it with the
|
|
# defined style and write it to the output file
|
|
if lastval:
|
|
stylebegin, styleend = self.styles[lasttype]
|
|
outfile.write(stylebegin + lastval + styleend)
|
|
# set lastval/lasttype to current values
|
|
lastval = value
|
|
lasttype = ttype
|
|
|
|
# if something is left in the buffer, write it to the
|
|
# output file, then close the opened <pre> tag
|
|
if lastval:
|
|
stylebegin, styleend = self.styles[lasttype]
|
|
outfile.write(stylebegin + lastval + styleend)
|
|
outfile.write('</pre>\n')
|
|
|
|
The comments should explain it. Again, this formatter doesn't override the
|
|
`get_style_defs()` method. If we would have used CSS classes instead of
|
|
inline HTML markup, we would need to generate the CSS first. For that
|
|
purpose the `get_style_defs()` method exists:
|
|
|
|
|
|
Generating Style Definitions
|
|
============================
|
|
|
|
Some formatters like the `LatexFormatter` and the `HtmlFormatter` don't
|
|
output inline markup but reference either macros or css classes. Because
|
|
the definitions of those are not part of the output, the `get_style_defs()`
|
|
method exists. It is passed one parameter (if it's used and how it's used
|
|
is up to the formatter) and has to return a string or ``None``.
|