This document works as a guide for the texopic python module. It explains what to expect from this module and how to use it.
More details about the language can be found from the front page.
The following code emits every token and character that texopic finds in the stream.
1 2 3 4 | import texopic for token in texopic.read_file("index.text"): print token |
This is the most primitive part of texopic. It doesn't load other modules by design. It allows you to implement completely custom logic over the language.
There are several helpers you should use though, because they help in generating texopic documents a great deal.
The generic parts of the document generator can be found in the texopic.generic module. The key piece in this system is the environment object.
The following program forms a generator from the generic module. We go through it by pieces.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 | import texopic from texopic.generic import Env, verbatim def main(): group = texopic.read_file("guide.text") for line in env.vcall(group, document=None): print line env = Env() @env.define("paragraph") def env_paragraph(context, group): context.emit(verbatim(group)) @env.define("preformat") def env_preformat(context, string): context.emit(string) if __name__=='__main__': main() |
verbatim function returns a string that is as close to the verbatim version of the input group as it can be. It is meant for retrieving URLs from the macro groups.
Internally it is used to retrieve the identifier inside #begin and #end clauses.
Environment object, Env, works as a namespace for the generator. The environment objects can be stacked to create namespace cakes for customizing generators.
Env.define can be used to define functions into the namespace. At minimum the namespace must contain the "paragraph" and "preformat" -functions. Rest of it describes behavior for macros.
Functions labelled such as #macroname and :macroname allow the user to customize the behavior of respective macros, or #begin/#;end -block. This is explained further in Customizing macros.
Env also has functions .hcall and .vcall that user can call. They have to discussed along some functions from Context.
Context is a helper for customizing the behavior of the Vertical/Horizontal stack machine of the generator. It contains a .document object that can be chosen directly.
If one of the functions attempts to shift into vertical mode, but that is impossible, it will raise Suspend() -exception. The Suspend() will be catched by the code that evaluates custom macros and causes it to write the macro in literal form into the document as backup measure.
Context.emit(value) runs the machine into vertical mode and appends the value into vertical list.
Context.next_group(builder) starts a new horizontal mode that builds with the builder function once finished.
Context.end_group() forces the current mode to stop.
Context.in_cake(name) returns True or False, depending on whether the name is in the stack 'cake'.
Context.block property retrieves a topmost block from the stack 'cake'. This cake term is explained later.
Context.get_cake() retrieves an object that allows the customization to empty topmost portions of the stack cake.
Context.hcall(group) starts a new context with same document and same env. The context starts in horizontal mode and cannot switch to vertical. Gives a horizontal list as result.
Context.vcall(group) starts a new context with same document and same env. The context starts in vertical mode. Returns a vertical list.
Env.hcall(group, document) and Env.vcall(group, document) behave same as those context functions except that you can use these to select the environment and the document. In fact Context functions are shorthands into context.env.*call(group, context.document) .
Texopic has a pretty printing module that implements the algorithm described by the Stanford University report CS-TR-79-770.
The pretty printing is not a focus of the module, so here's just a sample program that illustrates what it can do:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 | from texopic.printer import Scanner from StringIO import StringIO for margin in [20, 10, 80]: scan = Scanner(StringIO(), margin) scan("(").left().blank("", 2) scan.left() scan("hello").blank(" ", forceable=False)("world") scan.right() scan.blank(", ", 2) scan("second").blank(" ", 2)("line") scan.blank(" ")(")").right() scan.finish() print scan.printer.fd.getvalue() |
Console output:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 | cheery@ruttunen:~/Documents/texopic$ python scratch.py ( hello world second line ) ( hello world second line ) (hello world, second line) cheery@ruttunen:~/Documents/texopic$ |
texopic.html module is ensuring that valid HTML markup is easy to generate.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 | from texopic import html body = html.Block([ html.Node('a', ["hello"], { "href": html.URL("//example.org") }, extra=['disabled'], space_sensitive=False, # True if tag is 'pre' slash=True # if True end element with /> # if False end element with > ), html.Raw("<script>alert(1);</script>") ]) print html.stringify(body, margin=30) |
Output:
1 2 3 4 5 | cheery@ruttunen:~/Documents/texopic$ python scratch.py <a href="//example.org" disabled>hello </a><script>alert(1);</script> |
Although you can identify URLs in the markup for validation purposes, Texopic html module doesn't come with validation of URLs or XSS prevention.
XSS prevention during generating markup from unsafe sources is futile attempt because the HTML can be interpreted in vastly different ways today. Whitelisting simply cannot account for yet another markup language strapped on top of HTML introducing new notation that evaluates code from markup.
Texopic nopes out of doing XSS-prevention for now. It is impossible to do it properly given the current circumstances.
1 2 3 4 5 6 7 8 9 | from texopic.toc import Toc toc = Toc() print toc.entry(0, "hello", link=None) print toc.entry(1, "test") print toc.entry(0, "world", link="sample") print toc.data |
Output:
1 2 3 4 5 6 | cheery@ruttunen:~/Documents/texopic$ python scratch.py ('1', '1') ('1.1', '1.1') ('2', 'sample') [('1', '1', 'hello'), ('1.1', '1.1', 'test'), ('2', 'sample', 'world')] cheery@ruttunen:~/Documents/texopic$ |
1 2 3 4 | from texopic.default_html_env import env env = Env() # add your customizations here. |
The default_html_env uses the earlier html module to create HTML document fragments. It implements several useful macros listed below:
#title #section #section{link} #subsection #subsection{link} #bold{group} #comment #begin{comment} #begin{itemize} #begin{enumerate} #item #image{url} #image{url}{alt} #href{url} #href{url}{desc}
You need to install pygments to get this example run.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 | from texopic.default_html_env import env from texopic.generic import Env, process, verbatim from texopic import html from texopic.toc import Toc import sys import texopic class Document(object): def __init__(self): self.title = None self.description = None self.toc = Toc() template = """<!DOCTYPE html> <html> <head> <meta charset="utf-8"> <meta name="viewport" content="width=device-width, initial-scale=1.0"/> {0}</head> <body> {1}</body> </html>""" style = """ body { max-width: 75ex } body > pre { border: 1px solid #cfcfcf; padding: 1em 4ex } h2 > .ref { visibility: hidden; text-decoration: underline; } h2:hover > .ref { visibility: visible !important } h3 > .ref { visibility: hidden; text-decoration: underline; } h3:hover > .ref { visibility: visible !important } .sourcetable pre { margin: 0; } .sourcetable .linenos { padding-left: 1ex; padding-right: 1ex; border-right: 1px solid black; } """.strip() def main(): group = texopic.read_file(sys.argv[1]) document = Document() head = html.Block([]) body = html.Block(env.vcall(group, document)) if document.title is not None: head.append(html.Node('title', [document.title])) if document.description is not None: head.append(html.Node('meta', None, { 'name':'description', 'content':document.description, }, slash=False)) head.append(html.Node('style', [html.Raw(style)])) head.append(html.Node('link', None, { "rel": "stylesheet", "type": "text/css", "href": html.URL("pygments-style.css"), })) print template.format( html.stringify(head), html.stringify(body)) env = Env(env) @env.define("#include", 1) def env_include(context, path): path = verbatim(path) # TODO: make source file relative. group = texopic.read_file(path) process(context, group) @env.define("#description", 0) def env_description(context): @context.next_group def _build_description_(context, group): context.document.description = verbatim(group) from pygments import highlight from pygments.lexers import get_lexer_by_name from pygments.formatters import HtmlFormatter @env.define("#sample", 0) def env_sample(context): @context.next_group def _build_sample_(context, group, code=""): name = verbatim(group).strip() or "python" lexer = get_lexer_by_name(name, stripall=True) formatter = HtmlFormatter(linenos=True, cssclass="source") context.emit(html.Raw(highlight(code, lexer, formatter))) _build_sample_.capture_pre = True # heheehe. @env.define("#include_code", 2) def env_include_python_code(context, lexer_name, path): name = verbatim(lexer_name).strip() or "python" path = verbatim(path) # TODO: make source file relative? with open(path, "r") as fd: code = fd.read() lexer = get_lexer_by_name(name, stripall=True) formatter = HtmlFormatter(linenos=True, cssclass="source") context.emit(html.Raw(highlight(code, lexer, formatter))) if __name__=="__main__": main() |
You likely have good clue about how to extend Texopic with your own macros so far. To make it clear here are few complete samples about macros in Texopic.
Ordinary macro is just replacing itself with some content and starts a horizontal mode. This happens if you return something from the macro.
1 2 3 | @env.define("#italic", 1) def env_italic(context, group): return html.Node('i', group) |
The above code would parse #italic{group} macro.
Segments are paragraph-level constructs meant for customizing behavior of horizontal lists. The simplest construct such as this just writes out a differently formatted horizontal list.
1 2 3 4 5 6 | @env.define("#claim", 0) def env_claim(context): @context.next_group def _build_claim_(context, group): context.emit(html.Node("p", group, {"class":"claim"})) |
Using such macros as this #claim always starts a new horizontal mode and creates a horizontal list.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 | @env.define(":enumerate", 0) def env_begin_itemize(context): block = Itemize() def _build_(context, cake): block.item(cake) context.emit(html.Node('ol', block.data)) return {"build": _build_, "block": block} @env.define("#item", 0) def env_item(context): if isinstance(context.mode.block, Itemize): context.block.item(context.get_cake()) else: raise Suspend() class Itemize(object): def __init__(self): self.data = [] def item(self, cake): if cake.is_empty and len(self.data) == 0: pass # the first #item elif cake.is_group: self.data.append(html.Node('li', cake.as_group())) else: self.data.append(html.Node('li', cake.as_list())) |
These macros parse input:
#begin{enumerate} #item X #item Y #item Z #end{enumerate}
And produce:
This is what is referred to with the term 'cake stack'. The block macros push a vertical mode and allow to create nested vertical lists.
1 2 3 4 5 6 7 | @env.define("#sample", 0) def env_sample(context): @context.next_group def _build_sample_(context, group, code=""): lexer_name = verbatim(group).strip() pass # do some formatting for code here. _build_sample_.capture_pre = True |
This is parsing the #sample lexer_name ## and is meant for controlling how preformatted blocks are interpreted.
Clean makefiles aren't difficult to write. Get a guide if you don't know how.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 | all: index.html guide.html # This is for updating the page. You shouldn't have to do it. # http://lea.verou.me/2011/10/easily-keep-gh-pages-in-sync-with-master/ sync: git checkout gh-pages git rebase master git push origin gh-pages git checkout master guide.html: guide.text LICENSE.md texopic2html.py %.html: %.text python texopic2html.py $< > $@ |
Avoid cyclic dependencies and you're all right. Practically this means that don't have make rules that depend on the make rules coming before it.
As an useful tip, the following command installs this package with a symlink so that changes to the sources will be immediately available to the users of the package.
1 | $ pip install -e .
|
Python's module packaging is an useful link to anyone who wants to make his own packages for Python.
There is a common practice that enthusiastic people come to optimize other people's code because they think certain things are right or correct to do or more efficient.
Outcome of unbenchmarked optimizations is unclear code and no benefits. Therefore you should attach a benchmark that lets the others verify your optimization works along your commits.
Programming profession doesn't lack people ready to work for just sake of labour.
If you write an automated test for something or employ automated test framework on this code, you are expected to explain what you are doing and why.
MIT License
Copyright (c) 2016 Henri Tuhola
Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the "Software"), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions:
The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software.
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.