author | Tero Marttila <terom@fixme.fi> |
Mon, 10 Jan 2011 17:51:08 +0200 | |
changeset 53 | 06dad873204d |
parent 30 | 97d5d37333d2 |
permissions | -rw-r--r-- |
30
97d5d37333d2
markup: Implement a Markup class using python-markdown to parse a simplified variant of markdown into a document tree
Tero Marttila <terom@fixme.fi>
parents:
diff
changeset
|
1 |
from markdown import * |
97d5d37333d2
markup: Implement a Markup class using python-markdown to parse a simplified variant of markdown into a document tree
Tero Marttila <terom@fixme.fi>
parents:
diff
changeset
|
2 |
|
97d5d37333d2
markup: Implement a Markup class using python-markdown to parse a simplified variant of markdown into a document tree
Tero Marttila <terom@fixme.fi>
parents:
diff
changeset
|
3 |
# root tag |
97d5d37333d2
markup: Implement a Markup class using python-markdown to parse a simplified variant of markdown into a document tree
Tero Marttila <terom@fixme.fi>
parents:
diff
changeset
|
4 |
DOC_TAG = 'root' |
97d5d37333d2
markup: Implement a Markup class using python-markdown to parse a simplified variant of markdown into a document tree
Tero Marttila <terom@fixme.fi>
parents:
diff
changeset
|
5 |
|
97d5d37333d2
markup: Implement a Markup class using python-markdown to parse a simplified variant of markdown into a document tree
Tero Marttila <terom@fixme.fi>
parents:
diff
changeset
|
6 |
class Markup (object) : |
97d5d37333d2
markup: Implement a Markup class using python-markdown to parse a simplified variant of markdown into a document tree
Tero Marttila <terom@fixme.fi>
parents:
diff
changeset
|
7 |
""" |
97d5d37333d2
markup: Implement a Markup class using python-markdown to parse a simplified variant of markdown into a document tree
Tero Marttila <terom@fixme.fi>
parents:
diff
changeset
|
8 |
Custom implementation of markdown.Markdown, that supports direct etree access, and has a more limited set of output element types. |
97d5d37333d2
markup: Implement a Markup class using python-markdown to parse a simplified variant of markdown into a document tree
Tero Marttila <terom@fixme.fi>
parents:
diff
changeset
|
9 |
|
97d5d37333d2
markup: Implement a Markup class using python-markdown to parse a simplified variant of markdown into a document tree
Tero Marttila <terom@fixme.fi>
parents:
diff
changeset
|
10 |
<root> : |
97d5d37333d2
markup: Implement a Markup class using python-markdown to parse a simplified variant of markdown into a document tree
Tero Marttila <terom@fixme.fi>
parents:
diff
changeset
|
11 |
<p> : |
97d5d37333d2
markup: Implement a Markup class using python-markdown to parse a simplified variant of markdown into a document tree
Tero Marttila <terom@fixme.fi>
parents:
diff
changeset
|
12 |
text |
97d5d37333d2
markup: Implement a Markup class using python-markdown to parse a simplified variant of markdown into a document tree
Tero Marttila <terom@fixme.fi>
parents:
diff
changeset
|
13 |
|
97d5d37333d2
markup: Implement a Markup class using python-markdown to parse a simplified variant of markdown into a document tree
Tero Marttila <terom@fixme.fi>
parents:
diff
changeset
|
14 |
<h1>/<h2>/<h3>/.. : |
97d5d37333d2
markup: Implement a Markup class using python-markdown to parse a simplified variant of markdown into a document tree
Tero Marttila <terom@fixme.fi>
parents:
diff
changeset
|
15 |
text |
97d5d37333d2
markup: Implement a Markup class using python-markdown to parse a simplified variant of markdown into a document tree
Tero Marttila <terom@fixme.fi>
parents:
diff
changeset
|
16 |
|
97d5d37333d2
markup: Implement a Markup class using python-markdown to parse a simplified variant of markdown into a document tree
Tero Marttila <terom@fixme.fi>
parents:
diff
changeset
|
17 |
<ul>/<ol> : |
97d5d37333d2
markup: Implement a Markup class using python-markdown to parse a simplified variant of markdown into a document tree
Tero Marttila <terom@fixme.fi>
parents:
diff
changeset
|
18 |
<li> : |
97d5d37333d2
markup: Implement a Markup class using python-markdown to parse a simplified variant of markdown into a document tree
Tero Marttila <terom@fixme.fi>
parents:
diff
changeset
|
19 |
text/<p> |
97d5d37333d2
markup: Implement a Markup class using python-markdown to parse a simplified variant of markdown into a document tree
Tero Marttila <terom@fixme.fi>
parents:
diff
changeset
|
20 |
<p> |
97d5d37333d2
markup: Implement a Markup class using python-markdown to parse a simplified variant of markdown into a document tree
Tero Marttila <terom@fixme.fi>
parents:
diff
changeset
|
21 |
... |
97d5d37333d2
markup: Implement a Markup class using python-markdown to parse a simplified variant of markdown into a document tree
Tero Marttila <terom@fixme.fi>
parents:
diff
changeset
|
22 |
|
97d5d37333d2
markup: Implement a Markup class using python-markdown to parse a simplified variant of markdown into a document tree
Tero Marttila <terom@fixme.fi>
parents:
diff
changeset
|
23 |
text : |
97d5d37333d2
markup: Implement a Markup class using python-markdown to parse a simplified variant of markdown into a document tree
Tero Marttila <terom@fixme.fi>
parents:
diff
changeset
|
24 |
Currently no inline markup yet, just pure text |
97d5d37333d2
markup: Implement a Markup class using python-markdown to parse a simplified variant of markdown into a document tree
Tero Marttila <terom@fixme.fi>
parents:
diff
changeset
|
25 |
""" |
97d5d37333d2
markup: Implement a Markup class using python-markdown to parse a simplified variant of markdown into a document tree
Tero Marttila <terom@fixme.fi>
parents:
diff
changeset
|
26 |
|
97d5d37333d2
markup: Implement a Markup class using python-markdown to parse a simplified variant of markdown into a document tree
Tero Marttila <terom@fixme.fi>
parents:
diff
changeset
|
27 |
def __init__ (self) : |
97d5d37333d2
markup: Implement a Markup class using python-markdown to parse a simplified variant of markdown into a document tree
Tero Marttila <terom@fixme.fi>
parents:
diff
changeset
|
28 |
""" |
97d5d37333d2
markup: Implement a Markup class using python-markdown to parse a simplified variant of markdown into a document tree
Tero Marttila <terom@fixme.fi>
parents:
diff
changeset
|
29 |
Setup parser. |
97d5d37333d2
markup: Implement a Markup class using python-markdown to parse a simplified variant of markdown into a document tree
Tero Marttila <terom@fixme.fi>
parents:
diff
changeset
|
30 |
""" |
97d5d37333d2
markup: Implement a Markup class using python-markdown to parse a simplified variant of markdown into a document tree
Tero Marttila <terom@fixme.fi>
parents:
diff
changeset
|
31 |
|
97d5d37333d2
markup: Implement a Markup class using python-markdown to parse a simplified variant of markdown into a document tree
Tero Marttila <terom@fixme.fi>
parents:
diff
changeset
|
32 |
## Block parsing |
97d5d37333d2
markup: Implement a Markup class using python-markdown to parse a simplified variant of markdown into a document tree
Tero Marttila <terom@fixme.fi>
parents:
diff
changeset
|
33 |
self.parser = blockparser.BlockParser() |
97d5d37333d2
markup: Implement a Markup class using python-markdown to parse a simplified variant of markdown into a document tree
Tero Marttila <terom@fixme.fi>
parents:
diff
changeset
|
34 |
|
97d5d37333d2
markup: Implement a Markup class using python-markdown to parse a simplified variant of markdown into a document tree
Tero Marttila <terom@fixme.fi>
parents:
diff
changeset
|
35 |
# internal block parsing, doesn't generate any elements |
97d5d37333d2
markup: Implement a Markup class using python-markdown to parse a simplified variant of markdown into a document tree
Tero Marttila <terom@fixme.fi>
parents:
diff
changeset
|
36 |
self.parser.blockprocessors['empty'] = blockprocessors.EmptyBlockProcessor(self.parser) |
97d5d37333d2
markup: Implement a Markup class using python-markdown to parse a simplified variant of markdown into a document tree
Tero Marttila <terom@fixme.fi>
parents:
diff
changeset
|
37 |
|
97d5d37333d2
markup: Implement a Markup class using python-markdown to parse a simplified variant of markdown into a document tree
Tero Marttila <terom@fixme.fi>
parents:
diff
changeset
|
38 |
# nested ol/ul and li |
97d5d37333d2
markup: Implement a Markup class using python-markdown to parse a simplified variant of markdown into a document tree
Tero Marttila <terom@fixme.fi>
parents:
diff
changeset
|
39 |
self.parser.blockprocessors['indent'] = blockprocessors.ListIndentProcessor(self.parser) |
97d5d37333d2
markup: Implement a Markup class using python-markdown to parse a simplified variant of markdown into a document tree
Tero Marttila <terom@fixme.fi>
parents:
diff
changeset
|
40 |
|
97d5d37333d2
markup: Implement a Markup class using python-markdown to parse a simplified variant of markdown into a document tree
Tero Marttila <terom@fixme.fi>
parents:
diff
changeset
|
41 |
# h1,h2,h3 etc |
97d5d37333d2
markup: Implement a Markup class using python-markdown to parse a simplified variant of markdown into a document tree
Tero Marttila <terom@fixme.fi>
parents:
diff
changeset
|
42 |
self.parser.blockprocessors['hashheader'] = blockprocessors.HashHeaderProcessor(self.parser) |
97d5d37333d2
markup: Implement a Markup class using python-markdown to parse a simplified variant of markdown into a document tree
Tero Marttila <terom@fixme.fi>
parents:
diff
changeset
|
43 |
self.parser.blockprocessors['setextheader'] = blockprocessors.SetextHeaderProcessor(self.parser) |
97d5d37333d2
markup: Implement a Markup class using python-markdown to parse a simplified variant of markdown into a document tree
Tero Marttila <terom@fixme.fi>
parents:
diff
changeset
|
44 |
|
97d5d37333d2
markup: Implement a Markup class using python-markdown to parse a simplified variant of markdown into a document tree
Tero Marttila <terom@fixme.fi>
parents:
diff
changeset
|
45 |
# ol/ul |
97d5d37333d2
markup: Implement a Markup class using python-markdown to parse a simplified variant of markdown into a document tree
Tero Marttila <terom@fixme.fi>
parents:
diff
changeset
|
46 |
self.parser.blockprocessors['olist'] = blockprocessors.OListProcessor(self.parser) |
97d5d37333d2
markup: Implement a Markup class using python-markdown to parse a simplified variant of markdown into a document tree
Tero Marttila <terom@fixme.fi>
parents:
diff
changeset
|
47 |
self.parser.blockprocessors['ulist'] = blockprocessors.UListProcessor(self.parser) |
97d5d37333d2
markup: Implement a Markup class using python-markdown to parse a simplified variant of markdown into a document tree
Tero Marttila <terom@fixme.fi>
parents:
diff
changeset
|
48 |
|
97d5d37333d2
markup: Implement a Markup class using python-markdown to parse a simplified variant of markdown into a document tree
Tero Marttila <terom@fixme.fi>
parents:
diff
changeset
|
49 |
# remaining things as paragraphs |
97d5d37333d2
markup: Implement a Markup class using python-markdown to parse a simplified variant of markdown into a document tree
Tero Marttila <terom@fixme.fi>
parents:
diff
changeset
|
50 |
self.parser.blockprocessors['paragraph'] = blockprocessors.ParagraphProcessor(self.parser) |
97d5d37333d2
markup: Implement a Markup class using python-markdown to parse a simplified variant of markdown into a document tree
Tero Marttila <terom@fixme.fi>
parents:
diff
changeset
|
51 |
|
97d5d37333d2
markup: Implement a Markup class using python-markdown to parse a simplified variant of markdown into a document tree
Tero Marttila <terom@fixme.fi>
parents:
diff
changeset
|
52 |
## Inline patterns |
97d5d37333d2
markup: Implement a Markup class using python-markdown to parse a simplified variant of markdown into a document tree
Tero Marttila <terom@fixme.fi>
parents:
diff
changeset
|
53 |
self.inlinePatterns = odict.OrderedDict() |
97d5d37333d2
markup: Implement a Markup class using python-markdown to parse a simplified variant of markdown into a document tree
Tero Marttila <terom@fixme.fi>
parents:
diff
changeset
|
54 |
|
97d5d37333d2
markup: Implement a Markup class using python-markdown to parse a simplified variant of markdown into a document tree
Tero Marttila <terom@fixme.fi>
parents:
diff
changeset
|
55 |
# XXX: none for now |
97d5d37333d2
markup: Implement a Markup class using python-markdown to parse a simplified variant of markdown into a document tree
Tero Marttila <terom@fixme.fi>
parents:
diff
changeset
|
56 |
|
97d5d37333d2
markup: Implement a Markup class using python-markdown to parse a simplified variant of markdown into a document tree
Tero Marttila <terom@fixme.fi>
parents:
diff
changeset
|
57 |
## Tree processors |
97d5d37333d2
markup: Implement a Markup class using python-markdown to parse a simplified variant of markdown into a document tree
Tero Marttila <terom@fixme.fi>
parents:
diff
changeset
|
58 |
self.treeprocessors = odict.OrderedDict() |
97d5d37333d2
markup: Implement a Markup class using python-markdown to parse a simplified variant of markdown into a document tree
Tero Marttila <terom@fixme.fi>
parents:
diff
changeset
|
59 |
self.treeprocessors["inline"] = treeprocessors.InlineProcessor(self) |
97d5d37333d2
markup: Implement a Markup class using python-markdown to parse a simplified variant of markdown into a document tree
Tero Marttila <terom@fixme.fi>
parents:
diff
changeset
|
60 |
|
97d5d37333d2
markup: Implement a Markup class using python-markdown to parse a simplified variant of markdown into a document tree
Tero Marttila <terom@fixme.fi>
parents:
diff
changeset
|
61 |
# No postprocessors; we don't generate HTML |
97d5d37333d2
markup: Implement a Markup class using python-markdown to parse a simplified variant of markdown into a document tree
Tero Marttila <terom@fixme.fi>
parents:
diff
changeset
|
62 |
|
97d5d37333d2
markup: Implement a Markup class using python-markdown to parse a simplified variant of markdown into a document tree
Tero Marttila <terom@fixme.fi>
parents:
diff
changeset
|
63 |
def _normalize_input (self, source) : |
97d5d37333d2
markup: Implement a Markup class using python-markdown to parse a simplified variant of markdown into a document tree
Tero Marttila <terom@fixme.fi>
parents:
diff
changeset
|
64 |
""" |
97d5d37333d2
markup: Implement a Markup class using python-markdown to parse a simplified variant of markdown into a document tree
Tero Marttila <terom@fixme.fi>
parents:
diff
changeset
|
65 |
Normalize given input before processing.. |
97d5d37333d2
markup: Implement a Markup class using python-markdown to parse a simplified variant of markdown into a document tree
Tero Marttila <terom@fixme.fi>
parents:
diff
changeset
|
66 |
""" |
97d5d37333d2
markup: Implement a Markup class using python-markdown to parse a simplified variant of markdown into a document tree
Tero Marttila <terom@fixme.fi>
parents:
diff
changeset
|
67 |
|
97d5d37333d2
markup: Implement a Markup class using python-markdown to parse a simplified variant of markdown into a document tree
Tero Marttila <terom@fixme.fi>
parents:
diff
changeset
|
68 |
|
97d5d37333d2
markup: Implement a Markup class using python-markdown to parse a simplified variant of markdown into a document tree
Tero Marttila <terom@fixme.fi>
parents:
diff
changeset
|
69 |
source = source.replace(STX, "").replace(ETX, "") |
97d5d37333d2
markup: Implement a Markup class using python-markdown to parse a simplified variant of markdown into a document tree
Tero Marttila <terom@fixme.fi>
parents:
diff
changeset
|
70 |
source = source.replace("\r\n", "\n").replace("\r", "\n") + "\n\n" |
97d5d37333d2
markup: Implement a Markup class using python-markdown to parse a simplified variant of markdown into a document tree
Tero Marttila <terom@fixme.fi>
parents:
diff
changeset
|
71 |
source = re.sub(r'\n\s+\n', '\n\n', source) |
97d5d37333d2
markup: Implement a Markup class using python-markdown to parse a simplified variant of markdown into a document tree
Tero Marttila <terom@fixme.fi>
parents:
diff
changeset
|
72 |
source = source.expandtabs(TAB_LENGTH) |
97d5d37333d2
markup: Implement a Markup class using python-markdown to parse a simplified variant of markdown into a document tree
Tero Marttila <terom@fixme.fi>
parents:
diff
changeset
|
73 |
|
97d5d37333d2
markup: Implement a Markup class using python-markdown to parse a simplified variant of markdown into a document tree
Tero Marttila <terom@fixme.fi>
parents:
diff
changeset
|
74 |
return source |
97d5d37333d2
markup: Implement a Markup class using python-markdown to parse a simplified variant of markdown into a document tree
Tero Marttila <terom@fixme.fi>
parents:
diff
changeset
|
75 |
|
97d5d37333d2
markup: Implement a Markup class using python-markdown to parse a simplified variant of markdown into a document tree
Tero Marttila <terom@fixme.fi>
parents:
diff
changeset
|
76 |
def parse (self, text) : |
97d5d37333d2
markup: Implement a Markup class using python-markdown to parse a simplified variant of markdown into a document tree
Tero Marttila <terom@fixme.fi>
parents:
diff
changeset
|
77 |
""" |
97d5d37333d2
markup: Implement a Markup class using python-markdown to parse a simplified variant of markdown into a document tree
Tero Marttila <terom@fixme.fi>
parents:
diff
changeset
|
78 |
Parse the given plaintext markup, yielding an etree.Element(DOC_TAG) |
97d5d37333d2
markup: Implement a Markup class using python-markdown to parse a simplified variant of markdown into a document tree
Tero Marttila <terom@fixme.fi>
parents:
diff
changeset
|
79 |
|
97d5d37333d2
markup: Implement a Markup class using python-markdown to parse a simplified variant of markdown into a document tree
Tero Marttila <terom@fixme.fi>
parents:
diff
changeset
|
80 |
text - the unicode input |
97d5d37333d2
markup: Implement a Markup class using python-markdown to parse a simplified variant of markdown into a document tree
Tero Marttila <terom@fixme.fi>
parents:
diff
changeset
|
81 |
""" |
97d5d37333d2
markup: Implement a Markup class using python-markdown to parse a simplified variant of markdown into a document tree
Tero Marttila <terom@fixme.fi>
parents:
diff
changeset
|
82 |
|
97d5d37333d2
markup: Implement a Markup class using python-markdown to parse a simplified variant of markdown into a document tree
Tero Marttila <terom@fixme.fi>
parents:
diff
changeset
|
83 |
# normalize |
97d5d37333d2
markup: Implement a Markup class using python-markdown to parse a simplified variant of markdown into a document tree
Tero Marttila <terom@fixme.fi>
parents:
diff
changeset
|
84 |
text = self._normalize_input(text) |
97d5d37333d2
markup: Implement a Markup class using python-markdown to parse a simplified variant of markdown into a document tree
Tero Marttila <terom@fixme.fi>
parents:
diff
changeset
|
85 |
|
97d5d37333d2
markup: Implement a Markup class using python-markdown to parse a simplified variant of markdown into a document tree
Tero Marttila <terom@fixme.fi>
parents:
diff
changeset
|
86 |
# as lines |
97d5d37333d2
markup: Implement a Markup class using python-markdown to parse a simplified variant of markdown into a document tree
Tero Marttila <terom@fixme.fi>
parents:
diff
changeset
|
87 |
lines = text.split("\n") |
97d5d37333d2
markup: Implement a Markup class using python-markdown to parse a simplified variant of markdown into a document tree
Tero Marttila <terom@fixme.fi>
parents:
diff
changeset
|
88 |
|
97d5d37333d2
markup: Implement a Markup class using python-markdown to parse a simplified variant of markdown into a document tree
Tero Marttila <terom@fixme.fi>
parents:
diff
changeset
|
89 |
# parse |
97d5d37333d2
markup: Implement a Markup class using python-markdown to parse a simplified variant of markdown into a document tree
Tero Marttila <terom@fixme.fi>
parents:
diff
changeset
|
90 |
root = self.parser.parseDocument(lines).getroot() |
97d5d37333d2
markup: Implement a Markup class using python-markdown to parse a simplified variant of markdown into a document tree
Tero Marttila <terom@fixme.fi>
parents:
diff
changeset
|
91 |
|
97d5d37333d2
markup: Implement a Markup class using python-markdown to parse a simplified variant of markdown into a document tree
Tero Marttila <terom@fixme.fi>
parents:
diff
changeset
|
92 |
# process tree |
97d5d37333d2
markup: Implement a Markup class using python-markdown to parse a simplified variant of markdown into a document tree
Tero Marttila <terom@fixme.fi>
parents:
diff
changeset
|
93 |
for treeprocessor in self.treeprocessors.values() : |
97d5d37333d2
markup: Implement a Markup class using python-markdown to parse a simplified variant of markdown into a document tree
Tero Marttila <terom@fixme.fi>
parents:
diff
changeset
|
94 |
newRoot = treeprocessor.run(root) |
97d5d37333d2
markup: Implement a Markup class using python-markdown to parse a simplified variant of markdown into a document tree
Tero Marttila <terom@fixme.fi>
parents:
diff
changeset
|
95 |
|
97d5d37333d2
markup: Implement a Markup class using python-markdown to parse a simplified variant of markdown into a document tree
Tero Marttila <terom@fixme.fi>
parents:
diff
changeset
|
96 |
if newRoot : |
97d5d37333d2
markup: Implement a Markup class using python-markdown to parse a simplified variant of markdown into a document tree
Tero Marttila <terom@fixme.fi>
parents:
diff
changeset
|
97 |
root = newRoot |
97d5d37333d2
markup: Implement a Markup class using python-markdown to parse a simplified variant of markdown into a document tree
Tero Marttila <terom@fixme.fi>
parents:
diff
changeset
|
98 |
|
97d5d37333d2
markup: Implement a Markup class using python-markdown to parse a simplified variant of markdown into a document tree
Tero Marttila <terom@fixme.fi>
parents:
diff
changeset
|
99 |
# fix up the root |
97d5d37333d2
markup: Implement a Markup class using python-markdown to parse a simplified variant of markdown into a document tree
Tero Marttila <terom@fixme.fi>
parents:
diff
changeset
|
100 |
root.tag = DOC_TAG |
97d5d37333d2
markup: Implement a Markup class using python-markdown to parse a simplified variant of markdown into a document tree
Tero Marttila <terom@fixme.fi>
parents:
diff
changeset
|
101 |
|
97d5d37333d2
markup: Implement a Markup class using python-markdown to parse a simplified variant of markdown into a document tree
Tero Marttila <terom@fixme.fi>
parents:
diff
changeset
|
102 |
# ok |
97d5d37333d2
markup: Implement a Markup class using python-markdown to parse a simplified variant of markdown into a document tree
Tero Marttila <terom@fixme.fi>
parents:
diff
changeset
|
103 |
return root |
97d5d37333d2
markup: Implement a Markup class using python-markdown to parse a simplified variant of markdown into a document tree
Tero Marttila <terom@fixme.fi>
parents:
diff
changeset
|
104 |
|
97d5d37333d2
markup: Implement a Markup class using python-markdown to parse a simplified variant of markdown into a document tree
Tero Marttila <terom@fixme.fi>
parents:
diff
changeset
|
105 |