Pwikip is a Creole 1.0 Compliant** plus Creole Additions wikitext parser including a productive selection of pragmas and extensions to facilitate the efficient development, deployment and maintenance of a wikitext based HTML project.
The project is written in Linux shell script and requires GNU Bash as it contains a few bashisms: if [[ regex ]] where a case statement won't do and ${shellvar/substring/replacement} where inserting shellvars with sed and having to escape sed's special characters is cumbersome. A more portable grep could replace the former but it's not a built-in and it would unnacceptably impact performance so there's no reason not to use the latter.
I have not and am not intending to implement free-standing URLs being turned into links because the increased processing and resultant decrease in performance is not justified by isolating and effectively fixing URLs with missing enveloping brackets.
The latest version 0.2.9 manages to fully construct this entire page in 0m12.254s.
The wikitext source code of this page and of all the pages that constitute this website are listed here if you are interested to see how a pwikip project is constructed (you may have to set the page's character encoding to Unicode UTF-8 in your browser).
**//Bold italic text.//**
**//Bold italics to end-of-line.
Bold italics should **//be\\able//** to cross broken lines.
Bold italics should **//be
able//** to cross multiple lines.
**//Unacceptable but automatically corrected.**//
Sample output:
Bold italic text.
Bold italics to end-of-line.
Bold italics should be able to cross broken lines.
Bold italics should be able to cross multiple lines.
Unacceptable but automatically corrected.
Italic Bold
Sample input:
//**Italic bold text.**//
//**Italic bold to end-of-line.
Italic bold should //**be\\able**// to cross broken lines.
Italic bold should //**be
able**// to cross multiple lines.
//**Unacceptable but automatically corrected.//**
Sample output:
Italic bold text.
Italic bold to end-of-line.
Italic bold should be able to cross broken lines.
Italic bold should be able to cross multiple lines.
Unacceptable but automatically corrected.
Headings
Creole Compliant : the rules state that parsing within headings is optional but phrasing content is permitted within a heading so I support it.
The specification doesn't mention anything about the format of links created from the heading text so I've implemented the same format that the http://wikicreole.org/ website is using: invalid characters are removed, then the lowercase characters to the right of any whitespace are converted to uppercase before finally all whitespace is stripped.
This is a line of text which will be accompanied by the following line.
This is another line of text which will join the previous line in a paragraph.
This is yet another line of text which will become a paragraph on its own.
Sample output:
This is a line of text which will be accompanied by the following line. This is another line of text which will join the previous line in a paragraph.
This is yet another line of text which will become a paragraph on its own.
* Level 1 item 1 with a URL: [[http://www.woden.org.uk|Woden]]
* Level 1 item 2
with two additional lines of text underneath
which remain part of the same list item.
** **Level 2** item 1 with bold
*** Level 3 item 1
*** Level 3 //item 2// with italics
**** Level 4 item 1
***** Level 5 item 1 with\\multiple\\line\\\\breaks
***** Level 5 item 2 with code: {{{for f in `ls`; do echo $f; done}}}
**** Level 4 item 2
** Level 2 item 2
**** Level 4 item 3 - an unacceptable two level increase parsed as a line of bold text.
Level 1 item 2
with two additional lines of text underneath
which remain part of the same list item.
Level 2 item 1 with bold
Level 3 item 1
Level 3 item 2 with italics
Level 4 item 1
Level 5 item 1 with multiple line
breaks
Level 5 item 2 with code: for f in `ls`; do echo $f; done
Level 4 item 2
Level 2 item 2
** Level 4 item 3 - an unacceptable two level increase parsed as a line of bold text.
Ordered Lists
Sample input:
# Level 1 item 1 with a URL: [[http://www.woden.org.uk|Woden]]
# Level 1 item 2
with two additional lines of text underneath
which remain part of the same list item.
## **Level 2** item 1 with bold
### Level 3 item 1
### Level 3 //item 2// with italics
#### Level 4 item 1
##### Level 5 item 1 with\\multiple\\line\\\\breaks
##### Level 5 item 2 with code: {{{for f in `ls`; do echo $f; done}}}
#### Level 4 item 2
## Level 2 item 2
#### Level 4 item 3 - an unacceptable two level increase parsed as a line of monospaced text.
{{{
if (x != NULL) {
for (i = 0; i < size; i++) {
if (x[i] > 0) {
x[i]--;
}}}
}}}
{{{if (a>b) { b = a; }}}}
Sample output:
if (x != NULL) {
for (i = 0; i < size; i++) {
if (x[i] > 0) {
x[i]--;
}}}
if (a>b) { b = a; }
Escape Character
Creole Compliant** : I'm not currently implementing the escaping of HTML entities.
Sample input:
~# Not an unordered list item, ~__not underlined~__, ~## not monospaced ~##.
{{{~^^Ignored {inside~^^}}}} {{{~,,inline nowiki~,,}}} and »»»~##HTML~##«««.
~**Not bold~**, http://www.users.waitrose.com/~thunor/, ~// not italic ~//.
~[[Not an internal URL]], ~[[http://wikicreole.org/|not an external URL]].
This is a tilde escaping itself ~~__which then doesn't escape underlined.
This is a tilde alone ~ and ~these aren't~ ~escaping~ anything ~
{{{
~**Ignored inside a ~##nowiki block.
}}}
»»»
~//Ignored inside an ~~^^HTML block.
«««
~| Not | a | table |
~----
Sample output:
# Not an unordered list item, __not underlined__, ## not monospaced ##.
~^^Ignored {inside~^^}~,,inline nowiki~,, and ~##HTML~##.
**Not bold**, http://www.users.waitrose.com/~thunor/, // not italic //.
[[Not an internal URL]], [[http://wikicreole.org/|not an external URL]].
This is a tilde escaping itself ~which then doesn't escape underlined.
This is a tilde alone ~ and these aren't~ escaping~ anything ~
##Monospaced text.##
##Monospaced to end-of-line.
Monospace should ##be\\able## to cross broken lines.
Monospace should ##be
able## to cross multiple lines.
^^Superscripted text.^^
^^Superscripted to end-of-line.
Superscript should ^^be\\able^^ to cross broken lines.
Superscript should ^^be
able^^ to cross multiple lines.
Sample output:
Superscripted text.
Superscripted to end-of-line.
Superscript should be able to cross broken lines.
Superscript should be able to cross multiple lines.
,,Subscripted text.,,
,,Subscripted to end-of-line.
Subscript should ,,be\\able,, to cross broken lines.
Subscript should ,,be
able,, to cross multiple lines.
__Underlined text.__
__Underlined to end-of-line.
Underline should __be\\able__ to cross broken lines.
Underline should __be
able__ to cross multiple lines.
The specification doesn't mention anything about indented nowiki blocks but I've implemented them.
Sample input:
:This **paragraph** is indented to level 1.
::This paragraph is //indented// to level 2.
:::This {{{paragraph}}} is indented to level 3.
::::This paragraph is [[indented]] to level 4.
:::::This __paragraph__ is indented to level 5.
:{{{
This **nowiki** block is indented to level 1.
}}}
::{{{
This nowiki block is //indented// to level 2.
}}}
:::{{{
This {{{nowiki}}} block is indented to level 3.
}}}
::::{{{
This nowiki block is [[indented]] to level 4.
}}}
:::::{{{
This __nowiki__ block is indented to level 5.
}}}
Sample output:
This **nowiki** block is indented to level 1.
This nowiki block is //indented// to level 2.
This {{{nowiki}}} block is indented to level 3.
This nowiki block is [[indented]] to level 4.
This __nowiki__ block is indented to level 5.
HTML Entities
HTML entities can be inserted into a page by simply entering them as is.
How could you not want this feature! Useful for inserting JavaScript amongst many other things.
The characters I've chosen are guillemets which can be reached on a UK keyboard [with Linux] using AltGr+z and AltGr+x or AltGr+[ and AltGr+] on a US keyboard.
Sample input:
»»»
<p><strong>This</strong> <em>is</em> <code>a</code> <a href="block">block</a><br>
<u>of</u> <sup>raw</sup> <sub>HTML</sub> <tt>code</tt>.<p>
«««
Inline HTML: »»»<input type="button" value="A Button">«««
»»»
<script language="JavaScript" type="text/javascript">
var d=new Date();
document.write("<p>" + d + "<\/p>");
</script>
<noscript><p>JavaScript is disabled in your browser.</p></noscript>
«««
Some of these are [unintentionally] similar to MoinMoin's which means I'm on the right wavelength, but mostly they are custom.
#ftpput
This will by default ftp-put the target HTML file on completion although you can specify any number of files. The ftp account information is read from either the ~/.pwikiprc file or from the command-line at start-up.
Inserts the HTML tags <html> or </html> into the page. An open html tag will be closed automatically if the page ends therefore the closing tag is optional.
Examples:
#html
#htmlend
#head and #headend
Inserts the HTML tags <head> or </head> into the page. An open head tag will be closed automatically if a body tag is opened in which case the closing tag is optional.
Examples:
#head
#headend
#charset
Specifies the page's character encoding . If required, this must go inside the head section.
Examples:
#charset UTF-8
#charset ISO-8859-1
#title
Specifies the page's title . This is mandatory inside a head section but it can be blank.
Inserts the HTML tags <body> or </body> into the page. An open body tag will be closed automatically if the page ends therefore the closing tag is optional.
Examples:
#body
#bodyend
#include
Specifies one or more files to include. The file(s) must contain HTML, not wikitext markup.
This is a wikitext comment which won't be written to the target file.
Example:
#comment This is a wikitext comment which won't be written to the target file.
Command-Line Arguments
Most of these have rcfile equivalents (see ~/.pwikiprc below).
Pwikip Creole 1.0 Compliant** Wikitext Parser version 0.2.9
Usage: build [OPTION] [file or folder]
-b, --beep beep when finished
-c, --comment comment the output
--enable-ftp=state enable/disable ftp
--ftp-host=address set ftp host address
--ftp-port=number set ftp port number
--ftp-root=folder set ftp base folder
--ftp-username=user supply ftp username
--ftp-password=pass supply ftp password
-r, --recursive recurse subfolders
Pass [./]path/to/folder to process all files within a folder
Pass [./]path/to/file to process one file only
~/.pwikiprc
This is Pwikip's rcfile -- created on first run -- which contains options comparable to the command-line options above. If you wish to store your ftppassword within the rcfile then do so and Pwikip will obfuscate it the next time it's run.
## Simply uncomment and set the options you want.
#optionbeep=1
#optioncomment=1
#optionenableftp=1
#ftphost=
#ftpport=
#ftproot=
#ftpusername=
#ftppassword=