[Newbies] PEG Rule to handle nested WikiText Templates.

gettimothy gettimothy at zoho.com
Thu Sep 5 19:07:11 UTC 2019

Hi Folks.

I am writting  a Parser Grammar in the XTreams package http://squeaksource.com/@BPfsPY0nJbH9IXBW/wqWrIinC

to handle the Wikimedia wikitext  https://en.wikipedia.org/wiki/Help:Wikitext#Text_formatting

My grammar looks like this so far:

'Body <- (   Template / Flow )*

LineCharacter <- [^\n]

Flow <-  Bold / Italic / BoldItalic/ LinkShort / LinkFull / LineCharacter /Template

Italic <-  "''''" Flow{"''''"}

Bold <-  "''''''" Flow{"''''''"}

BoldItalic <- "''''''''''''" Flow{"''''''''''''"}

LinkShort <- "[" .{&[>\]]} "]"

LinkFull <- "[" Flow{">"} .{"]"}

Whitespace	<-	[\s\t\n\r]*

Template <-  "{{"  Template "}}" /   "{{"  Flow{"}}"}

Heading4    <-    Whitespace "==== " Flow{" ====\n"}


Italic and Bold work just peachy.

Template works for a simple Templates like: 


My Actor callback, for now just wraps it in a span like this:

<span style="text-decoration: underline">reflist</span>

Where things go south is when Templates are nested, for example on Infoboxes which are widely used in wiki markup:

Here is a truncated version of my output:

<span style="text-decoration: underline">Infobox Italian comune   <---This is where the PEGActor callback started to wrap the outmost template.

| name                = Elmas

| official_name       = Comune di Elmas

| area_code           = 070

| website             = {{official website|http://www.comune.elmas.ca.it/</span>    <--here is the nested Template. As you can see the Grammar ended the  outermost <span></span> here and never recursed on the inner template.

| footnotes           =

}}  <---here is the outermost template should end.

So...I need a rule that will cover this.

Template <-  "{{"  Template "}}" /   "{{"  Flow{"}}"}

should be what?

Thank you for your help.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.squeakfoundation.org/pipermail/beginners/attachments/20190905/80d27be9/attachment.html>

More information about the Beginners mailing list