Simple Mark Up
SMU is Simple Markup, and is an API for text conversion, any format to any format, from case conversion to Markdown to [insert yours]. Any text to any text.
SMU is not a textutil like SED or AWK or TR, but simply a few PHP algorithms to apply regular expression conversions to input text.
Here is the SMU 1.4.2 code archive. But first read the SMU README.
Here is an example that converts text to uppercase:
'/(.*)/' => function($s){return strtoupper($s);},
);
include 'smu.php';
echo simple_markup($argv[1]);
Now, of course, there already exists several existing ways to already do such a thing. (So this is a bit redundant.) K&R had a toupper
example for example. But that is not that. And that is just an example, lame though it be, of the SMU way.
Here Is The Thing
Say you have a Blog that you write "Posts" (a TEXTAREA FORM in which you type stuff that gets "saved" and the view by yer viewers as yer latest post, right?), and the source is Markdown, which means one writing this in yer "post a post to my blog" HTML input form:
All I got was *Coal*!
Gets to be this in yer blog:
<article> <h2>My Latest Post!<span>December 25, 2022</span></h2> <p>All I got was <strong>Coal</strong>!</p> </article>
Or whatever. I hope you get my meaning. The thing is, whether one is using a Wordpress or Joomla or Drupal or other shit to "Post shit" to yer blog, or one is using (some program to do same locally) to locally create shtuff to FTP to yer site... it's all the same!
One writes basic text in basic format (whether you use "stars" to mark a bold word or yer mouse to select and click and select "bold"), yer Marking up text for HTML!
Okay. The Raven on the roof is caw-ing for me to get to the point...
Let us say that you want to extend Markdown to adopt an extension like:
In this sentence, #make this uppercase#.
That's like this:
In this sentence, *make this bold*.
Okay? Kinda simple. Right? Okay. But, here's the thing? How do you do that?
Say you use Markdown.pl
. Where in that code do you modify it to support the Uppercase thingy? Say you use NODE or Python or whatever. How do you change the code to support Uppercase?
Go do that right now. I'll wait.
It's About Time
Actually, I mean it is about change. ("About time" just sounds a bit nicer.)
I have had a motto for a while now, and it's like:
No, I am not about to even try to change any markdown code's code to support that uppercase extension. For two reasons: 1) it will be very difficult and time consuming; 2) any changes made will interfere with all future versions and updates.
Of course, point one is probably just me. And point two is only if you are using a Node/Java/Python/Ruby et.al. program/library.
A big turd splattered on my window and as I shook my fist at the seagull outside it screeched back, "Get to the point!"
Okay. Okay.
For SMU's markdown dataset, I just add a line in the data like this:
'/#(.*)#/U' => function($m){return strtoupper($m[1]);},
We just changed the data, not the code
Two obvious things there. The regular expression does not account for escapes (e.g. "\#escapes\#"
) and the data relies in PHP and having to know what $m
is (does anyone read API documentation? more on that elsewhere).
This idea ain't new, of course. A similar model of doing things is for Templates. There is a good example of the "change the data to change the code" way of programming in Pygments with the pygmentize stylers; they are basically just data. And that is a widely used/implemented concept.
Back To The Code
As mentioned, the "data" here is a PHP data array, that this data can have "code" by way of closures. But it's still data in most senses of the term. (If you disagree, bring it up with PHP.)
But it is PHP code as data. Meaning, if one were to port SMU to say Perl, one would also have to convert and SMU datasets to a Perl-compatible format.
I only mention that (as that's what everybody's got to do when porting a program from one language to another) to what is can be to eliminate that. Real DATA.
Like, say there was this:
* bold ` code ** emphasis # UPPER
And say that that meant the bassis of word based text markup to HTML tag markup conversion rules...
- Maybe it was
tolower
, but I think you get my meaning. - But I did choose the word "probably", as in, actually, probably not just me.
- But a regex can be made to do so.
- Though "templating systems" just move the complexity to a, umm, "templating system".
- I wish they took one more step and allowed for multiple "tokens" files. sigh