Managing Preformatted Text Content

I would like to start by saying I have a tendency to over-think problems. This all started as a very quick project to add a blog feature to my content management system. Simple, right? Well hold you horses, let me explain.

When storing data to the data base, in my case MySQL I have to encode text data, that is, all data that are text columns since they can contain special characters such as a single quote, double quote, html tags, etc. The way I do this is using and encoding and decoding method provided by PHP. Before the column is written to the data base, the value is converted using htmlspecialchars and htmlspecialchars_decode. To encode the syntax is htmlspecialchars($value,ENT_QUOTES) and to decode the syntax is htmlspecialchars_decode($value, ENT_QUOTES). Please note, the ENT_QUOTES properly convert quotation marks so you won't have a problem storing the data into your data base.

For my purposes I use TinyMCE for my content management system to add and edit text/character data. Very efficient and useful tool. Pre-version 5, TinyMCE is free. TinyMCE provides for a method to incorporate preformatted text in your text. You have to configure TinyMCE to use the codesample/code feature. Once properly configured, you can insert preformatted text.

I create my content and, using TinyMCE, I insert preformatted text here and there and then store the data. The data stores correctly with the help of htmlspecialchars. Now I have to edit the data that contains the preformatted text, again with TinyMCE, the data is retrieved and with the help of htmlspecialchars_decode the data is returned properly, except when you display the preformatted text, all the ‘coded’ information is missing.

Now if you take a close look at what is happening, the preformatted text is that but is now treated as html code, that is &lt; and &gt; are converted to < and > and therefore the TinyMCE editor doesn’t know any difference and also treats the data as html.

This is where I tried to over-think the problem. Now the part that eluded me for the longest time. The preformatted text needs to have an ampersand include to the content to be handled proper. So, the tags &lt; needs to be &&lt; with only one & being removed. However, browsers cannot handle that, so you have to trick it by using the special character for an ampersand. This turned out to be easier than I thought. Before displaying the preformatted text in TinyMCE, in the HTML code tag where the ampersand (&) needs to be converted to &, and that is all there is to it.

The data that has preformatted text needs to be modified using the following function:

    function convert_preformatted_text($haystack)
    {
        $needle_start_code = "<code>";
        $needle_end_code = "</code>";
        $search = array(
            "&",
        );
        $replace = array(
            "&amp;",
        );
        $offset = 1;
        $needle = $needle_start_code;
        $result = null;
        $len = strlen($haystack);
        if(!strpos($haystack,$needle_start_code)) return $haystack;

        while ($offset < $len) {
            $strpos = strpos($haystack,$needle,$offset);
            if($strpos === false) {
                $text = substr($haystack,$offset - 1);
                $result .= $text;
                $offset = $len;
            } else {
                $text_len = $strpos - $offset + 1;
                $text = substr($haystack, $offset - 1, $text_len);
                switch ($needle) {
                    case $needle_start_code:
                        $result .= $text;
                        $needle = $needle_end_code;
                        break;
                    case $needle_end_code; // text to be
                        $result .= str_replace($search, $replace, $text);
                        $needle = $needle_start_code;
                        break;
                }
                $offset = $strpos + 1;
            }

        }
        return $result;
    }

Here is an example of how I used it to allow this page to be properly shown in TinyMCE:

            public function tinymce_record()
            {
                $this->record['blog_content'] = convert_preformatted_text($this->record['blog_content']);
                return $this->record;
            }