WordPress – avoiding wpautop method
Posted by Shaked
Long time ha?
So I`m working on a WordPress plugin for a while, and I found some common issues with plugins that need to be integrated with WordPress's posts.
One of the main issues is the wpautop method.
Description
From WordPress's website:
Changes double line-breaks in the text into HTML paragraphs (<p>...</p>).
WordPress uses it to filter the content and the excerpt.
Now lets dig a bit more inside ha?
function wpautop($pee, $br = 1) {
if ( trim($pee) === '' )
return '';
$pee = $pee . "\n"; // just to make things a little easier, pad the end
$pee = preg_replace('|<br />\s*<br />|', "\n\n", $pee);
// Space things out a little
$allblocks = '(?:table|thead|tfoot|caption|col|colgroup|tbody|tr|td|th|div|dl|dd|dt|ul|ol|li|pre|select|option|form|map|area|blockquote|address|math|style|input|p|h[1-6]|hr|fieldset|legend|section|article|aside|hgroup|header|footer|nav|figure|figcaption|details|menu|summary)';
$pee = preg_replace('!(<' . $allblocks . '[^>]*>)!', "\n$1", $pee);
$pee = preg_replace('!(</' . $allblocks . '>)!', "$1\n\n", $pee);
$pee = str_replace(array("\r\n", "\r"), "\n", $pee); // cross-platform newlines
if ( strpos($pee, '<object') !== false ) {
$pee = preg_replace('|\s*<param([^>]*)>\s*|', "<param$1>", $pee); // no pee inside object/embed
$pee = preg_replace('|\s*</embed>\s*|', '</embed>', $pee);
}
$pee = preg_replace("/\n\n+/", "\n\n", $pee); // take care of duplicates
// make paragraphs, including one at the end
$pees = preg_split('/\n\s*\n/', $pee, -1, PREG_SPLIT_NO_EMPTY);
$pee = '';
foreach ( $pees as $tinkle )
$pee .= '<p>' . trim($tinkle, "\n") . "</p>\n";
$pee = preg_replace('|<p>\s*</p>|', '', $pee); // under certain strange conditions it could create a P of entirely whitespace
$pee = preg_replace('!<p>([^<]+)</(div|address|form)>!', "<p>$1</p></$2>", $pee);
$pee = preg_replace('!<p>\s*(</?' . $allblocks . '[^>]*>)\s*</p>!', "$1", $pee); // don't pee all over a tag
$pee = preg_replace("|<p>(<li.+?)</p>|", "$1", $pee); // problem with nested lists
$pee = preg_replace('|<p><blockquote([^>]*)>|i', "<blockquote$1><p>", $pee);
$pee = str_replace('</blockquote></p>', '</p></blockquote>', $pee);
$pee = preg_replace('!<p>\s*(</?' . $allblocks . '[^>]*>)!', "$1", $pee);
$pee = preg_replace('!(</?' . $allblocks . '[^>]*>)\s*</p>!', "$1", $pee);
if ($br) {
$pee = preg_replace_callback('/<(script|style).*?<\/\\1>/s', '_autop_newline_preservation_helper', $pee);
$pee = preg_replace('|(?<!<br />)\s*\n|', "<br />\n", $pee); // optionally make line breaks
$pee = str_replace('<WPPreserveNewline />', "\n", $pee);
}
$pee = preg_replace('!(</?' . $allblocks . '[^>]*>)\s*<br />!', "$1", $pee);
$pee = preg_replace('!<br />(\s*</?(?:p|li|div|dl|dd|dt|th|pre|td|ul|ol)[^>]*>)!', '$1', $pee);
if (strpos($pee, '<pre') !== false)
$pee = preg_replace_callback('!(<pre[^>]*>)(.*?)</pre>!is', 'clean_pre', $pee );
$pee = preg_replace( "|\n</p>$|", '</p>', $pee );
return $pee;
}
You may find more at http://core.trac.wordpress.org/browser/tags/3.3.1/wp-includes/formatting.php
As you can see, this function changes a lot of things in WordPress's posts, such as:
- Adding \n after (almost) each HTML tag
- Fixing new lines to cross platform newlines (\r\n to \n)
- Changing the <object> tag
- Adding <p> tags.
- Cleaning empty <pre> tags
So I marked the most important issue about this function "Adding <p> tags".
Why is it so important for me?
As I wrote before it seems that one of the biggest problems for developers that develop WordPress plugins is the wpautop plugin. When they are trying to modify the user's posts, they always need to struggle with the <p> tags and sometimes they don't know how to handle them right.