Generating thumbnails on the fly with WordPress

One advantage of Drupal’s resizing images over WordPress – WordPress’s resized images are generated at upload time only, whereas Drupal’s will be generated automatically on page load if the thumbnail doesn’t already exist.

You could put this function into your WP template – given an image url $image_url, it checks for the address of its thumbnail $thumb_url. If it doesnt find it, then it generates a thumbnail of size $xdim x $ydim

function maybe_generate_userarticlethumb($thumb_url, $image_url, $xdim, $ydim) {  
  
  /* There might be quicker ways to check if url exists than file_get_contents */
  if(!@file_get_contents($thumb_url)) {   
    $imagepath = str_replace(array($_SERVER["HTTP_HOST"], 'http://', 'https://'), array($_SERVER["DOCUMENT_ROOT"], '', ''), $image_url);
    $thumbpath = str_replace(array($_SERVER["HTTP_HOST"], 'http://', 'https://'), array($_SERVER["DOCUMENT_ROOT"], '', ''), $thumb_url);
    $image = wp_get_image_editor($imagepath);
    if ( ! is_wp_error( $image ) ) {
       /* Use any of WP's image manipulation functions here */
       $image->resize( $xdim, $ydim, true );
       $image->save($thumbpath);
    }
  }
}  

Then just call maybe_generate_userarticlethumb($thumb_url, $image_url, $xdim, $ydim) in your template, and print $thumb_url where you need to.

Target Safari and Chrome with specific css

Sometimes you need to apply CSS rules to deal with Webkit browsers such as Safari and Chrome. One example: sometimes non-standard fonts rendered using @font-face or an embed service such as Typekit or Google webfonts display thinner than on other browsers.

You can deal with this using a media query:

@media screen and (-webkit-min-device-pixel-ratio:0) {
   body { font-weight: 400; }
}

Thanks to phrappe.com for this useful tidbit!

Searching for large files on your server

Here’s a command that will check the root / folder recursively for files greater than or equal to 1GB and sort the list.

cd / ; du -h | grep '^[0-9]\+\.\?[0-9]\?G' 2>/dev/null | sort -nrk 1

Thanks to Phil from WiredTree for sharing that info, I’m just putting it down here so I can remember it in future :)

Redirect rules to change underscore to hyphen

Unfortunately Apache redirect only allows for 9 placeholders, so that’s the maximum amount of undercore changes that this code allows. We are using this mainly within WordPress sites, so here is the entire wp htaccess file, so you know where to position the code.

# BEGIN WordPress

RewriteEngine On
RewriteBase /

# BEGIN underscore to hyphen redirect code
RewriteRule ^([^_]*)_([^_]*)_([^_]*)_([^_]*)_([^_]*)_([^_]*)_([^_]*)_([^_]*)_([^_]*)$ /$1-$2-$3-$4-$5-$6-$7-$8-$9 [R=301,L]
RewriteRule ^([^_]*)_([^_]*)_([^_]*)_([^_]*)_([^_]*)_([^_]*)_([^_]*)_([^_]*)$ /$1-$2-$3-$4-$5-$6-$7-$8 [R=301,L]
RewriteRule ^([^_]*)_([^_]*)_([^_]*)_([^_]*)_([^_]*)_([^_]*)_([^_]*)$ /$1-$2-$3-$4-$5-$6-$7 [R=301,L]
RewriteRule ^([^_]*)_([^_]*)_([^_]*)_([^_]*)_([^_]*)_([^_]*)$ /$1-$2-$3-$4-$5-$6 [R=301,L]
RewriteRule ^([^_]*)_([^_]*)_([^_]*)_([^_]*)_([^_]*)$ /$1-$2-$3-$4-$5 [R=301,L]
RewriteRule ^([^_]*)_([^_]*)_([^_]*)_([^_]*)$ /$1-$2-$3-$4 [R=301,L]
RewriteRule ^([^_]*)_([^_]*)_([^_]*)$ /$1-$2-$3 [R=301,L]
RewriteRule ^([^_]*)_([^_]*)$ /$1-$2 [L,R=301]
# END underscore to hyphen redirect code

RewriteRule ^index\.php$ - [L]
RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_FILENAME} !-d
RewriteRule . /index.php [L]


# END WordPress

Migrating static HTML sites into a WordPress multisite

1. Create a multisite account for the site you are about to import

We set up our multisite using subdomains – this is useful for us as some of our sites there really are subdomains of the main site. If you use subfolders, some of the steps below might be different – unfortunately we havent tested them for that case.

2. Import the site

We are using the HTML Import module. You can import blog posts and pages in seperate import runs. For pages, it will keep the hierarchy.

3. Clean up superflous HTML

Sometimes there is unneeded content that appears on every page. For example our HTML sites were scraped from Plone sites, so there was lots of template cruft at the beginning and end of every post (plus the title was inside the CSS class that served as our ‘body’ class, so it ended up inside the main content too). So here’s a little php script you can modify for your own ends. Note that you will also need to download the htmLawed script and place it in the same folder as the script below.

 2,
'tidy' => 1,
'elements' => $elements,
'cdata' => 1,
'comment' => 1,
'deny_attribute' => 'align'
);

mysql_select_db($dbname);

$doc = new DOMDocument();

$query = mysql_query("SELECT ID, post_date, post_title, post_name, post_content, post_type FROM " . POSTTABLE . " WHERE post_type IN ('post', 'page') AND post_status = 'publish'");

while($post = mysql_fetch_object($query)) {

if(TESTINGSINGLE == false || $post->ID == TESTID) {
print "\n\n" . '**opening ' . $post->ID . ' - name: ' . $post->post_name . "\n";
}

$oldpost = $post->post_content;
$doc->loadHTML($oldpost);

$newtabledom = new DOMDocument;
$xpath = new DOMXPath($doc);

$newtabledom = $doc;
$pagepath = new DOMXPath($newtabledom);

// Remove content inside certain classes inside page
// The below elements are leftover elements from Plone
// See XPath documentation for more details of how to make queries
$toremove= $pagepath->query("//h1[@class='documentFirstHeading'] | //p[@class='documentDescription'] | //div[@class='documentDescription'] | //div[@class='documentByLine'] | //div[@class='documentActions'] | //div[@id='relatedItems'] | //div[@class='discussion'] | //a[@id='documentContent'] | //a[@class='link-parent']");

foreach ($toremove as $entry) {
$entry->parentNode->removeChild($entry);
}

// We try and change classes of images to use the WordPress floated classes
// The script detects classes (or parent div classes) that have the words
// left right or center and renames them
$imgtags = $doc->getElementsByTagName('img');
foreach($imgtags as $child) {
$linkclass = $child->attributes->getNamedItem('class')->nodeValue;
$alignclass = $child->attributes->getNamedItem('align')->nodeValue;
$linkfile = $child->attributes->getNamedItem('src')->nodeValue;

if(strpos($linkclass, 'left') !== false || strpos($alignclass, 'left') !== false) {
$child->setAttribute( 'class' , 'alignleft' );
}
else if (strpos($linkclass, 'right') !== false || strpos($alignclass, 'right') !== false) {
$child->setAttribute( 'class' , 'alignright' );

}
else if (strpos($linkclass, 'centre') !== false || strpos($linkclass, 'center') !== false
|| strpos($alignclass, 'centre') !== false || strpos($alignclass, 'center') !==false ) {
$imageinfo = @getimagesize(SITEROOT . $linkfile);
$child->setAttribute( 'class' , 'aligncenter' );
}
else {

// get parent
$parent = $child->parentNode;

if ($parent) {
$grandparent = $parent->parentNode;
$parentclass = $parent->attributes->getNamedItem('class')->nodeValue;
$parentalign = $parent->attributes->getNamedItem('align')->nodeValue;
$grandparentclass = $grandparent->attributes->getNamedItem('class')->nodeValue;
$grandparentalign = $grandparent->attributes->getNamedItem('align')->nodeValue;
}
else {
$parentclass = '';
$parentalign = '';
$grandparentclass = '';
$grandparentalign = '';
}

if (strpos($parentclass, 'left') !== false || strpos($parentalign, 'left') !== false) {
print "\n\n" . '** left parent ' . $linkfile . " - class: " . $parentclass . " - align: " . $parentalign. "\n";
$child->setAttribute( 'class' , 'alignleft' );
}
else if (strpos($parentclass, 'right') !== false || strpos($parentalign, 'left') !== false) {
print "\n\n" . '** right parent ' . $linkfile . " - class: " . $parentclass . " - align: " . $parentalign. "\n";
$child->setAttribute( 'class' , 'alignright' );
}
else if (strpos($parentclass, 'centre') !== false || strpos($parentclass, 'center') !== false
|| strpos($parentalign, 'centre') !== false || strpos($parentalign, 'center') !== false) {
print "\n\n" . '** centred parent ' . $linkfile . " - class: " . $parentclass . " - align: " . $parentalign. "\n";
$imageinfo = @getimagesize($linkfile);
$child->setAttribute( 'class' , 'aligncenter' );
}
else if (strpos($grandparentclass, 'left') !== false || strpos($grandparentalign, 'left') !== false) {
print "\n\n" . '** left grandparent ' . $linkfile . " - class: " . $grandparentclass . " - align: " . $grandparentalign. "\n";
$child->setAttribute( 'class' , 'alignleft' );
}
else if (strpos($grandparentclass, 'right') !== false || strpos($grandparentalign, 'left') !== false) {
print "\n\n" . '** right grandparent ' . $linkfile . " - class: " . $grandparentclass . " - align: " . $grandparentalign. "\n";
$child->setAttribute( 'class' , 'alignright' );
}
else if (strpos($grandparentclass, 'centre') !== false || strpos($grandparentclass, 'center') !== false
|| strpos($grandparentalign, 'centre') !== false || strpos($grandparentalign, 'center') !== false) {
print "\n\n" . '** centred grandparent ' . $linkfile . " - class: " . $grandparentclass . " - align: " . $grandparentalign. "\n";
$imageinfo = @getimagesize($linkfile);
$child->setAttribute( 'class' , 'aligncenter' );
}
}
}

// Replace underscores with dashes inside relative links
// we are excluding ../ links for now - too complicated
$atags = $doc->getElementsByTagName('a');
foreach($atags as $child) {

$linkhref = $child->attributes->getNamedItem('href')->nodeValue;

if (!(substr($linkhref, 0, 4) == 'http' || substr($linkhref, 0, 1) == '/' || substr($linkhref, 0, 3) == '../')) {

$parent = mysql_fetch_object(mysql_query("SELECT post_name FROM " . POSTTABLE . " WHERE post_parent = " . $post->post_parent));

// print "\n\n" . 'parent name: ' . $parent->post_name . "\n";
if(strpos($linkhref, '_') !== FALSE && (strpos($post->post_name, '-') !== FALSE || strpos($parent->post_name, '-') !== FALSE) ) {
print "\n\n" . 'link: ' . $linkhref . "\n";

$linkhref = str_replace('_', '-', $linkhref);
$linkhref = preg_replace('/--+/', '-', $linkhref);

print "\n\n" . 'changed internal link: ' . $linkhref . "\n";

$child->setAttribute( 'href' , $linkhref);
}
}

}

// Output HTML from query documents
$newtablehtml = $newtabledom->saveHTML();

// Text rewriting
// This can be modified to your needs
$newtablehtml = str_replace('[...]', '', $newtablehtml);
$newtablehtml = str_replace('/index.html"', '"', $newtablehtml);
$newtablehtml = str_replace('/"', '"', $newtablehtml);
$newtablehtml = str_replace('https://my.', 'http://www.', $newtablehtml);

// Sometimes there are encoding issues which need dealing with
$newtablehtml = str_replace(' ', '', $newtablehtml);
$newtablehtml = str_replace('Â', '', $newtablehtml);
$newtablehtml = str_replace('„', '', $newtablehtml);
$newtablehtml = str_replace('â€&#8482', "'", $newtablehtml);
$newtablehtml = str_replace("‘", "'", $newtablehtml);
$newtablehtml = str_replace("’", "'", $newtablehtml);
$newtablehtml = str_replace("“", "'", $newtablehtml);
$newtablehtml = str_replace("”", "'", $newtablehtml);
$newtablehtml = str_replace("–", " - ", $newtablehtml);
$newtablehtml = str_replace("—", " - ", $newtablehtml);
$newtablehtml = str_replace("’", "'", $newtablehtml);
$newtablehtml = str_replace("“", "", $newtablehtml);
$newtablehtml = str_replace("”", "", $newtablehtml);

// Sometimes the old page still contains html doctype
// inside the content tag
if (strpos($newtablehtml, '') !== 0) {
$newtablehtml = str_replace('', '', $newtablehtml);
}

// Remove empty paragraphs
$newtablehtml = preg_replace("#]*>(\s| ?)*

#", '', $newtablehtml); // Now run htmLawed to clean up $newtablehtml = htmLawed($newtablehtml, $config); // Normalise post titles in all caps if (strtoupper($post->post_title) == $post->post_title) { $post->post_title = ucwords(strtolower($post->post_title)); } if(strlen($newtablehtml) > 30) { // Post name exists - save new post content only if(strlen(trim($post->post_name)) > 0) { if(TESTINGSINGLE == false || $post->ID == TESTID) { $query2 = "UPDATE " . POSTTABLE . " SET post_content = '" . mysql_real_escape_string($newtablehtml) . "', post_title = '" . $post->post_title . "' WHERE ID = ". $post->ID; mysql_query($query2); } } // Need to generate post content from title else { $postname = strtolower(sanitize_file_name($post->post_title)); if(TESTINGSINGLE == false || $post->ID == TESTID) { $query2 = "UPDATE " . POSTTABLE . " SET post_content = '" . mysql_real_escape_string($newtablehtml) . "', post_name ='" . mysql_real_escape_string($postname) . "',post_title = '" . $post->post_title . "' WHERE ID = ". $post->ID; mysql_query($query2); } } } else { // Delete posts with v little or no content $query3 = mysql_query("SELECT ID FROM " . POSTTABLE . " WHERE post_type IN ('post', 'page') AND post_status = 'publish' AND post_parent = " . $post->ID); if(!mysql_fetch_object($query3)) { if(TESTINGSINGLE == false || $post->ID == TESTID) { mysql_query("DELETE FROM " . POSTTABLE . " WHERE ID = ". $post->ID); } } } } // Taken from the WP function function sanitize_file_name( $filename ) { $filename_raw = $filename; $special_chars = array("?", "[", "]", "/", "\\", "=", "<", ">", ":", ";", ",", "'", "\"", "&", "$", "#", "*", "(", ")", "|", "~", "`", "!", "{", "}", "–", "—","—", chr(0)); $filename = str_replace($special_chars, '', $filename); $filename = preg_replace('/[\s-]+/', '-', $filename); $filename = trim($filename, '.-_'); $entities = array("%e2", "%80", "%9c", "%9d", "%94", "%a0". "%93", "%99"); $filename = str_replace($entities, '', $filename); $unique = 0; $i = 0; while (!$unique) { $query = mysql_query("SELECT post_name FROM " . POSTTABLE . " WHERE post_name IN ('post', 'page') AND post_name = '" . $filename . "'"); if(mysql_fetch_object($query)) { print('***not unique - ' . $filename); $filename = $filename . '-' . $i; $i = $i + 1; } else { $unique = 1; } } return $filename; } ?>

4. Create redirections from one site to another

The also generates a very nice .htaccess file we can use as the basis for our redirects. Unfortunately we can’t use this directly in the .htaccess file for wordpress multisites, as the same htaccess file is used across all sites. Fortunately we can use the Redirection module, which lets WordPress handle the redirections instead of Apache.

For static sites, we often need to cater for the case where the URL ends in / as well as /index.html. So we need to rewrite our redirects a little – heres a little shell script you can run. Copy the generated .htaccess file to your desktop and run

# specify your original domain here - ie the one that occurs first in the .htaccess rule
DOMAIN = http://www.domain.com
# Replace tabs with spaces - much easier to deal with
expand -t1 htaccess > htaccess1
# this replaces your domain with ^/ - makes it much easier to target remaining
sed 's-$DOMAIN/-^/-g' htaccess2 > htaccess3
# Use RedirectMatch
sed 's/Redirect/RedirectMatch 301/g' htaccess3 > htaccess4
# We wont use the mod_rewrite way as Redirect doesnt handle that
sed 's/[R=301,NC,L]//g' htaccess4 > htaccess5
# This tacks on a regex that will handle / and index.html at the end of a URL
# note: this is for the case where your static URLS in .htaccess dont end in
# either / or /index.html
sed 's- http://-(:?/index.html|/)?$ http://-g' htaccess5 > htaccess_new

# comment the above line and uncomment one of these if your static URLs
# end in / (1st one) or /index.html (2nd one)
# sed 's-/ http://-(:?/index.html|/)?$ http://-g' htaccess5 > htaccess_new
# sed 's-/index.html-(:?/index.html|/)?$-g' htaccess5 > htaccess_new

Test it out on one line, and then see if it works.

Drupal Views: selecting number of items via argument

Steps to reproduce:
1. Add argument Global:Null (it might work with other arguments as well, but this seemed the least intrusive)
2. Go to ‘Specify validation criteria’ select ‘php code’ and add this:

$view->set_items_per_page($argument);
return true;

You don’t need to change anything else in the argument – however in the ‘More’ section it might be useful for future reference to change the argument name from ‘Global: Null’ to something like ‘Number of items’

This is especially useful for views you would like to include in nodes using the Viewfield or Insert View module, where you would like editors to be able to add views with the number of items they want without having access to the view structure.

Useful command line snippets for investigating site downtime

Look at incoming hosts – can be good for seeing if there is some kind of system abuse (eg ddos)

netstat -plan|grep :80|awk {'print $5'}|cut -d: -f 1|sort|uniq -c|sort -nrk 1 | head

Isolate Apache processes by account owner – sometimes it is Apache processes belonging to a particular site that are consuming memory.

lsof | grep '^httpd' | grep '/home/'

Using this command in conjunction with top should be able to point out the culprit account

Hat tip: our hosts Wiredtree – this little bit of info came in response to working out a recent issue

Batch renaming files using command-line find and replace

This turned out to be quite easy:

replace 'old text' 'new text' files-to-replace

So for example to change all spaces to underscores in a folder of jpegs, you can cd to the folder and type

replace ' ' '_' *.jpg

I did see some other syntax for this online which looked more like sed, so maybe it depends on the linux distribution (I was using Red Hat at the time).

Separating a wordpress multisite into individual installations

We had 2 large sites on the same WordPress multisite, but then we decided to split it up for the following reasons:

  • From a sysadmin point of view, its always better to keep large sites in their own account
  • Easier from a debugging point of view – the simpler the setup, the easier it is to find issues
  • A lot of plugins (eg Broken link checker) don’t work with multisites. Also its our experience that with many other plugins (eg caching) the experience is less than optimal compared to if they were on their own installation.

So…we decided to take a WP multisite with 2 sites (www.default.org – the default site, and www.addon.org which was added as a multisite subomain addon.default.org and then mapped to www.addon.org using the Domain Mapping plugin)

Extracting the database for the addon site:

  • Save a copy of your db somewhere, because we will first extract the db for www.addon.org, before returning to the original db to extract default.org. To extract the db for addon.org, we edited the table using phpmyadmin.
  • Empty any tables that can be reindexed (eg for the relevansii or broken links plugins)
  • Remove all wp_ tables (these store the data for www.default.org) that have a matching wp_2_ table (these store the data for www.addon.org) Be very careful here, as a few tables eg wp_users and wp_usermeta need to be kept that were shared by both installations – so these have no wp_2_ equivalent.
  • Then you can rename all wp_2_ tables to wp_ – for each table, select ‘Structure’ and then ‘Operations’ (maybe there is a faster way to do this via script, but this took me only 20 mins)
  • Edit wp-config.php and comment out the multisite variables and the sunrise variable associated with the Domain Mapping plugin
  • In wp-content folder, remove the sunrise.php file. Also it is probably a good idea to remove any cache files in this directory and temporarily get rid of any caching plugins.
  • In our newly renamed wp_options table change the site_name and home variables from addon.default.org to www.addon.org, and change the upload_path variable from wp-content/blogs.dir/2/files to wp-content/uploads
  • Now we need to do some find and replace in our newly renamed wp_posts table: we need to change all instances of wp-content/blogs.dir/2/files to wp-content/uploads and also all instances of addon.default.org to www.addon.org. So in the SQL tab you can do the following commands:
    update wp_posts set post_content = replace(post_content,’wp-content/blogs.dir/2/files’,'wp-content/uploads’);
    update wp_posts set post_content = replace(post_content,’addon.default.org’,'www.addon.org’);
    It is no harm to also change the guid entry, just for the sake of cleanliness:
    update wp_posts set guid = replace(guid,’wp-content/blogs.dir/2/files’,'wp-content/uploads’);
    update wp_posts set guid = replace(guid,’addon.default.org’,'www.addon.org’);
    You might also need to do search and replace in any other database tables added by other plugins eg the redirection module.
  • Finally, delete the uploads folder (again, make sure it is backed up somewhere) and rename the wp-content/blogs.dir/2/files folder to wp-content/uploads