Migrating static HTML sites into a WordPress multisite

1. Create a multisite account for the site you are about to import

We set up our multisite using subdomains – this is useful for us as some of our sites there really are subdomains of the main site. If you use subfolders, some of the steps below might be different – unfortunately we havent tested them for that case.

2. Import the site

We are using the HTML Import module. You can import blog posts and pages in seperate import runs. For pages, it will keep the hierarchy.

3. Clean up superflous HTML

Sometimes there is unneeded content that appears on every page. For example our HTML sites were scraped from Plone sites, so there was lots of template cruft at the beginning and end of every post (plus the title was inside the CSS class that served as our ‘body’ class, so it ended up inside the main content too). So here’s a little php script you can modify for your own ends. Note that you will also need to download the htmLawed script and place it in the same folder as the script below.

'tidy' => 1,
'elements' => $elements,
'cdata' => 1,
'comment' => 1,
'deny_attribute' => 'align'


$doc = new DOMDocument();

$query = mysql_query("SELECT ID, post_date, post_title, post_name, post_content, post_type FROM " . POSTTABLE . " WHERE post_type IN ('post', 'page') AND post_status = 'publish'");

while($post = mysql_fetch_object($query)) {

if(TESTINGSINGLE == false || $post->ID == TESTID) {
print "\n\n" . '**opening ' . $post->ID . ' - name: ' . $post->post_name . "\n";

$oldpost = $post->post_content;

$newtabledom = new DOMDocument;
$xpath = new DOMXPath($doc);

$newtabledom = $doc;
$pagepath = new DOMXPath($newtabledom);

// Remove content inside certain classes inside page
// The below elements are leftover elements from Plone
// See XPath documentation for more details of how to make queries
$toremove= $pagepath->query("//h1[@class='documentFirstHeading'] | //p[@class='documentDescription'] | //div[@class='documentDescription'] | //div[@class='documentByLine'] | //div[@class='documentActions'] | //div[@id='relatedItems'] | //div[@class='discussion'] | //a[@id='documentContent'] | //a[@class='link-parent']");

foreach ($toremove as $entry) {

// We try and change classes of images to use the WordPress floated classes
// The script detects classes (or parent div classes) that have the words
// left right or center and renames them
$imgtags = $doc->getElementsByTagName('img');
foreach($imgtags as $child) {
$linkclass = $child->attributes->getNamedItem('class')->nodeValue;
$alignclass = $child->attributes->getNamedItem('align')->nodeValue;
$linkfile = $child->attributes->getNamedItem('src')->nodeValue;

if(strpos($linkclass, 'left') !== false || strpos($alignclass, 'left') !== false) {
$child->setAttribute( 'class' , 'alignleft' );
else if (strpos($linkclass, 'right') !== false || strpos($alignclass, 'right') !== false) {
$child->setAttribute( 'class' , 'alignright' );

else if (strpos($linkclass, 'centre') !== false || strpos($linkclass, 'center') !== false
|| strpos($alignclass, 'centre') !== false || strpos($alignclass, 'center') !==false ) {
$imageinfo = @getimagesize(SITEROOT . $linkfile);
$child->setAttribute( 'class' , 'aligncenter' );
else {

// get parent
$parent = $child->parentNode;

if ($parent) {
$grandparent = $parent->parentNode;
$parentclass = $parent->attributes->getNamedItem('class')->nodeValue;
$parentalign = $parent->attributes->getNamedItem('align')->nodeValue;
$grandparentclass = $grandparent->attributes->getNamedItem('class')->nodeValue;
$grandparentalign = $grandparent->attributes->getNamedItem('align')->nodeValue;
else {
$parentclass = '';
$parentalign = '';
$grandparentclass = '';
$grandparentalign = '';

if (strpos($parentclass, 'left') !== false || strpos($parentalign, 'left') !== false) {
print "\n\n" . '** left parent ' . $linkfile . " - class: " . $parentclass . " - align: " . $parentalign. "\n";
$child->setAttribute( 'class' , 'alignleft' );
else if (strpos($parentclass, 'right') !== false || strpos($parentalign, 'left') !== false) {
print "\n\n" . '** right parent ' . $linkfile . " - class: " . $parentclass . " - align: " . $parentalign. "\n";
$child->setAttribute( 'class' , 'alignright' );
else if (strpos($parentclass, 'centre') !== false || strpos($parentclass, 'center') !== false
|| strpos($parentalign, 'centre') !== false || strpos($parentalign, 'center') !== false) {
print "\n\n" . '** centred parent ' . $linkfile . " - class: " . $parentclass . " - align: " . $parentalign. "\n";
$imageinfo = @getimagesize($linkfile);
$child->setAttribute( 'class' , 'aligncenter' );
else if (strpos($grandparentclass, 'left') !== false || strpos($grandparentalign, 'left') !== false) {
print "\n\n" . '** left grandparent ' . $linkfile . " - class: " . $grandparentclass . " - align: " . $grandparentalign. "\n";
$child->setAttribute( 'class' , 'alignleft' );
else if (strpos($grandparentclass, 'right') !== false || strpos($grandparentalign, 'left') !== false) {
print "\n\n" . '** right grandparent ' . $linkfile . " - class: " . $grandparentclass . " - align: " . $grandparentalign. "\n";
$child->setAttribute( 'class' , 'alignright' );
else if (strpos($grandparentclass, 'centre') !== false || strpos($grandparentclass, 'center') !== false
|| strpos($grandparentalign, 'centre') !== false || strpos($grandparentalign, 'center') !== false) {
print "\n\n" . '** centred grandparent ' . $linkfile . " - class: " . $grandparentclass . " - align: " . $grandparentalign. "\n";
$imageinfo = @getimagesize($linkfile);
$child->setAttribute( 'class' , 'aligncenter' );

// Replace underscores with dashes inside relative links
// we are excluding ../ links for now - too complicated
$atags = $doc->getElementsByTagName('a');
foreach($atags as $child) {

$linkhref = $child->attributes->getNamedItem('href')->nodeValue;

if (!(substr($linkhref, 0, 4) == 'http' || substr($linkhref, 0, 1) == '/' || substr($linkhref, 0, 3) == '../')) {

$parent = mysql_fetch_object(mysql_query("SELECT post_name FROM " . POSTTABLE . " WHERE post_parent = " . $post->post_parent));

// print "\n\n" . 'parent name: ' . $parent->post_name . "\n";
if(strpos($linkhref, '_') !== FALSE && (strpos($post->post_name, '-') !== FALSE || strpos($parent->post_name, '-') !== FALSE) ) {
print "\n\n" . 'link: ' . $linkhref . "\n";

$linkhref = str_replace('_', '-', $linkhref);
$linkhref = preg_replace('/--+/', '-', $linkhref);

print "\n\n" . 'changed internal link: ' . $linkhref . "\n";

$child->setAttribute( 'href' , $linkhref);


// Output HTML from query documents
$newtablehtml = $newtabledom->saveHTML();

// Text rewriting
// This can be modified to your needs
$newtablehtml = str_replace('[...]', '', $newtablehtml);
$newtablehtml = str_replace('/index.html"', '"', $newtablehtml);
$newtablehtml = str_replace('/"', '"', $newtablehtml);
$newtablehtml = str_replace('https://my.', 'http://www.', $newtablehtml);

// Sometimes there are encoding issues which need dealing with
$newtablehtml = str_replace(' ', '', $newtablehtml);
$newtablehtml = str_replace('Â', '', $newtablehtml);
$newtablehtml = str_replace('„', '', $newtablehtml);
$newtablehtml = str_replace('â€&#8482', "'", $newtablehtml);
$newtablehtml = str_replace("‘", "'", $newtablehtml);
$newtablehtml = str_replace("’", "'", $newtablehtml);
$newtablehtml = str_replace("“", "'", $newtablehtml);
$newtablehtml = str_replace("”", "'", $newtablehtml);
$newtablehtml = str_replace("–", " - ", $newtablehtml);
$newtablehtml = str_replace("—", " - ", $newtablehtml);
$newtablehtml = str_replace("’", "'", $newtablehtml);
$newtablehtml = str_replace("“", "", $newtablehtml);
$newtablehtml = str_replace("”", "", $newtablehtml);

// Sometimes the old page still contains html doctype
// inside the content tag
if (strpos($newtablehtml, '') !== 0) {
$newtablehtml = str_replace('', '', $newtablehtml);

// Remove empty paragraphs
$newtablehtml = preg_replace("#]*>(\s| ?)*

#", '', $newtablehtml); // Now run htmLawed to clean up $newtablehtml = htmLawed($newtablehtml, $config); // Normalise post titles in all caps if (strtoupper($post->post_title) == $post->post_title) { $post->post_title = ucwords(strtolower($post->post_title)); } if(strlen($newtablehtml) > 30) { // Post name exists - save new post content only if(strlen(trim($post->post_name)) > 0) { if(TESTINGSINGLE == false || $post->ID == TESTID) { $query2 = "UPDATE " . POSTTABLE . " SET post_content = '" . mysql_real_escape_string($newtablehtml) . "', post_title = '" . $post->post_title . "' WHERE ID = ". $post->ID; mysql_query($query2); } } // Need to generate post content from title else { $postname = strtolower(sanitize_file_name($post->post_title)); if(TESTINGSINGLE == false || $post->ID == TESTID) { $query2 = "UPDATE " . POSTTABLE . " SET post_content = '" . mysql_real_escape_string($newtablehtml) . "', post_name ='" . mysql_real_escape_string($postname) . "',post_title = '" . $post->post_title . "' WHERE ID = ". $post->ID; mysql_query($query2); } } } else { // Delete posts with v little or no content $query3 = mysql_query("SELECT ID FROM " . POSTTABLE . " WHERE post_type IN ('post', 'page') AND post_status = 'publish' AND post_parent = " . $post->ID); if(!mysql_fetch_object($query3)) { if(TESTINGSINGLE == false || $post->ID == TESTID) { mysql_query("DELETE FROM " . POSTTABLE . " WHERE ID = ". $post->ID); } } } } // Taken from the WP function function sanitize_file_name( $filename ) { $filename_raw = $filename; $special_chars = array("?", "[", "]", "/", "\\", "=", "<", ">", ":", ";", ",", "'", "\"", "&", "$", "#", "*", "(", ")", "|", "~", "`", "!", "{", "}", "–", "—","—", chr(0)); $filename = str_replace($special_chars, '', $filename); $filename = preg_replace('/[\s-]+/', '-', $filename); $filename = trim($filename, '.-_'); $entities = array("%e2", "%80", "%9c", "%9d", "%94", "%a0". "%93", "%99"); $filename = str_replace($entities, '', $filename); $unique = 0; $i = 0; while (!$unique) { $query = mysql_query("SELECT post_name FROM " . POSTTABLE . " WHERE post_name IN ('post', 'page') AND post_name = '" . $filename . "'"); if(mysql_fetch_object($query)) { print('***not unique - ' . $filename); $filename = $filename . '-' . $i; $i = $i + 1; } else { $unique = 1; } } return $filename; } ?>

4. Create redirections from one site to another

The also generates a very nice .htaccess file we can use as the basis for our redirects. Unfortunately we can’t use this directly in the .htaccess file for wordpress multisites, as the same htaccess file is used across all sites. Fortunately we can use the Redirection module, which lets WordPress handle the redirections instead of Apache.

For static sites, we often need to cater for the case where the URL ends in / as well as /index.html. So we need to rewrite our redirects a little – heres a little shell script you can run. Copy the generated .htaccess file to your desktop and run

# specify your original domain here - ie the one that occurs first in the .htaccess rule
DOMAIN = http://www.domain.com
# Replace tabs with spaces - much easier to deal with
expand -t1 htaccess > htaccess1
# this replaces your domain with ^/ - makes it much easier to target remaining
sed 's-$DOMAIN/-^/-g' htaccess2 > htaccess3
# Use RedirectMatch
sed 's/Redirect/RedirectMatch 301/g' htaccess3 > htaccess4
# We wont use the mod_rewrite way as Redirect doesnt handle that
sed 's/[R=301,NC,L]//g' htaccess4 > htaccess5
# This tacks on a regex that will handle / and index.html at the end of a URL
# note: this is for the case where your static URLS in .htaccess dont end in
# either / or /index.html
sed 's- http://-(:?/index.html|/)?$ http://-g' htaccess5 > htaccess_new

# comment the above line and uncomment one of these if your static URLs
# end in / (1st one) or /index.html (2nd one)
# sed 's-/ http://-(:?/index.html|/)?$ http://-g' htaccess5 > htaccess_new
# sed 's-/index.html-(:?/index.html|/)?$-g' htaccess5 > htaccess_new

Test it out on one line, and then see if it works.

Drupal Views: selecting number of items via argument

Steps to reproduce:
1. Add argument Global:Null (it might work with other arguments as well, but this seemed the least intrusive)
2. Go to ‘Specify validation criteria’ select ‘php code’ and add this:

return true;

You don’t need to change anything else in the argument – however in the ‘More’ section it might be useful for future reference to change the argument name from ‘Global: Null’ to something like ‘Number of items’

This is especially useful for views you would like to include in nodes using the Viewfield or Insert View module, where you would like editors to be able to add views with the number of items they want without having access to the view structure.

Useful command line snippets for investigating site downtime

Look at incoming hosts – can be good for seeing if there is some kind of system abuse (eg ddos)

netstat -plan|grep :80|awk {'print $5'}|cut -d: -f 1|sort|uniq -c|sort -nrk 1 | head

Isolate Apache processes by account owner – sometimes it is Apache processes belonging to a particular site that are consuming memory.

lsof | grep '^httpd' | grep '/home/'

Using this command in conjunction with top should be able to point out the culprit account

Hat tip: our hosts Wiredtree – this little bit of info came in response to working out a recent issue

Batch renaming files using command-line find and replace

This turned out to be quite easy:

replace 'old text' 'new text' files-to-replace

So for example to change all spaces to underscores in a folder of jpegs, you can cd to the folder and type

replace ' ' '_' *.jpg

I did see some other syntax for this online which looked more like sed, so maybe it depends on the linux distribution (I was using Red Hat at the time).

Separating a wordpress multisite into individual installations

We had 2 large sites on the same WordPress multisite, but then we decided to split it up for the following reasons:

  • From a sysadmin point of view, its always better to keep large sites in their own account
  • Easier from a debugging point of view – the simpler the setup, the easier it is to find issues
  • A lot of plugins (eg Broken link checker) don’t work with multisites. Also its our experience that with many other plugins (eg caching) the experience is less than optimal compared to if they were on their own installation.

So…we decided to take a WP multisite with 2 sites (www.default.org – the default site, and www.addon.org which was added as a multisite subomain addon.default.org and then mapped to www.addon.org using the Domain Mapping plugin)

Extracting the database for the addon site:

  • Save a copy of your db somewhere, because we will first extract the db for www.addon.org, before returning to the original db to extract default.org. To extract the db for addon.org, we edited the table using phpmyadmin.
  • Empty any tables that can be reindexed (eg for the relevansii or broken links plugins)
  • Remove all wp_ tables (these store the data for www.default.org) that have a matching wp_2_ table (these store the data for www.addon.org) Be very careful here, as a few tables eg wp_users and wp_usermeta need to be kept that were shared by both installations – so these have no wp_2_ equivalent.
  • Then you can rename all wp_2_ tables to wp_ – for each table, select ‘Structure’ and then ‘Operations’ (maybe there is a faster way to do this via script, but this took me only 20 mins)
  • Edit wp-config.php and comment out the multisite variables and the sunrise variable associated with the Domain Mapping plugin
  • In wp-content folder, remove the sunrise.php file. Also it is probably a good idea to remove any cache files in this directory and temporarily get rid of any caching plugins.
  • In our newly renamed wp_options table change the site_name and home variables from addon.default.org to www.addon.org, and change the upload_path variable from wp-content/blogs.dir/2/files to wp-content/uploads
  • Now we need to do some find and replace in our newly renamed wp_posts table: we need to change all instances of wp-content/blogs.dir/2/files to wp-content/uploads and also all instances of addon.default.org to www.addon.org. So in the SQL tab you can do the following commands:
    update wp_posts set post_content = replace(post_content,’wp-content/blogs.dir/2/files’,’wp-content/uploads’);
    update wp_posts set post_content = replace(post_content,’addon.default.org’,’www.addon.org’);
    It is no harm to also change the guid entry, just for the sake of cleanliness:
    update wp_posts set guid = replace(guid,’wp-content/blogs.dir/2/files’,’wp-content/uploads’);
    update wp_posts set guid = replace(guid,’addon.default.org’,’www.addon.org’);
    You might also need to do search and replace in any other database tables added by other plugins eg the redirection module.
  • Finally, delete the uploads folder (again, make sure it is backed up somewhere) and rename the wp-content/blogs.dir/2/files folder to wp-content/uploads



WordPress-like summaries and Teasers in Drupal 7 and CKEditor

Over the past few versions of Drupal, there have been quite a few attempts to create a good way of creating page summaries/teasers/excerpts/extracts that you can use on the front page to encourage people to read the article.

The version that comes out of the box with Drupal 7 works quite nicely if no text editors are added, but our editors have found it rather confusing when used in combination with an editor like CKEditor. When a summary is included and selected, 2 text editors suddenly pop up on the page!

So we have been looking for ways to make the experience a little easier for people.

A good template to follow in this regard is WordPress. They have 2 different ways of creating summaries:

1) A ‘post excerpt’ field, which is good for 1-line summaries to be used in sidebar blocks and so forth. We will call this the short summary.

2) A ‘break’ button which you can press to separate the first couple of paragraphs from the rest (we will call this the long summary). This is good for a main blog page, to let people read the first couple of paragraphs, and then press ‘read more’ to read the rest.

We will try to emulate that here.

Out of the Box steps

So these are steps you can take without having to resort to code. They’ll take you 90% of the way there.

1. Disable the ‘Add/edit summary’ link

Go to Structure » Content types and click the ‘Manage fields‘ link for the content type. Edit the Body field and uncheck “Summary input”.

2. Add the Drupal break button in CKEditor

Go to Configuration » Content authoring » CKEditor. In the ‘Editor Appearance’ section, enable the ‘Drupal Break’ plugin, and add the button (a red dashed line) to the sidebar. This will enable ‘long summaries’ to be created.

3. Create a separate ‘Long text’ field for your ‘short summaries’

You can make this field ‘plain text’ input and limit it to 2 or 3 rows. You can then use this field in shorter Views listings to be placed in sidebars or on the front page.

After that, what’s left?

1. Views doesn’t recognise the <!–break–> tag. The CKEditor button inserts a <!–break–> tag to distinguish between the summary and the rest of the page. However, neither Views ‘trimmed’ or ‘summary or trimmed’ display modes recognise it, which is odd as Drupal’s default trimming does. So the trick here is to populate the node summary on node save:

/* Implements hook_form_node_form_alter(). */
function mymodule_form_node_form_alter(&$form, $form_state) {
// Not sure if we need this, if we have input disabled
$form['body']['und'][0]['summary']['#access'] = FALSE;

$form['#submit'][] = 'mymodule_node_form_submit';

function mymodule_node_form_submit(&$form, &$form_state) {
$form_state['values']['body']['und'][0]['summary'] = text_summary($form_state['values']['body']['und'][0]['value'], 3, 600);

The last 2 values of the text_summary function are the filter format ID (which should probably be the same as the ID of the body itself. ) and the trim length, which won’t matter in our case as text_summary first looks for the <!–break–> before doing any trims. With this addition, we can then use the long summary in our views, as the Body field in ‘Summary or trimmed’ mode.

2. The page break button is not as precise as we would like. The break button is quite careful not to break into things like divs (for fear of leaving unopened tags), but if youre using it on an old site with lots of legacy code (like we often are), this means that often when you press the break button, the <!–break–> tag ends up at the bottom of the page! Views has an option to close unopened tags, so we no longer need to worry about them in the plugin itself. So here’s a version of the CKEditor plugin that is less choosy about positioning, just replace the /plugins/drupalbreaks/plugin.js in the ckeditor module. Just make sure to close all tags in your views. Also try not to use it inside things like tables 🙂

Copyright (c) 2003-2011, CKSource - Frederico Knabben. All rights reserved.
For licensing, see LICENSE.html or http://ckeditor.com/license

 * @file Plugin for inserting Drupal teaser and page breaks.
( function() {
    CKEDITOR.plugins.add( 'drupalbreaks',
        requires  : [ 'fakeobjects', 'htmldataprocessor' ],

        init : function( editor )
            // Add the styles that renders our fake objects.
                'img.cke_drupal_pagebreak,img.cke_drupal_break' +
                '{' +
                'background-image: url(' + CKEDITOR.getUrl( this.path + 'images/pagebreak.gif' ) + ');' +
                'background-position: center center;' +
                'background-repeat: no-repeat;' +
                'clear: both;' +
                'display: block;' +
                'float: none;' +
                'width: 100%;' +
                'border-top: #999999 1px dotted;' +
                'border-bottom: #999999 1px dotted;' +
                'height: 5px;' +
                '}' +
                'img.cke_drupal_break' +
                '{' +
                'border-top: #FF0000 1px dotted;' +
                'border-bottom: #FF0000 1px dotted;' +
            // Register the toolbar buttons.
            editor.ui.addButton( 'DrupalBreak',
                label : Drupal.t('Insert Teaser Break'),
                icon : this.path + 'images/drupalbreak.png',
                command : 'drupalbreak'

            editor.addCommand( 'drupalbreak',
                exec : function()
                    // There should be only one  in document. So, look
                    // for an image with class "cke_drupal_break" (the fake element).
                    var images = editor.document.getElementsByTag( 'img' );
                    for ( var i = 0, len = images.count() ; i < len ; i++ )
                        var img = images.getItem( i );
                        if ( img.hasClass( 'cke_drupal_break' ) )
                            if ( confirm( Drupal.t( 'The document already contains a teaser break. Do you want to proceed by removing it first?' ) ) )

                    insertComment( 'break' );
            } );

            editor.ui.addButton( 'DrupalPageBreak',
                label : Drupal.t( 'Insert Page Break' ),
                icon : this.path + 'images/drupalpagebreak.png',
                command : 'drupalpagebreak'

            editor.addCommand( 'drupalpagebreak',
                exec : function()
                    var hr = editor.document.createElement( '!--break--' ),
				range = new CKEDITOR.dom.range( editor.document );

			editor.insertElement( hr );

			// If there's nothing or a non-editable block followed by, establish a new paragraph
			// to make sure cursor is not trapped.
			range.moveToPosition( hr, CKEDITOR.POSITION_AFTER_END );
			var next = hr.getNext();
			if ( !next || next.type == CKEDITOR.NODE_ELEMENT && !next.isEditable() )
				range.fixBlock( true, editor.config.enterMode == CKEDITOR.ENTER_DIV ? 'div' : 'p'  );

                  /*   insertComment( 'pagebreak' );*/
            } );

            // This function effectively inserts the comment into the editor.
            function insertComment( text )
                // Create the fake element that will be inserted into the document.
                // The trick is declaring it as an 
, so it will behave like a // block element (and in effect it behaves much like an
). if ( !CKEDITOR.dom.comment.prototype.getAttribute ) { CKEDITOR.dom.comment.prototype.getAttribute = function() { return ''; }; CKEDITOR.dom.comment.prototype.attributes = { align : '' }; } var fakeElement = editor.createFakeElement( new CKEDITOR.dom.comment( text ), 'cke_drupal_' + text, 'hr' ); // This is the trick part. We can't use editor.insertElement() // because we need to put the comment directly at level. // We need to do range manipulation for that. // Get a DOM range from the current selection. var range = editor.getSelection().getRanges()[0], elementsPath = new CKEDITOR.dom.elementPath( range.getCommonAncestor( true ) ), element = ( elementsPath.block && elementsPath.block.getParent() ) || elementsPath.blockLimit, hasMoved; // If we're not in go moving the position to after the // elements until reaching it. This may happen when inside tables, // lists, blockquotes, etc. /*while ( element && element.getName() != 'body' ) { range.moveToPosition( element, CKEDITOR.POSITION_AFTER_END ); hasMoved = 1; element = element.getParent(); }*/ // Split the current block. if ( !hasMoved ) range.splitBlock( 'p' ); // Insert the fake element into the document. range.insertNode( fakeElement ); // Now, we move the selection to the best possible place following // our fake element. var next = fakeElement; //while ( ( next = next.getNext() ) && !range.moveToElementEditStart( next ) ) //{} range.select(); } }, afterInit : function( editor ) { // Adds the comment processing rules to the data filter, so comments // are replaced by fake elements. editor.dataProcessor.dataFilter.addRules( { comment : function( value ) { if ( !CKEDITOR.htmlParser.comment.prototype.getAttribute ) { CKEDITOR.htmlParser.comment.prototype.getAttribute = function() { return ''; }; CKEDITOR.htmlParser.comment.prototype.attributes = { align : '' }; } if ( value == 'break' || value == 'pagebreak' ) return editor.createFakeParserElement( new CKEDITOR.htmlParser.comment( value ), 'cke_drupal_' + value, 'hr' ); return value; } }); } }); } )();

Adventures in Drupal 7 Caching

(This post will be updated as we get a clearer idea of what is going on and how all the pieces fit together)

We have made quite a few sites in Drupal 7, the largest being http://www.srichinmoycentre.org. Drupal has evolved into a really powerful framework that can enable you to bring together all kinds of data in a variety of imaginative ways. However, you also need to know quite a bit about caching to get the site to load at speeds that people expect nowadays.

There are many different caching mechanisms that one can use with Drupal – this post aims to make clear how they can all fit together.

Out of the box caching mechanisms:

Page caching for anonymous users: This basically means that the entire page HTML is stored in the cache_page table in the database, and rendering the page requires just one trip rather than the multifarious queries normally required to build a page.

Block caching: If you enable this, then blocks will be cached for both anonymous and authenticated users. (Note that if you are not logged in and visit a page that has been cached for anonymous users, then the block content in that page will be cached no matter what – see this post). If you have a module that calls the hook_node_grants hook, then this is disabled to prevent people possibly seeing links in cached blocks to pages they shouldnt be seeing. However, this seems to be overriden by using the Drupal 7 block_cache_alter module (need to test this further). Note that block caches are flushed every time content is created or updated on the site – you can cache per page, role or user but there doesn’t seem to be much finesse in terms of minimum lifetimes etc.

Not out of the box

Views – Views can generate some pretty long queries, for example some of the queries mentioned above are querying 10 tables. So caching these are a must to have the site render quickly for authenticated users (as with block caching, the HTML output of these views are cached for anonymous users). For example, our feed displays have a caching lifetime of 1 hr, and many of our other content displays have a lifetime of 6 hours. Note that for block displays, Views offers options both for caching the view content and the surrounding block. We’re still evaluating in what cases this is useful and what other cases this could be counter productive (setting and flushing caches can be quite expensive processes)

Caching customised content – It is very easy to cache customised content using the cache_get and cache_set functions (for a very good guide, see here). You can set minimum cached lifetimes. As with views, we are still evaluating if it is a good idea to enable block caching for customised blocks whose content has already been cached in this manner.

Block cache alter – Right now, we are using it to cache the content of custom blocks. It seems to work straight out of the box in D7, even though we have modules with node access restrictions enabled (namely Domain Access). Right now we are mainly using it with user-created custom blocks (ie those using the block module).

Boost – This module creates static HTML pages and serves them instead of making Drupal/PHP/MySQL do the work. As of writing, the D7 version is workable and we are using it on several production sites. It seems to use some of Drupal’s native settings, so we’re still trying to see whats what there. It’s not yet as full featured as Drupal 6. I think the delay is bacause there is a push to try and integrate the settings into some kind of centralised caching API (see the pluggable caching section below) We have had issues with it on a few occasions, for example caching white screens (see this issue) In one instance on another site, I had to disable it because the cache wasnt being flushed effectively on node update. However, both those occasions we were using different hosting set ups to normal.

Pluggable caching

Drupal 7 has a pluggable caching backend. Basically what this means is that you can specify in your settings.php that instead of using Drupal’s database cache_* tables, the cache will instead be handled by another module. You can even configure different caching mechanisms to replace different cache tables.

Memcache/APC – Object based caching, which is faster then Database caching by many multiples. You can configure different modules to handle different caches – for example one recommended variant is to use APC to handle caches which do not change often like ‘cache’ and ‘cache_bootstrap’, Memcache to handle caches which can get large or change often like ‘cache_field’ and ‘cache_menu’ and leave Drupal’s database to handle ‘cache_filter’. These do require Mamcached/APC and their associated PHP PECL libraries to be installed on the server, but if you are on a managed VPS, your hosting people might do it for you.

Authcache – In its default state, this module will cache pages in the Drupal cache_page table for authenticated users just as the native caching does for anonymous users. However, this means pages will be served the same way for each role no matter what. Authcache offers a little bit of latitude to customise the user experience eg a special page variable to print the current user name. Right now we aren’t using it, as the D7 version hasnt yet developed a similar variable to print tabs.

You can use this module in combination with other cache backends eg memcache to speed up caching for authenticated users.




Wired Tree Hosting Review

In the past few years we have tried many hosting companies with a mixture of results. Unfortunately poor quality of hosting / support forced us to keep moving. Finding Wired Tree was a real boon as it helped stabilise our hosting environment and we are very happy to recommend their services.

Features of Wired Tree.

  • Technical Support is really first class. It is fast, responsive and has helped solve many problems.
  • The support staff are knowledgeable and polite. It really is great to have such a good quality backup. You also have a greater feeling of confidence.
  • Wired tree are also happy to go the extra mile and help in instances where other hosting companies may shrug their shoulders.
  • After using several hosting companies, Wired Tree is best by quite a long way.
  • Up time is excellent. There have  been a few small scheduled downtime, but these were given prior warnings. Overall, it is so much better than previous experiences.

We have two  managed hybrid servers, which has substantial available memory, and disc space for many drupal / static and wordpress sites.

(This is not a sponsored post and is without affiliate links.)

We hope they continue to grow and at the same time maintain that considerate personal touch you sometimes feel big companies lack.

Changing Stored Passwords on Firefox (Mac) and Windows

Changing stored passwords using Firefox browser is fairly simple. For those with Windows PCs, see caveat at bottom.

Click on Firefox  (top left) – then click on Preferences.


Click on the ‘Security’ tab and click on the ‘saved passwords’ button on the bottom right. A new window will pop up (see below) – to see passwords, press the ‘Show passwords’ button on the bottom right.

Note: if you have a lot of passwordsstored on your site, you can find the password for your site just by typing the url of the site in the ‘Search’ field at the top.

Stored Passwords on Windows

1. Go to – Tools – Options


2. In security choose – Saved Passwords and then view passwords





Firefox: Changing Passwords on PC – Important Caveat

It’s true that there’s a simple procedure within Firefox for changing passwords or deleting them altogether. But there’s an important caveat: People who have a Windows PC for many years may end up installing numerous versions of Firefox. Some of these versions leave behind old passwords which are a security vulnerability.

There’s a very good utility called SIW or System Information For Windows. It’s free and lightweight, and can be installed portably:


After installing it, run it and scroll down the left side. Click on the item labeled “Passwords” or “Secrets.” This might reveal a bunch of old Firefox logins and passwords left behind from previous installs.

If so, locate folders which look something like this:

Documents and Settings\Foo\ApplicationData\Mozilla\Profiles
Documents and Settings\Foo\ApplicationData\Mozilla\Firefox\Profiles

Of special concern are these files, buried a couple of levels deep in the Profiles folders:


Delete them and SIW should no longer find any Firefox passwords. Of course, don’t delete your current passwords!

Note: If SIW still finds old Firefox passwords, that probably means you didn’t delete the above files for each user.

Tip: Always use a master password in Firefox, or your site passwords will be easy to hack.