Delete inactive Drupal users – sql query

Delete users who have no role assigned, created no content and who have not logged in in a few years:

DELETE u.* FROM users u LEFT JOIN users_roles r on u.uid = r.uid LEFT JOIN node n ON u.uid = n.uid WHERE r.rid IS NULL AND n.nid IS NULL AND u.login < 1400000000 AND u.uid > 1;
DELETE a.* FROM authmap a LEFT JOIN users u ON a.uid = u.uid WHERE u.uid IS NULL;

Export all 404 errors stored in Drupal database log

A drush command to export a text file with all 404s on your site and how often they occur:

drush sql-query "SELECT count(*) as num_rows, message from watchdog WHERE type = 'page not found' group by message order by num_rows desc;" --result-file=404s.txt

WordPress: get all posts with no audio files attached

Useful SQL query for wordpress:

SELECT p.post_title, p.guid
FROM wp_posts p
WHERE p.post_type = 'post' AND
NOT EXISTS (SELECT *
FROM wp_posts a
WHERE p.ID = a.post_parent
AND a.post_mime_type LIKE '%audio%');

Generating thumbnails on the fly with WordPress

One advantage of Drupal’s resizing images over WordPress – WordPress’s resized images are generated at upload time only, whereas Drupal’s will be generated automatically on page load if the thumbnail doesn’t already exist.

You could put this function into your WP template – given an image url $image_url, it checks for the address of its thumbnail $thumb_url. If it doesnt find it, then it generates a thumbnail of size $xdim x $ydim

function maybe_generate_userarticlethumb($thumb_url, $image_url, $xdim, $ydim) {  
  
  /* There might be quicker ways to check if url exists than file_get_contents */
  if(!@file_get_contents($thumb_url)) {   
    $imagepath = str_replace(array($_SERVER["HTTP_HOST"], 'http://', 'https://'), array($_SERVER["DOCUMENT_ROOT"], '', ''), $image_url);
    $thumbpath = str_replace(array($_SERVER["HTTP_HOST"], 'http://', 'https://'), array($_SERVER["DOCUMENT_ROOT"], '', ''), $thumb_url);
    $image = wp_get_image_editor($imagepath);
    if ( ! is_wp_error( $image ) ) {
       /* Use any of WP's image manipulation functions here */
       $image->resize( $xdim, $ydim, true );
       $image->save($thumbpath);
    }
  }
}  

Then just call maybe_generate_userarticlethumb($thumb_url, $image_url, $xdim, $ydim) in your template, and print $thumb_url where you need to.

Target Safari and Chrome with specific css

Sometimes you need to apply CSS rules to deal with Webkit browsers such as Safari and Chrome. One example: sometimes non-standard fonts rendered using @font-face or an embed service such as Typekit or Google webfonts display thinner than on other browsers.

You can deal with this using a media query:

@media screen and (-webkit-min-device-pixel-ratio:0) {
   body { font-weight: 400; }
}

Thanks to phrappe.com for this useful tidbit!

Searching for large files on your server

Here’s a command that will check the root / folder recursively for files greater than or equal to 1GB and sort the list.

cd / ; du -h | grep '^[0-9]\+\.\?[0-9]\?G' 2>/dev/null | sort -nrk 1

Thanks to Phil from WiredTree for sharing that info, I’m just putting it down here so I can remember it in future 🙂

Redirect rules to change underscore to hyphen

Unfortunately Apache redirect only allows for 9 placeholders, so that’s the maximum amount of undercore changes that this code allows. We are using this mainly within WordPress sites, so here is the entire wp htaccess file, so you know where to position the code.

# BEGIN WordPress

RewriteEngine On
RewriteBase /

# BEGIN underscore to hyphen redirect code
RewriteRule ^([^_]*)_([^_]*)_([^_]*)_([^_]*)_([^_]*)_([^_]*)_([^_]*)_([^_]*)_([^_]*)$ /$1-$2-$3-$4-$5-$6-$7-$8-$9 [R=301,L]
RewriteRule ^([^_]*)_([^_]*)_([^_]*)_([^_]*)_([^_]*)_([^_]*)_([^_]*)_([^_]*)$ /$1-$2-$3-$4-$5-$6-$7-$8 [R=301,L]
RewriteRule ^([^_]*)_([^_]*)_([^_]*)_([^_]*)_([^_]*)_([^_]*)_([^_]*)$ /$1-$2-$3-$4-$5-$6-$7 [R=301,L]
RewriteRule ^([^_]*)_([^_]*)_([^_]*)_([^_]*)_([^_]*)_([^_]*)$ /$1-$2-$3-$4-$5-$6 [R=301,L]
RewriteRule ^([^_]*)_([^_]*)_([^_]*)_([^_]*)_([^_]*)$ /$1-$2-$3-$4-$5 [R=301,L]
RewriteRule ^([^_]*)_([^_]*)_([^_]*)_([^_]*)$ /$1-$2-$3-$4 [R=301,L]
RewriteRule ^([^_]*)_([^_]*)_([^_]*)$ /$1-$2-$3 [R=301,L]
RewriteRule ^([^_]*)_([^_]*)$ /$1-$2 [L,R=301]
# END underscore to hyphen redirect code

RewriteRule ^index\.php$ - [L]
RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_FILENAME} !-d
RewriteRule . /index.php [L]


# END WordPress

Migrating static HTML sites into a WordPress multisite

1. Create a multisite account for the site you are about to import

We set up our multisite using subdomains – this is useful for us as some of our sites there really are subdomains of the main site. If you use subfolders, some of the steps below might be different – unfortunately we havent tested them for that case.

2. Import the site

We are using the HTML Import module. You can import blog posts and pages in seperate import runs. For pages, it will keep the hierarchy.

3. Clean up superflous HTML

Sometimes there is unneeded content that appears on every page. For example our HTML sites were scraped from Plone sites, so there was lots of template cruft at the beginning and end of every post (plus the title was inside the CSS class that served as our ‘body’ class, so it ended up inside the main content too). So here’s a little php script you can modify for your own ends. Note that you will also need to download the htmLawed script and place it in the same folder as the script below.

 2,
'tidy' => 1,
'elements' => $elements,
'cdata' => 1,
'comment' => 1,
'deny_attribute' => 'align'
);

mysql_select_db($dbname);

$doc = new DOMDocument();

$query = mysql_query("SELECT ID, post_date, post_title, post_name, post_content, post_type FROM " . POSTTABLE . " WHERE post_type IN ('post', 'page') AND post_status = 'publish'");

while($post = mysql_fetch_object($query)) {

if(TESTINGSINGLE == false || $post->ID == TESTID) {
print "\n\n" . '**opening ' . $post->ID . ' - name: ' . $post->post_name . "\n";
}

$oldpost = $post->post_content;
$doc->loadHTML($oldpost);

$newtabledom = new DOMDocument;
$xpath = new DOMXPath($doc);

$newtabledom = $doc;
$pagepath = new DOMXPath($newtabledom);

// Remove content inside certain classes inside page
// The below elements are leftover elements from Plone
// See XPath documentation for more details of how to make queries
$toremove= $pagepath->query("//h1[@class='documentFirstHeading'] | //p[@class='documentDescription'] | //div[@class='documentDescription'] | //div[@class='documentByLine'] | //div[@class='documentActions'] | //div[@id='relatedItems'] | //div[@class='discussion'] | //a[@id='documentContent'] | //a[@class='link-parent']");

foreach ($toremove as $entry) {
$entry->parentNode->removeChild($entry);
}

// We try and change classes of images to use the WordPress floated classes
// The script detects classes (or parent div classes) that have the words
// left right or center and renames them
$imgtags = $doc->getElementsByTagName('img');
foreach($imgtags as $child) {
$linkclass = $child->attributes->getNamedItem('class')->nodeValue;
$alignclass = $child->attributes->getNamedItem('align')->nodeValue;
$linkfile = $child->attributes->getNamedItem('src')->nodeValue;

if(strpos($linkclass, 'left') !== false || strpos($alignclass, 'left') !== false) {
$child->setAttribute( 'class' , 'alignleft' );
}
else if (strpos($linkclass, 'right') !== false || strpos($alignclass, 'right') !== false) {
$child->setAttribute( 'class' , 'alignright' );

}
else if (strpos($linkclass, 'centre') !== false || strpos($linkclass, 'center') !== false
|| strpos($alignclass, 'centre') !== false || strpos($alignclass, 'center') !==false ) {
$imageinfo = @getimagesize(SITEROOT . $linkfile);
$child->setAttribute( 'class' , 'aligncenter' );
}
else {

// get parent
$parent = $child->parentNode;

if ($parent) {
$grandparent = $parent->parentNode;
$parentclass = $parent->attributes->getNamedItem('class')->nodeValue;
$parentalign = $parent->attributes->getNamedItem('align')->nodeValue;
$grandparentclass = $grandparent->attributes->getNamedItem('class')->nodeValue;
$grandparentalign = $grandparent->attributes->getNamedItem('align')->nodeValue;
}
else {
$parentclass = '';
$parentalign = '';
$grandparentclass = '';
$grandparentalign = '';
}

if (strpos($parentclass, 'left') !== false || strpos($parentalign, 'left') !== false) {
print "\n\n" . '** left parent ' . $linkfile . " - class: " . $parentclass . " - align: " . $parentalign. "\n";
$child->setAttribute( 'class' , 'alignleft' );
}
else if (strpos($parentclass, 'right') !== false || strpos($parentalign, 'left') !== false) {
print "\n\n" . '** right parent ' . $linkfile . " - class: " . $parentclass . " - align: " . $parentalign. "\n";
$child->setAttribute( 'class' , 'alignright' );
}
else if (strpos($parentclass, 'centre') !== false || strpos($parentclass, 'center') !== false
|| strpos($parentalign, 'centre') !== false || strpos($parentalign, 'center') !== false) {
print "\n\n" . '** centred parent ' . $linkfile . " - class: " . $parentclass . " - align: " . $parentalign. "\n";
$imageinfo = @getimagesize($linkfile);
$child->setAttribute( 'class' , 'aligncenter' );
}
else if (strpos($grandparentclass, 'left') !== false || strpos($grandparentalign, 'left') !== false) {
print "\n\n" . '** left grandparent ' . $linkfile . " - class: " . $grandparentclass . " - align: " . $grandparentalign. "\n";
$child->setAttribute( 'class' , 'alignleft' );
}
else if (strpos($grandparentclass, 'right') !== false || strpos($grandparentalign, 'left') !== false) {
print "\n\n" . '** right grandparent ' . $linkfile . " - class: " . $grandparentclass . " - align: " . $grandparentalign. "\n";
$child->setAttribute( 'class' , 'alignright' );
}
else if (strpos($grandparentclass, 'centre') !== false || strpos($grandparentclass, 'center') !== false
|| strpos($grandparentalign, 'centre') !== false || strpos($grandparentalign, 'center') !== false) {
print "\n\n" . '** centred grandparent ' . $linkfile . " - class: " . $grandparentclass . " - align: " . $grandparentalign. "\n";
$imageinfo = @getimagesize($linkfile);
$child->setAttribute( 'class' , 'aligncenter' );
}
}
}

// Replace underscores with dashes inside relative links
// we are excluding ../ links for now - too complicated
$atags = $doc->getElementsByTagName('a');
foreach($atags as $child) {

$linkhref = $child->attributes->getNamedItem('href')->nodeValue;

if (!(substr($linkhref, 0, 4) == 'http' || substr($linkhref, 0, 1) == '/' || substr($linkhref, 0, 3) == '../')) {

$parent = mysql_fetch_object(mysql_query("SELECT post_name FROM " . POSTTABLE . " WHERE post_parent = " . $post->post_parent));

// print "\n\n" . 'parent name: ' . $parent->post_name . "\n";
if(strpos($linkhref, '_') !== FALSE && (strpos($post->post_name, '-') !== FALSE || strpos($parent->post_name, '-') !== FALSE) ) {
print "\n\n" . 'link: ' . $linkhref . "\n";

$linkhref = str_replace('_', '-', $linkhref);
$linkhref = preg_replace('/--+/', '-', $linkhref);

print "\n\n" . 'changed internal link: ' . $linkhref . "\n";

$child->setAttribute( 'href' , $linkhref);
}
}

}

// Output HTML from query documents
$newtablehtml = $newtabledom->saveHTML();

// Text rewriting
// This can be modified to your needs
$newtablehtml = str_replace('[...]', '', $newtablehtml);
$newtablehtml = str_replace('/index.html"', '"', $newtablehtml);
$newtablehtml = str_replace('/"', '"', $newtablehtml);
$newtablehtml = str_replace('https://my.', 'http://www.', $newtablehtml);

// Sometimes there are encoding issues which need dealing with
$newtablehtml = str_replace('&Acirc;&nbsp;', '', $newtablehtml);
$newtablehtml = str_replace('&Acirc;', '', $newtablehtml);
$newtablehtml = str_replace('&#132;', '', $newtablehtml);
$newtablehtml = str_replace('&acirc;&#8364;&#8482', "'", $newtablehtml);
$newtablehtml = str_replace("&#145;", "'", $newtablehtml);
$newtablehtml = str_replace("&#146;", "'", $newtablehtml);
$newtablehtml = str_replace("&#147;", "'", $newtablehtml);
$newtablehtml = str_replace("&#148;", "'", $newtablehtml);
$newtablehtml = str_replace("&#150;", " - ", $newtablehtml);
$newtablehtml = str_replace("&#151;", " - ", $newtablehtml);
$newtablehtml = str_replace("&acirc;&#128;&#153;", "'", $newtablehtml);
$newtablehtml = str_replace("&acirc;&#128;&#156;", "", $newtablehtml);
$newtablehtml = str_replace("&acirc;&#128;&#157;", "", $newtablehtml);

// Sometimes the old page still contains html doctype
// inside the content tag
if (strpos($newtablehtml, '') !== 0) {
$newtablehtml = str_replace('', '', $newtablehtml);
}

// Remove empty paragraphs
$newtablehtml = preg_replace("#]*>(\s|&nbsp;?)*

#", '', $newtablehtml); // Now run htmLawed to clean up $newtablehtml = htmLawed($newtablehtml, $config); // Normalise post titles in all caps if (strtoupper($post->post_title) == $post->post_title) { $post->post_title = ucwords(strtolower($post->post_title)); } if(strlen($newtablehtml) > 30) { // Post name exists - save new post content only if(strlen(trim($post->post_name)) > 0) { if(TESTINGSINGLE == false || $post->ID == TESTID) { $query2 = "UPDATE " . POSTTABLE . " SET post_content = '" . mysql_real_escape_string($newtablehtml) . "', post_title = '" . $post->post_title . "' WHERE ID = ". $post->ID; mysql_query($query2); } } // Need to generate post content from title else { $postname = strtolower(sanitize_file_name($post->post_title)); if(TESTINGSINGLE == false || $post->ID == TESTID) { $query2 = "UPDATE " . POSTTABLE . " SET post_content = '" . mysql_real_escape_string($newtablehtml) . "', post_name ='" . mysql_real_escape_string($postname) . "',post_title = '" . $post->post_title . "' WHERE ID = ". $post->ID; mysql_query($query2); } } } else { // Delete posts with v little or no content $query3 = mysql_query("SELECT ID FROM " . POSTTABLE . " WHERE post_type IN ('post', 'page') AND post_status = 'publish' AND post_parent = " . $post->ID); if(!mysql_fetch_object($query3)) { if(TESTINGSINGLE == false || $post->ID == TESTID) { mysql_query("DELETE FROM " . POSTTABLE . " WHERE ID = ". $post->ID); } } } } // Taken from the WP function function sanitize_file_name( $filename ) { $filename_raw = $filename; $special_chars = array("?", "[", "]", "/", "\\", "=", "<", ">", ":", ";", ",", "'", "\"", "&", "$", "#", "*", "(", ")", "|", "~", "`", "!", "{", "}", "–", "—","—", chr(0)); $filename = str_replace($special_chars, '', $filename); $filename = preg_replace('/[\s-]+/', '-', $filename); $filename = trim($filename, '.-_'); $entities = array("%e2", "%80", "%9c", "%9d", "%94", "%a0". "%93", "%99"); $filename = str_replace($entities, '', $filename); $unique = 0; $i = 0; while (!$unique) { $query = mysql_query("SELECT post_name FROM " . POSTTABLE . " WHERE post_name IN ('post', 'page') AND post_name = '" . $filename . "'"); if(mysql_fetch_object($query)) { print('***not unique - ' . $filename); $filename = $filename . '-' . $i; $i = $i + 1; } else { $unique = 1; } } return $filename; } ?>

4. Create redirections from one site to another

The also generates a very nice .htaccess file we can use as the basis for our redirects. Unfortunately we can’t use this directly in the .htaccess file for wordpress multisites, as the same htaccess file is used across all sites. Fortunately we can use the Redirection module, which lets WordPress handle the redirections instead of Apache.

For static sites, we often need to cater for the case where the URL ends in / as well as /index.html. So we need to rewrite our redirects a little – heres a little shell script you can run. Copy the generated .htaccess file to your desktop and run

# specify your original domain here - ie the one that occurs first in the .htaccess rule
DOMAIN = http://www.domain.com
# Replace tabs with spaces - much easier to deal with
expand -t1 htaccess > htaccess1
# this replaces your domain with ^/ - makes it much easier to target remaining
sed 's-$DOMAIN/-^/-g' htaccess2 > htaccess3
# Use RedirectMatch
sed 's/Redirect/RedirectMatch 301/g' htaccess3 > htaccess4
# We wont use the mod_rewrite way as Redirect doesnt handle that
sed 's/[R=301,NC,L]//g' htaccess4 > htaccess5
# This tacks on a regex that will handle / and index.html at the end of a URL
# note: this is for the case where your static URLS in .htaccess dont end in
# either / or /index.html
sed 's- http://-(:?/index.html|/)?$ http://-g' htaccess5 > htaccess_new

# comment the above line and uncomment one of these if your static URLs
# end in / (1st one) or /index.html (2nd one)
# sed 's-/ http://-(:?/index.html|/)?$ http://-g' htaccess5 > htaccess_new
# sed 's-/index.html-(:?/index.html|/)?$-g' htaccess5 > htaccess_new

Test it out on one line, and then see if it works.