Chris Nizzardini, Salt Lake City Utah, Web Developer Specializing in LAMP+Ajax Since 2006

My Blog

Here is my awesome blog. You can find information on programming, linux, documentation, tips for code and database optimization, my thoughts and rants, and whatever else I feel like sharing. Feel free to contribute to the blog by posting comments and asking questions.

Archive for 2010

Creating a Spider with Python Mechanize to Monitor Your Website

Posted by chris on September 15th, 2010 Comments (2)

Program a Python Web Spider Bot with MechanizeEvery since the SeoMoz conference in Seattle I’ve been wanting to write a spider to do some of the interesting things the speakers mentioned. Well I haven’t gotten an opportunity to write any SEO apps yet, but I did get a chance to take my first crack at Python using the Mechanize spider. Mechanize is fast and light-weight parsing large HTML documents in fractions of a second. There were PHP options I could have used. One was phpdig. I could have also written a spider from scratch. But after 4 years of developing in PHP I wanted to try something new. Enter python + mechanize for monitoring websites.

The opportunity came after some unfortunate circumstances lead to downtime at work related to a data center migration. Needing a way to go beyond simple service monitoring for things like HTTP/Apache and MySQL I built upon mechanize to test for specific HTML tags within the document body.

Read the rest of this entry »

In Programming (, , , )

Setting up NTP on CentOS

Posted by chris on September 9th, 2010 Comments(0)

Wow has it been this long since I did a Linux post? Here’s a quickie. I needed to setup NTP on a newly migrated server to resolve some wacky time card issues. First install ntp (should already be installed), turn the service on with chkconfig, create a backup of your timezone file, set your timezone to your local area, perform an ntp update, ensure the service is started, and last using chkconfig verify that it has the proper run levels (2-5) so it starts up on a server reboot.

yum install ntp
chkconfig ntpd on
mv /etc/localtime /etc/localtime.bak
ln -s /usr/share/zoneinfo/America/Denver /etc/localtime
ntpdate pool.ntp.org
/etc/init.d/ntpd start
chkconfig --list

You may also want to sync your hardware clock in your BIOS to your system clock.

See:
http://www.cyberciti.biz/faq/howto-install-ntp-to-synchronize-server-clock/

http://www.wikihow.com/Change-the-Timezone-in-Linux

http://www.forum.psoft.net/showthread.php?t=13307

In Linux, Software (, )

Random Musings on MySQL Order By Optimizations

Posted by chris on September 9th, 2010 Comments (3)

Was attempting to optimize a query for some additional functionality we are going to be releasing soon and sadly to say I failed. I was hoping to get the query execution time down to 0.01 seconds, but the best I could muster was a measly 0.03 seconds.

The goal of the query is to pull all unanswered questions (note: the design of the database predates me by over 5 years), sort them by a specific product (stored as a varchar), and to next include the products sibling products (all hair dryers for instance). The fact that the products model must be a varchar and is not numeric already increases execution time and the answered column is a varchar as well when it should really by a boolean or tinyint. Those things are out of my control though.

The first thought that comes to my head is using an IF statement to return a 1 or a 0 based on matching the product_id value and then ordering by the ‘relevance’ column. This executes in 0.0323 sec.

/* Showing rows 0 - 29 (43 total, Query took 0.0323 sec) */
SELECT
	discussion.*, if(discussion.product_id = 'y',1,0) as relevance 
FROM
	discussion 
WHERE
    (discussion.category_id = '1' OR discussion.product_id = 'y') 
    AND
    discussion.type = 1 
	AND
	discussion.status = 20 
    AND 
    discussion.answered = 'no' 
ORDER BY
	relevance DESC

Next I try some strange conditional ORDER BY with a SWITCH statement mixed in. Kinda ugly and kinda slower at 0.0568 sec.

/* Showing rows 0 - 29 (43 total, Query took 0.0568 sec) */
SELECT
	discussion.* 
FROM
	discussion 
WHERE
    (discussion.category_id = '1' OR discussion.product_id = 'y') 
    AND
    discussion.type = 1 
	AND
	discussion.status = 20 
    AND 
    discussion.answered = 'no' 
ORDER BY CASE WHEN
	discussion.product_id = 'y' THEN 1 ELSE 0 END DESC

Worst yet is the UNION executing in 0.0666 sec.

/* Showing rows 0 - 29 (43 total, Query took 0.0666 sec) */
(SELECT
	discussion.* 
FROM
	discussion 
WHERE
    discussion.product_id = 'y'  
    AND
    discussion.type = 1 
	AND
	discussion.status = 20 
    AND 
    discussion.answered = 'no' 
ORDER BY
	relevance DESC)
UNION
(SELECT
	discussion.* 
FROM
	discussion 
WHERE
    (discussion.category_id = '1' AND discussion.product_id != 'y') 
    AND
    discussion.type = 1 
	AND
	discussion.status = 20 
    AND 
    discussion.answered = 'no' 
ORDER BY
	relevance DESC)

Is there a better way to do what I’m attempting?

In SQL (, )

Debugging Your WebSite with Google Analytics Custom Reports

Posted by chris on August 28th, 2010 Comments (2)

I’m in Seattle for the SeoMoz Pro Training and figured I should do a blog post meshing SEO with debugging and development. I’m pretty excited about the SeoMoz Pro Training. My SEO skills are no where near my programming skills so this is an awesome opportunity to improve those skills, meet some of the leaders in the SEO field, and see Seattle for my first time.

Problem: Testing your site in multiple browser versions

Lets face it. Browser testing is a bitch. There are 3 versions of IE to test, Safari on Windows and Mac, Firefox, and now Chrome. It really sucks. I absolutely hate it. Recently, while doing a routine testing of a clients site I found an error causing a critical page to not render correctly in Internet Explorer 6. Needless to say this was negatively impacting the clients conversion rates.

Digging In with Analytics

Using Google Analytics I discovered that on this clients site, Internet Explorer (all versions) makes up about half of the total visits. Within IE about 10% of those visitors are still using version 6. So about 5% (about 800 total visits per month) of the clients total users are on Internet Explorer 6. When you consider this clients conversion rate is around 25% that comes up to a potential loss of 200 conversions per month! Digging in deeper I found that the IE6 conversion rate was only at about 7%. Using a quick CSS hack I was able to fix the problem (I used a B.S. IE CSS hack setting zoom:1 on a problematic CSS class).

Creating a Custom Report to Find Browser Errors

Go into custom reports and create a new custom report. Call this something like Browser Conversion Report and add the following metrics: Total Goal Starts, Total Goal Completions, and Goal Conversion Rate (these are found under the Goals metric). Now you will need to add some Dimensions so add Browser and then as a drill-down add Browser Version (both found under System).

Google Analytics Custom Reports

Preview the report now.

Google Analytics

Now drill down by clicking on a Browser type.

Google Analytics Browser Report

So did the the bug fix work? Absolutely, and now we have a report to back it up. The clients IE6 conversion rate is sitting at around 25% since fixing the CSS issue. Now we’ve created a way to easily monitor our conversion rates among browsers. You can get even more creative with this by adding in Operating System types, connection speeds, or whatever your geeky heart desires.

I’d love to blog all day, but I think its time to go see what Seattle has to offer.

In Seo, Software (, )

Sages Cafe Is Underwhelming

Posted by chris on August 24th, 2010 Comments(0)

I tried a vegan eat out popular to Salt Lake City the other night. I’d heard good things about this place from vegan and non-vegans alike. While I’m not completely upset with the meal I received it was very underwhelming. First impressions were very good, even though Sages Cafe sits in the heart of downtown Salt Lake it seemed a bit “off the beaten path” to me. I enjoyed the older style building and the vegetation outside gave the building a cool feeling on a warm day. I found the decor to be quaint and loved the exposed brick.

Our service was adequate as we were seated within 5 minutes despite the place being moderately busy. The establishment was clean and well kept and we had a nice table next to a window overlooking the street. We started off with the Bruschetta for an appetizer and some locally brewed beers. The Bruschetta was great consisting of vegan cheese sitting atop bread sprinkled with almond flakes and seasonings. For beers we ordered a Squatters India Pale Ale and a Pinkus Hefe Weizen. We took a moment to decide on our main course and this was not because there were too many options to choose. In fact the menu was too small. I suppose when you cut out two food groups your menu tends to get a bit smaller. What was there seemed to be very healthy and they had a reasonable selections of beer and wine.

We settled on the Picadillo Vegetariano and the Nut Burger (I knew this was a bad idea but a client had told my girlfriend it was fantastic). The Nut Burger was nothing more than a conglomeration of nut paste resting between some wheat bread. Might as well have gotten a PB&J. It was dry, dull, chalky, and tasteless. The Picadillo Vegetariano was a stir fry consisting of a fresh sauté of tomato, bell pepper, onion, carrot, tofu, black beans, brown rice and Caribbean spices topped with California black olives, served with cilantro-lime vinaigrette tossed greens. The tofu was billed as “vegan chicken” but it was obviously just standard tofu that anyone could make. The stir fry lacked the flavoring you would expect from a restaurant.

It’s difficult for me to gauge this from the Vegan perspective as I’m not Vegan nor Vegetarian and have never been to this type of restaurant. The service is decent and the decor is good, but for $45.00 I felt we would have left filled up. Worst of all the food was simply not filling, when we left our plates were as empty as our stomachs.

In Beer and Stuff (, )

MySQL INNER JOIN on Multiple Parameters and Condtions

Posted by chris on August 19th, 2010 Comments(0)

Wow. This blog has gotten a bit lopsided as of late on the MySQL side of stuff. A symptom of all the MySQL wizardry I’ve been doing and learning at my new job. I found out today that you can add multiple parameters/conditions to an INNER JOIN statement and I feel kind of dumb for not knowing this before. Using multiple parameters on your inner join you can remove the need for nasty sub queries!

Take this query for instance:

SELECT DISTINCT
    wp_posts.*, wp_users.user_nicename, 
	((SELECT meta_value FROM wp_postmeta meta WHERE meta.meta_key = 'views' AND meta.post_id = wp_posts.ID)*1) as views
FROM
	article_to_categories
INNER JOIN
    wp_posts ON wp_posts.ID = article_to_categories.wp_post_id
INNER JOIN
    wp_users ON wp_users.ID = wp_posts.post_author
WHERE
    article_to_categories.categories_id = $id
ORDER BY
    views DESC

On a production server this query is sadly executing in 0.1 seconds. This is due almost entirely to that nasty little sub select in there. In my opinion, sub selects should be avoided at all cost. They are a symptom of at best an unoptimized query and at a worst a poorly designed database. Sometimes I’m a guilty party. Our goal is to get rid of the sub query:

SELECT meta_value FROM wp_postmeta meta WHERE meta.meta_key = 'views' AND meta.post_id = wp_posts.ID

We can do this by specifying multiple parameters and conditions on our join statement.

SELECT DISTINCT
    wp_posts.*, wp_users.user_nicename, (wp_postmeta.meta_value*1) as views 
FROM
    article_to_categories
INNER JOIN
	wp_posts ON wp_posts.ID = article_to_categories.wp_post_id 
INNER JOIN
	wp_users ON wp_users.ID = wp_posts.post_author 
INNER JOIN 
	wp_postmeta ON (wp_postmeta.post_id = wp_posts.ID AND wp_postmeta.meta_key = 'views') 
WHERE
    article_to_categories.categories_id = $id
ORDER BY
	views DESC

Notice that when joining on wp_postmeta I specify that I want all records matching wp_posts.ID and that match the meta_key views. By moving away from the sub select I was able to trim the query execution time to 0.03 seconds. Thats a massive improvement especially for a query that is run in bulk!

INNER JOIN 
	wp_postmeta ON (wp_postmeta.post_id = wp_posts.ID AND wp_postmeta.meta_key = 'views')

I’ve got a feeling I’ll need to go back through some of my older code and introduce some needed query optimizations. How about you?

In SQL (, , )

What is PDO Error HY093: Debug/Fix MySql PHP PDO Error HY093

Posted by chris on August 11th, 2010 Comments (2)

I’ve been working with PDO at my new job and encountered this HY093 error. Infact that was the only information I received back from my PDO object. Usually there is a lot more information. My guess is that generally PDO is passing back the MySQL error directly from the MySQL server. However, HY093 is strictly an error issued by PDO.

In this case I received the HY093 error because I had improperly binded a data type to a variable.

I was using the form:

$db->prepare("SELECT * FROM tbl WHERE name = :name");
$pdo->bindValue(':name',$phpVar,PARAM_STR);
$pdo->execute();

Notice on line 2 I used “PARAM_STR” this should be PDO::PARAM_STR likewise had this been an integer it would have been PDO:PARAM_INT. Hopefully that helps some people.

Drop me a comment if this helped your or if you feel I missed something, thanks for reading.

In Programming, SQL (, , , )

How to Create a WordPress Plugin

Posted by chris on July 30th, 2010 Comment(1)

I recently had the opportunity to create a wordpress plugin for my company’s website. This is a simple plugin that associates blog posts to product pages on our shopping cart based on word press tags. The plugin simply displays links to the products page (based on the tags) on the wordpress post page for a given blog post.

First create a file in the wordpress plugins folder and add the following comments:

/*
Plugin Name: Article to Categories 
Plugin URI: http://www.foo.com/
Description: ...
Version: 1.0
Author: ...
Author URI: http://www.cnizz.com
License: ...
*/

WordPress will read in this information on the plugin page. Next we are going to hook into the core wordpress operations. Prior to this project I didn’t think much of wordpress. I viewed it as a great blogging platform with really horrid looking code, but the WP developers have a done a great job creating a really robust API. This hook will tell wordpress to call the show_category_pages_meta_box() function when it loads the admin menu. Now we need to define that function.

add_action('admin_menu','show_category_pages_meta_box');

This function defines a new meta box with an element ID of myplugin_sectionid. The title is Article to Categories and it will be displayed on post pages only. When loaded it will call show_category_pages() which will actually echo out the text to appear in the new meta box.

function show_category_pages_meta_box(){
	add_meta_box( 'myplugin_sectionid', 'Article to Categories', 'show_category_pages', 'post', 'side', 'high');
}

Finally we need to define show_category_pages() and populate our new meta box.

function show_category_pages(){
echo 'Some awesome product pages!';
}

Creating this wordpress plugin was pretty fun and I hope I get a change to do it again in the future.

Resources:
WordPress: Writing a Plugin
Plugin API
Action Reference

In Programming, Software (, , , )

Use MySQL HAVING to Get Past Error 1111 Invalid Use of Group Function

Posted by chris on June 22nd, 2010 Comments(0)

I was working on some new reporting features for a client this evening when I encountered MySQL Error 1111 Invalid Use of Group Function. I was a bit befuddled by this as I was simply using a SUM total in my WHERE clause. Of course after a bit of googling I found that I cannot use an aggregate function such as SUM or COUNT as a WHERE clause.

Here is my original query.

SELECT SQL_CALC_FOUND_ROWS 
  city.name AS city, state.abbrev AS state_abbrev, zip.zipcode, zip.city_id, COUNT(zip.zipcode) AS zip_count, SUM(assigned.is_primary) AS primary_count, SUM(assigned.is_backup) AS backup_count 
FROM 
  tbl_zipcode zip 
INNER JOIN 
  tbl_city city ON city.city_id = zip.city_id 
INNER JOIN 
  tbl_state state ON state.state_id = city.state_id 
LEFT JOIN 
  tbl_affiliate_assigned_location assigned ON assigned.zipcode = zip.zipcode 
WHERE 
  zip_count > primary_count 
GROUP BY 
  city.city_id 
ORDER BY 
  COUNT(zip.zipcode) DESC, city.name ASC 
LIMIT 0,50

I fixed this using a HAVING clause on the GROUP BY.

SELECT SQL_CALC_FOUND_ROWS 
  city.name AS city, state.abbrev AS state_abbrev, zip.zipcode, zip.city_id, COUNT(zip.zipcode) AS zip_count, SUM(assigned.is_primary) AS primary_count, 
  SUM(assigned.is_backup) AS backup_count 
FROM 
  tbl_zipcode zip 
INNER JOIN 
  tbl_city city ON city.city_id = zip.city_id 
INNER JOIN 
  tbl_state state ON state.state_id = city.state_id 
LEFT JOIN 
  tbl_affiliate_assigned_location assigned ON assigned.zipcode = zip.zipcode 
GROUP BY 
  city.city_id HAVING zip_count > primary_count 
ORDER BY 
  COUNT(zip.zipcode) DESC, city.name ASC 
LIMIT 0,50

The HAVING clause seems to be one of those things I use so rarely that I forget about it just before I need it again. Hopefully since I wrote on a blog on it this time it sticks around in my memory bank a bit longer. Hope this helped someone. I also found out that you can have multple HAVING clauses using the following syntax.

HAVING zip_count >  primary_count OR primary_count IS NULL
In SQL (, , , )

MySQL InnoDb inserts are slow, really slow!

Posted by chris on June 12th, 2010 Comments (6)

So I was doing some MySQL database optimization tests this morning when I got my database engines mixed up. Thats when I stumbled on something horrible….InnoDb is really slow when it comes to inserts! Like really slow.

I performed 10,000 inserts on MyIsam and it executed in about 1/10th of a second. The same inserts on an InnoDb table took over 21 seconds! I was seriously dumbfounded by this. I knew MyIsam would be faster, but I figured it would be only nominally faster.

Out of curiosity I did a single insert with MyIsam which executed in 0.000762939453125 seconds whereas the same insert in InnoDb executed in 0.032792091369629 seconds. I must say this is a bummer for anyone using InnoDb in a high-insert environment. I’ll have to see what the differences are on updates, selects, deletes, and joins.

In SQL (, , )