Chris Nizzardini, Salt Lake City Utah, Web Developer Specializing in LAMP+Ajax Since 2006

My Blog

Here is my awesome blog. You can find information on programming, linux, documentation, tips for code and database optimization, my thoughts and rants, and whatever else I feel like sharing. Feel free to contribute to the blog by posting comments and asking questions.

Posts Tagged ‘mysql’

MySQL Optimization Tip: Checking for Differences in Queries.

Posted by chris on November 2nd, 2011 Comments(0)

I’ve been going through the MySQL slow query log at my new job to optimize our applications performance. It’s a bit more scary making changes when you’re the “new guy” because there are a lot of “gotchas” you haven’t quite figured out in the new platform. Well one of the queries that I wanted to optimize lead me to get rid of some needless LEFT JOINS and trim down the amount of fields returned in the result set. Aside from testing the application as a whole after making my modification I needed a fast way to verify the exact same data was being returned for all 50,000 rows. One option was to append each field to a string and then convert that into an MD5 hash for each variation of the query. That seemed half assed too me so I decided to export the results to a CSV file then run a diff against the two from the linux shell. Read the rest of this entry »

In SQL (, , )

Unscientific Benchmarking of Type Casting, is_numeric, and regex in PHP

Posted by chris on October 5th, 2011 Comments(0)

I performed some unscientific PHP benchmarks today pitting casting to an integer against is_numeric against preg_replace. The point was to see which is the fastest way of quickly cleaning user input before passing to a SQL query to prevent against SQL injection. Obviously this would only work on database fields that are integers. To be fair I should’ve benchmarked mysql_real_escape_string in the same way, but I’m sure its a loser in this test. The test is run in a for loop 100,000 times. None of the code being used here is very expensive to begin with — but here we go:

Casting to (INT)

0.017745018005371

Using is_numeric()

0.028823852539062

Using preg_replace()

0.087189197540283

        $start = microtime(true);
        /*for($i=0;$i<100000;$i++){
                $v = (INT) "$i";
        }*/
        /*for($i=0;$i<100000;$i++){
                is_numeric("$i");
        }*/
        for($i=0;$i<100000;$i++){
                preg_replace('/\D/','',"$i");
        }
        $end = microtime(true);
 
        echo "\nTime: ".($end-$start)."\n";
In Programming (, , , , , , , , )

MySQL Slow Query Log – Analyzing, Parsing, and Optimizing Slow Queries

Posted by chris on March 31st, 2011 Comments (3)

This is a follow up post to my blog on enabling and disabling the MySQL slow query log without restarting MySQL.  In this post I’ll explain some methods for analyzing and parsing information out of the MySQL slow query log and then optimizing the slow queries. The slow query log is an absolute mess.  Go ahead just open it and you will see what I mean…

Read the rest of this entry »

In SQL (, , , )

MySQL Temporary Tables Example – Optimizing PHP Applications

Posted by chris on November 24th, 2010 Comments (7)

Shortly after starting a new job as Web Application Developer with an e-commerce company I was tasked with rewriting a legacy application. After analyzing and flow charting the current application I found numerous performance penalties, bloated code, linear programming (non-OOP), and many other areas for optimization. Even after refactoring the code and removing these performance barriers the application was a bit sluggish. Though I had improved overall application execution time by 90% I still knew there was more I could do. This is where temporary tables came into play.

MySQL Temporary Tables have the same functionality as standard disk-based tables except they exist in memory. Since memory is not long term storage, they are temporary tables, hence the name. Operating in memory makes working with these tables fast, your only limit is the amount of memory/swap space available to you.

Read the rest of this entry »

In Programming, SQL (, , , , , )

MySQL Trigger – How To, Example, Tutorial, and Syntax

Posted by chris on November 12th, 2010 Comments (3)

A MySQL trigger defines a course of action for MySQL to take when data is changed within a table. These changes can be inserts, updates, or deletes made to a table. Support for triggers was added to MySQL 5.0.2. An example of a trigger would be updating inventory each time a product is purchased.

In this example I’ll use a MySQL trigger to compute a numerical hash of a varchar field everytime the varchar is updated or when a new row is inserted. Our example table will have a simple auto-incrementing primary key, the products_model column as a varchar (because models can contain dashes and even letters in this case), and model_hash which will be a numerical representation of the products model using a hashing function.

Read the rest of this entry »

In SQL (, )

Enable or Disable MySQL 5.0 Slow Query Log Dynamically Without a Server Restart

Posted by chris on October 28th, 2010 Comments (2)

Okay so you can’t really do this dynamically without a mysqld server restart, but there is a little trick that can do this. The log-slow-queries option is either set in /etc/my.cnf or it isn’t and changing that does require a server restart. Of course this absolutely sucks on production systems. You either have to do it during business hours or log on late at night to do it. If you’re like me, both those options sound pretty bad.

There is a way to effectively disable the MySQL slow query log dynamically without restarting the MySQL daemon. Go ahead and enable your slow query log and set whatever long_query_time time you desire. Our setting currently looks like this:

Read the rest of this entry »

In SQL (, )

Random Musings on MySQL Order By Optimizations

Posted by chris on September 9th, 2010 Comments (3)

Was attempting to optimize a query for some additional functionality we are going to be releasing soon and sadly to say I failed. I was hoping to get the query execution time down to 0.01 seconds, but the best I could muster was a measly 0.03 seconds.

The goal of the query is to pull all unanswered questions (note: the design of the database predates me by over 5 years), sort them by a specific product (stored as a varchar), and to next include the products sibling products (all hair dryers for instance). The fact that the products model must be a varchar and is not numeric already increases execution time and the answered column is a varchar as well when it should really by a boolean or tinyint. Those things are out of my control though.

The first thought that comes to my head is using an IF statement to return a 1 or a 0 based on matching the product_id value and then ordering by the ‘relevance’ column. This executes in 0.0323 sec.

/* Showing rows 0 - 29 (43 total, Query took 0.0323 sec) */
SELECT
	discussion.*, if(discussion.product_id = 'y',1,0) as relevance 
FROM
	discussion 
WHERE
    (discussion.category_id = '1' OR discussion.product_id = 'y') 
    AND
    discussion.type = 1 
	AND
	discussion.status = 20 
    AND 
    discussion.answered = 'no' 
ORDER BY
	relevance DESC

Next I try some strange conditional ORDER BY with a SWITCH statement mixed in. Kinda ugly and kinda slower at 0.0568 sec.

/* Showing rows 0 - 29 (43 total, Query took 0.0568 sec) */
SELECT
	discussion.* 
FROM
	discussion 
WHERE
    (discussion.category_id = '1' OR discussion.product_id = 'y') 
    AND
    discussion.type = 1 
	AND
	discussion.status = 20 
    AND 
    discussion.answered = 'no' 
ORDER BY CASE WHEN
	discussion.product_id = 'y' THEN 1 ELSE 0 END DESC

Worst yet is the UNION executing in 0.0666 sec.

/* Showing rows 0 - 29 (43 total, Query took 0.0666 sec) */
(SELECT
	discussion.* 
FROM
	discussion 
WHERE
    discussion.product_id = 'y'  
    AND
    discussion.type = 1 
	AND
	discussion.status = 20 
    AND 
    discussion.answered = 'no' 
ORDER BY
	relevance DESC)
UNION
(SELECT
	discussion.* 
FROM
	discussion 
WHERE
    (discussion.category_id = '1' AND discussion.product_id != 'y') 
    AND
    discussion.type = 1 
	AND
	discussion.status = 20 
    AND 
    discussion.answered = 'no' 
ORDER BY
	relevance DESC)

Is there a better way to do what I’m attempting?

In SQL (, )

MySQL INNER JOIN on Multiple Parameters and Condtions

Posted by chris on August 19th, 2010 Comments(0)

Wow. This blog has gotten a bit lopsided as of late on the MySQL side of stuff. A symptom of all the MySQL wizardry I’ve been doing and learning at my new job. I found out today that you can add multiple parameters/conditions to an INNER JOIN statement and I feel kind of dumb for not knowing this before. Using multiple parameters on your inner join you can remove the need for nasty sub queries!

Take this query for instance:

SELECT DISTINCT
    wp_posts.*, wp_users.user_nicename, 
	((SELECT meta_value FROM wp_postmeta meta WHERE meta.meta_key = 'views' AND meta.post_id = wp_posts.ID)*1) as views
FROM
	article_to_categories
INNER JOIN
    wp_posts ON wp_posts.ID = article_to_categories.wp_post_id
INNER JOIN
    wp_users ON wp_users.ID = wp_posts.post_author
WHERE
    article_to_categories.categories_id = $id
ORDER BY
    views DESC

On a production server this query is sadly executing in 0.1 seconds. This is due almost entirely to that nasty little sub select in there. In my opinion, sub selects should be avoided at all cost. They are a symptom of at best an unoptimized query and at a worst a poorly designed database. Sometimes I’m a guilty party. Our goal is to get rid of the sub query:

SELECT meta_value FROM wp_postmeta meta WHERE meta.meta_key = 'views' AND meta.post_id = wp_posts.ID

We can do this by specifying multiple parameters and conditions on our join statement.

SELECT DISTINCT
    wp_posts.*, wp_users.user_nicename, (wp_postmeta.meta_value*1) as views 
FROM
    article_to_categories
INNER JOIN
	wp_posts ON wp_posts.ID = article_to_categories.wp_post_id 
INNER JOIN
	wp_users ON wp_users.ID = wp_posts.post_author 
INNER JOIN 
	wp_postmeta ON (wp_postmeta.post_id = wp_posts.ID AND wp_postmeta.meta_key = 'views') 
WHERE
    article_to_categories.categories_id = $id
ORDER BY
	views DESC

Notice that when joining on wp_postmeta I specify that I want all records matching wp_posts.ID and that match the meta_key views. By moving away from the sub select I was able to trim the query execution time to 0.03 seconds. Thats a massive improvement especially for a query that is run in bulk!

INNER JOIN 
	wp_postmeta ON (wp_postmeta.post_id = wp_posts.ID AND wp_postmeta.meta_key = 'views')

I’ve got a feeling I’ll need to go back through some of my older code and introduce some needed query optimizations. How about you?

In SQL (, , )

What is PDO Error HY093: Debug/Fix MySql PHP PDO Error HY093

Posted by chris on August 11th, 2010 Comments (2)

I’ve been working with PDO at my new job and encountered this HY093 error. Infact that was the only information I received back from my PDO object. Usually there is a lot more information. My guess is that generally PDO is passing back the MySQL error directly from the MySQL server. However, HY093 is strictly an error issued by PDO.

In this case I received the HY093 error because I had improperly binded a data type to a variable.

I was using the form:

$db->prepare("SELECT * FROM tbl WHERE name = :name");
$pdo->bindValue(':name',$phpVar,PARAM_STR);
$pdo->execute();

Notice on line 2 I used “PARAM_STR” this should be PDO::PARAM_STR likewise had this been an integer it would have been PDO:PARAM_INT. Hopefully that helps some people.

Drop me a comment if this helped your or if you feel I missed something, thanks for reading.

In Programming, SQL (, , , )

Use MySQL HAVING to Get Past Error 1111 Invalid Use of Group Function

Posted by chris on June 22nd, 2010 Comments(0)

I was working on some new reporting features for a client this evening when I encountered MySQL Error 1111 Invalid Use of Group Function. I was a bit befuddled by this as I was simply using a SUM total in my WHERE clause. Of course after a bit of googling I found that I cannot use an aggregate function such as SUM or COUNT as a WHERE clause.

Here is my original query.

SELECT SQL_CALC_FOUND_ROWS 
  city.name AS city, state.abbrev AS state_abbrev, zip.zipcode, zip.city_id, COUNT(zip.zipcode) AS zip_count, SUM(assigned.is_primary) AS primary_count, SUM(assigned.is_backup) AS backup_count 
FROM 
  tbl_zipcode zip 
INNER JOIN 
  tbl_city city ON city.city_id = zip.city_id 
INNER JOIN 
  tbl_state state ON state.state_id = city.state_id 
LEFT JOIN 
  tbl_affiliate_assigned_location assigned ON assigned.zipcode = zip.zipcode 
WHERE 
  zip_count > primary_count 
GROUP BY 
  city.city_id 
ORDER BY 
  COUNT(zip.zipcode) DESC, city.name ASC 
LIMIT 0,50

I fixed this using a HAVING clause on the GROUP BY.

SELECT SQL_CALC_FOUND_ROWS 
  city.name AS city, state.abbrev AS state_abbrev, zip.zipcode, zip.city_id, COUNT(zip.zipcode) AS zip_count, SUM(assigned.is_primary) AS primary_count, 
  SUM(assigned.is_backup) AS backup_count 
FROM 
  tbl_zipcode zip 
INNER JOIN 
  tbl_city city ON city.city_id = zip.city_id 
INNER JOIN 
  tbl_state state ON state.state_id = city.state_id 
LEFT JOIN 
  tbl_affiliate_assigned_location assigned ON assigned.zipcode = zip.zipcode 
GROUP BY 
  city.city_id HAVING zip_count > primary_count 
ORDER BY 
  COUNT(zip.zipcode) DESC, city.name ASC 
LIMIT 0,50

The HAVING clause seems to be one of those things I use so rarely that I forget about it just before I need it again. Hopefully since I wrote on a blog on it this time it sticks around in my memory bank a bit longer. Hope this helped someone. I also found out that you can have multple HAVING clauses using the following syntax.

HAVING zip_count >  primary_count OR primary_count IS NULL
In SQL (, , , )