Chris Nizzardini, Salt Lake City Utah, Web Developer Specializing in LAMP+Ajax Since 2006

My Blog

Here is my awesome blog. You can find information on programming, linux, documentation, tips for code and database optimization, my thoughts and rants, and whatever else I feel like sharing. Feel free to contribute to the blog by posting comments and asking questions.

SQL

Optimize MySQL Queries – Fast Inserts With Multiple Rows

Posted by chris on May 31st, 2010 Comment(1)

I was programming some code that needed to do a lot of inserts a few months back. I hypothesized that creating a big SQL insert statement would be faster than executing a bunch of small insert statements. I didn’t have time to create a test, but I wanted to go back and test this theory at some point. The idea was that even though there is a penalty for looping through an array and accessing a variable to build the SQL statement that it would still be faster than asking PHP to send a bunch of tiny querries.

My test server is my local machine running Ubuntu 10.04, dual 1.80Ghz core Intel processors with 2 GB of memory. Its running the latest stable release of PHP5, Apache2, and MySql5. I used the standard php mysql_connect function, not mysqli. The database engine used in this test was MyIsam. I performed 1000 inserts in the first test, I then altered the code to build one giant insert. I restarted the apache and mysql server after the first scenario.

Read the rest of this entry »

In Programming, SQL (, , , )

MySQL engines, InnoDb versus MyISAM for web developers

Posted by chris on February 23rd, 2010 Comments (3)

Let me start out by saying I think MyISAM sucks. I hate it. It’s the default MySQL database engine, but its non-relational so a lot of people just start using it without exploring the other options. Hey thats okay, I did the same thing until some smart guys over at my last job introduced be to InnoDb. MyISAM is probably the best way to go for newer web developers just trying to cut their teeth on web application development. At some point its time to a pick up a book and learn how InnoDb can save you time, save you headaches, reduce the amount of code you write, and make the world a better place (okay that last one is a reach).

The best part about InnoDB is that its relational. Its transaction-safe too, but I’ll just focus on the relation side of things for now. What is a relation? A relation joins two tables together on a common value. Typically this is a parent-child relationship known as a one-to-many, but it can be a one-to-one relation too. Lets look at three tables.

tbl_profile
—————
profile_id
profile_name

tbl_profile_setting
———————–
profile_setting_id
profile_setting_name

tbl_profile_has_setting
—————————
profile_id
profile_setting_id

We have a profile table for storing whatever, then a profile can have settings. It doesn’t really matter what these settings are for the purpose of this article, but a profile can have multiple of these settings. You could have this same structure in MyISAM, but you would have to store the relations in your code. Your code is prone to errors. It happens, in fact is happens enough that I try to write as little code as possible. My goal is to leverage as much pre-written code as possible, because its been reviewed by more people and if I’m using that code I likely trust the source. InnoDB is an awesome example of this. Its widely deployed and written by people with more skills than me. No inferiority-complex here, thanks InnoDB.

For creating a relation. We’ll use phpMyAdmin. MySQL Administrator works great too and if your nutty enough you can look up the SQL for doing it in the mysql command line console. Go into the tbl_profile_has_setting table. In the structure tab you will see a link called Relation View. Click on this. You’ll notice a drop down next profile_id and profile_setting_id (these will only appear if you made these primary keys). You’ll need to create indexes on these two columns in the tbl_profile_has_setting table as well. Select the tbl_profile.profile_id and tbl_profile_setting.profile_setting_id for their respective columns. For the On Delete drop down select cascade.

What you’ve just done is create relations that have the following rules enforced by the database engine.

  1. When you delete a profile, its corresponding record(s) in tbl_profile_has_settings is deleted automajically
  2. When you delete a profile setting, its corresponding record(s) in tbl_profile_has_settings is deleted automajically
  3. When you add a record to tbl_profile_has_setting the profile_id and profile_setting_id must exist in their respective tables

Guess what, you don’t have to verify that the setting exists anymore when inserting into tbl_profile_has_setting and you don’t even need to worry about the profile existing. MySQL will return an error if these rules are violated. You now have referential integrity, clean data, and happy reports. You made all this possible just by creating the relation. So what did the cascade option do? That created rule 1 and 2 above. The auto-delete. Cascade should be used wisely as it can have devastating consequences (you records are automatically deleted), but when you implement a cascade this is normally what you want.

So why doesn’t everyone use InnoDb over MyISAM. There are several reasons:

  1. You need to be more knowledgeable to use it. This isn’t just throw data in grab it out anymore. It takes more thought and for bigger projects you’ll want to create ER diagrams to flowchart out your database.
  2. Performance penalty. Since you’ve offloaded the work to the database engine your database now runs slower. I scoff when people use this as an argument against InnoDb. If your application has gotten to be so successful that InnoDb is the sole reason of your slow down then congratulations, not many people are as successful as you. Plus InnoDb operates in a lower level language that is faster than the PHP code you are writing. Also most of your slowdowns in PHP web applications can be attributed to poorly written queries, bad database design to begin with, and lack of innovation to come up with solutions to improve speed.
  3. Harded to backup. Yes you can still use the mysqldump to backup your data, but you can’t copy the actaul database file like you could with MyIsam. This is a crappy form of database backup anyways. If you’re big enough to wear the mysqldump is no longer a sane method of doing backups then just stop being cheap and go buy the enterprise software to manage your data. Your data is important to you right?

Hope this helps some people and I hope it offends some people as well. This is one of those things that I cannot find common ground on, its debated often between me and co-workers. On a side note, don’t let your domain expire while on vacation. You’ll lose your SERPs fast.

In Programming, SQL (, , )

SQL_CALC_FOUND_ROWS – Get Total Rows in MySQL Query

Posted by chris on August 13th, 2009 Comments(0)

The MySQL SQL_CALC_FOUND_ROWS function is a nice way to return how many rows were returned in the query. There has been a lot of discussion in the PHP.net entry on mysql_num_rows regarding this function. The debate centers around whether its more effecient to use MySQLs built in functionality or whether its more effecient to run the same query again using the COUNT() function.

For me, its hard to determine which way is better. Usually its better to leverage your database engine than code. There is not an easy way to tell how database cache plays into this either. I feel using SQL_CALC_FOUND_ROWS is the better option, it eliminates a few extra lines of code, and prevents you from having to update multiple queries. Whether there is a performance penalty in either case is debatable, if its even noticeable…

1
SELECT SQL_CALC_FOUND_ROWS * FROM tbl_customer WHERE entry_date > '2009-01-01';

In a separate query run this (I believe this is connection dependent, so they must be run in conjunction with each other within the life of the same connection).

1
SELECT FOUND_ROWS() AS totalRows;
In SQL (, , )

MySQL WITH ROLLUP for Easy Automatic Grouped Total Columns

Posted by chris on June 4th, 2009 Comments(0)

Using the WITH ROLLUP modifier in queries using GROUP BY will add an additional row to the result set which sums all columns. This prevents you from having you to write code which adds each column in your programming language.

Description from the Mysql Reference Manual

The GROUP BY clause allows a WITH ROLLUP modifier that causes extra rows to be added to the summary output. These rows represent higher-level (or super-aggregate) summary operations. ROLLUP thus allows you to answer questions at multiple levels of analysis with a single query. It can be used, for example, to provide support for OLAP (Online Analytical Processing) operations.

Example

1
2
3
4
5
6
7
8
SELECT 
	DATE(dateTime) AS order_date, COUNT(*) AS shipments, SUM(shipping_total) AS shipping_total, SUM(hasShipAmt) AS hasShipAmt, SUM(shipping_total)-SUM(hasShipAmt) AS revenue  
FROM 
	tbl_my_orders 
WHERE 
	dateTime BETWEEN '$startDate' AND '$endDate 23:59:59' 
GROUP BY 
	order_date WITH ROLLUP
In SQL (, , , )

MS SQL Show Tables in a Database (Microsoft SQL)

Posted by chris on February 11th, 2009 Comments (3)

In MS SQL use the following SQL to show all tables in the selected database. This is the equivalent of MySQLs show all tables.

MS SQL queries for show table:

1
SELECT * FROM INFORMATION_SCHEMA.TABLES WHERE TABLE_TYPE = 'BASE TABLE';

Using this query syntax you can display all tables in a MS SQL database. Hmmm… I think I like the MySQL way for showing all tables better:

1
SHOW ALL TABLES;

Drop me a comment if this helped your or if you feel I missed something, thanks for reading.

In SQL (, )

preventing sql injection attacks on mysql, php, lamp servers

Posted by chris on February 3rd, 2009 Comments(0)

I was recently auditing a clients website and was astounded when I was able to access any customers account by disabling JavaScript, using a wild card ‘%’ operator as the username, and guessing passwords. Unfortunately the developer of the customer login used a query that looked like this and did not clean any of the data once received by the server:

1
SELECT * FROM the_customer_table WHERE the_username LIKE "'.$_POST['the_username_field'].'" AND password="'.$_POST['the_password_field'].'"

Using the wild card operator it would return any username where the password matched whatever I passed in such as “password”, “happiness”, and some dirty words that I threw in for fun. The exploit could have been very bad actually as certain accounts have gift certificates associated with them. A hacker could have written a simple brute force script in any language and went to town compiling a list of valid accounts and then ordered hundreds to thousands of dollars worth of product. The exploit was patched by altering the query and additional adding an add_slashes() to all POST data for the login.

A proper login query should look like this:

1
2
3
4
5
6
7
8
$valid=false;
$sql = "SELECT the_username,the_password FROM the_customer_table WHERE the_username ="'.mysql_real_escape_string($_POST['the_username_field']).'" LIMIT 1";
$result = mysql_query($sql);
$row = mysql_fetch_assoc($result);
if($row['the password']==$_POST['the_password_field']){
$valid=true;
}
return $valid;

This will verify first that a valid customer account exists, then it will test the actual password against what was supplied by the end-user. This of course should occur after the POST data has been cleansed against SQL injection hacks and JavaScript hacks. Never use LIKE when you are searching for an exact match either and always use a LIMIT 1 when you are only expecting 1 record to be returned or 1 record to be affected.

In SQL ()

mysql removing characters from end of value

Posted by chris on December 4th, 2008 Comments(0)

Here’s a quick MySQL trick a co-worker helped me figure out today. This roughly equates to PHPs substr($var,0,-2) functionality.

1
SUBSTRING(field, 1, (LENGTH(field) - 2))

So if the fields value was 12345xx this would change the returned value to 12345.

In SQL ()

moving innodb mysql tables & database to a new server

Posted by chris on November 11th, 2008 Comments(0)

A standard mysqldump will not work for moving InnoDB tables to a new server. It will create the tables and the structure, but will fail on inserting the data. Why? If you have any foreign key constraints mysqldump is not “smart” enough to insert the rows containing the primary keys before the rows using those keys as foreign keys. You will get MySQL error 150: Foreign key constraint is incorrectly formed.

There are two methods two get around this:

Read the rest of this entry »

In SQL ()

Using perror to Help Debug Mysql Errors

Posted by chris on November 4th, 2008 Comments(0)

Print a description for a system error code or an error code from a MyISAM/ISAM/BDB table handler. Example, type “perror 150″ in the linux shell.

1
2
# perror 150
MySQL error code 150: Foreign key constraint is incorrectly formed
In Linux, SQL (, , )

Using MySQL Functions in WHERE Clauses Breaks Speed Gains from Indexing?

Posted by chris on August 8th, 2008 Comments(0)

I was conversing with a co-worker today and asked him to take a look at some queries of mine hoping he could determine why they were so sluggish. I had already ruled out a bad inner join, sub select, and lack of indexes. The query looked at a months worth of data using something like:

1
WHERE DATE(date_time) '2008-08-08'

For whatever reason this will cause MySQL (at least MySQL 4) to ignore the indexing on the date_time field. Changing your query to this:

1
WHERE date_time BETWEEN '2008-08-08 00:00:00' AND '2008-08-08 23:59:59'

Can save significant execution time. In my tests on a table with over 300,000 records this dropped by query execution time from roughly 4 seconds to .4 seconds, 10x faster! Using mysql functions in your select clause does not seem to negatively impact execution time.

In SQL (, , )