Chris Nizzardini, Salt Lake City Utah, Web Developer Specializing in LAMP+Ajax Since 2006

My Blog

Here is my awesome blog. You can find information on programming, linux, documentation, tips for code and database optimization, my thoughts and rants, and whatever else I feel like sharing. Feel free to contribute to the blog by posting comments and asking questions.

Posts Tagged ‘mysql’

MySQL InnoDb inserts are slow, really slow!

Posted by chris on June 12th, 2010 Comments (6)

So I was doing some MySQL database optimization tests this morning when I got my database engines mixed up. Thats when I stumbled on something horrible….InnoDb is really slow when it comes to inserts! Like really slow.

I performed 10,000 inserts on MyIsam and it executed in about 1/10th of a second. The same inserts on an InnoDb table took over 21 seconds! I was seriously dumbfounded by this. I knew MyIsam would be faster, but I figured it would be only nominally faster.

Out of curiosity I did a single insert with MyIsam which executed in 0.000762939453125 seconds whereas the same insert in InnoDb executed in 0.032792091369629 seconds. I must say this is a bummer for anyone using InnoDb in a high-insert environment. I’ll have to see what the differences are on updates, selects, deletes, and joins.

In SQL (, , )

Optimize MySQL Queries – Fast Inserts With Multiple Rows

Posted by chris on May 31st, 2010 Comment(1)

I was programming some code that needed to do a lot of inserts a few months back. I hypothesized that creating a big SQL insert statement would be faster than executing a bunch of small insert statements. I didn’t have time to create a test, but I wanted to go back and test this theory at some point. The idea was that even though there is a penalty for looping through an array and accessing a variable to build the SQL statement that it would still be faster than asking PHP to send a bunch of tiny querries.

My test server is my local machine running Ubuntu 10.04, dual 1.80Ghz core Intel processors with 2 GB of memory. Its running the latest stable release of PHP5, Apache2, and MySql5. I used the standard php mysql_connect function, not mysqli. The database engine used in this test was MyIsam. I performed 1000 inserts in the first test, I then altered the code to build one giant insert. I restarted the apache and mysql server after the first scenario.

Read the rest of this entry »

In Programming, SQL (, , , )

MySQL engines, InnoDb versus MyISAM for web developers

Posted by chris on February 23rd, 2010 Comments (5)

Let me start out by saying I think MyISAM sucks. I hate it. It’s the default MySQL database engine, but its non-relational so a lot of people just start using it without exploring the other options. Hey thats okay, I did the same thing until some smart guys over at my last job introduced be to InnoDb. MyISAM is probably the best way to go for newer web developers just trying to cut their teeth on web application development. At some point its time to a pick up a book and learn how InnoDb can save you time, save you headaches, reduce the amount of code you write, and make the world a better place (okay that last one is a reach).

The best part about InnoDB is that its relational. Its transaction-safe too, but I’ll just focus on the relation side of things for now. What is a relation? A relation joins two tables together on a common value. Typically this is a parent-child relationship known as a one-to-many, but it can be a one-to-one relation too. Lets look at three tables.

tbl_profile
—————
profile_id
profile_name

tbl_profile_setting
———————–
profile_setting_id
profile_setting_name

tbl_profile_has_setting
—————————
profile_id
profile_setting_id

We have a profile table for storing whatever, then a profile can have settings. It doesn’t really matter what these settings are for the purpose of this article, but a profile can have multiple of these settings. You could have this same structure in MyISAM, but you would have to store the relations in your code. Your code is prone to errors. It happens, in fact is happens enough that I try to write as little code as possible. My goal is to leverage as much pre-written code as possible, because its been reviewed by more people and if I’m using that code I likely trust the source. InnoDB is an awesome example of this. Its widely deployed and written by people with more skills than me. No inferiority-complex here, thanks InnoDB.

For creating a relation. We’ll use phpMyAdmin. MySQL Administrator works great too and if your nutty enough you can look up the SQL for doing it in the mysql command line console. Go into the tbl_profile_has_setting table. In the structure tab you will see a link called Relation View. Click on this. You’ll notice a drop down next profile_id and profile_setting_id (these will only appear if you made these primary keys). You’ll need to create indexes on these two columns in the tbl_profile_has_setting table as well. Select the tbl_profile.profile_id and tbl_profile_setting.profile_setting_id for their respective columns. For the On Delete drop down select cascade.

What you’ve just done is create relations that have the following rules enforced by the database engine.

  1. When you delete a profile, its corresponding record(s) in tbl_profile_has_settings is deleted automajically
  2. When you delete a profile setting, its corresponding record(s) in tbl_profile_has_settings is deleted automajically
  3. When you add a record to tbl_profile_has_setting the profile_id and profile_setting_id must exist in their respective tables

Guess what, you don’t have to verify that the setting exists anymore when inserting into tbl_profile_has_setting and you don’t even need to worry about the profile existing. MySQL will return an error if these rules are violated. You now have referential integrity, clean data, and happy reports. You made all this possible just by creating the relation. So what did the cascade option do? That created rule 1 and 2 above. The auto-delete. Cascade should be used wisely as it can have devastating consequences (you records are automatically deleted), but when you implement a cascade this is normally what you want.

So why doesn’t everyone use InnoDb over MyISAM. There are several reasons:

  1. You need to be more knowledgeable to use it. This isn’t just throw data in grab it out anymore. It takes more thought and for bigger projects you’ll want to create ER diagrams to flowchart out your database.
  2. Performance penalty. Since you’ve offloaded the work to the database engine your database now runs slower. I scoff when people use this as an argument against InnoDb. If your application has gotten to be so successful that InnoDb is the sole reason of your slow down then congratulations, not many people are as successful as you. Plus InnoDb operates in a lower level language that is faster than the PHP code you are writing. Also most of your slowdowns in PHP web applications can be attributed to poorly written queries, bad database design to begin with, and lack of innovation to come up with solutions to improve speed.
  3. Harded to backup. Yes you can still use the mysqldump to backup your data, but you can’t copy the actaul database file like you could with MyIsam. This is a crappy form of database backup anyways. If you’re big enough to wear the mysqldump is no longer a sane method of doing backups then just stop being cheap and go buy the enterprise software to manage your data. Your data is important to you right?

Hope this helps some people and I hope it offends some people as well. This is one of those things that I cannot find common ground on, its debated often between me and co-workers. On a side note, don’t let your domain expire while on vacation. You’ll lose your SERPs fast.

In Programming, SQL (, , )

How To Write a Page Controller in PHP for Dynamic Content

Posted by chris on February 6th, 2010 Comments (2)

This how to will cover the topic of creating a dynamic content system. It’s a well known fact that when you come across a site like wikipedia that they don’t have an html file for each article. That would be insanity. It would be nearly impossible to display the file tree in an IDE and cumbersome to search through even with an OS that has a slick file system and powerful shell like Linux. Trust me, I worked on a site that created a unique page for each product on their site (they’ve since gotten with the times). So how can web browsers access a page like http://en.wikipedia.org/wiki/Mike_Tyson, when that file doesn’t exist. The application uses a combination of server-side code, database storage, and apache htaccess magic. Here’s how to do this.

Apache HTACCESS
This is the most important part of redirecting dynamic content. The .htaccess file is what makes the magic happen. What happens is a user requests http://en.wikipedia.org/wiki/Mike_Tyson, apache goes to process the request and does its thing. Normally apache would redirect this to a 404 error page because the file does not exist, but if it see’s the .htaccess file in the directory, then apache will follow the rules we defined in the htaccess. Our rule will tell apache that if the file is not found, to go to some other file. We will call this file mycontroller.php (because its the controller in our ModelViewController). Below is some example code to get your started:

1
2
3
4
5
6
7
8
9
10
11
12
Options -Indexes
Options +FollowSymLinks
DirectoryIndex index.php
ErrorDocument 404 /404.php
 
<IfModule mod_rewrite.c>
  RewriteEngine on
  RewriteCond %{REQUEST_FILENAME} !-f
  RewriteCond %{REQUEST_FILENAME} !-d
  RewriteCond %{REQUEST_URI} !=/favicon.ico
  RewriteRule ^(.*)$ mycontroller.php [L,QSA]
</IfModule>

Recap:
1. We have /public_html/mydomain.com/wiki/.htaccess
This overwrites the Apache Web Servers default operating procedures.

2. We redirect the request to /public_html/mydomain.com/wiki/mycontroller.php
This contains the server side code that will handle our request for the Mike Tyson article.

The Database
Going in detail on this topic is beyond the scope of this article, but you’ll need some sort of database management system to store your article on Mike Tyson and the thousands of other articles. Of course there are other options like an XML file, but a database such as MySQL is the sanest approach for most sites.

Server Side Code
You’ll need some sort of server-side code running whether is ASP, JSP, or PHP. I’m a bit partial to PHP so lets roll with that. In mycontroller.php your code might look something like this:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
$uriArr = explode('/',$_SERVER['REQUEST_URI']); 
$article = $uriArr[2];
$article = urldecode($page); // in this case the article equates to Mike_Tyson
 
$sql = "SELECT * FROM article WHERE name = '$article'";
$result = mysql_query($sql);
if(mysql_num_rows($result) == 1){
	$page = mysql_fetch_assoc($result);
	header('HTTP/1.1 200 OK');
	header('Connection: close');
	include_once 'mytemplatepage.php';
DIE():
}
else{
	header('HTTP/1.1 404 Not Found');
	header('location: /404.php');
	header('Connection: close');
DIE();
}

This is asking our database for any records it has on the request Mike_Tyson. If a row is returned than we know we’ve found our article. We tell the browser that this is a 200 OK request. Then we include a file called mytemplate.php (this file is never actually seen by the browser). We set the record in the database to a variable array called $page. Our mytemplate.php file will look for this variable and begin populating the article. Lets see $page contains the following data: Title, Body, Image, and References. The mytemplate.php file might look something like this:

1
2
3
4
5
6
7
8
9
10
$title = $page['title '];
$body = $page['body '];
$image = $page['image '];
$references = $page['references '];
echo "<html><head><title>$title</title></head><body>";
echo "<h1>$title</h1>";
echo "<div class="mainImage">$image</div>";
echo "<p>$body</p>";
echo "<p>$references</p>";
echo "</body></html>";

Sweet! We can use the same template for a bunch of different articles, without having to create multiple files. Now if the user had searched for the following url: http://en.wikipedia.org/wiki/Mike_TysonIsEvil, we wouldn’t have an article on that. So instead the code would tell the browser this is a 404 Error and route the browser to the 404.php page.

This is an over simplified version of a dynamic content system, but it would work. If I was developing one of these on a professional level it would be complete with objects to handle requests, string cleaners to protect against SQL injection and XSS attacks, error logging, and the works! Let me know if you have questions I can answer and thanks for reading.

Drop me a comment if this helped you out or have something to add, thanks for reading.

In Programming, Seo (, , , , , )

SQL_CALC_FOUND_ROWS – Get Total Rows in MySQL Query

Posted by chris on August 13th, 2009 Comments(0)

The MySQL SQL_CALC_FOUND_ROWS function is a nice way to return how many rows were returned in the query. There has been a lot of discussion in the PHP.net entry on mysql_num_rows regarding this function. The debate centers around whether its more effecient to use MySQLs built in functionality or whether its more effecient to run the same query again using the COUNT() function.

For me, its hard to determine which way is better. Usually its better to leverage your database engine than code. There is not an easy way to tell how database cache plays into this either. I feel using SQL_CALC_FOUND_ROWS is the better option, it eliminates a few extra lines of code, and prevents you from having to update multiple queries. Whether there is a performance penalty in either case is debatable, if its even noticeable…

1
SELECT SQL_CALC_FOUND_ROWS * FROM tbl_customer WHERE entry_date > '2009-01-01';

In a separate query run this (I believe this is connection dependent, so they must be run in conjunction with each other within the life of the same connection).

1
SELECT FOUND_ROWS() AS totalRows;
In SQL (, , )

MySQL WITH ROLLUP for Easy Automatic Grouped Total Columns

Posted by chris on June 4th, 2009 Comments(0)

Using the WITH ROLLUP modifier in queries using GROUP BY will add an additional row to the result set which sums all columns. This prevents you from having you to write code which adds each column in your programming language.

Description from the Mysql Reference Manual

The GROUP BY clause allows a WITH ROLLUP modifier that causes extra rows to be added to the summary output. These rows represent higher-level (or super-aggregate) summary operations. ROLLUP thus allows you to answer questions at multiple levels of analysis with a single query. It can be used, for example, to provide support for OLAP (Online Analytical Processing) operations.

Example

1
2
3
4
5
6
7
8
SELECT 
	DATE(dateTime) AS order_date, COUNT(*) AS shipments, SUM(shipping_total) AS shipping_total, SUM(hasShipAmt) AS hasShipAmt, SUM(shipping_total)-SUM(hasShipAmt) AS revenue  
FROM 
	tbl_my_orders 
WHERE 
	dateTime BETWEEN '$startDate' AND '$endDate 23:59:59' 
GROUP BY 
	order_date WITH ROLLUP
In SQL (, , , )

Using perror to Help Debug Mysql Errors

Posted by chris on November 4th, 2008 Comments(0)

Print a description for a system error code or an error code from a MyISAM/ISAM/BDB table handler. Example, type “perror 150″ in the linux shell.

1
2
# perror 150
MySQL error code 150: Foreign key constraint is incorrectly formed
In Linux, SQL (, , )

Using MySQL Functions in WHERE Clauses Breaks Speed Gains from Indexing?

Posted by chris on August 8th, 2008 Comments(0)

I was conversing with a co-worker today and asked him to take a look at some queries of mine hoping he could determine why they were so sluggish. I had already ruled out a bad inner join, sub select, and lack of indexes. The query looked at a months worth of data using something like:

1
WHERE DATE(date_time) '2008-08-08'

For whatever reason this will cause MySQL (at least MySQL 4) to ignore the indexing on the date_time field. Changing your query to this:

1
WHERE date_time BETWEEN '2008-08-08 00:00:00' AND '2008-08-08 23:59:59'

Can save significant execution time. In my tests on a table with over 300,000 records this dropped by query execution time from roughly 4 seconds to .4 seconds, 10x faster! Using mysql functions in your select clause does not seem to negatively impact execution time.

In SQL (, , )

Mod-Log-SQL – Storing Apache Access Logs in a MySql Database

Posted by chris on October 27th, 2007 Comments(0)

Mod log sql is an awesome way of getting away from those old log files and is really handy for both web development and system administration. It’s been a while since I’ve posted a blog and this is something I’ve never done before so here we go. If you are ever doing any kind of parsing of your apache access log and run a relatively high traffic website (the one I’m doing this for, mp3crib.com, averages over 5,000 hits a day and some days gets over 10,000) you will begin eating up huge amounts of CPU and memory (if storing the information in an array). Well lately it’s gotten so bad that my PHP script fails due to memory exhaustion. I can’t have that. I heard somewhere that databases are better than flat files, go figure. If I knew a lower level language then I would of course write my parser in that…but I don’t. So luckily libapache2-mod-log-sql exists.
Read the rest of this entry »

In Linux, SQL (, , , )