MySQL engines, InnoDb versus MyISAM for web developers February 23, 2010
Posted by chris in : SQL, php , add a commentLet me start out by saying I think MyISAM sucks. I hate it. It’s the default MySQL database engine, but its non-relational so a lot of people just start using it without exploring the other options. Hey thats okay, I did the same thing until some smart guys over at my last job introduced be to InnoDb. MyISAM is probably the best way to go for newer web developers just trying to cut their teeth on web application development. At some point its time to a pick up a book and learn how InnoDb can save you time, save you headaches, reduce the amount of code you write, and make the world a better place (okay that last one is a reach).
The best part about InnoDB is that its relational. Its transaction-safe too, but I’ll just focus on the relation side of things for now. What is a relation? A relation joins two tables together on a common value. Typically this is a parent-child relationship known as a one-to-many, but it can be a one-to-one relation too. Lets look at three tables.
tbl_profile
—————
profile_id
profile_name
tbl_profile_setting
———————–
profile_setting_id
profile_setting_name
tbl_profile_has_setting
—————————
profile_id
profile_setting_id
We have a profile table for storing whatever, then a profile can have settings. It doesn’t really matter what these settings are for the purpose of this article, but a profile can have multiple of these settings. You could have this same structure in MyISAM, but you would have to store the relations in your code. Your code is prone to errors. It happens, in fact is happens enough that I try to write as little code as possible. My goal is to leverage as much pre-written code as possible, because its been reviewed by more people and if I’m using that code I likely trust the source. InnoDB is an awesome example of this. Its widely deployed and written by people with more skills than me. No inferiority-complex here, thanks InnoDB.
For creating a relation. We’ll use phpMyAdmin. MySQL Administrator works great too and if your nutty enough you can look up the SQL for doing it in the mysql command line console. Go into the tbl_profile_has_setting table. In the structure tab you will see a link called Relation View. Click on this. You’ll notice a drop down next profile_id and profile_setting_id (these will only appear if you made these primary keys). You’ll need to create indexes on these two columns in the tbl_profile_has_setting table as well. Select the tbl_profile.profile_id and tbl_profile_setting.profile_setting_id for their respective columns. For the On Delete drop down select cascade.
What you’ve just done is create relations that have the following rules enforced by the database engine.
- When you delete a profile, its corresponding record(s) in tbl_profile_has_settings is deleted automajically
- When you delete a profile setting, its corresponding record(s) in tbl_profile_has_settings is deleted automajically
- When you add a record to tbl_profile_has_setting the profile_id and profile_setting_id must exist in their respective tables
Guess what, you don’t have to verify that the setting exists anymore when inserting into tbl_profile_has_setting and you don’t even need to worry about the profile existing. MySQL will return an error if these rules are violated. You now have referential integrity, clean data, and happy reports. You made all this possible just by creating the relation. So what did the cascade option do? That created rule 1 and 2 above. The auto-delete. Cascade should be used wisely as it can have devastating consequences (you records are automatically deleted), but when you implement a cascade this is normally what you want.
So why doesn’t everyone use InnoDb over MyISAM. There are several reasons:
- You need to be more knowledgeable to use it. This isn’t just throw data in grab it out anymore. It takes more thought and for bigger projects you’ll want to create ER diagrams to flowchart out your database.
- Performance penalty. Since you’ve offloaded the work to the database engine your database now runs slower. I scoff when people use this as an argument against InnoDb. If your application has gotten to be so successful that InnoDb is the sole reason of your slow down then congratulations, not many people are as successful as you. Plus InnoDb operates in a lower level language that is faster than the PHP code you are writing. Also most of your slowdowns in PHP web applications can be attributed to poorly written queries, bad database design to begin with, and lack of innovation to come up with solutions to improve speed.
- Harded to backup. Yes you can still use the mysqldump to backup your data, but you can’t copy the actaul database file like you could with MyIsam. This is a crappy form of database backup anyways. If you’re big enough to wear the mysqldump is no longer a sane method of doing backups then just stop being cheap and go buy the enterprise software to manage your data. Your data is important to you right?
Hope this helps some people and I hope it offends some people as well. This is one of those things that I cannot find common ground on, its debated often between me and co-workers. On a side note, don’t let your domain expire while on vacation. You’ll lose your SERPs fast.
mysql get total rows in query with sql_calc_found_rows August 13, 2009
Posted by chris in : SQL , add a commentThe MySQL SQL_CALC_FOUND_ROWS function is a nice way to return how many rows were returned in the query. There has been a lot of discussion in the PHP.net entry on mysql_num_rows regarding this function. The debate centers around whether its more effecient to use MySQLs built in functionality or whether its more effecient to run the same query again using the COUNT() function.
For me, its hard to determine which way is better. Usually its better to leverage your database engine than code. There is not an easy way to tell how database cache plays into this either. I feel using SQL_CALC_FOUND_ROWS is the better option, it eliminates a few extra lines of code, and prevents you from having to update multiple queries. Whether there is a performance penalty in either case is debatable, if its even noticeable…
SELECT SQL_CALC_FOUND_ROWS * FROM tbl_customer WHERE entry_date > '2009-01-01';
In a separate query run this (I believe this is connection dependent, so they must be run in conjunction with each other within the life of the same connection).
SELECT FOUND_ROWS() as totalRows;
mysql with rollup for a easy grouped total columns in result set June 4, 2009
Posted by chris in : SQL , add a commentUsing the WITH ROLLUP modifier in queries using GROUP BY will add an additional row to the result set which sums all columns. This prevents you from having you to write code which adds each column in your programming language.
Description from the Mysql Reference Manual
The GROUP BY clause allows a WITH ROLLUP modifier that causes extra rows to be added to the summary output. These rows represent higher-level (or super-aggregate) summary operations. ROLLUP thus allows you to answer questions at multiple levels of analysis with a single query. It can be used, for example, to provide support for OLAP (Online Analytical Processing) operations.
Example
SELECT DATE(dateTime) AS order_date, count(*) as shipments, sum(shipping_total) as shipping_total, sum(hasShipAmt) as hasShipAmt, sum(shipping_total)-sum(hasShipAmt) AS revenue FROM tbl_my_orders WHERE dateTime BETWEEN '$startDate' AND '$endDate 23:59:59' GROUP BY order_date WITH ROLLUP
preventing sql injection attacks on mysql, php, lamp servers February 3, 2009
Posted by chris in : SQL , add a commentI was recently auditing a clients website and was astounded when I was able to access any customers account by disabling JavaScript, using a wild card ‘%’ operator as the username, and guessing passwords. Unfortunately the developer of the customer login used a query that looked like this and did not clean any of the data once received by the server:
SELECT * FROM the_customer_table WHERE the_username LIKE "'.$_POST['the_username_field'].'" AND password="'.$_POST['the_password_field'].'"
Using the wild card operator it would return any username where the password matched whatever I passed in such as “password”, “happiness”, and some dirty words that I threw in for fun. The exploit could have been very bad actually as certain accounts have gift certificates associated with them. A hacker could have written a simple brute force script in any language and went to town compiling a list of valid accounts and then ordered hundreds to thousands of dollars worth of product. The exploit was patched by altering the query and additional adding an add_slashes() to all POST data for the login.
A proper login query should look like this:
$valid=false;
$sql = "SELECT the_username,the_password FROM the_customer_table WHERE the_username ="'.$_POST['the_username_field'].'" LIMIT 1";
$result = mysql_query($sql);
$row = mysql_fetch_assoc($result);
if($row['the password']==$_POST['the_password_field']){
$valid=true;
}
return $valid;
This will verify first that a valid customer account exists, then it will test the actual password against what was supplied by the end-user. This of course should occur after the POST data has been cleansed against SQL injection hacks and JavaScript hacks. Never use LIKE when you are searching for an exact match either and always use a LIMIT 1 when you are only expecting 1 record to be returned or 1 record to be affected.
mysql removing characters from end of value December 4, 2008
Posted by chris in : SQL , add a commentHere’s a quick MySQL trick a co-worker helped me figure out today. This roughly equates to PHPs substr($var,0,-2) functionality.
SUBSTRING(field, 1, (LENGTH(field) - 2))
So if the fields value was 12345xx this would change the returned value to 12345.
moving innodb mysql tables & database to a new server November 11, 2008
Posted by chris in : SQL , add a commentA standard mysqldump will not work for moving InnoDB tables to a new server. It will create the tables and the structure, but will fail on inserting the data. Why? If you have any foreign key constraints mysqldump is not “smart” enough to insert the rows containing the primary keys before the rows using those keys as foreign keys. You will get MySQL error 150: Foreign key constraint is incorrectly formed.
There are two methods two get around this:
adding an innodb mysql foreign key with phpmyadmin November 11, 2008
Posted by chris in : SQL , add a commentIdeally you should be using MySQLs InnoDB engine in the database layer of your web applications. InnoDB is a “better” database engine and therefor a “better” web development practice. InnoDB allows for referential integrity and makes it easier to have a normalized database. In MyISAM you have to do this in your code, one little error and your database can get out of whack. In some cases MyISAM is the way to go. For instance, your client already has a MyISAM based system or you are developing an application that may be running on multiple platforms where you have no idea what the system administrator may have installed. MyISAM is the default storage engine for MySQL and is widely used, but InnoDB is the new kid on the block and should see widespread adoption in the coming years.
Below is a look at how to setup InnoDB foreign keys with phpMyAdmin. I will also provide the SQL statements used as well.
how to create a table with a datetime stamp in sql (mysql) November 6, 2008
Posted by chris in : SQL , add a commentThis one is easy and often comes in handy for logs in your web application. If you are using phpMyAdmin just create your table as your normally would and set the field that you would like to be a datetime stamp as a type of ‘timestamp’ and you’re done. Otherwise if you are doing this in the mysql shell use the following SQL statement as an example:
CREATE TABLE 'test' ( 'id' TINYINT NOT NULL , 'time' TIMESTAMP NOT NULL , PRIMARY KEY ( `'id' ) ) TYPE = MYISAM ;
Remeber when using a timestamp it will update the timestamp everytime the record is modified.
using perror to help debug mysql errors November 4, 2008
Posted by chris in : SQL, linux , add a commentPrint a description for a system error code or an error code from a MyISAM/ISAM/BDB table handler. Example, type “perror 150″ in the linux shell.
# perror 150 MySQL error code 150: Foreign key constraint is incorrectly formed
using functions in mysql where clauses breaks speed gains from indexing? August 8, 2008
Posted by chris in : SQL , add a commentI was conversing with a co-worker today and asked him to take a look at some queries of mine hoping he could determine why they were so sluggish. I had already ruled out a bad inner join, sub select, and lack of indexes. The query looked at a months worth of data using something like:
WHERE DATE(date_time) '2008-08-08'
For whatever reason this will cause MySQL (at least MySQL 4) to ignore the indexing on the date_time field. Changing your query to this:
WHERE date_time BETWEEN '2008-08-08 00:00:00' AND '2008-08-08 23:59:59'
Can save significant execution time. In my tests on a table with over 300,000 records this dropped by query execution time from roughly 4 seconds to .4 seconds, 10x faster! Using mysql functions in your select clause does not seem to negatively impact execution time.