The Full Monty- Part 3

October 21, 2011

This is more boring parts but you must lay a good foundation.

MySQL Setup:

All the resources must be in place before you configure Pacemaker. The default location for MySQL data is /var/lib/mysql.  You will be moving this to the DRBD file system in /data/mysql.   The default MySQL configuration is in /etc/my.cnf.  I move it to /data/mysql so updates will move with system fail overs.   To keep life simple I use sym-links to point from /var/lib/mysql and /etc/my.cnf to their new locations.
Install MySQL of your choise on DB1, DB2 and DB3.

# rpm -Uhv http://www.percona.com/downloads/percona-release/percona-release-0.0-1.x86_64.rpm
# yum -y install Percona-Server-server-51.x86_64 Percona-Server-client-51.x86_64 \
 Percona-Server-shared-51.x86_64 xtrabackup.x86_64
# service mysql start
# mysql -e "CREATE FUNCTION fnv1a_64 RETURNS INTEGER SONAME 'libfnv1a_udf.so'"
# mysql -e "CREATE FUNCTION fnv_64 RETURNS INTEGER SONAME 'libfnv_udf.so'"
# mysql -e "CREATE FUNCTION murmur_hash RETURNS INTEGER SONAME 'libmurmur_udf.so'"
# service mysql stop
# chkconfig mysql off

ON DB1, move MySQL data to the DRBD file system.

# df
Filesystem        1K-blocks   Used Available Use% Mounted on
 /dev/md2           11903664      1633608   9655620   15% /
 /dev/md1             505508        218159    261250    46% /tmp
 /dev/md0             256586         25967    217371    11% /boot
 tmpfs               2037412  16348   2021064    1% /dev/shm
 /dev/drbd1         30477552 176200  28753152    1% /data

On DB1 and DB3

# mkdir /data/mysql
# cp -prv /var/lib/mysql/* /data/mysql
# mv /var/lib/mysql /var/lib/mysql-empty
# ln -s /data/mysql /var/lib/mysql
# touch /data/mysql/my.cnf
# ln -sf /data/mysql/my.cnf /etc/my.cnf
# chown -R mysql.mysql /data/mysql
# chmod 644 /data/mysql/my.cnf

On DB2, make it ready for a fail over.

# mv /var/lib/mysql /var/lib/mysql-empty
# ln -sf /data/mysql /var/lib/mysql
# ln -sf /data/mysql/my.cnf /etc/my.cnf

Configure MySQL (/data/mysql/my.cnf)

I have a documented my.cnf file I start with. You need to change the server-id number on each server.  The minimum setting you will need are:

[mysqld]
 log-bin=mysql-bin
 server-id=1
 innodb_flush_log_at_trx_commit=2
 sync_binlog=1

on DB1

# service mysql start

Now we need to add a couple of user before we install the replicator.  We also need to set a password for the root user.  The root password is blank so just it return.

If you have skip-name-resolve set you will need to substitute the host names for IP addresses.

# mysql -h localhost -u root -p
  mysql> DROP USER ''@'db.grennan.com';
  mysql> DROP USER ''@'localhost';
  mysql> GRANT ALL on *.* TO 'root'@'%' IDENTIFIED by 'P@ssw0rd' with GRANT option;
  mysql> GRANT ALL on *.* TO 'root'@'localhost' IDENTIFIED by 'P@ssw0rd' with GRANT option;
  mysql> GRANT ALL ON *.* TO 'tungsten'@'127.0.0.1' IDENTIFIED BY 'P@ssw0rd' WITH GRANT OPTION;
  mysql> GRANT ALL ON *.* TO 'tungsten'@'localhost' IDENTIFIED BY 'P@ssw0rd' WITH GRANT OPTION;
  mysql> GRANT ALL ON *.* TO 'tungsten'@'db.grennan.com' IDENTIFIED BY 'P@ssw0rd' WITH GRANT OPTION;
  mysql> GRANT ALL ON *.* TO 'tungsten'@'db2.grennan.com' IDENTIFIED BY 'P@ssw0rd' WITH GRANT OPTION;
  mysql> GRANT ALL ON *.* TO 'tungsten'@'db3.grennan.com' IDENTIFIED BY 'P@ssw0rd' WITH GRANT OPTION;
  mysql> flush privileges;

Now that we can access MySQL from any of the servers we need to prepare each as a master and replication.  We can fix DB1 and then copy all of the data tables to the other servers.

 mysql> reset master; reset slave;
  mysql> exit
# service mysql stop
# rsync -rog --delete /data/mysql  root@db2:/data
# rsync -rog --delete /data/mysql  root@db3:/data

Remember we put the my.cnf file in the /data/mysql directory.  We need to make sure each has a unique server_id number.

# vi /etc/my.cnf
 server_id = 1
# ssh db2
# vi /etc/my.cnf
 server_id = 2
# exit
# ssh db3
 server_id = 3
# exit

Rather than type the password for MySQL each type we connect.  We can also setup a private my.cnf to prevent this. This should connect you to the master (RW) server from each host.

# vi ~/.my.cnf
[Client]
 user=root
 password=P@ssw0rd
 host=db.grennan.com
 socket=/data/mysql/mysql.sock
# scp .my.cnf db2:.
# scp .my.cnf db3:.

 

Next up,  Configuring Heartbeat and making failover work!

 

 

What can I DROP?

September 30, 2011

So you have inherited a MySQL server and no one know what applications are use what databases.   Space is every tight and every night transactions logs almost fill your hard disk.  The server was once used as a replication server and it seems every database the company uses is duplicated here.

You could do a full dump, drop everything and wait for the requests to restore. (This might be OK on a test database.)

We could ask MySQL for all the databases that have tables that have been updated in the last 30 days.

$ mysql information_schema -e  \
"select DISTINCT TABLE_SCHEMA as NAME from TABLES
  WHERE UPDATE_TIME IS NOT NULL and UPDATE_TIME > NOW() - INTERVAL 30 DAY"  \
> Databases

But this only gives a list of databases that have changed.  What about tables that are only read from?

Bin-Logs don’t help because they too only have updates.  What we really need is a long snapshot of the general log. But, starting the general log will just fill your disk space even more.

As long as all queries are remote we can use tcpdump.

Filter all the MySQL connections for  ‘FROM tablenames’.

$ tcpdump -i eth1 -s 0 -l -w - dst port 3306 | strings | \
 sed -n 's/\(.*\)\(from.*\)/\2/p' | awk '{print $2}' > TableList
 $ cat TableList | sort | uniq > ShortList

Some of the queries may reference databases.tablename so we can filter out the database names and add them to your Database list.

$ cat TableList | sed -n 's/\./ /p' | awk '{print $1}' | sort | uniq >> Databases

Now for the queries that only use table names. We need to find the database each belongs to.  This isn’t completely accurate because the table name may be in more then one databases.

$  for x in `cat TableList | sort | uniq` ;
 do
 mysql information_schema -e "select TABLE_SCHEMA from TABLES where TABLE_NAME='$x'\G"  \
 | grep TABLE | awk '{print $2}' ;
 done | sort | uniq >> Databases

To create the final list of database that should not be dropped.

$ sort Databases | uniq > KEEPTHESE ; cat KEEPTHESE

Now you can do the deed with some confidence your not removing anything being used.

 

 

 

 

P.S.  Did I tell you, Never do anything you can’t reverse.   Make a BACKUP FIRST!

The Full Monty- Part 2

August 15, 2011

Installing DRDB in CentOS 5.6.

In Part 1 I when through the process of preparing a number of CentOS 5.6 servers. Now make the services they’ll preform more stable.

High Availability (HA)

I’ll be presenting two ways to provide redundant data and high available services. First, Pacemaker – with DRDB will duplicate your data at the disk partition level and watch for failures. Should the hardware failure, Pacemaker will take all the needed steps to start MySQL on the Hot Stand By (HSB). This is not perfect. Should someone run ‘rm *’ or drop a database DRDB will duplicate the loss on the HSB.

In another part, I’ll use Tungsten replicator. It offers a set of features that surpass the built-in MySQL replicator. The community version of Tungsten has global transaction IDs so even with many slaves, global transactions IDs make turning a slave into the master easy. Tungsten replicator is not a HA service. You have to manually fail to a new master. Tungsten Enterprise (if you have money) solves all the issues. With the tools the enterprise version supplies you can easily migrate the “master” server to any slave.

HA with Pacemaker:

Neither RedHat or CentOS supply PaceMaker packages. Redhat support their own propitiatory clustering suite. CentOS is suck trying to maintain compatibility with Redhat while still giving you a high availability system. CentOS does supply heartbeat and openais but not pacemaker. Thankful Redhat helps out by supporting the Fedora project and in turn Fedora provide an EPEL repository for Redhat 5.

The Pacemaker packages in Fedora’s EPEL directories are built against some additional packages that don’t exist on vanilla RHEL/CentOS installs. For more information on EPEL, see http://fedoraproject.org/wiki/EPEL/FAQ So before installing Pacemaker, you will first need to tell the machine how to find the EPEL packages Pacemaker depends on. To do this, download and install the EPEL package that matches your RHEL/CentOS version.

LINBIT is the primary maintainer of DRBD and offers a product and service portfolio for exactly what we are building here. They have produced a video that takes you through this same process using the the DRDB Console Manager. I’m going to take you through the same process by hand. I hope this way you will better understand the management touch points.

DRBD Installation:

I build two computer alike in every way (clones) and use DRBD to sync the data partitions on each. CentOS provides the packages needed.

 # wget http://www.clusterlabs.org/rpm/epel-5/clusterlabs.repo  
 # mv clusterlabs.repo /etc/yum.repos.d  
 # rpm -Uvh http://download.fedora.redhat.com/pub/epel/5/i386/epel-release-5-4.noarch.rpm  
 # yum -y install pacemaker.x86_64 heartbeat.x86_64 drbd83.x86_64 kmod-drbd83.x86_64
 # /sbin/chkconfig --add heartbeat  
 # depmod  
 # modprobe drbd

DRBD Configuration:

On both machines (db1 and db2), edit the DRBD configuration file. The host names must be the save as returned by ‘hostname’ command. You may also need to edit the host name in /etc/sysconfig/network. I have highlighted parts you will need to edit in red.

To replace the shared-secret, select at least half of the characters from the “63 random alpha-numeric characters” from Gibson Research Ultra High Security Passwords.

 # vi /etc/drbd.conf
 include "drbd.d/global.conf";
 include "drbd.d/*.res";
# vi /etc/drbd.d/global.conf
global { usage-count yes; }
common { startup { degr-wfc-timeout 0; }
net { cram-hmac-alg sha1; shared-secret R4x2alEkxtIg2kzbXqUL6l4uoTI7Ab7Qt; }
disk { on-io-error detach; } }
# hostname
 db1.grennan.com
# vi /etc/drbd.d/db.res
resource db { protocol C; syncer { rate 10M; }
on db1.grennan.com { device /dev/drbd0; disk /dev/md3; address 192.168.4.1:7788; flexible-meta-disk internal; }
on db2.grennan.com { device /dev/drbd0; disk /dev/md3; address 192.168.4.2:7788; flexible-meta-disk internal; } }
 # scp -r /etc/drbd.d db2:/etc  # scp -r /etc/drbd.conf db2:/etc

Manage DRDB processes:

If the disk was formatted during the OS install you may need to erase the ext3 file system info on both DB1 and DB2.

 # umount /data  # dd if=/dev/zero of=/dev/md3 count=2048

Write the DRBD meta data on both DB1 and DB2.

# drbdadm create-md db

On >>> DB1 only

  # drbdadm adjust db 
  # drbdsetup /dev/drbd0 primary -o
  # service drbd start

On DB2

  # service drbd start

Did you miss the >>> DB1 only

WAIT until the sync process completes.

Back on DB1 create the file system and mount it.

  # mkdir /data 
  # mkfs -j /dev/drbd0
  # tune2fs -c -1 -i 0 /dev/drbd0
  # mkdir /data
  # mount -o rw /dev/drbd0 /data

Don’t forget to make the /data directory on DB2.

  # mkdir /data/mysql
  # cat /proc/drbd
version: 8.3.8 (api:88/proto:86-94)
 GIT-hash: d78846e52224fd00562f7c225bcc25b2d422321d build by mockbuild@builder10.centos.org, 2010-06-04 08:04:09
1: cs:SyncSource ro:Primary/Secondary ds:UpToDate/Inconsistent C r----
 ns:2433920 nr:0 dw:0 dr:2433920 al:0 bm:148 lo:0 pe:0 ua:0 ap:0 ep:1 wo:b oos:28530088
 [==>.................] sync'ed: 21.9% (27860/30236)M delay_probe: 757
 finish: 4:57:11 speed: 1,248 (6,404) K/sec

In the next post I’ll go through installing MySQL in preparation for configuring Pacemaker. Then I’ll show you how to test fail over.

In the next part I will go over setting up MySQL.

 

 

 

 

PostGIS 1.5 in Postgresql 9.0 install on CentOS 5.6

July 14, 2011

I love short and consist install instructions. I know this is a MySQL blog but our good friend PostGreSQL has a great GIS library. This is what I learned upgrading our PostGIS system to GIS 1.5. Much thanks to Jeremy Tunnell for give this document it’s start.

Start with CentOS 5.6 x86_64 basic install.

Add the PostgreSQL Yum repository to your system.

 $ wget http://yum.pgrpms.org/reporpms/9.0/pgdg-centos-9.0-2.noarch.rpm
 $ rpm -i pgdg-centos-9.0-2.noarch.rpm

Another location for these is DAG. I have to tried these so your results may very.

You will need to exclude the packages CentOS provide by added two lines to the BASE and UPDATE sections of /etc/yum.repos.d/CentOS-Base.repo. They are:

exclude=postgresql*
exclude=geos*

You you are ready to install the needed packages. This includes proj version 4 and geos version 3.

 $ yum install postgresql90-contrib.x86_64
 $ yum install postgis90.x86_64
 $ yum install postgresql90-server
 $ yum install proj
 $ yum install geos
 $ yum install php-pear
 $ yum install php-devel

GEOS

There may be some dependencies you will have to work through. If you are using PDO and haven’t installed the pgsql PDO drivers, you can do it now:

 $ pecl install pdo_pgsql

PL/pgSQL

You you are ready to initialize the database files and start postgresql.

 $ service postgresql-9.0 initdb
 $ service postgresql-9.0 start

Now you can change to the postgres user and begin installing the functions for GIS. You have to start with defining the language.

 $ su – postgres
 $ psql
 # create language plpgsql ;
 # \q

Now you can create your database and add the GIS functions calls to it.

 $ createdb geos
 $ psql -d geos -f /usr/pgsql-9.0/share/contrib/postgis-1.5/postgis.sql
 $ psql -d geos -f /usr/pgsql-9.0/share/contrib/postgis-1.5/spatial_ref_sys.sql

You you can verify the install.

 $ psql geos
 # select postgis_full_version();
                                              postgis_full_version
——————————————————————————————————–
 POSTGIS=”1.5.2″ GEOS=”3.2.2-CAPI-1.6.2″ PROJ=”Rel. 4.7.1, 23 September 2009″ LIBXML=”2.6.26″ USE_STATS
(1 row)

For more on using PostGIS check out Jeremy’s “PostGIS part 2”.

.

automysqlbackup 2.5.5

July 12, 2011

I spent my day doing updates to the automysqlbackup script.  Here is some of what I’ve added over the last year.

The bug number fixes are from SourceForge.  https://sourceforge.net/tracker/?atid=628964&group_id=101066&func=browse

# 2.5.5 MTG – (2011-07-21)
#    – Bug – Typo Ureadable Unreadable config file line 424 – ID: 3316825
#    – Bug – Change “#!/bin/bash” to “#!/usr/bin/env bash” – ID: 3292873
#    – Bug – problem with excludes – ID: 3169562
#    – Bug – Total disk space on symbolic links – ID: 3064547
#    – Added DEBUG option to only print the commands that will be executed.
#    – Bug – WHICH command didn’t work if there was a WHICH alias.
# VER 2.5.4: MTG – (2011-01-28)
#    – fixed bug in rsync process.
#    – Added the ability to backup only a single table by naming the table like
#      database.table
# VER 2.5.2-02:  MTG – (2010-12-29)
#    – Added file promission settings (chmod) to directory and file creation points.
# VER 2.5.2-01:  MTG – (2010-11-06)
#    – Added ‘-R’ to the mysqldump options to include stored procedures in the backup
#      by default. (suggested by Zack Evans)
# VER 2.5.2:  MTG – (2010-11-04)
#    – Added option to archive (rsync) the local backup files to a remote locations
#      using the COPYDIR varaible.
#    – Added option to copy files into a directory based on the hostname using the
#      variable HOSTNAME.  This allows the script to be run from a shared storage directory
#      ( SBM, NFS, NetApp) the data to be kept seperate places.
#    – Added option to backup all database schemas only using variable FULLSCHEMA.
#    – Added option to backup MySQL configuration file, my.cnf and remove files older then seven
#      days from the BACKUPDIR directory.
#    – Added –master-data=2 and –single-transaction to include a comment with the master server’s
#      the binary log coordinates. If used the CHANGE_MASTER_TO line must be uncommented.

The project is at https://sourceforge.net/projects/automysqlbackup/

You can download my version at http://www.mysqlfanboy.com/Files/automysqlbackup.sh

Email me or leave a comment if you have any trouble.

mg_hot_replace_table.pl

May 1, 2010

Do you have MyISAM tables you reload with new data?

Do your queries, using that table, get blocked because the table is locked?

Do the waiting queries create idle connections slowing down the table load?

Do you wish you could just replace the table?

Years ago I was told you can replace CSV tables by simply replacing the CSV file. I figured this would also be true of a MyISAM file and it is. I use this perl script to replace MyISAM tables forcast and current observation weather data. The processing and tables are created on another computer. Weather forecasting is CPU and database expensive. I then copy (rsync) the files to the production system and run this script.

#!/usr/bin/perl
################################################################################
################################################################################
# mg_hot_replace_table.pl - Hot Replace a MySQL table.
#
# 2010-05-01 Mark Grennan - Weather Decision Technonigy Inc.
#
################################################################################
################################################################################
 use DBI;
 use File::Copy;
 use Getopt::Long;
 use Pod::Usage;
################################################################################
# These items need to be modified for your needs.
################################################################################
 $mydir="/data/mysql";           # MySQL data directory
 $stage="$mydir/stage";          # Directory containing the NEW data
 # database information
 $db="point_forecast";           # Database that containes the table to be replaced.
 $tb="uv_hourly";                # Table being replaced.
 $host="127.0.0.1";              # IP or FQDN of the MySQL server (localhost)
 $port="3306";                   # Port used by localhost (3306)
 $userid="dba";                  # User with access to database.
 $passwd="dbap@sswd";            # Password for the user.
################################################################################
# take command line options
################################################################################
 $help=0;
 &GetOptions("mydir=s" => \$mydir, "stage=s" => \$stage, "u=s" => \$userid, "p=s" => \$passwd, "port=i" => \$port, "db=s" => \$db, "tb=s" => \$tb, 'help|?' => \$help);
 if ($help) {
 &usage;
 exit 1;
 }

################################################################################
# main
################################################################################

#&check_table;
print "Connecting to database $db.$tb on $host:$port\n";
# make connection to database
$connectionInfo="DBI:mysql:$db:$host:$port";
$dbh = DBI->connect($connectionInfo,$userid,$passwd) || die ;
&lock_table;
&flush_table;
&move_files;
&flush_table;
$sth->finish()
 or die "Couldn't execute statement: " . $sth->errstr;
print "Disconnecting from MySQL.\n";
$dbh->disconnect        # disconnect from database
 or die "Couldn't execute statement: " . $sth->errstr;

print "DONE\n";
exit 0;

################################################################################
# Subrutines
################################################################################
sub lock_table {                # Lock table to prevent READs
 print "Locking Table\n";
 $query = "LOCK TABLES $tb WRITE;";
 $sth = $dbh->prepare($query);
 $sth->execute()
 or die "Couldn't execute statement: " . $sth->errstr;
}
sub unlock_table {              # Un-Lock table to allow use.
 print "UnLocking Table\n";
 $query = "UNLOCK TABLES;";
 $sth = $dbh->prepare($query);
 $sth->execute()
 or die "Couldn't execute statement: " . $sth->errstr;
}
sub flush_table {               # Flush table cache data.
 print "Flushing Table\n";
 $query = "FLUSH TABLE $tb";
 $sth = $dbh->prepare($query);
 $sth->execute()
 or die "Couldn't execute statement: " . $sth->errstr;
}
sub move_files {                        # Move new data to table files being used.
 print "Moving Table Files\n";
#    `rm -f /data/mysql/$db/$tb.frm /data/mysql/$db/$tb.MYD /data/mysql/$db/$tb.MYI`;
#    `mv $stage/$tb.frm  $stage/$tb.MYD $stage/$tb.MYI /data/mysql/$db`;
 if (-e "$stage/$tb.frm" and -e "$stage/$tb.MYD" and -e "$stage/$tb.MYI" ) {
 move("$stage/$tb.frm", "$mydir/$db");
 move("$stage/$tb.MYD", "$mydir/$db");
 move("$stage/$tb.MYI", "$mydir/$db");
 } else {
 print "There is a file missing! Files NOT moved.\n";
 }
}
sub check_table {
 $status = `mysqlcheck $stage $tb`;
 print "Status is  $status";
 if ($status) {
 print "New data table is BAD!\n";
 exit 2;
 }
}
sub usage ($) {
 my ($message)= @_;
 if ( $message )
 {
 print STDERR "$message\n";
 }

print 

Jonathan Schwartz leaves Sun/Oracle

April 29, 2010

I read in his blog Jonathan Schwartz is leaving Sun/Oracle.  Jonathan was the CEO, Sun Microsystems, Inc.

Jonathan has been the force at Sun driving open source.  His voice will not be heard at Oracle so I’m wondering if this will make a change in the way projects like MySQL, Open Solaris and Open Office will be managed.

MySQL master/slave support merged into Linux-HA

April 21, 2010

(Re-posted from Florian’s blog.)

MySQL replication support for the Pacemaker cluster manager (the stuff that we explained in this webinar) has made it into the Linux-HA resource agents default branch. If you are interested in testing — and you should! — please read the extended announcement. Feedback is extremely welcome on the linux-ha-dev mailing list.

We are expecting to release this as part of resource-agents 1.0.4, in late May/early June.

Developer Tips using MySQL

April 19, 2010

I get ask, by application developers,  “how do you optimize MySQL”.  I do lots of things that don’t really relate to a developer. I analyze the percent of queries are being pulled from cache for instance.  What a developer can do to optimize the SQL they develop is a different questions.   So here is a quick list of things applications developers should know about MySQL.

Explain will analyze your query.

This example shows the possible indexes (keys) that could be used and the index that was selected.  2,262 rows where selected and then sorted (Using file sorts) and one record was returned (limit 1).

mysql> explain SELECT 5/9*(temp_F-32) as t, 5/9*(dewpt_F-32) as td, speed_mps as spd, dir
 > where stn='KLDM' and date_time'2010-02-12 17:45'
 > order by ABS( date_time - CAST('2010-02-12 18:00:00' as datetime) ) limit 1;
+----+-------------+----------+------+-----------------------+------+---------+-------+------+-----------------------------+
| id | select_type | table    | type | possible_keys         | key  | key_len | ref   | rows | Extra                       |
+----+-------------+----------+------+-----------------------+------+---------+-------+------+-----------------------------+
|  1 | SIMPLE      | metar_nc | ref  | PRIMARY,stn,date_time | stn  | 8       | const | 2262 | Using where; Using filesort |
+----+-------------+----------+------+-----------------------+------+---------+-------+------+-----------------------------+

Using profiling can give you even more information;  Don’t forget to turn it off with a ‘set profiling=0’ when you are done.

mysql> set profiling=1;
+--------------------+----------+
| Status             | Duration |
+--------------------+----------+
| starting           | 0.000110 |
| Opening tables     | 0.000014 |
| query end          | 0.000004 |
| freeing items      | 0.000008 |
| logging slow query | 0.000002 |
| cleaning up        | 0.000003 |
+--------------------+----------+
6 rows in set (0.01 sec)
mysql> set profiling=0;


Indexing Basics

Avoiding disk reads is the name of the game.  Indexes are presorted and small.  Two or three disk reads of an index can point to a large amount of data.

  • MySQL Can only use Prefixes of the index
    mysql> SELECT AVG(age) FROM user GROUP BY city;
    This is a complex query that needs to scan all the rows. You can make it traverse shorter indexes by adding an index on (city,age)
  • Index (A,B) can be used for
    WHERE A=5 ,  WHERE A=6 AND  B=5 , WHERE A=7 AND B>5
    It can’t be used for  –  WHERE B=6 AND B
  • Only Equality/List allows second key part usage
    WHERE A=5 AND B>6 – will use 2 key parts
    IN (1,2) AND B=2 – will use 2 key parts
  • A>5 and B=2 will use 1 key part only
    The B=2 will be checked while reading row/index only
    A=5 ORDER BY B – will use the index
    A>5 ORDER BY B – will NOT use the index
  • For simple cross reference look ups, add the data to the index to skip the data read.
    SELECT name FROM login=”Jack123”;

If this is a very common part of your code, make the index (login,name).  When the index is read the data is in memory. Don’t add every column you just double the disk space and magnify the disk access.

More Tips

  1. Check that all tables have PRIMARY KEYs on columns with high cardinality. Primary keys must be unique.
    A column like, `gender` low cardinality (selectivity), an id column (Int – auto increment) is a good candidate to become a primary key.
  2. All joins (inter, outer, ‘,’) should have indexes.
  3. Fields you often search on (appear frequently in WHERE, ORDER BY or GROUP BY clauses) need indexes.
    But don’t add too many: the worst thing you can do is to add an index on every column of a table.
  4. Don’t use DISTINCT when you have or could use GROUP BY
  5. Open to the server just before you are going to use it.  Unless you are using a persistent connection library, don’t open a database connections and then run minutes of calculations before making your query.  You may find your connections has been “gone away” before you make your query.
  6. When your index many columns, create a hash column. Then your query will look like:
    SELECT *
    FROM table
    WHERE hash_column = MD5( CONCAT(col1, col2) )
    AND col1=’aaa’ AND col2=’bbb’;
  7. Use less RAM by declaring columns only as large as they need to be to hold the values stored in them.
    Use CHAR type when possible (instead of VARCHAR, BLOB or TEXT) — when values of a column have constant length: MD5-hash (32 symbols) or  ICAO or IATA airport code (4 and 3 symbols). This is also true for indexes.  If only the last 4 symbols are unique index only that part.
  8. Use SQL_NO_CACHE when you are SELECT-ing frequently updated data or large sets of data.  This way you will not kick good data out of the cache.
  9. Avoid wildcards at the start of LIKE queries.  (LIKE ‘%find%’).  Finding ‘1234find’ in 10000 records requires up to 40,000 searches.
  10. Normalizing redundant data is good but don’t split a table because you have too many columns.
  11. Think of storing users sessions data (or any non-critical / high access data) in MEMORY table — it’s very fast.
  12. Divide complex queries into several simpler ones — they have more chances to be cached, so will be quicker.
  13. A column must be declared as NOT NULL if it really is. This speeds up table traversing.
  14. If you usually retrieve rows in the same order like expr1, expr2, …, make ALTER TABLE … ORDER BY expr1, expr2, … to optimize the table.
  15. Don’t use PHP loop to fetch rows from database one by one just because you can — use IN instead, e.g.
    SELECT *
    FROM `table`
    WHERE `id` IN (1,7,13,42);
  16. Reuse your database connections.  Opening a new connection to the database will add one or more seconds to your query.
    In PHP use mysql_pconnect() to open a persistent connection with mod_php. Perl provides persistent connections with Apache::DBI with mod_perl. Python does not have persistent connections in mod_python.  But you can maintain them in your application.  (http://www.modpython.org/FAQ/faqw.py)
  17. When inserting data, insert only those values that differs from the default. This reduces the query parsing time.
  18. Use INSERT DELAYED or INSERT LOW_PRIORITY (for MyISAM) to write to your change log table.
    Also, if it’s MyISAM, you can add DELAY_KEY_WRITE=1 option — this makes index updates faster because they are not flushed to disk until the table is closed.
  19. For your web application, images and other binary assets should normally be stored as files.
  20. That is, store only a reference to the file rather than the file itself in the database.

Mark Grennan

MySQL Conference 2010

April 19, 2010

The big questions was “Where is Oracle taking MySQL.”

Edward Screven spoke to insure everyone Oracle will continue to grow the Open Source side of the company and gave several examples of the way MySQL has already been improved over the last year. Oracle has integrating the MySQL and InnoDB teams resulting in increases of up to 35 percent for MySQL databases operating with several hundred concurrent connections. Screvan stated Hot backup had been a separate product but will now be rolled in to the enterprise edition of MySQL and InnoDB will be the default engine in the next update.

Screven said “I expect that core features will end up in community edition. There will be some value-add, like monitoring or backup, that make sense in the enterprise edition.” I take this as Oracle will continue to make MySQL more marketable. To me, this insures MySQL will not be going away.

Continue Reading »

 



Powered by Wordpress and MySQL. Theme by Shlomi Noach, openark.org
Creative Commons License
MySQL Fan Boy by Mark Grennan is licensed under a Creative Commons Attribution-Share Alike 3.0 United States License.
HOME