Percona Live – MySQL Conference

April 25, 2013

Mark GrennanThis is my takeaway from this years MySQL conference in San Francisco.

The community believes Oracle has been a good steward of MySQL and delivered some great code in 5.6. Studies from 451 Research show MySQL is on the decline and a lot of this has to do with the ownership my Oracle and growth of no-SQL.

Big data, Cloud, hadoop and MongoDB are still the on the lips of the “JSON Generation”.  Marty Weiner and Yashh Nelapati (Pinterest) will tell you to start in the Cloud (Amazon) and grow until you can’t stand it. Chris Schneider (Ning.com) will tell you there are lots of tools to make hadoop easy to integrator.

As a DBA and System Administrator I see that computer systems are shrinking and CPU densities have reach the point that racking and installing is dead.  Linux Torvald said “If we are still using the same operating systems and using computers in the same way in 50 or 100 years from now then we have done something wrong“.  In only five years I think we are starting to see this.  We still have the LAMP stack but  it’s installed for us by scripts on computers we’ve never touched.

There was about a 15% growth in attendants this year and maybe a 10% decrease in vendor participation.   This week TokuDB v7 went open source. Still, I feel the community is suffering. Some faces you always expect to see where not there, like Sheeri who was gone “… because I opted to stay home after a crazy year of travel…” and may not be back because “until there is an acceptable Code of Conduct“.

Everyone had fun at the SkySQL booth having their likness drawn by Doug Shannon of  EventToons  and Julian Cash returned this year to take everyone s photo.

I learned a lot and had fun.  I’d like to thank ScaleArc for sponsoring my trip this year.  I’ll be posting more about ScaleArc soon.

 

Ten years of MySQL developer conference

April 23, 2013

IMG_1640[2]

This is the tenth year for the MySQL developer conference and my third.

I’ll be posting some news about iDB From ScaleArc. Oracle is back this year. Everyone is fighting with developers wanting to move away from MySQL. And there is lots of talk about 5.6 and working in virtual environments.

 

 

Making a MySQL slave with ZERO downtime

April 10, 2013

It’s about time for this year’s MySQL (Percona) conference in San Francisco.  I haven’t written anything in some time.  So, I thought I’d write something we can talk about should you meet me.

You need to make a slave of your MySQL server.

What if, you’re bonus was tied to database up time or maybe the systems group will not grant you any down time?  You’re ONE Master DB server is so big, coping tables takes hours. Even an ‘scp’ sync takes too long.  99.999% uptime is hard to achieve.

Percona and their tools are your friend… mostly. Xtrabackup will create a ready to use copy of all your databases, document your my.cnf and your current Master Position…  unless it doesn’t.

Xtrabackup uses it’s own XtraDB engine to copy the innodb tables so it doesn’t block master transactions. (Very cool) But that just copies the innodb tables.  All the other files/tables are just copied (cp-ed).

What if, your local disk space is more than 50% full?  You will need to copy it to a different system.  Maybe that space is on a NFS server and maybe that service is mounted with automount.

Now you learn ‘cp’ is not reliable.  (Timeouts? SAN problems? I’m a DBA not a sysadmin and have you read the bastard operator from hell? )

You learn, Xtrabackup can use ‘scp’.  Now all will go…  better.  Now it only fails about one out of six times.  You learn a week later when the developers tell you they are getting odd results half the time.

Odd thing, if you start a slave with all the Innodb data intact and all database (directories) but not all the tables, the replication process will not complain.  It will just ignore the missing tables like they were “black holes”.

So you add a backup to your backup script.  (See Below, the part in RED). You use your “fixed” backup script to create a slave. It only takes six hours for the slave to catch up with the master.  All goes well…  for about a week.

 

#!/bin/bash
export HOST="db01"
export BKDIR="/mnt/archive/xtrabackup"

if [ ! -d $BKDIR/$HOST ]
then
/bin/mkdir $BKDIR/$HOST
fi

#
# archive Old backup
#
export DT=`/bin/date +20%y-%m-%d_%H-%M-%S`
cd $BKDIR/$HOST/`/bin/date +20%y`*
tar zcf $BKDIR/$HOST/$HOST-$DT.tgz ./*

#
# Remove old backups
#
/bin/find $BKDIR/$HOST/$HOST-* -mtime +1 -exec rm {} -Rf \;

#
# Remove last backup
#
rm -Rf $BKDIR/$HOST/`/bin/date +20%y`*

#
# Create backup
#
export DT=`/bin/date +20%y-%m-%d_%H-%M-%S`
/usr/bin/innobackupex --rsync $BKDIR/$HOST

#
# Check for missing files
#
export d=`ls $BKDIR/$HOST | grep ^20`
cd /data/mysql
for f in `find . -regex ".*\.\(frm\|MYD\|MYI\|MAD|\MAI\|MRG\|TRG\|TRN\|ARM\|ARZ\|CSM\|CSV\|opt\|par\)" -printf "%P\n"` ;
do
if [ ! -f $BKDIR/$HOST/$d/$f ]
then
echo coping $f
rsync $f $BKDIR/$HOST/$d/$f
fi
done
/usr/bin/xtrabackup --prepare --target-dir=$BKDIR/$HOST/$d
echo DONE!

 

You check and… All the innodb tables are all there.  Only a few “less” critical myisam tables are missing.  You take care of the missing tables with mysqldump. The app guys check again for missing data and NO JOY.   You forgot to restart the mysql slave after you duped/copied the tables and the replication engine doesn’t know about them.

 

Again Percona to the rescue. You use PT-Table-Checksum to verify all the tables are in sync with the master. The DIFFS column show all 0 (zeros).  All is good… or not.  Again you find out from the developers that “some data is missing”.  (They don’t tell you the database or table of course.)

Time for some “old school” testing.

$ for x in `mysql -h db01 -e "show databases"`; do mysql -h db01 $x -e "show tables"; done > bar
$ for x in `mysql -e "show databases"`; do mysql $x -e "show tables"; done > foo
$ diff foo bar
99a100,101
>; forecast_20130318_1705
>; forecast_20130318_2002
105a108
>; forecast_gfs_new

 

Sure enough, you turn up missing and added tables because you’re copy wasn’t quite a “Point in time”.  The developers like to do table renames with their “load data from file” processes.  The myisam data was changing between the time you started your backup and the time you created your slave.

So a few mysqldumps, tables drops, restarts and you got it.


Do you have weeks like this?

 

 

In place replacement / upgrade – Redhat and Centos

November 13, 2012

For many years I have recommended startup companies install and work with CentOS until they need and can afford RedHat Enterprise.  The idea, because the two are binary compatible, the applications the company uses and develops will migrate between the two without changes.  I have even said you can do an inplace migration between them.

I’ve been challenged to prove an inplace migration is possible.  This is the results.

There are a couple packages that define a CentOS/Redhat system.

  • The centos/redhat-release package

This package hold the release files, product GPG keyes and repository config files.

  • redhat-logos

This file hold the proprietary artwork owned by the vendor.

You might have packages installed not provided by your vendor.  You can list all the vendor names of installed packages with this:

rpm -qa --qf "%{VENDOR} \n" | sort | uniq

Dell Inc
CentOS

In this example “Dell Inc” packages are also installed. You can list the packages installed for a single vendor with this command:

rpm -qa --qf "%{NAME} %{VENDOR} \n" | grep "Dell Inc" | cut -d ' ' -f 1 | sort

 

Migrating CentOS to Redhat

You’re starting with an existing CentOS system.  I’m using a CentOS 6.2 x86_64 minimum install not updated.  I’m migrating to Redhat Enterprise Server 6.3 x86_64.

Why a minimum install?  Because this is what I recommend for application servers like MySQL.

This process is simplest if you are moving to the next minor release. A migration to the same release (6.2 CentOS to 6.2 RHEL) is possible but will require you manual replace more packages.  I can not recommend a migration to the next major release.  Although a migration from 5.8 to 6.0 might work, configuration files and sometimes whole subsystems are replaced.

To do an update you need a repository to update from.  You can use a local Redhat repository if you have one, or you can use a Redhat CD.

mkdir -p /mnt/cdrom
mount /dev/cdrom /mnt/cdrom

mv /etc/yum.repo.d/CentOS* /root
vi /etc/yum.repo.d/rhel.repo

Type this into the file.

[rhel]
name=redhat enterprise linux
baseurl=file:/mnt/cdrom
gpgcheck=0

You need to remove just a few files if they exist.

yum clean all
rpm -e --nodeps centos-release
rpm -e --nodeps centos-release-cr
rpm -e --nodeps centos-indexhtml
rpm -e --nodeps redhat-logos
rpm -e --nodeps yum-plugin-fastestmirror
rpm -e --nodeps yu

Now you can install them back from the CD repository.

yum -y install redhat-release redhat-logos yum-rhn-plugin

Then update to the latest release.

yum update

Now you need to register/entitle  the server with Redhat.

rm /etc/yum.repo.d/rhel.repo

REBOOT

To purge your system of the last of the CentOS packages (now that you are registered) do:

yum reinstall $(rpm -qa --qf "%{NAME} %{VENDOR} \n" | grep "CentOS" | cut -d ' ' -f 1 | sort)

You may still have a few packages left.  The kernel doesn’t get replace unless you force it and some packages are specific to CentOS.  The boot loader for example.  (Do you care?)  In my test I still had:

kernel
libxml2
libxml2-python
plymouth
plymouth-core-libs
plymouth-scripts

How you have upgraded to the next release but you still need to update bug fixes from the Redhat repository.

yum update

With all updates, you should go through your system and look for *.rpm.save and *.rpm.new files.  These are created because of changes made to configuration files on your system.  If the OS thinks it can’t safely replace your .conf file with it’s you get a .rpm.new.  If the OS knows it has to replace your .conf file for the application to work you’ll get a .rpm.save.  Either way you should review the differences and correct what’s needed.

find / -name*.rpm.save’
find / -name*.rpm.new’

 

Full install Test

To really test this process I did a FULL install with 1,858 packages.  I installed CentOS 6.2 and selected every package I could. I didn’t select virtualization because I was running the test in VMware. I also didn’t select the mainframe option. (Anyone want to lend me a mainframe?)

The full install went almost as simply as the minimum install except, I had to uninstall firefox before I did the ‘yum update’.

rpm -e --nodeps firefox

Of the 1858 packages (not including dependencies) 1077 were updates done by the ‘yum update’ command. This left 1325 packages that were the same release as CentOS.  You could leave them in place.  I found the system worked fine.  I tested many of the stock applications  in X-windows without issues.

The purge of CentOS files with “yum reinstall” also went well.  It found another 1229 packages leaving 97 packages still manufactured by CentOS.  Most of the remaining packages where packages that had Redhat equivalents but the package names didn’t match.  For example:

CentOS Redhat
at-spi.x86_64 at-spi-1.28.1-2.el6.x86_64
at-spi-python.x86_64 at-spi-python-1.28.1.-2.el6.x86_64
redhat-bookmarks redhat-bookmarks-6.1.el6
redhat-lsb-graphics redhat-lsb-graphics-4.0-3.el6
system-config-date system-config-date-1.9.60-1.el6

You can go through by hand and replace these if you are just have to get rid of CentOS 100% or your crazy.  Again, the system works just fine without replacing these files.

Some packages not have a replacement on the CD. Most are like OpenOffice. If you need such a thing you might have just installed the CentOS version anyway.  Maybe these are available in the EPEL repo?

CentOS
autocorr-en ?
ekiga ?
farsight2 ?
gssdp ?
gupng ?
hyphen ?
hyphen-en ?
iwl6000g2b-firmware ?
libnice ?
libpurple ?
libtextcat ?
lpsolve ?
mythes-en ?
opal ?
openchange ?
openoffice ?
pidgin ?
ptlib ?
samba4-libs ?

Cruft

Maybe for my next trick I’ll build a list of files from a CentOS to Redhat install and a straight from CD Redhat install and compare them to see if there are any crumbs left over.

 

In place upgrade – Redhat to Centos

These instructions are from the CentOS Wiki – http://wiki.centos.org/HowTos/MigrationGuide

To migrate from RHEL6 to CentOS 6 fetch the latest versions of the following packages for your architecture and put them in a temporary directory. Rather than fetching packages with wget, as shown below, the CentOS-6 installation media can be mounted and packages installed from it.
For CentOS 6.2 x86_64

yum remove rhnlib abrt-plugin-bugzilla redhat-release-notes*
rpm -e --nodeps redhat-release redhat-indexhtml
cd /tmp
wget http://mirror.centos.org/centos/6/os/x86_64/Packages/centos-release-6-2.el6.centos.7.x86_64.rpm
wget http://mirror.centos.org/centos/6/os/x86_64/Packages/centos-indexhtml-6-1.el6.centos.noarch.rpm
wget http://mirror.centos.org/centos/6/os/x86_64/Packages/yum-3.2.29-22.el6.centos.noarch.rpm
wget http://mirror.centos.org/centos/6/os/x86_64/Packages/yum-plugin-fastestmirror-1.1.30-10.el6.noarch.rpm
rpm -Uvh *.rpm
cd ..
yum clean all
yum upgrade

One might want to download and add a few more look-and-feel items, or load them from the CentOS DVD:

rpm -Uvh redhat-bookmarks-6-1.el6.centos.noarch.rpm redhat-logos-60.0.14-10.el6.centos.noarch.rpm \
redhat-menus-14.0.0-3.el6.noarch.rpm redhat-rpm-config-9.0.3-34.el6.noarch.rpm

With all updates, you should go through your system and look for *.rpm.save and *.rpm.new files.  These are created because of changes made to configuration files on your system.  If the OS thinks it can’t safely replace your .conf file with it’s you get a .rpm.new.  If the OS knows it has to replace your .conf file for the application to work you’ll get a .rpm.save.  Either way you should review the differences and correct what’s needed.

find / -name*.rpm.save’
find / -name*.rpm.new’

CentOS Redhat
at-spi.x86_64 at-spi-1.28.1-2.el6.x86_64
at-spi-python.x86_64 at-spi-python-1.28.1.-2.el6.x86_64
redhat-bookmarks redhat-bookmarks-6.1.el6
redhat-lsb-graphics redhat-lsb-graphics-4.0-3.el6
system-config-date system-config-date-1.9.60-1.el6

 

 

 

 

.

ScaleArc’s iDB Reviewed

August 28, 2012

Stop coding and start scaling!

At the April MySQL conference in San Francisco (2012) I discovered iDB by ScaleArc and was excited by what it can do. After three months of testing and working with ScaleArc I purchased it.

What I do

The company I work for (WDT Inc.) supplies weather data. From a database point of view, we take in data feeds from many sources with “Ingest” applications. All are processed (forecast applications) and provided to many “Out flow” applications. Current conditions are processed into forecasts data, RADAR images, lightning and alerts are mapped into images to be used by governments and individual customers (iPhones). So, we have many applications work with many databases written in many languages.

Coding applications to split read and writes for replication and sharding have been the answer for scaling database access for years. Many companies are working on ways improve replication. The problem with these approaches is they all require code changes. Having been a firewall developer (ipchains), I’ve always thought a proxy was the answer. Think F5-BigIP for databases.

Applications written to work with one MySQL server can’t scale without re-coding.  If read/write splitting was all iDB did I’d be happy, but iDB can also monitors the transactions, creates metrics and cache transactions.

Testing

ScaleArc provided me with a VM version and two 30 day licenses. I installed iDB on two VMware servers along with several types of web applications. (some staging for production) The primary application was an in-house Java applications to process weather alerts. Other application included WordPress, MediaWiki, Cacti and Thinkup.

I also ran sysbench and hammerora against iDB. I used two and three MySQL server, with master / master / slave replication between them.

Most test “Servers” where VMware virtual machines running CentOS 6.0 x64. Some tests were run on production class MySQL servers while iDB ran in a VM.

Installation

iDB is designed as an appliance. You can purchase it as hardware, software or a VM. You can have ScaleArc pre-configure iDB for your network. I skipped this, so needed to use a live Linux CD to pop the root password and changed the network configuration.

iDB uses a web interface. The setup process is very intuitive. My installation went well until I tried to install the license key. iDB would not take the key. I turned the problem into ScaleArc support to see how they would respond. I also began to dig into the code and in about two hours solved the problem. I found the startup script was calling the wrong program name.

Within a three days, ScaleArc sent a system engineer, in house, to help with my problem and complete the installation. (I already had it working but I let ScaleArc do it their way to test their support.) The engineer upgraded to a newer version of the iDB server and installed the license key.

Security

As a part of digging into ScaleArc’s code, I check iDB’s security. iDB has two methods to authenticate a user. They are, pass through and off load. The default, off load, authenticates the client/application/user itself. Meaning, iDB connects to the database server with a password you give it for that database user. This allows iDB to answer requests from cached queries without connecting to MySQL.

Pass through, as the name implies, just passes the authentication protocol from the client to the server. In this mode iDB can not cache the queries.

Load Balancing

iDB lets you setup “Clusters” of servers. You can designate each database server as either Read/Write or Read only.
You can define a cluster as either Round Robin or Dynamic load balancing.

Caching

iDB analyzes the traffic passing through it and summarizes it to create analytic data. You can then create expressions to cache queries.

SELECT type FROM `db`\.`settings` WHERE name\=.*

iDB plots the summarized queries and makes it easy to setup caching rules.

Inconveniences

To load balance you need supply iDB with the ID and password of every database user. I’m sure you keep these save somewhere? What about that one, very old, production program that guy (who left two years ago) wrote in Fortran?

MySQL stores a sha1 of each user’s password. After some study I found a way for a proxy to recover the sha1 of the password used by the client. I wrote this up and presented it to ScaleArc. I call it “Transparent Authentication”. ScaleArc has agreed to build this into iDB. I hope to test this soon.

The console doesn’t alway report the current settings. I turned on Access Control List for the cluster and set access to the local LAN only. This creates iptables setting in the underlying Linux server. After changing the access the console didn’t show the iptables setting left. I had to turn off the ACL all together to stop the iptables setting that were left.

Security Issue

I found a number of security problem with ScaleArc. Not all the web pages where protected by iDB security code. I found many test and support scripts in the GUI application that leaked information. I also found a rather scary program that leaked system files. I reported all to ScaleArc.  These bugs have been fixed.

Some Application Issues

I found some applications, like WordPress, report “database issues” if they can’t write to the database right away. It seems the first thing these applications do is log the connection. They do not continue to show the page even though the database can be read.

Hammerora Test

All OLTP test ran without a flaw. What more can I say.

SysBench test

This test was to review the latency created by iDB. The SysBench, iDB and MySQL all ran on different VMs. All VMs ran on the same hardware. The first test was run direct to the MySQL server and the second through iDB. The MySQL server was not load balanced.

Transactions per second

Average transaction time in milliseconds

Conclusion

ScaleArc’s iDB works! Unlike the MySQL Proxy I could not make iDB fail. Application have no idea they were going through a proxy. Queries went directed to read or write servers as needed. Replication was not affected.

Although the heart of iDB is solid, I did have some trouble with the user interface. I believe iDB may have started life as a command line program. The web UI is good but need work.

ScaleArc support is 100% the are very helpful and willing to listen to their customer. I’ve found their support has continued to be great after purchase.

When ScaleArc complete the “Transparent Authentication” iDB will truly be a drop-in scaling solution.

 

 

Indexed CSV

July 26, 2012

Why is there no index for CSV files?  Indexes are very simple.

If the first column of your CSV file is in sorted order you can do a binary search to find your data.  But what if you need to find data in the second or third column?

If you have a separate index file pointing to the first byte of each line you could seek to that position in the CSV file and get your data.  Given a file with only your needed column and the byte offset and length of the line you can search the index to find the pointer to the position in the CSV file.

SAMPLE DATA
“record”,”title”,”data”
1,”test1”,”data1”
2,”test3”,”data3”
3,”test2”,”data2”

SAMPLE INDEX
“test1”,24,17
“test2”,58,17
“test3”,41,17
“title”,0,23

Here is a simple perl program to create just such an index.

#!/usr/bin/perl
# csvindex.pl

use strict;
use warnings;

use Text::CSV_XS;

my $file = $ARGV[0];

my $csv = Text::CSV_XS->new ({
binary    => 1,
auto_diag => 1,
sep_char  => ','    # not really needed as this is the default
});

open(my $data, '<:encoding(utf8)', $file) or die "Could not open '$file'\n";

my $head = 0;
my $tail = 0 ;

while (my $fields = $csv->getline( $data )) {
$tail = tell $data ;
printf("\"%s\",%d,%d\n",$fields->[$ARGV[1]], $head, $tail-$head);
$head = $tail;
}

This command create an index of the second column (number start with 0).

$ perl csvindex.pl 2009.csv 1 | sort > 2009.cidx

A binary search can be done with the linux ‘look’ command.

$ look \"OK\" 2009.cidx
"OK",2134,59

Here is a simple program to seek into a file and print ‘n’ number of characters.

#!/usr/bin/perl
# csvseek.pl
use strict;
use warnings;

use Text::CSV_XS;

my $file = $ARGV[0];

open(my $data, '<:encoding(utf8)', $file) or die "Could not open '$file'\n";

seek $data, $ARGV[1], 0 ;

my $line = "" ;

read $data, $line, $ARGV[2];
chomp($line);

print $line . "\n" ;

Here is how it is used.

$ perl csvseek.pl  2009.csv 2134 59
OKLAHOMA,OK,40,"4,765","1,251","2,160","1,024",6,"9,206",

I’d like to see these tools added to CSVKit.

What I’d really like to see is this index/s added to MySQL’s CSV engine.  Would anyone like to help me get this done?

Install HandlerSocket into CentOS 6.2 MySQL

July 2, 2012

If you don’t know what HandlerSocket is read Yoshinori Matsunobu’s blog post.

Because I’m starting with a minimum install of CentOS 6.2,  You’ll need to make sure you have a few utilities and development tools installed:

yum install git perl openssl-clients wget telnet lsof  
yum install gcc gcc-c++ libtool make openssl-devel perl-DBI perl-DBD-MySQL.x86_64

Installing the MySQL source

You’ll also need the MySQL source and a few more supporting packages.

wget http://vault.centos.org/6.2/os/Source/SPackages/mysql-5.1.52-1.el6_0.1.src.rpm 
yum install rpm-build gperf readline-devel ncurses-devel time perl-Time-HiRes 
rpm -i mysql-5.1.52-1.el6_0.1.src.rpm

Now you can make sure MySQL compiles.

cd /root/rpmbuild/SPECS 
rpmbuild -ba mysql.spec --define='runselftest 0'

If everything builds, you’ll have a fresh RPM in /root/rpmbuild/RPMS/x86_64.  If you’re installed MySQL is not the save version as the source you’ll need to install the compiled version.

cd /root/rpmbuild/RPMS/x86_64  
rpm -i mysql-5.1.52-1.el6.1.x86_64.rpm mysql-server-5.1.52-1.el6.1.x86_64.rpm mysql-libs-5.1.52-1.el6.1.x86_64.rpm

Building the HandlerSocket Plugin

With all the MySQL source in place,  we need the HandlerSocket source.

cd ~ mkdir Downloads   
cd Downloads  
git clone https://github.com/ahiguti/HandlerSocket-Plugin-for-MySQL.git  
cd HandlerSocket-Plugin-for-MySQL 
./autogen.sh  
./configure --with-mysql-source=/root/rpmbuild/BUILD/mysql-5.1.52 --with-mysql-bindir=/usr/bin 
make  
make install

To tell MySQL about the plug-in, add these instructions to your /etc/my.cnf file under the [mysqld] header.

plugin-load=handlersocket.so
loose_handlersocket_port = 9998
 # the port number to bind to (for read requests)
loose_handlersocket_port_wr = 9999
 # the port number to bind to (for write requests)
loose_handlersocket_threads = 16
 # the number of worker threads (for read requests)
loose_handlersocket_threads_wr = 1
 # the number of worker threads (for write requests)
 # open_files_limit = 65535
 # to allow handlersocket accept many concurrent
 # connections, make open_files_limit as large as
 # possible.

And, restart MySQL.

/etc/rc.d/mysqld restart

Run mysql and check for the plugin.

mysql> show plugins;
+---------------+--------+----------------+------------------+---------+
| Name          | Status | Type           | Library          | License |
+---------------+--------+----------------+------------------+---------+
| binlog        | ACTIVE | STORAGE ENGINE | NULL             | GPL     |
| partition     | ACTIVE | STORAGE ENGINE | NULL             | GPL     |
| CSV           | ACTIVE | STORAGE ENGINE | NULL             | GPL     |
| MEMORY        | ACTIVE | STORAGE ENGINE | NULL             | GPL     |
| InnoDB        | ACTIVE | STORAGE ENGINE | NULL             | GPL     |
| MyISAM        | ACTIVE | STORAGE ENGINE | NULL             | GPL     |
| MRG_MYISAM    | ACTIVE | STORAGE ENGINE | NULL             | GPL     |
| handlersocket | ACTIVE | DAEMON | handlersocket.so         | BSD     |
+---------------+--------+----------------+------------------+---------+

You should also see two new ports for MySQL in an lsof command.

lsof -i -P
COMMAND     PID    USER   FD   TYPE DEVICE SIZE NODE NAME
portmap    3297     rpc    3u  IPv4   9500       UDP *:111
portmap    3297     rpc    4u  IPv4   9501       TCP *:111 (LISTEN)
sshd       3641    root    3u  IPv6  11654       TCP *:22 (LISTEN)
heartbeat  3928  nobody    7u  IPv4  12165       UDP *:37988
heartbeat  3928  nobody    8u  IPv4  12166       UDP *:694
heartbeat  3929  nobody    7u  IPv4  12165       UDP *:37988
heartbeat  3929  nobody    8u  IPv4  12166       UDP *:694
sshd      18054    root    3u  IPv6 409453       TCP db2.grennan.com:22->192.168.2.11:59037 (ESTABLISHED) 
mysqld    19426   mysql   15u  IPv4 411158       TCP *:9998 (LISTEN)  
mysqld    19426   mysql   33u  IPv4 411175       TCP *:9999 (LISTEN)
mysqld    19426   mysql   36u  IPv4 411179       TCP *:3306 (LISTEN)

If ports 9998 and 9999 don’t show up.  Make sure SELinux is not running.  (sestatus)

The telnet test

Given you have a simple database like this.  Maybe in your test database…

CREATE TABLE user (
 user_id INT UNSIGNED PRIMARY KEY,
 user_name VARCHAR(50),
 user_email VARCHAR(255),
 created DATETIME
) ENGINE=InnoDB;

And some data

insert into test.user (user_id, user_name, user_email, created) values (1, "mark", "mark@grennan.com", now());
insert into test.user (user_id, user_name, user_email, created) values (2, "jim", "jim@grennan.com", now());

You can telnet to localhost and talk to HandlerSocket with a simple protocol.  (Type the part in bold and press enter.)

# telnet localhost 9998
Trying ::1…
telnet: connect to address ::1: Connection refused
Trying 127.0.0.1…
Connected to localhost.
Escape character is ‘^]’.
P 0 test user PRIMARY user_name,user_email,created
0 1
0 = 1 1
0 3 mark mark@grennan.com 2012-07-02 15:47:04

 

And there you have it.  Happiness is a new plug-in.

 

New Toy – MemSQL

June 25, 2012

We have a new toy to play with. MemSQL is a “Lock Free”, Just In TIme (GIJ) SQL compler (C++) that is MySQL client compatible. It was developed by two former Facebook developers, Eric Frenkiel and Nikita Shamgunov,  and is said to be the world’s fastest.

I gave it a 30min review.  The install was incredibly simple. I use mysqldump to dump a WordPress tables directly into MemSQL and all when well until I  tried to use the website.  MemSQL is NOT a MySQL drop in replacement. But, they don’t say it is.

MemSQL is ACID compliment and might fit well in a small application needing high read write transactions. (Think stock trading.) There is no descriptions on how to scale. Documentation is thin but good enough for testing. It is not open source.

MemSQL has thrown down the gontlet gauntlet with the a claim of being the world’s fastest.  They don’t describe how the testing was done.  What version of MySQL where they comparing it with?

I’m looking forward to some more technical review.   I’ll report more if I can find the time.

.

MySQL Security

June 18, 2012

A deep dive into the code

MySQL security is like an onion, it has many layers. MySQL check the user source location (IP) and password. MySQL can also check a user’s SSL Certificate credentials. MySQL may or may not give a user access to a set of databases, tables, functions or views based on the user’s Access Control List.

I’ve been reviewing ScaleArc’s iDB for some time.  The use of a proxy complicates the authentication process even more.  This has led to me reviewing all of MySQL Security.  I’ll blog about what I’ve found as I dive into the code.

The Login Handshake

In version 4.0 and before:

  • The server sends a random string to the client, in scramble_buff.
  • The client encrypts the scramble_buff value using the hash of a password that the user has entered. This happens in sql/password.c:scramble() function.
  • The client sends the encrypted scramble_buff value to the server.
  • The server encrypts the original random string using a value in the mysql database, mysql.user.Password.
  • The server compares its encrypted random string to what the client sent in scramble_buff.
  • If they are the same, the password is okay.

In this protocol, snooping on the wire doesn’t reveal the password. But note the problem – if the client doesn’t know the password, but knows a hash of it (as stored in mysql.user.Password) it can connect to the server. In other words, the hash of a password is the real password; if one can get the value of mysql.user.Password – he can connect to the server.

In version 4.1 and Later

Passwords are stored in the mysql.user table as double sha1 strings. sha1(sha1(‘P@ssw0rd’).  The MySQL sha1 function outputs a hex value.  In the server’s mysql.user table the password is stored with a ‘*’ pre-pended to the string to indicate the password is post version 4.1. In SQL this is:

SELECT CONCAT( ‘*’,sha1(unhex(sha1(“P@ssw0rd”))));

The MySQL server keeps this hash in the mysql.user table.  I’ll use the password ‘P@ssw0rd’ in this example.

Server_Password = SHA1(SHA1('P@ssw0rd')) = gjKhKYpJ9xDb7gszDELuyCXUGQo=

To start the process the user types their password into the MySQL client and connects to the server.

Password = P@ssw0rd
The server check the user IP and makes up a random code (scramble) to send it to the client. (Handshake Initialisation Packet)  The scramble length is 8 bytes in versions before 3.23 and 20 thereafter.

scramble = 1a2b3c4d

When the client receive the handshake initialization packet, it hashes (sha1)the plain text password (stage1).

stage1_hash = SHA1(PW) = Ib0S3Bg/dA7nbye3jrOcitlyp1c=

Client create a token with the scramble and the stage1_hash.

token = SHA1(scramble + SHA1(stage1_hash) xor stage1_hash = 2I4FYp26k+PT6ivw5fCCl82LqSQ=

Client sends the user ID and token to server.  (Client Authentication Packet)

The server reverses the token. With the scramble it sent and its password hash from the mysql.user table, the server recovers the hash of the user’s password.

stage1 Prime = token xor SHA1(scramble + Server_Password) = Ib0S3Bg/dA7nbye3jrOcitlyp1c=

The server re-hashes the recovered stage1_hash just recovered so it can be compared it with the double hash in the mysql.user table.

code = SHA1(stage2_prime) = gjKhKYpJ9xDb7gszDELuyCXUGQo=

and compares it to the server’s stored password.

Does the code            gjKhKYpJ9xDb7gszDELuyCXUGQo=
equal the server_pw   gjKhKYpJ9xDb7gszDELuyCXUGQo=

NOTE:  It is this memory memcmp() in the check_scramble function, that is at the heart of the authentication bypass found by Sergei golubchik and released just days after I discovered the same issue. I believe TCPCOPY may suffer from the same issue.

Can the password be Brute Forced?

As a bird on the wire, can you recover the user’s password? Do you need to?

If you have every possible scrambles and the sha1 of every possible password, you could calculate and create a table of the results.

sha1(scramble(n) + sha1(n)) xor sha1(n)

You could then compare the transmitted token with table to produce the password.

The scramble is eight (8) bytes of “printable” characters (33).  33^8 = 1406408618241 = ≈ 1.4T  Or its twenty (20) bytes of “printable” characters. 33^20 = 2345734188103679287078463273601

The password keyspace for only seven (7) characters is (mixalpha-numeric-symbol32-space#1-7) 70576641626495 ≈ 2 ^ 46.0043  So, given this rainbow table of sha1 hashes you must multiply the combinations.

scramble * sha1 = 1977985201462558877934081  for an 8 byte scramble and scramble * sha1 = 1.6555404114481058399928719625084e+44  for 20.

Using hardware to calculate 400,000,000 sha1 keys per second, it will take 15,680,375 years to create such a table for an eight byte scramble and  13,124,210,516,933,867,960,369,672,458 years for a twenty byte scramble.

But wait! We know the scramble. The server sent it to the client so we only need to calculate, so we only need to calculate:

sha1(scramble + sha1(n)) xor sha1(n)

With this reduced number, n to 70,576,641,626,495.  Again at the rate of 400,000,000 per second,  the eight byte scramble is reduced to 2 days.  The size of the scramble doesn’t matter.

As a bird on the wire, if you get lucky, you might observe the administrator issuing the command “SELECT * from mysql.user;” If this happens you have everything you need to calculate the stage1_hash password and connect without the plain text version.

Size Matters

Passwords work because their entropy (randomness ) and length together create search space.

To brute force every possible eight character password, like “P@ssw0rd”, would take about two (2) years at 400,000,000 per second.  The password “P.@.s.s.w.0.r.d.” would take 3,535,000 years.  Both have the about same entropy but the second creates more space.

Here is a good calculator to demonstrate. https://www.grc.com/haystack.htm

The password still has to be process with sha1.  You could just skip the password all together and just calculate all the possible values for a sha1. (8.02 thousand trillion trillion centuries @ 1Bcps)

It is believed the NSA has a database “Rainbow Table” of ever upper/lower-number-symbol two twelve (12) characters.  With such a table any twelve character or less password (in ASCII) would only be a 8ms lookup.

Man in the Middle

All of the above is based on being a bird on the wire. However, once the authentication is complete, the only thing managing access  is the TCP session.  It is possible a third party could inject commands into this session and issue commands.

A true Man In the Middle (MITM) or proxy, could simply pass the authentication through and then change or modify the message between the client and the server and issue commands of it’s own without the clients or servers knowledge. There is no way for the client to know he has connected to the proxy and not the server.  The server would believe all commands where issued by the client.

What could a MITM do? If the authenticated user has access to the mysql.user table it’s game over.  The MITM only needs see one attempt of a user to access the database to calculate that user’s stage1_hash.  If the client is a replication user a third party could connect and take a bin-log feed and watch all your updates.

But wait there’s more

For my next post I’ll dive into the host authentication (Firewall like function) of MySQL security.  Could there be a compare like authentication like bypass problem there?

 

 

 

.

Percona Tease Kit

April 16, 2012
Percona you know I love you.  You have the largest brains working with MySQL. Your toolkit (formerly Maatkit and Aspersa) is a real gift to the MySQL community. But stop teasing us with webinars about what it can do and show us how to use it.

The Percona ToolKit needs a cookbook. The documentation is reference only, very thin and total void of useful examples. You’re beginning to frustrate the masses.

Here is what’s happening.  My MySQL server crashes or stalls ever few months.  I searched the Goog and find PT-Stalk :

“watches for a trigger condition to become true, and then collects data to help in diagnosing problems. It is designed to run as a daemon with root privileges, so that you can diagnose intermittent problems that you cannot observe directly. You can also use it to execute a custom command, or to gather the data on demand without waiting for the trigger to happen.”

I attend your webinar.  I even attend the Percona Live conference in San Francisco to talke with the developer in person and still I get little more then:

pt-stalk [OPTIONS] [-- MYSQL OPTIONS]


??!!

Anyone else feel the same?

I’m going to take a deep dive into the Percona ToolKit code and try to bring you some working examples (Cut and Paste).  I’m going to start with PT-STALK because it fills a need.  I’ll pick the next in the line of useful tools that need more documentation.

If you would like be to document a Percona tool you need, leave me a comment.

 



Powered by Wordpress and MySQL. Theme by Shlomi Noach, openark.org
Creative Commons License
MySQL Fan Boy by Mark Grennan is licensed under a Creative Commons Attribution-Share Alike 3.0 United States License.