What can I DROP?

September 30, 2011

So you have inherited a MySQL server and no one know what applications are use what databases.   Space is every tight and every night transactions logs almost fill your hard disk.  The server was once used as a replication server and it seems every database the company uses is duplicated here.

You could do a full dump, drop everything and wait for the requests to restore. (This might be OK on a test database.)

We could ask MySQL for all the databases that have tables that have been updated in the last 30 days.

$ mysql information_schema -e  \
"select DISTINCT TABLE_SCHEMA as NAME from TABLES
  WHERE UPDATE_TIME IS NOT NULL and UPDATE_TIME > NOW() - INTERVAL 30 DAY"  \
> Databases

But this only gives a list of databases that have changed.  What about tables that are only read from?

Bin-Logs don’t help because they too only have updates.  What we really need is a long snapshot of the general log. But, starting the general log will just fill your disk space even more.

As long as all queries are remote we can use tcpdump.

Filter all the MySQL connections for  ‘FROM tablenames’.

$ tcpdump -i eth1 -s 0 -l -w - dst port 3306 | strings | \
 sed -n 's/\(.*\)\(from.*\)/\2/p' | awk '{print $2}' > TableList
 $ cat TableList | sort | uniq > ShortList

Some of the queries may reference databases.tablename so we can filter out the database names and add them to your Database list.

$ cat TableList | sed -n 's/\./ /p' | awk '{print $1}' | sort | uniq >> Databases

Now for the queries that only use table names. We need to find the database each belongs to.  This isn’t completely accurate because the table name may be in more then one databases.

$  for x in `cat TableList | sort | uniq` ;
 do
 mysql information_schema -e "select TABLE_SCHEMA from TABLES where TABLE_NAME='$x'\G"  \
 | grep TABLE | awk '{print $2}' ;
 done | sort | uniq >> Databases

To create the final list of database that should not be dropped.

$ sort Databases | uniq > KEEPTHESE ; cat KEEPTHESE

Now you can do the deed with some confidence your not removing anything being used.

 

 

 

 

P.S.  Did I tell you, Never do anything you can’t reverse.   Make a BACKUP FIRST!

Tweet

posted in HOW TO by admin

Follow comments via the RSS Feed | Leave a comment | Trackback URL

2 Comments to "What can I DROP?"

  1. Baron Schwartz wrote:

    Two ways that are even more accurate:

    1. Use pt-query-digest with tcpdump and –group-by=tables.

    2. Use Percona Server and look at INFORMATION_SCHEMA.TABLE_STATISTICS to see what tables are used.

  2. admin wrote:

    Yup – Took me some time to work out the HOWTO but that works!

    sudo /usr/sbin/tcpdump -i eth1 port 3306 -s 65535 -x -n -q -tttt -c 100000 | pt-query-digest –group-by=tables –type=tcpdump > report.txt

Leave Your Comment

 



Powered by Wordpress and MySQL. Theme by Shlomi Noach, openark.org
Creative Commons License
MySQL Fan Boy by Mark Grennan is licensed under a Creative Commons Attribution-Share Alike 3.0 United States License.
HOME