Splitting Huge MySQL dump for easy restore

The open source RDBMS MySQL provides a great tool mysqdump to dump database.  Its the official sql dump utitlity for MySQL database. It makes the life of dba so easy that he can backup and restore the database within just two commands.  But sometimes due to the lacking of the infrastructure you can not dump and restore that easily. Specially when you are dealing with huge amount of data. Our database grows over time. Few hundred GBs are quite common. But if you run a software for long it might get in to Tera byte range. The problem starts when size is this huge.

Backing up and restoring such huge database is not easy.  It takes a lot of time to both backup and restore. Not to mention, if something breaks up the operation the output will most time corrupted. For backup operation the output  is a dump file. This will be corrupted if backup is interrupted. Just think if it breaks at 99%. It feels like pulling your hair! Think if it was a restore operation. Your database will be corrupted. Usually we do restore on production server. If the restore fails, that’ll be a disaster.

To come round this, Its better to backup and restore part by part. This way your database will less likely be corrupted. Here is some ways to achieve this,

  1. Split the database by tables. When backing up, split the backup operation by tables.  Just do some balance on the table size and then backup on each table. For example, If you have 3 tables 10 GB each and 2 tables 30 GB each then you can backup tables in 3 groups.
    • Group 1. All the 3 tables 10 GBs each totaling 30 GB
      • mysqldump  database1 table1 table2 table3 > table1-3.sql
    • Group 2. One of the two 30 GB tables.
      • mysqldump  database1 table4 > table4.sql
    • Group 3. One of the two 30 GB tables.
      • mysqldump  database1 table5 > table5.sql
  2. Split the big tables by rows. Not all tables in the database are big. Rather we always have some tables which are big enough to cause headache for the DBA. Think about the tweeter. Assuming a simple database schema, we can say all tables cumulatively not bigger than the table that holds the tweets. In our previous example table4 and table5 is bigger.  For such big tables we can split the dump by rows. mysqldump provides a great option called “–where”. By using this option you can split your table on any condition which is supported by “where” sql clause. So the number of ways you can split the table is infinite. Here is some strategies,
    1. auto_increment. If your table has auto column  you can split by any number of chunks.  See the examples.
      1. Split the odd and even rows
        mysqldump --where "id%2=0" database1 table4 > table4_even.sql
        mysqldump --where "id%2=1" database1 table4 > table4_odd.sql
      2. Using any number
        mysqldump --where "id%4=0" database1 table5 > table5_0.sql
        mysqldump --where "id%4=1" database1 table5 > table5_1.sql
        mysqldump --where "id%4=2" database1 table5 > table5_2.sql
        mysqldump --where "id%4=3" database1 table5 > table5_3.sql
    2. limit clause. Even you are using where clause, with a little hack you can use limit clause. This allows true partition of the tables.  For example if you have 18000 rows in a table you can dump in two parts like this,
      mysqldump --where "1 LIMIT 0, 10000" database1 table5 > table5_part_1.sql
      mysqldump --where "1 LIMIT 10000, 10000" database1 table5 > table5_part_2.sql
    3. Others. As the –where switch allows any sql condition you can use any criteria. You can split a user table by sex (male and female), by age group, by any enum column you are using, by date span etc. Some examples,
      mysqldump --where "sex='M'" database1 user > user_m.sql
      mysqldump --where "sex='F'" database1 user > user_f.sql
      mysqldump --where "account='ACTIVE'" database1 user > user_active.sql
      mysqldump --where "account='INACTIVE'" database1 user > user_inactive.sql
      mysqldump --where "year(date) <= 2008" database1 payment > payment_prior_2009.sql
      mysqldump --where "year(date) > 2008" database1 payment > payment_2009_and_after.sql

Determine whether a color is bright or dark

For a site recently I have been working, I had to determine whether a background color is dark or bright. My target was to watermark it with contrast color. If its too bright, I’d use black. For dark color I’d use white. Its so simple to think. But not easy to find.

The main Idea was to find the Luminance of a color (capital Y) . If the luminance is enough high water mark it with dark color and vice versa. This will make a contrast so watermark will always visible be.

To find the Luminance of a color (Y) this forumula is used

Y = 0.2126 R + 0.7152 G + 0.0722 B

Here R, G and B are the Red, Green and Blue component of a color.  Here the constant values are determined by the contribution factor of intensity perceived by human eye. See the wiki. Now if Y is higher than 128 then its brighter color and you need to use dark color for water marking and use bright color for Y is less than 128.

Convert Little endian to Big endian in PHP or vice versa

In PHP you might have to convert the endianness of a number. PHP does not provide any function for this even though it has function for almost everything.

So I wrote a function for this,

function chbo($num) {
    $data = dechex($num);
    if (strlen($data) <= 2) {
        return $num;
    }
    $u = unpack("H*", strrev(pack("H*", $data)));
    $f = hexdec($u[1]);
    return $f;
}

Usage:

php > echo var_dump(5254071951610216, chbo(5254071951610216448));
int(5254071951610216)
int(20120214104648)
php > echo var_dump(2147483648, chbo(2147483648));
int(2147483648)
int(128)

Note: this function changes the byte order. If your machines byte-order is little-endian, this function will change it to big-endian. If your machines byte-order is big-endian, it will change the number to big-endian.

All x86 and x86_64 are little-endian. ARM can be both.  More can be found on this wiki article