Tuesday, January 28, 2014

Use your Index even with a varchar || char

I recently noticed a post on the forums.mysql.com site : How to fast search in 3 millions record?
The example given used a LIKE '%eed'

That will not taken advantage of an index and will do an full table scan.
Below is an example using the world database, so not 3 million records but just trying to show how it works.

> explain select * from City where Name LIKE '%dam' \G
*************************** 1. row ***************************
           id: 1
  select_type: SIMPLE
        table: City
         type: ALL
possible_keys: NULL
          key: NULL
      key_len: NULL
          ref: NULL
         rows: 4188
        Extra: Using where
1 row in set (0.01 sec)

[world]> select count(*) FROM City;
+----------+
| count(*) |
+----------+
|     4079 |
+----------+

> select * from City where Name LIKE '%dam';
+------+------------------------+-------------+----------------+------------+
| ID   | Name                   | CountryCode | District       | Population
+------+------------------------+-------------+----------------+------------+
|    5 | Amsterdam              | NLD         | Noord-Holland  |     731200 |
|    6 | Rotterdam              | NLD         | Zuid-Holland   |     593321 |
| 1146 | Ramagundam             | IND         | Andhra Pradesh |     214384 |
| 1318 | Haldwani-cum-Kathgodam | IND         | Uttaranchal    |     104195 |
| 2867 | Tando Adam             | PAK         | Sind           |     103400 |
| 3122 | Potsdam                | DEU         | Brandenburg    |     128983 |
+------+------------------------+-------------+----------------+------------+
To show the point further

> explain select * from City where Name LIKE '%dam%' \G
*************************** 1. row ***************************
           id: 1
  select_type: SIMPLE
        table: City
         type: ALL
possible_keys: NULL
          key: NULL
      key_len: NULL
          ref: NULL
         rows: 4188
        Extra: Using where
1 row in set (0.00 sec)

> select * from City where Name LIKE '%dam%';
+------+------------------------+-------------+----------------+------------+
| ID   | Name                   | CountryCode | District       | Population |
+------+------------------------+-------------+----------------+------------+<
|    5 | Amsterdam              | NLD         | Noord-Holland  |     731200 |
|    6 | Rotterdam              | NLD         | Zuid-Holland   |     593321 |
|  380 | Pindamonhangaba        | BRA         | São Paulo      |     121904 |<
|  625 | Damanhur               | EGY         | al-Buhayra     |     212203 |
| 1146 | Ramagundam             | IND         | Andhra Pradesh |     214384 |
| 1318 | Haldwani-cum-Kathgodam | IND         | Uttaranchal    |     104195 |
| 1347 | Damoh                  | IND         | Madhya Pradesh |      95661 |
| 2867 | Tando Adam             | PAK         | Sind           |     103400 |
| 2912 | Adamstown              | PCN         | –              |         42 |
| 3122 | Potsdam                | DEU         | Brandenburg    |     128983 |
| 3177 | al-Dammam              | SAU         | al-Sharqiya    |     482300 |
| 3250 | Damascus               | SYR         | Damascus       |    1347000 |
+------+------------------------+-------------+----------------+------------+<
12 rows in set (0.00 sec)

The explain output above shows that no indexes are being used.

CREATE TABLE `City` (
  `ID` int(11) NOT NULL AUTO_INCREMENT,
  `Name` char(35) NOT NULL DEFAULT '',
  `CountryCode` char(3) NOT NULL DEFAULT '',
  `District` char(20) NOT NULL DEFAULT '',
  `Population` int(11) NOT NULL DEFAULT '0',
  PRIMARY KEY (`ID`),
  KEY `CountryCode` (`CountryCode`),
  CONSTRAINT `city_ibfk_1` FOREIGN KEY (`CountryCode`) REFERENCES `Country` (`Code`)
) ENGINE=InnoDB
<

So for grins let us put a key on the varchar field. Notice I do not put a key on the entire range just the first few characters. This is of course dependent on your data.

> ALTER TABLE City ADD KEY name_key(`Name`(5));
Query OK, 0 rows affected (0.54 sec)

CREATE TABLE `City` (
  `ID` int(11) NOT NULL AUTO_INCREMENT,
  `Name` char(35) NOT NULL DEFAULT '',
  `CountryCode` char(3) NOT NULL DEFAULT '',
  `District` char(20) NOT NULL DEFAULT '',
  `Population` int(11) NOT NULL DEFAULT '0',
  PRIMARY KEY (`ID`),
  KEY `CountryCode` (`CountryCode`),
  KEY `name_key` (`Name`(5)),
  CONSTRAINT `city_ibfk_1` FOREIGN KEY (`CountryCode`) REFERENCES `Country` (`Code`)
) ENGINE=InnoDB 

SO will this even matter?

> explain select * from City where Name LIKE '%dam' \G
*************************** 1. row ***************************
           id: 1
  select_type: SIMPLE
        table: City
         type: ALL
possible_keys: NULL
          key: NULL
      key_len: NULL
          ref: NULL
         rows: 4188
        Extra: Using where
1 row in set (0.00 sec)

No it will not matter, because of the LIKE '%dam' will force a full scan regardless.

> EXPLAIN select * from City where Name LIKE '%dam' AND CountryCode = 'IND' \G
*************************** 1. row ***************************
           id: 1
  select_type: SIMPLE
        table: City
         type: ref
possible_keys: CountryCode
          key: CountryCode
      key_len: 3
          ref: const
         rows: 341
        Extra: Using index condition; Using where
1 row in set (0.00 sec)

Notice the difference in the explain output above. This query is using an index. It is not using the name as the index but it is using an index. So how can you take advantage of the varchar index?

> EXPLAIN select * from City where Name LIKE 'Ra%'  \G
*************************** 1. row ***************************
           id: 1
  select_type: SIMPLE
        table: City
         type: range
possible_keys: name_key
          key: name_key
      key_len: 5
          ref: NULL
         rows: 35
        Extra: Using where
1 row in set (0.00 sec)

 The above query will use the name_key index.

The point being is that  you have to be careful how you write your SQL query and ensure that you run explains to find the best index of choice for your query. 

2 comments:

  1. If your main goal is to search by suffixes, then consider using REVERSE(), as discussed in
    http://forums.mysql.com/read.php?10,604600,604834#msg-604834

    ReplyDelete

@AnotherMySQLDBA