Quantcast
Channel: Severalnines - MariaDB
Viewing all articles
Browse latest Browse all 327

Database Design 101: Partitions in MySQL

$
0
0

In this blog post we are going to be discussing one of the most widely used features of MySQL - partitions.

What is Partitioning?

In MySQL, partitioning is a database design technique in which a database splits data into multiple tables, but still treats the data as a single table by the SQL layer. Simply put, when you partition a table, you split it into multiple sub-tables: partitioning is used because it improves the performance of certain queries by allowing them to only access a portion of the data thus making them faster. I/O operations can also be improved because data and indexes can be split across many disk volumes.

There are two types of partitioning: horizontal and vertical. Horizontal partitioning involves putting different rows into different tables, vertical partitioning on the other hand involves creating tables with fewer columns and using additional tables to store the remaining columns.

How Does Partitioning Work?

  • When SELECT queries are used, the partitioning layer opens and locks partitions, the query optimizer determines if any of the partitions can be pruned, then the partitioning layer forwards the handler API calls to the storage engine that handles the partitions.
  • When INSERT queries are used, the partitioning layer opens and locks partitions, determines which partition should the row belong to, then forwards the row to that partition.
  • When DELETE queries are used, the partitioning layer opens and locks partitions, determines which partition contains the row, then deletes the row from that partition.
  • When UPDATE queries are used, the partitioning layer opens and locks partitions, figures out which partition contains the row, fetches the row and modifies it, then determines which partition should contain the new row, forwards the row to the new partition with an insertion request, then forwards the deletion request to the original partition.

When Should You Use Partitioning?

In general, partitioning is useful when:

  • You have a lot of data that you need to query through.
  • Your tables are too big to fit in memory.
  • Your tables contain historical data and new data is added into the newest partition.
  • You think that you will need to distribute the contents of a table across different storage devices.
  • You think you will need to restore individual partitions.

If one or more of the scenarios described above describe your situation, partitioning may help. Before partitioning your data though, keep in mind that MySQL partitions have their own limitations:

  • Partitioning expressions do not permit the use of stored procedures, stored functions, user-defined functions (UDFs) or plugins, and with limited support for SQL functions. You also cannot use declared or stored variables.
  • Partitioned tables cannot contain or be referenced by foreign keys.
  • There is a limit of 1,024 partitions per table (starting from MariaDB 10.0.4, tables can contain a maximum of 8,192 partitions).
  • A table can only be partitioned if the storage engine supports partitioning.
  • The query cache is not aware of partitioning or partition pruning.
  • All partitions must use the same storage engine.
  • FullTEXT indexes is not supported
  • Temporary tables cannot be partitioned

The options above should help you make up your mind whether partitioning is an option for you or not.

Partitioning Types

If you decide to use partitions, keep in mind that you have a number of partitioning types to choose from. We will briefly cover your options below, then dive deeper into them:

  • Partitioning by RANGE can help you to partition rows based on column values falling within a given range.
  • Partitioning by LIST can help you to partition rows based on the membership of column values in a given list.
  • Partitioning by HASH can help you to partition rows based on a value returned by a user-defined expression.
  • Partitioning by KEY can help you to partition rows based on a hashing function provided by MySQL.

Partitioning by RANGE

Partitioning by RANGE is one of the most popular forms of partitioning MySQL tables. When you partition a table by RANGE, you partition the table in such a way that each partition contains a certain number of rows that fall within a given range. To define a partition, define the name of it, then tell it which values it should hold - to partition a table by range, add a PARTITION BY RANGE statement. For example, if you would want to name your partition p0 and make it hold every value that is less than 5, you would need to make sure that your query contains PARTITION p0 VALUES LESS THAN (5). Here’s an example of a partitioned table:

CREATE TABLE sample_table (

id INT(255) NOT NULL AUTO_INCREMENT PRIMARY KEY,

column_name VARCHAR(255) NOT NULL DEFAULT ‘’

...

) PARTITION BY RANGE (column_name) (

PARTITION p0 VALUES LESS THAN (5),

PARTITION p1 VALUES LESS THAN (10),

PARTITION p2 VALUES LESS THAN (15),

PARTITION p3 VALUES LESS THAN (20),

...

);

You can also define a partition that holds all of the values that do not fall in certain ranges like so:

PARTITION p5 VALUES LESS THAN MAXVALUE

The above partition is named p5 and it holds all values other partitions do not - MAXVALUE represents a value that is always higher than the largest possible value. You can also use functions by defining your partitions like so:

PARTITION BY RANGE (YEAR(date)) (

    PARTITION p0 VALUES LESS THAN (2000),

    PARTITION p1 VALUES LESS THAN (2010),

    PARTITION p2 VALUES LESS THAN (2020),

    PARTITION p3 VALUES LESS THAN MAXVALUE

);

In this case, all values that are less than 2000 are stored in the partition p0, all values that are less than 2010 are stored in the partition p1, all values that are less than 2020 are stored in the partition p2 and all values that do not fall in any of these ranges are stored in the partition p3.

Partitioning by LIST

Partitioning MySQL tables by LIST is similar to partitioning by RANGE - the main difference of partitioning tables by LIST is that when tables are partitioned by LIST each partition is defined and selected based on the membership of a column value in a set of value lists rather than a range of values. Partitioning by LIST can be useful when you know that, for example, you have data that can be divided into multiple smaller sets of data (say, regions). Suppose that you have a store that has 4 franchises: one in the central part of town, second in the north, third in the east, fourth in the west. You can partition a table in such a way that data belonging to a certain franchise would be stored in a partition dedicated to that franchise:

PARTITION BY LIST(store) (

PARTITION central VALUES IN (1,3,5),

PARTITION north VALUES IN (2,4,7),

PARTITION east VALUES IN (8,9),

PARTITION west VALUES IN (10,11)

);

Partitioning by HASH

Partitioning MySQL tables by HASH can be a way to make sure that data across partitions is distributed evenly. If you are partitioning your tables by HASH you only need to specify how many partitions you need your data to be divided into - the rest is taken care of MySQL. You can use partitioning by HASH by adding the following statement to CREATE TABLE:

PARTITION BY HASH(id)

PARTITIONS 5;

Replace 5 with the number that specifies into how many partitions you need your data to be divided into - the default number is 1.

MySQL also supports partitioning by LINEAR HASH - linear hashing differs from regular hashing because linear hashing utilizes a linear powers-of-two algorithm. To partition tables by a LINEAR HASH, replace PARTITION BY HASH with PARTITION BY LINEAR HASH.

Partitioning by KEY

Partitioning MySQL tables by KEY is similar to partitioning MySQL tables by HASH - in this case, the hashing function for key partitioning is supplied by the MySQL server. Any columns that are used as the partitioning key must comprise the entire table’s primary key or at least be a part of the table’s primary key. If no column name is specified as the partitioning key, the primary key will be used. If there is no primary key, but there is a unique key, the unique key will be used instead. For example, following statements are both valid, even though the first statement does not even specify the partitioning key:

CREATE TABLE demo_table (
id INT NOT NULL AUTO_INCREMENT PRIMARY KEY,
title VARCHAR(255) NOT NULL DEFAULT ''
)
PARTITION BY KEY()
PARTITIONS 2;
CREATE TABLE demo_table (
id INT NOT NULL AUTO_INCREMENT PRIMARY KEY
)
PARTITION BY KEY(id)
PARTITIONS 5;

Conclusion

To summarize, partitions can be helpful if you have a lot of data, your tables are too big to fit in memory or they contain historical data. Partitions can also be useful if you think that you will need to distribute the contents of a table across different storage mediums, also if you want to have the option to delete or restore individual partitions.

However, do keep in mind that partitions in MySQL have their own disadvantages. One of the major disadvantages of partitioning is that it will make your tables bigger - you cannot gain speed without compromising on space. If you have a very large set of data this can be a pretty big problem.


Viewing all articles
Browse latest Browse all 327

Trending Articles