MariaDB Server was originally derived from MySQL and has therefore inherited its pluggable storage engine architecture. Different storage engines have different characteristics in terms of performance but also features and possibilities. This allows users to pick the right tool for the job instead of using the same storage engine no matter what is the purpose of the data, what are the requirements regarding data storage and how the data should be accessed. In this blog post we would like to look at the options available in MariaDB and discuss potential use cases for the different storage engines available.
What is a Storage Engine?
First, though, let’s take a look at what is the storage engine? MariaDB consists of multiple layers that operate together. SQL is parsed by one of them, then MariaDB reaches out for data, using a common API. Under the hood there is a storage engine that contains the data and it reacts to the requests for data, extracts the data and makes it available to MariaDB.
In short, MariaDB sends a request for a row and it is all up to the storage engine to retrieve it and send it back. MariaDB does not care how exactly the row is stored or how it is going to be retrieved, it is all up to the implementation within the storage engine. Storage engines may also implement different features. Transactions are being handled also entirely on the storage engine’s side. That’s why some of the support transactions and some do not. With this architecture it is possible to write different storage engines, dedicated to solving different problems.
Storage Engines in MariaDB Server
MariaDB comes with a set of storage engines. You can check which ones are available through a simple command:
MariaDB [(none)]> SHOW STORAGE ENGINES;
+--------------------+---------+-------------------------------------------------------------------------------------------------+--------------+------+------------+
| Engine | Support | Comment | Transactions | XA | Savepoints |
+--------------------+---------+-------------------------------------------------------------------------------------------------+--------------+------+------------+
| MRG_MyISAM | YES | Collection of identical MyISAM tables | NO | NO | NO |
| CSV | YES | Stores tables as CSV files | NO | NO | NO |
| Aria | YES | Crash-safe tables with MyISAM heritage. Used for internal temporary tables and privilege tables | NO | NO | NO |
| SEQUENCE | YES | Generated tables filled with sequential values | YES | NO | YES |
| MEMORY | YES | Hash based, stored in memory, useful for temporary tables | NO | NO | NO |
| MyISAM | YES | Non-transactional engine with good performance and small data footprint | NO | NO | NO |
| PERFORMANCE_SCHEMA | YES | Performance Schema | NO | NO | NO |
| InnoDB | DEFAULT | Supports transactions, row-level locking, foreign keys and encryption for tables | YES | YES | YES |
+--------------------+---------+-------------------------------------------------------------------------------------------------+--------------+------+------------+
8 rows in set (0.000 sec)
As you can see, there are many of them, we will cover the most important ones.
InnoDB
InnoDB, obviously, is THE storage engine. Transactional, built to deal with OLTP traffic, can provide really great performance. It is the default engine used in MariaDB and, unless you know what you are doing, you probably want to stick to it for your database.
MyISAM
MyISAM is one of the “original” storage engines available in MySQL and then MariaDB. It is not transactional, making it not ideal for the replication setups and, well, most of other environments as well. It is still very fast engine, especially regarding index access, making it suitable for read-only workloads that won’t be affected by locking of INSERTs and overall fragility of MyISAM.
Aria
Aria is an engine created for MariaDB as a replacement for MyISAM. It is not transactional but it is crash-safe making it way more reliable. Currently it is used for system and temporary tables but it can also be used instead of MyISAM for workloads requiring fast, read-only access to data.
Memory
This is an all-in-memory engine that is typically used for temporary in-memory tables. It is not persistent but might work for some read-only workloads.
CSV
This storage engine is designed to store data in a file as comma-separated values. It is not the most used storage engine, it’s very specialized but it still can be used to easily extract data from MariaDB into any other database software as well as Excel or similar software.
Storage Engines in MariaDB Enterprise Server
MariaDB Enterprise Server comes with a couple additional storage engines over what is available in the community edition. Let’s take a look at them as well.
ColumnStore
This is a dedicated storage engine for analytical workload. Thanks to the specific way of storing the data it makes it faster to retrieve large volumes of data, frequently needed for reporting. This might be the storage engine of your choosing for OLAP (OnLine Analytical Processing) workloads.
S3
S3 engine allows you to access data located in S3. It is a non-transactional engine intended to give users the option to archive data in the S3. Read only access is available after the table is created.
Spider
Spider engine lets you connect multiple MariaDB databases across the network, creating a sharded storage. It is transactional and it makes it easier for users to scale out by splitting the data across numerous MariaDB Enterprise Servers, distributing the traffic and workload among them.
MyRocks
MyRocks is a storage engine developed in Facebook, it is intended to reduce the write amplification and minimize the wear out of SSD drives. It is a transactional engine which should handle OLTP workload quite well, especially workloads typical for social media websites. MyRocks comes with pretty good compression, better than InnoDB, which can help to significantly reduce expenses on storage if the dataset becomes too large for InnoDB to handle properly.
Conclusion
As you can see, there are numerous options provided by both MariaDB Enterprise and Community Server regarding the way in which data can be stored. There are storage engines that excel in read-only workloads, OLAP or large datasets. It is up to the user to pick a good fit. Please keep in mind that, when in doubt, you can always stick to InnoDB, which provides quite good performance in general and should be more than enough for the majority of the cases. It is for those edge cases where you may need to look for something more suitable.