Quantcast
Channel: Severalnines - MariaDB
Viewing all 327 articles
Browse latest View live

Webinar: Become a MongoDB DBA - Scaling and Sharding

$
0
0

Join us for our third ‘How to become a MongoDB DBA’ webinar on Tuesday, November 15th! In this webinar we will uncover the secrets and caveats of MongoDB scaling and sharding.

Become a MongoDB DBA - Scaling and Sharding

MongoDB offers read and write scaling out of the box: adding secondary nodes will increase your potential read capacity, while adding shards will increase your potential write capacity. However, adding a new shard doesn’t necessarily mean it will be used. Choosing the wrong shard key may also cause uneven data distribution.

There is more to scaling than just simply adding nodes and shards. Factors to take into account include indexing, shard re-balancing,replication lag, capacity planning and consistency.

Learn with this webinar how to plan your scaling strategy up front and how to prevent ending up with unusable secondary nodes and shards. Finally, we’ll show you how to leverage ClusterControl’s MongoDB scaling capabilities and have ClusterControl manage your shards.

Date, Time & Registration

Europe/MEA/APAC

Tuesday, November 15th at 09:00 GMT / 10:00 CET (Germany, France, Sweden)

Register Now

North America/LatAm

Tuesday, November 15th at 09:00 Pacific Time (US) / 12:00 Eastern Time (US)

Register Now

Agenda

  • What are the differences in read and write scaling with MongoDB
  • Read scaling considerations with MongoDB
  • MongoDB read preference explained
  • How sharding works in MongoDB
  • Adding new shards and balance data
  • How to scale and shard MongoDB using ClusterControl
  • Live Demo

Speaker

Art van Scheppingen is a Senior Support Engineer at Severalnines. He’s a pragmatic database expert with over 16 years experience in web development. He previously worked at Spil Games as Head of Database Engineering, where he kept a broad vision upon the whole database environment: from MySQL to MongoDB, Vertica to Hadoop and from Sphinx Search to SOLR. He regularly presents his work and projects at various conferences (Percona Live, MongoDB Open House, FOSDEM) and related meetups.

We look forward to “seeing” you there!

This session is based upon the experience we have using MongoDB and implementing it for our database infrastructure management solution, ClusterControl. For more details, read through our ‘Become a MongoDB DBA’ blog series.


High Availability on a Shoestring Budget - Deploying a Minimal Two Node MySQL Galera Cluster

$
0
0

We regularly get questions about how to set up a Galera cluster with just 2 nodes. The documentation clearly states you should have at least 3 Galera nodes to avoid network partitioning. But there are some valid reasons for considering a 2 node deployment, e.g., if you want achieve database high availability but have limited budget to spend on a third database node. Or perhaps you are running Galera in a development/sandbox environment and prefer a minimal setup.

Galera implements a quorum-based algorithm to select a primary component through which it enforces consistency. The primary component needs to have a majority of votes, so in a 2 node system, there would be no majority resulting in split brain. Fortunately, it is possible to add a garbd (Galera Arbitrator Daemon), which is a lightweight stateless daemon that can act as the odd node. Arbitrator failure does not affect the cluster operations and a new instance can be reattached to the cluster at any time. There can be several arbitrators in the cluster.

ClusterControl has support for deploying garbd on non-database hosts.

Normally a Galera cluster needs at least three hosts to be fully functional, however at deploy time two nodes would suffice to create a primary component. Here are the steps:

  1. Deploy a Galera cluster of two nodes,
  2. After the cluster has been deployed by ClusterControl, add garbd on the ClusterControl node.

You should end up with the below setup:

Deploy the Galera Cluster

Go to the ClusterControl deploy wizard to deploy the cluster.

Even though ClusterControl warns you a Galera cluster needs an odd number of nodes, only add two nodes to the cluster.

Deploying a Galera cluster will trigger a ClusterControl job which can be monitored at the Jobs page.

Install Garbd

Once deployment is complete, install garbd on the ClusterControl host. It will be under the Manage -> Load Balancer:

Installing garbd will trigger a ClusterControl job which can be monitored at the Jobs page. Once completed, you can verify garbd is running with a green tick icon at the top bar:

That’s it. Our minimal two-node Galera cluster is now ready!

Join our live webinar on how to scale and shard MongoDB

$
0
0

We’re live next Tuesday, November 15th, with our webinar ‘Become a MongoDB DBA - Scaling and Sharding’!

Join us and learn about the three components necessary for MongoDB sharding. We’ll also share a read scaling considerations checklist as well as tips & tricks for finding the right shard key for MongoDB.

Overall, we’ll discuss how to plan your MongoDB scaling strategy up front and how to prevent ending up with unusable secondary nodes and shards. And we’ll look at how to leverage ClusterControl’s MongoDB scaling and shards management capabilities.

Sign up below!

Date, Time & Registration

Europe/MEA/APAC

Tuesday, November 15th at 09:00 GMT / 10:00 CET (Germany, France, Sweden)
Register Now

North America/LatAm

Tuesday, November 15th at 09:00 Pacific Time (US) / 12:00 Eastern Time (US)
Register Now

Agenda

  • What are the differences in read and write scaling with MongoDB
  • Read scaling considerations with MongoDB
  • MongoDB read preference explained
  • How sharding works in MongoDB
  • Adding new shards and balance data
  • How to scale and shard MongoDB using ClusterControl
  • Live Demo

Speaker

Art van Scheppingen is a Senior Support Engineer at Severalnines. He’s a pragmatic database expert with over 16 years experience in web development. He previously worked at Spil Games as Head of Database Engineering, where he kept a broad vision upon the whole database environment: from MySQL to MongoDB, Vertica to Hadoop and from Sphinx Search to SOLR. He regularly presents his work and projects at various conferences (Percona Live, MongoDB Open House, FOSDEM) and related meetups.

We look forward to “seeing” you there!

This session is based upon the experience we have using MongoDB and implementing it for our database infrastructure management solution, ClusterControl. For more details, read through our ‘Become a MongoDB DBA’ blog series.

Deploying and Monitoring MySQL and MongoDB clusters in the cloud with NinesControl

$
0
0

NinesControl is a new service from Severalnines which helps you deploy MySQL Galera and MongoDB clusters in the cloud. In this blog post we will show you how you can easily deploy and monitor your databases on AWS and DigitalOcean.

Deployment

At the moment of writing, NinesControl supports two cloud providers - Amazon Web Services and DigitalOcean. Before you attempt to deploy, you need first to configure access credentials to the cloud you’d like to run on. We covered this topic in a blog post.

Once it’s done, you should see in the “Cloud Accounts” tab the credentials defined for the chosen cloud provider.

You’ll see screen below as you do not have any clusters running yet:

You can click on “Deploy your first cluster” to start your first deployment. You will be presented with a screen like below - you can pick the cluster type you’d like to deploy, set some configuration settings like port, data directory and password. You can also set number of nodes in the cluster and which database vendor you’d like to use.

For MongoDB, the deployment screen is fairly similar with some additional settings to configure.

Once you are done here, it’s time to move to the second step - picking credentials to use to deploy your cluster. You have an option to pick either DigitalOcean and Amazon Web Services. You can also pick whatever credentials you have added to NinesControl. In our example, we just have a single credential but it’s perfectly ok to have more than one credential per cloud provider.

Once you’ve made your choice, proceed to the third, final step in which you will pick what kind of VM’s you’d like to use. This screen differs between AWS and DigitalOcean.

If you picked AWS, you will have an option to choose the operating system and VM size. You also need to pick the VPC in which you will deploy too and subnet which will be used by your cluster. If you don’t see anything on the drop-down list, you can click on “[Add]” buttons to create both VPC and subnet and NinesControl will create these for you. Finally, you need to set the volume size of the VMs. After that, you can trigger the deployment.

DigitalOcean uses a bit different screen setup but the idea is similar - you need to pick a region, operating system and a size of droplet.

Once you are done, click on “Deploy cluster” to start deployment.

Status of the deployment will be show in the cluster list. You can also click on a status bar to see full log of a deployment. Whenever you’d like to deploy a new cluster, you will have to click on the “Deploy cluster” button.

Monitoring

Once deployment completes, you’ll see a list of your clusters.

When you click on one of them, you’ll see a list of nodes in the cluster and cluster-wide metrics.

Of course, metrics are cluster-dependent. Above is what you will see on a MySQL/MariaDB Galera cluster. MongoDB will present you different graphs and metrics:

When you click on a node, you will be redirected to host statistics of that particular node - CPU, network, disk, RAM usage - all of those very important basics which tell you about node health:

As you can see, NinesControl not only allows you to deploy Galera and MongoDB clusters in a fast and efficient way but it also collects important metrics for you and shows them as graphs.

Give it a try and let us know what you think.

New whitepaper - the DevOps Guide to database backups for MySQL and MariaDB

$
0
0

This week we’re happy to announce that our new DevOps Guide to Database Backups for MySQL & MariaDB is now available for download (free)!

This guide discusses in detail the two most popular backup utilities available for MySQL and MariaDB, namely mysqldump and Percona XtraBackup.

Topics such as how database features like binary logging and replication can be leveraged in backup strategies are covered. And it provides best practices that can be applied to high availability topologies in order to make database backups reliable, secure and consistent.

Ensuring that backups are performed, so that a database can be restored if disaster strikes, is a key operational aspect of database management. The DBA or System Administrator is usually the responsible party to ensure that the data is protected, consistent and reliable. Ever more crucially, backups are an important part of any disaster recovery strategy for businesses.

So if you’re looking for insight into how to perform database backups efficiently or the impact of Storage Engine on MySQL or MariaDB backup procedures, need some tips & tricks on MySQL / MariaDB backup management … our new DevOps Guide has you covered.

Planets9s - Download our new DevOps Guide to Database Backups for MariaDB & MySQL

$
0
0

Welcome to this week’s Planets9s, covering all the latest resources and technologies we create around automation and management of open source database infrastructures.

Download our new DevOps Guide to Database Backups for MariaDB & MySQL

Check out our free whitepaper on database backups, which discusses in detail the two most popular backup utilities available for MySQL and MariaDB, namely mysqldump and Percona XtraBackup. If you’re looking for insight into how to perform database backups efficiently or the impact of Storage Engine on MySQL or MariaDB backup procedures, need some tips & tricks on MySQL / MariaDB backup management … our new DevOps Guide has you covered.

Download the whitepaper

Tips and Tricks: Receive email notifications from ClusterControl

Did you know that apart from receiving notifications when things go wrong, you can also receive digest emails for less critical notifications from ClusterControl? As SysAdmins and DBAs, we need to be notified whenever something critical happens to our database. But would it not be nicer if we were informed upfront, and still had time to perform pre-emptive maintenance and retain high availability?  With this new blog post, find out how to enable and set up your email notifications in ClusterControl according to your needs.

Read the blog

Getting social with Severalnines

As we begin to wrap up 2016 and look towards and plan all the exciting things for next year, we wanted to take a moment to encourage you to follow and engage with us on our social channels. We produce plenty of content and have a lot more planned for 2017. To ensure that you don’t miss out on any of it, we’d love it if you would follow us so we can better keep you up to date and interact more directly with you.

Get social

That’s it for this week! Feel free to share these resources with your colleagues and follow us in our social media channels.

Have a good end of the week,

Jean-Jérôme Schmidt
Planets9s Editor
Severalnines AB

The choice of MySQL storage engine and its impact on backup procedures

$
0
0

MySQL offers multiple storage engines to store its data, with InnoDB and MyISAM being the most popular ones.  Each storage engine implements a more specific set of features required for a type of workload, and as a result, works differently from other engines. Since data is stored inside the storage engine, we need to understand how the storage engines work to determine the best backup tool. In general, MySQL backup tools perform a special operation in order to retrieve a consistent data - either lock the tables or establish a transaction isolation level that guarantees data read is unchanged.

MyISAM/Aria

MyISAM was the default storage engine for MySQL versions prior to 5.5.5. It is based on the older ISAM code but has many useful extensions. The major deficiency of MyISAM is the absence of transactions support. Aria is another storage engine with MyISAM heritage and is a MyISAM replacement in all MariaDB distributions. The main difference is that Aria is crash safe, whereas MyISAM is not. Being crash safe means that an Aria table can recover from unexpected failures in a much better way than a MyISAM table can. In most circumstances, backup operations for MyISAM and Aria are almost identical.

MyISAM uses table-level locking. It stores indexes in one file and data in another. MyISAM tables are generally more compact in size on disk when compared to InnoDB tables. With the nature of table-level locking and no transaction support, the recommended way to backup MyISAM tables is to acquire the global read lock by using FLUSH TABLE WITH READ LOCK (FTWRL) to make MySQL read-only temporarily or use LOCK TABLES statement explicitly. Without that, MyISAM backups will be inconsistent.

InnoDB/XtraDB

InnoDB is the default storage engine for MySQL and MariaDB. It provides the standard ACID-compliant transaction features, along with foreign key support and row-level locking.

Percona’s XtraDB is an enhanced version of the InnoDB storage engine for MySQL and MariaDB. It features some improvements that make it perform better in certain situations. It is backwards compatible with InnoDB, so it can be used as a drop-in replacement.

There are a number of key components in InnoDB that directly influences the behaviour of backup and restore operation:

  • Transactions
  • Crash recovery
  • Multiversion concurrency control (MVCC)

Transactions

InnoDB does transactions. A transaction will never be completed unless each individual operation within the group is successful (COMMIT). If any operation within the transaction fails, the entire transaction will fail and any changes will be undone (ROLLBACK).

The following example shows a transaction in MySQL (assuming autocommit is off):

BEGIN;
UPDATE account.saving SET balance = (balance - 10) WHERE id = 2;
UPDATE account.current SET balance = (balance + 10) WHERE id = 2;
COMMIT;

A transaction starts with a BEGIN and ends with a COMMIT or ROLLBACK. In the above example, if the MySQL server crashes after the first UPDATE statement completed (line 2), that update would be rolled back and the balance value won’t change for this transaction. The ability to rollback is vital when performing crash recovery, as explained in the next section.

Crash Recovery

InnoDB maintains a transaction log, also called redo log. The redo log is physically represented as a set of files, typically named ib_logfile0 and ib_logfile1. The log contains a record of every change to InnoDB data. When InnoDB starts, it inspects the data files and the transaction log, and performs two steps:

  1. Applies committed transaction log entries to the data files.
  2. Performs an undo operation (rollback) on any transactions that modified data but did not commit.

The rollback is performed by a background thread, executed in parallel with transactions from new connections. Until the rollback operation is completed, new connections may encounter locking conflicts with recovered transactions. In most situations, even if the MySQL server was killed unexpectedly in the middle of heavy activity, the recovery process happens automatically. No action is needed from the DBA.

Percona Xtrabackup utilizes InnoDB crash recovery functionality to prepare the internally inconsistent backup (the binary copy of MySQL data directory) into a consistent and usable database again.

MVCC

InnoDB is a multiversion concurrency control (MVCC) storage engine which means many versions of a single row can exist at the same time. Due to this nature, unlike MyISAM, InnoDB does not require a global read lock to get a consistent read. It utilizes its ACID-compliant transaction component called isolation. Isolation is the “i” in the acronym ACID - the isolation level determines the capabilities of a transaction to read/write data that is accessed by other transactions.

In order to get a consistent snapshot of InnoDB tables, one could simply start a transaction with REPEATABLE READ isolation level. In REPEATABLE READ, a read view is created at the start of the transaction, and this read view is held open for the duration of the transaction. For example, if you execute a SELECT statement at 6 AM, and come back in an open transaction at 6 PM, when you run the same SELECT, then you will see the exact same result set that you saw at 6 AM. This is part of MVCC capability and it is accomplished using row versioning and UNDO information.

Logical backup like mysqldump uses this approach to generate a consistent backup for InnoDB without explicit table lock that can cause the MySQL server to be read-only.

MEMORY

The MEMORY storage engine (formerly known as HEAP) creates special-purpose tables with contents that are stored in memory. Because the data is vulnerable to crashes, hardware issues, or power outages, only use these tables as temporary work areas or read-only caches for data pulled from other tables.

Due to the transient nature of data from MEMORY tables (data is not persisted to disk), only logical backup is capable of backing up these tables. Backup in physical format is not possible.

That’s it for today, but you can read more about backups in our whitepaper - The DevOps Guide to Database Backups for MySQL and MariaDB.

Online schema change for MySQL & MariaDB - comparing GitHub’s gh-ost vs pt-online-schema-change

$
0
0

Database schema change is one of the most common activities that a MySQL DBA has to tackle. No matter if you use MySQL Replication or Galera Cluster, direct DDL’s are troublesome and, sometimes, not feasible to execute. Add the requirement to perform the change while all databases are online, and it can get pretty daunting.

Thankfully, online schema tools are there to help DBAs deal with this problem. Arguably, the most popular of them is Percona’s pt-online-schema-change, which is part of Percona Toolkit.

It has been used by MySQL DBAs for years and is proven as a flexible and reliable tool. Unfortunately, not without drawbacks.

To understand these, we need to understand how it works internally.

How does pt-online-schema-change work?

Pt-online-schema-change works in a very simple way. It creates a temporary table with the desired new schema - for instance, if we added an index, or removed a column from a table. Then, it creates triggers on the old table - those triggers are there to mirror changes that happen on the original table to the new table. Changes are mirrored during the schema change process. If a row is added to the original table, it is also added to the new one. Likewise if a row is modified or deleted on the old table, it is also applied on the new table. Then, a background process of copying data (using LOW_PRIORITY INSERT) between old and new table begins. Once data has been copied, RENAME TABLE is executed to rename “yourtable” into “yourtable_old” and “yourtable_new” into “yourtable”. This is an atomic operation and in case something goes wrong, it is possible to recover the old table.

The process described above has some limitations. For starters, it is not possible to reduce the overhead of the tool to 0. Pt-online-schema-change gives you an option to define the maximum allowed replication lag and, if that threshold is crossed, it stops to copy data between the old and new table. It is also possible to pause the background process entirely. The problem is that we are talking only about the background process of running INSERTs. It is not possible to reduce the overhead caused by the fact that every operation in “yourtable” is duplicated in “yourtable_new” through triggers. If you remove the triggers, the old and new table would go out of sync without any means to sync them again. Therefore, when you run pt-online-schema-change on your system, it always adds some overhead, even if it is paused or throttled. How big overhead depends on how many writes hit the table which is undergoing a schema change.

Another issue is caused again by triggers - precisely by the fact that, to create triggers, one has to acquire a lock on MySQL’s metadata. This can become a serious problem if you have highly concurrent traffic or if you use longer transactions. Under such load, it may be virtually impossible (and we’ve seen such databases) to use pt-online-schema-change due to the fact that it is not able to acquire metadata lock to create the required triggers. Additionally, the process of acquiring metadata can also lock further transactions, basically grinding all database operations to halt.

Yet another problem are foreign keys - unfortunately, there is no simple way of handling them. Pt-online-schema-change gives you two methods to approach this issue. Neither of those are really good. The main issue here is that a foreign key of a given name can only refer to a single table and it sticks to it - even if you rename the table referred to, the foreign key will follow this change. This leads to the problem: after RENAME TABLE, the foreign key will point to ‘yourtable_old’, not ‘yourtable’.

One workaround is to not use:

RENAME TABLE ‘yourtable’ TO ‘yourtable_old’, ‘yourtable_new’ TO ‘yourtable’;

Instead, use a two step approach:

DROP TABLE ‘yourtable’; RENAME TABLE ‘yourtable_new’ TO ‘yourtable’;

This poses a serious problem. If for some reason, RENAME TABLE won’t work, there’s no going back as the original table has been already dropped.

Another approach would be to create a second foreign key, under a different name, which refers to ‘yourtable_new’. After RENAME TABLE, it will point to ‘yourtable’, which is exactly what we want. Thing is, you need to execute a direct ALTER to create such foreign key - which kind of invalidates the point of using online schema change - to avoid direct alters. If the altered table is large, such operation is not feasible to execute on Galera Cluster (cluster-wide stall caused by TOI) and MySQL replication cluster (slave lag induced by serialized ALTER).

As you can see, while being a useful tool, pt-online-schema-change has serious limitations which you need to be aware of before you use it. If you use MySQL at scale, limitations may become a serious motivation to do something about it.

Introducing GitHub’s gh-ost

Motivation alone is not enough - you also need resources to create a new solution. GitHub recently released gh-ost, their take on online schema change. Let’s take a look at how it compares to Percona’s pt-online-schema-change and how it can be used to avoid some of its limitations.

To understand better what is the difference between both tools, let’s take a look at how gh-ost works.

Gh-ost creates a temporary table with the altered schema, just like pt-online-schema-change does - it uses “_yourtable_gho” pattern. It executes INSERT queries which use the following pattern to copy data from old to new table:

insert /* gh-ost `sbtest1`.`sbtest1` */ ignore into `sbtest1`.`_sbtest1_gho` (`id`, `k`, `c`, `pad`)
      (select `id`, `k`, `c`, `pad` from `sbtest1`.`sbtest1` force index (`PRIMARY`)
        where (((`id` > ?)) and ((`id` < ?) or ((`id` = ?)))) lock in share mode

As you can see, it is a variation of INSERT INTO new_table  SELECT * FROM old_table. It uses primary key to split data in chunks and then work on them.

In pt-online-schema-change, the current traffic was handled using triggers. Gh-ost uses a triggerless approach - it uses binary logs to track and apply changes which happened since gh-ost started to copy data. It connects to one of the hosts, by default it is one of the slaves, simulates that it is a slave itself and asks for binary logs.

This behavior has a couple of repercussions. First of all, network traffic is increased compared to pt-online-schema-change - not only gh-ost has to copy data but it also has to copy binary logs.

It also requires binary logs in row-based format for full data consistency - if you use statement or mixed replication, gh-ost won’t work in your setup. As a workaround, you can create a new slave, enable log_slave_updates and set it to store events in row format. Reading data from a slave is, actually, the default way in which gh-ost operates - it makes perfect sense as pulling binary logs adds some overhead and if you can avoid additional overhead on the master, you most likely want to do it. Of course, if your master uses row-based replication format, you can force gh-ost to connect to it and get binary logs.

What is good about this design is that you don’t have to create triggers, which, as we discussed, could become a serious problem or even a blocker. What is also great is that you can always stop parsing binary logs - it’s like you’d just run STOP SLAVE. You have the binlog coordinates so you can easily start in the same position later on. This makes it possible to stop practically all operations executed by gh-ost. Not only the background process of copying data from old to new table, but also any load related to keeping the new table in sync with the old one. This is a great feature in a production environment - pt-online-schema-change requires constant monitoring as you can only estimate the additional load on the system. Even if you paused it, it will still add some overhead and, under heavy load, this overhead may result in an unstable database. On the other hand, with gh-ost, you can just pause the whole process and the workload pattern will go back to what you are used to see - no additional load whatsoever related to the schema change. This is really great - it means you can start the migration at 9am, when you start your day, stop it at 5pm when you are leaving your office. You can be sure that you won’t get paged late at night because the paused schema change process is not actually 100% paused, and is causing problems to your production systems.

Unfortunately, gh-ost is not without drawbacks. For starters, foreign keys. Pt-online-schema-change does not provide any good way of altering tables which contain foreign keys. It is still way better than gh-ost as gh-ost does not support foreign keys at all. At the moment of writing, that is - it may change in the future. Triggers - gh-ost, at the moment of writing, does not support triggers at all. The same is true for pt-online-schema-change - it was a limitation of pre-5.7 MySQL where you couldn’t have more than one trigger of a given type defined in a table (and pt-online-schema-change had to create them for its own purposes). Even if the limitation is removed in MySQL 5.7, pt-online-schema-change still does not support tables with triggers.

One of the main limitations of gh-ost is, definitely, the fact that it does not support Galera Cluster. It is because of how gh-ost performs a table switch - it uses LOCK TABLE which do not work well with Galera - as of now there is no known fix or workaround for this issue and this leaves pt-online-schema-change as the only option for Galera Cluster.

These are probably the most important  limitations of gh-ost, but there are more of them. Minimal row image is not supported (which makes your binlogs grow larger), JSON and generated columns in 5.7 are not supported. Migration key must not contain NULL values, there are limitations when it comes to mixed cases in table names. You can find more details on all requirements and limitations of gh-ost in its documentation.

In our next blog post we will take a look at how gh-ost operates, how you can test your changes and how to perform it. We will also discuss throttling of gh-ost.


Planets9s - Online schema change for MySQL & MariaDB, MySQL storage engine & backups … and more

$
0
0

Welcome to this week’s Planets9s, covering all the latest resources and technologies we create around automation and management of open source database infrastructures.

Online schema change for MySQL & MariaDB: GitHub’s gh-ost & pt-online-schema-change

Online schema changes are unavoidable, as any DBA will know. While there are tools such as Percona’s pt-online-schema-change to assist, it does not come without drawbacks. However, there is a new kid on the block: GitHub released an online schema change tool called gh-ost. This post by Krzysztof Ksiazek, Senior Support Engineer at Severalnines, looks at how gh-ost compares to pt-online-schema-change, and how it can be used to address some limitations.

Read the blog

The choice of MySQL storage engine and its impact on backup procedures

As you will know, MySQL offers multiple storage engines to store its data, with InnoDB and MyISAM being the most popular ones. And this has an impact on how you design and run your backup procedures. Since data is stored inside the storage engine, we need to understand how the storage engines work to determine the best backup tool. This post by Ashraf Sharif, System Support Engineer at Severalnines, provides the necessary insight into these topics and recommendations on how best to proceed.

Read the blog

Want an easy way to deploy & monitor Galera Cluster in the cloud?

If you haven’t see it yet, we’ve recently launched a new tool that allows you to easily deploy and monitor Galera Clusters onto the AWS and Digital Ocean clouds. NinesControl allows quick, easy, point-and-click deployment and monitoring of a standalone or a clustered SQL and NoSQL database. Each provisioned database is automatic, repeatable and completes in minutes. It also provides real-time monitoring, self-healing and automatic recovery features. Find out more and get started via the link below.

Check out NinesControl

That’s it for this week! Feel free to share these resources with your colleagues and follow us in our social media channels.

Have a good end of the week,

Jean-Jérôme Schmidt
Planets9s Editor
Severalnines AB

How to Perform Efficient Backup for MySQL and MariaDB

$
0
0

All backup methods have their pros and cons. They also affect database workloads differently. Your backup strategy will depend upon the business requirements, the environment you operate in and resources at your disposal. Backups are usually planned according to your restoration requirement. Data loss can be full or partial, and you do not always need to recover the whole dataset. In some cases, you might just want to do a partial recovery by restoring missing tables or rows. In this case, you will need a combination of Percona Xtrabackup, mysqldump and binary logs to cover the different cases.

Performing a backup on MySQL or MariaDB is not that hard, but to be efficient, we do need to understand the effects of each and every procedure. It also depends on a number of factors like storage engine, recovery objective, dataset and delta size, storage capability and capacity, security as well as high availability design and architecture.

One of the most important things in performing a backup is to make sure you get a consistent backup. Backing up non-transactional tables like MyISAM and MEMORY require tables to be locked to guarantee consistency, this can be done using the global lock (FLUSH TABLE WITH READ LOCKS). Consequently, global lock will temporarily make the server to be read-only. For InnoDB, locking is unnecessary and other DML operations are allowed to execute while the backup is running.

In term of backup size, if you have limited storage space backed by an outdated disk subsystem, compression is your friend. Performing compression is a CPU intensive process and can directly impact the performance of your MySQL server. However, if it can be scheduled during periods of low traffic, compression can save you a lot of space. It is a tradeoff between processing power and storage space, and reduces the risk of server crash caused by a full disk.

If your database workload is write-intensive, you might find the difference in size (delta) between the two latest full backups to be fairly big, for example 1GB for a 10GB dataset per day. Performing regular full backups on databases with this kind of workload will likely introduce performance degradation, and it might be more efficient to perform incremental backups. Ultimately, this kind of workload will bring the database to a state where the backup size is rapidly growing and physical backup might be the only way to go.

When creating an encrypted backup, one thing to have in mind is that it usually takes more time to recover. The backup has to be decrypted prior to any recovery activities. With a large dataset, this could introduce some delays to the RTO. On the other hand, if you are using private key for encryption, make sure to store the key in a safe place. If the private key is missing, the backup will be useless and unrecoverable. If the key is stolen, all created backups that use the same key would be compromised as they are no longer secured.

It is common nowadays to have a high availability setup using either MySQL Replication or MySQL/MariaDB Galera Cluster. It is not necessary to backup all members in the replication chain or cluster. Since all nodes are expected to hold the same data (unless the dataset is sharded across different nodes), it is recommended to perform backup on only one node (or one per shard).

Taking a MySQL backup on a dedicated backup server will simplify your backup plans. A dedicated backup server is usually an isolated slave connected to the production servers via asynchronous replication. A good backup server consists of plenty of  disk space for backup storage, with the ability to do storage snapshots. Since it uses loosely-coupled asynchronous replication, it will unlikely cause additional overhead to the production database. However, this server might become a single point of failure, with the risk of inconsistent backup if the backup server regularly lags behind.

As we have seen, there are quite a few things to consider in order to make efficient backups of MySQL and MariaDB. Each of the mentioned points are discussed in depth, together with example use-cases and best practices in our latest whitepaper - The DevOps Guide to Database Backups for MySQL and MariaDB.

How to perform online schema changes on MySQL using gh-ost

$
0
0

In the previous blog post, we discussed how gh-ost works internally and how it compares to Percona’s pt-online-schema-change. Today we’d like to focus on operations - how can we test a schema change with gh-ost to verify it can be executed without any issues? And how do we go ahead and perform the actual schema change?

Testing migration

Ensuring that a migration will go smoothly is one of the most important steps in the whole schema change process. If you value your data, then you definitely want to avoid any risk of data corruption or partial data transformation. Let’s see how gh-ost allows you to test your migration.

gh-ost gives you numerous ways to test. First of all, you can execute a no-op migration by skipping the --execute flag. Let’s look at an example - we want to add a column to a table.

root@ip-172-30-4-235:~# ./gh-ost --host=172.30.4.235 --user=sbtest --password=sbtest --database=sbtest1 --table=sbtest1 --alter="ADD COLUMN x INT NOT NULL DEFAULT '0'" --chunk-size=2000 --max-load=Threads_connected=20

We here pass access details like user, password, database and table to alter. We also define what change needs to be added. Finally, we define chunk size for the background copy process and what we understand as a max load. Here we can pass different status counters in MySQL (not all makes sense) - we used threads_connected but we could use, for example, ‘threads_running’. Once this threshold is crossed, gh-ost starts to throttle writes.

# Migrating `sbtest1`.`sbtest1`; Ghost table is `sbtest1`.`_sbtest1_gho`
# Migrating ip-172-30-4-4:3306; inspecting ip-172-30-4-235:3306; executing on ip-172-30-4-235
# Migration started at Tue Dec 20 14:00:45 +0000 2016
# chunk-size: 2000; max-lag-millis: 1500ms; max-load: Threads_connected=20; critical-load: ; nice-ratio: 0.000000

Next, we see information about migration - what tables do we alter, which table is used as a ghost (temporary) table. Gh-ost creates two tables, one with _gho suffix is a temporary table with the new schema and it’s the target of the data copying process. The second table, with _ghc suffix, stores migration logs and status. We can also see a couple of other defaults - maximum acceptable lag is 1500 milliseconds (1.5 seconds) - gh-ost may work with an external script to create up to millisecond granularity for lag control. If you don’t set --replication-lag-query flag, seconds_behind_master from SHOW SLAVE STATUS will be used, which has granularity of one second.

# throttle-additional-flag-file: /tmp/gh-ost.throttle
# Serving on unix socket: /tmp/gh-ost.sbtest1.sbtest1.sock

Here we have information about throttle flag file - creating it will automatically trigger throttling on gh-ost. We also have an unix socket file, which can be used to control gh-ost’s configuration at runtime.

Copy: 0/0 100.0%; Applied: 0; Backlog: 0/100; Time: 1s(total), 0s(copy); streamer: binlog.000042:102283; State: migrating; ETA: due
CREATE TABLE `_sbtest1_gho` (
  `id` int(10) unsigned NOT NULL AUTO_INCREMENT,
  `k` int(10) unsigned NOT NULL DEFAULT '0',
  `c` char(120) NOT NULL DEFAULT '',
  `pad` char(60) NOT NULL DEFAULT '',
  `x` int(11) NOT NULL DEFAULT '0',
  PRIMARY KEY (`id`),
  KEY `k_1` (`k`)
) ENGINE=InnoDB DEFAULT CHARSET=latin1 MAX_ROWS=1000000

Finally, we have information about progress - nothing interesting here as we ran a no-op change. We also have information about the schema of the target table.

Now that we tested no-op change, it’s time for some more real-life tests. Again, gh-ost gives you an option to verify that everything goes as planned. What we can do is to use one of our database replicas to run the change on, and verify it went fine. Gh-ost will stop the replication for us as soon as the change completes, to ensure that we can compare data from old and new table. It’s not so easy to compare tables with different schemas so we may want to start with a change which doesn’t do anything. For example:

ALTER TABLE … ENGINE=InnoDB;

Let’s run this migration to verify that gh-ost actually does its job correctly:

root@ip-172-30-4-235:~# ./gh-ost --host=172.30.4.235 --user=sbtest --password=sbtest --database=sbtest1 --table=sbtest1 --alter="ENGINE=InnoDB" --chunk-size=2000 --max-load=Threads_connected=20 --test-on-replica --execute

Once it’s done, you will see your slave in the following state.

mysql> \P grep Running
PAGER set to 'grep Running'
mysql> SHOW SLAVE STATUS\G
             Slave_IO_Running: No
            Slave_SQL_Running: No
      Slave_SQL_Running_State:
1 row in set (0.00 sec)

Replication has been stopped so no new changes are being added.

mysql> SHOW TABLES FROM sbtest1;
+-------------------+
| Tables_in_sbtest1 |
+-------------------+
| _sbtest1_gho      |
| sbtest1           |
+-------------------+
2 rows in set (0.00 sec)

Gh-ost table has been left for you to look into. Now, as we run a noop alter, we can compare both tables to verify that the whole process worked flawlessly. There are a couple of methods to do that. You can, for example, dump the table contents via SELECT … INTO OUTFILE and then compare md5 of both dump files. You can also use CHECKSUM TABLE command in MySQL:

mysql> CHECKSUM TABLE sbtest1.sbtest1, sbtest1._sbtest1_gho EXTENDED;
+----------------------+-----------+
| Table                | Checksum  |
+----------------------+-----------+
| sbtest1.sbtest1      | 851491558 |
| sbtest1._sbtest1_gho | 851491558 |
+----------------------+-----------+
2 rows in set (9.27 sec)

As long as checksums are identical (no matter how you calculated them), you should be safe to assume that both tables are identical and the migration process went fine.

Performing an actual migration

Once we verified that gh-ost can execute our schema change correctly, it’s time to actually execute it. Keep in mind that you may need to manually drop old tables that were created by gh-ost during the process of testing the migration. You can also use --initially-drop-ghost-table and --initially-drop-old-table flags to ask gh-ost to do it for you. The final command to execute is exactly the same as we used to test our change, we just added --execute to it.

./gh-ost --host=172.30.4.235 --user=sbtest --password=sbtest --database=sbtest1 --table=sbtest1 --alter="ADD COLUMN x INT NOT NULL DEFAULT '0'" --chunk-size=2000 --max-load=Threads_connected=20 --execute

Once started, we’ll see a summary of the job. The main change is that the “migrating” host points to our master, 172.30.4.4 and we use one of slaves, 172.30.4.235 to look for binary logs.

# Migrating `sbtest1`.`sbtest1`; Ghost table is `sbtest1`.`_sbtest1_gho`
# Migrating ip-172-30-4-4:3306; inspecting ip-172-30-4-235:3306; executing on ip-172-30-4-235
# Migration started at Fri Dec 23 19:18:00 +0000 2016
# chunk-size: 2000; max-lag-millis: 1500ms; max-load: Threads_connected=20; critical-load: ; nice-ratio: 0.000000
# throttle-additional-flag-file: /tmp/gh-ost.throttle
# Serving on unix socket: /tmp/gh-ost.sbtest1.sbtest1.sock

We can also see progress messages printed by gh-ost:

Copy: 0/9982267 0.0%; Applied: 0; Backlog: 7/100; Time: 4s(total), 0s(copy); streamer: binlog.000074:808522953; State: migrating; ETA: N/A
Copy: 0/9982267 0.0%; Applied: 538; Backlog: 100/100; Time: 5s(total), 1s(copy); streamer: binlog.000074:808789786; State: migrating; ETA: N/A
Copy: 0/9982267 0.0%; Applied: 1079; Backlog: 100/100; Time: 6s(total), 2s(copy); streamer: binlog.000074:809092031; State: migrating; ETA: N/A
Copy: 0/9982267 0.0%; Applied: 1580; Backlog: 100/100; Time: 7s(total), 3s(copy); streamer: binlog.000074:809382067; State: migrating; ETA: N/A
Copy: 0/9982267 0.0%; Applied: 2171; Backlog: 84/100; Time: 8s(total), 4s(copy); streamer: binlog.000074:809718243; State: migrating; ETA: N/A
Copy: 4000/9982267 0.0%; Applied: 2590; Backlog: 33/100; Time: 9s(total), 5s(copy); streamer: binlog.000074:810697550; State: migrating; ETA: N/A
Copy: 12000/9982267 0.1%; Applied: 3006; Backlog: 5/100; Time: 10s(total), 6s(copy); streamer: binlog.000074:812459945; State: migrating; ETA: N/A
Copy: 28000/9982267 0.3%; Applied: 3348; Backlog: 12/100; Time: 11s(total), 7s(copy); streamer: binlog.000074:815749963; State: migrating; ETA: N/A
Copy: 46000/9982267 0.5%; Applied: 3736; Backlog: 0/100; Time: 12s(total), 8s(copy); streamer: binlog.000074:819054426; State: migrating; ETA: N/A
Copy: 60000/9982267 0.6%; Applied: 4032; Backlog: 4/100; Time: 13s(total), 9s(copy); streamer: binlog.000074:822321562; State: migrating; ETA: N/A
Copy: 78000/9982267 0.8%; Applied: 4340; Backlog: 12/100; Time: 14s(total), 10s(copy); streamer: binlog.000074:825982397; State: migrating; ETA: N/A
Copy: 94000/9982267 0.9%; Applied: 4715; Backlog: 0/100; Time: 15s(total), 11s(copy); streamer: binlog.000074:829283130; State: migrating; ETA: N/A
Copy: 114000/9982267 1.1%; Applied: 5060; Backlog: 24/100; Time: 16s(total), 12s(copy); streamer: binlog.000074:833357982; State: migrating; ETA: 17m19s
Copy: 130000/9982267 1.3%; Applied: 5423; Backlog: 16/100; Time: 17s(total), 13s(copy); streamer: binlog.000074:836654200; State: migrating; ETA: 16m25s

From those we can see how many rows were copied, how many events have been applied from binary logs, if there is a backlog of binlog events to apply, how long the whole process and copying of data took, binlog coordinates where gh-ost is looking for new events, state of the job (migrating, throttled, etc) and estimated time to complete the process.

Important to remember is that the number of rows to copy is just an estimate based on the EXPLAIN output for:

SELECT * FROM yourschema.yourtable;

You can see it below in ‘rows’ column and on gh-ost status output:

mysql> EXPLAIN SELECT * FROM sbtest1.sbtest1\G
*************************** 1. row ***************************
           id: 1
  select_type: SIMPLE
        table: sbtest1
   partitions: NULL
         type: ALL
possible_keys: NULL
          key: NULL
      key_len: NULL
          ref: NULL
         rows: 9182788
     filtered: 100.00
        Extra: NULL
1 row in set, 1 warning (0.00 sec)
Copy: 0/9182788 0.0%; Applied: 0; Backlog: 0/100; Time: 1m15s(total), 0s(copy); streamer: binlog.000111:374831609; State: migrating; ETA: N/A
Copy: 0/9182788 0.0%; Applied: 0; Backlog: 100/100; Time: 1m20s(total), 5s(copy); streamer: binlog.000111:374945268; State: throttled, lag=33.166494s; ETA: N/A
Copy: 0/9182788 0.0%; Applied: 0; Backlog: 100/100; Time: 1m25s(total), 10s(copy); streamer: binlog.000111:374945268; State: throttled, lag=2.766375s; ETA: N/A
Copy: 0/9182788 0.0%; Applied: 1907; Backlog: 100/100; Time: 1m30s(total), 15s(copy); streamer: binlog.000111:375777140; State: migrating; ETA: N/A
Copy: 0/9182788 0.0%; Applied: 4543; Backlog: 100/100; Time: 1m35s(total), 20s(copy); streamer: binlog.000111:376924495; State: migrating; ETA: N/A

If you are interested in having precise numbers, you can use --exact-rowcount flag in gh-ost. If you use it, gh-ost will execute SELECT COUNT(*) FROM yourtable;, making sure that the number of rows has been calculated precisely.

After some time, gh-ost should complete the change, leaving the old table with _del suffix (_yourtable_del). In case something went wrong, you still can recover old data and then, using binary logs, replay any events which are missing. Obviously, it’s not the cleanest or fastest way to recover but it has been made possible - we’d surely take it over data loss.

What we described above is the default way in which gh-ost performs migration - read binary log from a slave, analyze table on a slave and execute changes on the master. This way we minimize any extra load which is put on the master. If you’d like to execute all your changes on the master, it is possible, as long as your master uses RBR format.

To execute our change on the master, we need to execute gh-ost in a way like below. We use our master’s IP in --host flag. We also use --allow-on-master flag to tell gh-ost that we are going to run the whole process on the master only.

./gh-ost --host=172.30.4.4 --user=sbtest --password=sbtest --database=sbtest1 --table=sbtest1 --alter="ADD COLUMN x INT NOT NULL DEFAULT '0'" --chunk-size=2000 --max-load=Threads_connected=20 --allow-on-master --execute

As you can clearly see, gh-ost gives you numerous ways in which you can ensure the schema change will be performed smoothly and in a safe manner. We cannot stress enough how important it is for a DBA to have a way to test every operation. Flexibility is also very welcome - default behavior of reducing load on the master makes perfect sense, but it is good that gh-ost still allows you to execute everything on the master only.

In the next blog post, we are going to discuss some safety measures that come with gh-ost. Namely, we will talk about its throttling mechanism and ways to perform runtime configuration changes.

Tips and Tricks - How to shard MySQL with ProxySQL in ClusterControl

$
0
0

Having too large a (write) workload on a master is dangerous. If the master collapses and a failover happens to one of its slave nodes, the slave node could collapse under the write pressure as well. To mitigate this problem you can shard horizontally across more nodes.

Sharding increases the complexity of data storage though, and very often, it requires an overhaul of the application. In some cases, it may be impossible to make changes to an application. Luckily there is a simpler solution: functional sharding. With functional sharding you move a schema or table to another master, and thus alleviating the master from the workload of these schemas or tables.

In this Tips & Tricks post, we will explain how you can functionally shard your existing master, and offload some workload to another master using functional sharding. We will use ClusterControl, MySQL replication and ProxySQL to make this happen, and the total time taken should not be longer than 15 minutes in total. Mission impossible? :-)

The example database

In our example we have a serious issue with the workload on our simple order database, accessed by the so_user. The majority of the writes are happening on two tables: orders and order_status_log. Every change to an order will write to both the order table and the status log table.

CREATE TABLE `orders` (
  `id` int(11) NOT NULL AUTO_INCREMENT,
  `customer_id` int(11) NOT NULL,
  `status` varchar(14) DEFAULT 'created',
  `total_vat` decimal(15,2) DEFAULT '0.00',
  `total` decimal(15,2) DEFAULT '0.00',
  `created` timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP,
  `updated` timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP,
  PRIMARY KEY (`id`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8;
CREATE TABLE `order_status_log` (
  `orderId` int(11) NOT NULL,
  `status` varchar(14) DEFAULT 'created',
  `changeTime` timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP,
  `logline` text,
  PRIMARY KEY (`orderId`, `status`, `changeTime` )
) ENGINE=InnoDB DEFAULT CHARSET=utf8;
CREATE TABLE `customers` (
  `id` int(11) NOT NULL AUTO_INCREMENT,
  `firstname` varchar(15) NOT NULL,
  `surname` varchar(80) NOT NULL,
  `address` varchar(255) NOT NULL,
  `postalcode` varchar(6) NOT NULL,
  `city` varchar(50) NOT NULL,
  `state` varchar(50) NOT NULL,
  `country` varchar(50) NOT NULL,
  PRIMARY KEY (`id`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8;

What we will do is to move the order_status_log table to another master.

As you might have noticed, there is no foreign key defined on the order_status_log table. This simply would not work across functional shards. Joining the order_status_log table with any other table would simply no longer work as it will be physically on a different server than the other tables. And if you write transactional data to multiple tables, the rollback will only work for one of these masters. If you wish to retain these things, you should consider to use homogenous sharding instead where you keep related data grouped together in the same shard.

Installing the Replication setups

First, we will install a replication setup in ClusterControl. The topology in our example is really basic: we deploy one master and one replica:

But you could import your own existing replication topology into ClusterControl as well.

After the setup has been deployed, deploy the second setup:

While waiting for the second setup to be deployed, we will add ProxySQL to the first replication setup:

Adding the second setup to ProxySQL

After ProxySQL has been deployed we can connect with it via command line, and see it’s current configured servers and settings:

MySQL [(none)]> select hostgroup_id, hostname, port, status, comment from mysql_servers;
+--------------+-------------+------+--------+-----------------------+
| hostgroup_id | hostname    | port | status | comment               |
+--------------+-------------+------+--------+-----------------------+
| 20           | 10.10.36.11 | 3306 | ONLINE | read server           |
| 20           | 10.10.36.12 | 3306 | ONLINE | read server           |
| 10           | 10.10.36.11 | 3306 | ONLINE | read and write server |
+--------------+-------------+------+--------+-----------------------+
MySQL [(none)]> select rule_id, active, username, schemaname, match_pattern, destination_hostgroup from mysql_query_rules;
+---------+--------+----------+------------+---------------------------------------------------------+-----------------------+
| rule_id | active | username | schemaname | match_pattern                                           | destination_hostgroup |
+---------+--------+----------+------------+---------------------------------------------------------+-----------------------+
| 100     | 1      | NULL     | NULL       | ^SELECT .* FOR UPDATE                                   | 10                    |
| 200     | 1      | NULL     | NULL       | ^SELECT .*                                              | 20                    |
| 300     | 1      | NULL     | NULL       | .*                                                      | 10                    |
+---------+--------+----------+------------+---------------------------------------------------------+-----------------------+

As you can see, ProxySQL has been configured with the ClusterControl default read/write splitter for our first cluster. Any basic select query will be routed to hostgroup 20 (read pool) while all other queries will be routed to hostgroup 10 (master). What is missing here is the information about the second cluster, so we will add the hosts of the second cluster first:

MySQL [(none)]> INSERT INTO mysql_servers VALUES (30, '10.10.36.13', 3306, 'ONLINE', 1, 0, 100, 10, 0, 0, 'Second repl setup read server'), (30, '10.10.36.14', 3306, 'ONLINE', 1, 0, 100, 10, 0, 0, 'Second repl setup read server');
Query OK, 2 rows affected (0.00 sec) 
MySQL [(none)]> INSERT INTO mysql_servers VALUES (40, '10.10.36.13', 3306, 'ONLINE', 1, 0, 100, 10, 0, 0, 'Second repl setup read and write server');
Query OK, 1 row affected (0.00 sec)

After this we need to load the servers to ProxySQL runtime tables and store the configuration to disk:

MySQL [(none)]> LOAD MYSQL SERVERS TO RUNTIME;
Query OK, 0 rows affected (0.00 sec)
MySQL [(none)]> SAVE MYSQL SERVERS TO DISK;
Query OK, 0 rows affected (0.01 sec)

As ProxySQL is doing the authentication for the clients as well, we need to add the os_user user to ProxySQL to allow the application to connect through ProxySQL:

MySQL [(none)]> INSERT INTO mysql_users (username, password, active, default_hostgroup, default_schema) VALUES ('so_user', 'so_pass', 1, 10, 'simple_orders');
Query OK, 1 row affected (0.00 sec)
MySQL [(none)]> LOAD MYSQL USERS TO RUNTIME;
Query OK, 0 rows affected (0.00 sec)
MySQL [(none)]> SAVE MYSQL USERS TO DISK;
Query OK, 0 rows affected (0.00 sec)

Now we have added the second cluster and user to ProxySQL. Keep in mind that normally in ClusterControl the two clusters are considered two separate entities. ProxySQL will remain part of the first cluster. Even though it is now configured for the second cluster, it will only be displayed under the first cluster,.

Mirroring the data

Keep in mind that mirroring queries in ProxySQL is still a beta feature, and it doesn’t guarantee the mirrored queries will actually be executed. We have found it working fine within the boundaries of this use case. Also there are (better) alternatives to our example here, where you would make use of a restored backup on the new cluster and replicate from the master until you make the switch. We will describe this scenario in a follow up Tips & Tricks blog post.

Now that we have added the second cluster, we need to create the simple_orders database, the order_status_log table and the appropriate users on the master of the second cluster:

mysql> create database simple_orders;
Query OK, 1 row affected (0.01 sec)
mysql> use simple_orders;
Database changed
mysql> CREATE TABLE `order_status_log` (
  `orderId` int(11) NOT NULL,
  `status` varchar(14) DEFAULT 'created',
  `changeTime` timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP,
  `logline` text,
  PRIMARY KEY (`orderId`, `status`, `changeTime` )
) ENGINE=InnoDB DEFAULT CHARSET=utf8;
Query OK, 0 rows affected (0.00 sec)
mysql> create user 'so_user'@'10.10.36.15' identified by 'so_pass';
Query OK, 0 rows affected (0.00 sec)
mysql> grant select, update, delete, insert on simple_orders.* to 'so_user'@'10.10.36.15';
Query OK, 0 rows affected (0.00 sec)

This enables us to start mirroring the queries executed against the first cluster onto the second cluster. This requires an additional query rule to be defined in ProxySQL:

MySQL [(none)]> INSERT INTO mysql_query_rules (rule_id, active, username, schemaname, match_pattern, destination_hostgroup, mirror_hostgroup, apply) VALUES (50, 1, 'so_user', 'simple_orders', '(^INSERT INTO|^REPLACE INTO|^UPDATE|INTO TABLE) order_status_log', 20, 40, 1);
Query OK, 1 row affected (0.00 sec)
MySQL [(none)]> LOAD MYSQL QUERY RULES TO RUNTIME;
Query OK, 1 row affected (0.00 sec)

In this rule ProxySQL will match everything that is writing to the orders_status_log table, and send it in addition to the hostgroup 40. (write server of the second cluster)

Now that we have started mirroring the queries, the backfill of the data from the first cluster can take place. You can use the timestamp from the first entry in the new orders_status_log table to determine the time we started to mirror.

Once the data has been backfilled we can reconfigure ProxySQL to perform all actions on the orders_status_log table on the second cluster. This will be a two step approach: add a new rule to move the read queries to the second cluster’s read servers and except the SELECT … FOR UPDATE queries. Then another one to modify our mirroring query to stop mirroring and only write to the second cluster.

MySQL [(none)]> INSERT INTO mysql_query_rules (rule_id, active, username, schemaname, match_pattern, destination_hostgroup, apply) VALUES (70, 1, 'so_user', 'simple_orders', '^SELECT .* FROM order_status_log', 30, 1), (60, 1, 'so_user', 'simple_orders', '^FROM order_status_log .* FOR UPDATE', 40, 1);
Query OK, 2 rows affected (0.00 sec)
MySQL [(none)]> UPDATE mysql_query_rules SET destination_hostgroup=40, mirror_hostgroup=NULL WHERE rule_id=50;
Query OK, 1 row affected (0.00 sec)

And don’t forget to activate and persist the new query rules:

MySQL [(none)]> LOAD MYSQL QUERY RULES TO RUNTIME;
Query OK, 1 row affected (0.00 sec)
MySQL [(none)]> SAVE MYSQL QUERY RULES TO DISK;
Query OK, 0 rows affected (0.05 sec)

After this final step we should see the workload drop on the first cluster, and increase on the second cluster. Mission possible and accomplished. Happy clustering!

How to use the ClusterControl Query Monitor for MySQL, MariaDB and Percona Server

$
0
0

The MySQL database workload is determined by the number of queries that it processes. There are several situations in which MySQL slowness can originate. The first possibility is if there is any queries that are not using proper indexing. When a query cannot make use of an index, the MySQL server has to use more resources and time to process that query. By monitoring queries, you have the ability to pinpoint SQL code that is the root cause of a slowdown.

By default, MySQL provides several built-in tools to monitor queries, namely:

  • Slow Query Log - Captures query that exceeds a defined threshold, or query that does not use indexes.
  • General Query Log - Captures all queries happened in a MySQL server.
  • SHOW FULL PROCESSLIST statement (or through mysqladmin command) - Monitors live queries currently being processed by MySQL server.
  • PERFORMANCE_SCHEMA - Monitors MySQL Server execution at a low level.

There are also open-source tools out there that can achieve similar result like mtop and Percona’s pt-query-digest.

How ClusterControl monitors queries

ClusterControl does not only monitor your hosts and database instances, it also monitors your database queries. It gets the information in two different ways:

  • Queries are retrieved from PERFORMANCE_SCHEMA
  • If PERFORMANCE_SCHEMA is disabled or unavailable, ClusterControl will parse the content of the Slow Query Log

ClusterControl starts reading from the PERFORMANCE_SCHEMA tables immediately when the query monitor is enabled, and the following tables are used by ClusterControl to sample the queries:

  • performance_schema.events_statements_summary_by_digest
  • performance_schema.events_statements_current
  • performance_schema.threads

In older versions of MySQL (5.5), having PERFORMANCE_SCHEMA (P_S) enabled might not be an option since it can cause significant performance degradation. With MySQL 5.6 the overhead is reduced and even more so in 5.7. P_S offers great introspection of the server at an overhead of a few percents (1-3%). If the overhead is a concern then ClusterControl can parse the Slow Query log remotely to sample queries. Note that no agents are required on your database servers. It uses the following flow:

  1. Start slow log (during MySQL runtime).
  2. Run it for a short period of time (a second or couple of seconds).
  3. Stop log.
  4. Parse log.
  5. Truncate log (ClusterControl creates new log file).
  6. Go to 1.

As you can see, ClusterControl does the above trick when pulling and parsing the Slow Query log to overcome the problems with offsets. The drawback of this method is that the continuous sampling might miss some queries during steps 3 to 5. Hence, if continuous query sampling is vital for you and part of your monitoring policy, the best way is to use P_S. If enabled, ClusterControl will automatically use it.

The collected queries are hashed, calculated and digested (normalize, average, count, sort) and then stored in ClusterControl.

Enabling Query Monitoring

As mentioned earlier, ClusterControl monitors MySQL query via two ways:

  • Fetch the queries from PERFORMANCE_SCHEMA
  • Parse the content of MySQL Slow Query

Performance Schema (Recommended)

First of all, if you would like to use Performance Schema, turn it on all MySQL servers (MySQL/MariaDB v5.5.3 and later). Enabling this requires a MySQL restart. Add the following line to your MySQL configuration file:

performance_schema = ON

Then, restart the MySQL server. For ClusterControl users, you can use the configuration management feature at Manage -> Configurations -> Change Parameter and perform a rolling restart at Manage -> Upgrades -> Rolling Restart.

Once enabled, ensure at least events_statements_current is enabled:

mysql> SELECT * FROM performance_schema.setup_consumers WHERE NAME LIKE 'events_statements%';
+--------------------------------+---------+
| NAME                           | ENABLED |
+--------------------------------+---------+
| events_statements_current      | YES     |
| events_statements_history      | NO      |
| events_statements_history_long | NO      |
+--------------------------------+---------+

Otherwise, run the following statement to enable it:

UPDATE performance_schema.setup_consumers SET ENABLED = 'YES' WHERE NAME = 'events_statements_current';

MySQL Slow Query

If Performance Schema is disabled, ClusterControl will then default to the Slow Query log. Hence, you don’t have to do anything since it can be turned on and off dynamically during runtime via SET statement.

The Query Monitoring function must be toggled to on under ClusterControl -> Query Monitor -> Top Queries. ClusterControl will monitor queries on all database nodes under this cluster:

Click on the “Settings” and configure “Long Query Time” and toggle “Log queries not using indexes” to On. If you have defined two parameters (long_query_time and log_queries_not_using_indexes) inside my.cnf and you would like to use those values instead, toggle “MySQL Local Query Override” to On. Otherwise, ClusterControl will obey the former.

Once enabled, you just need to wait a couple of minutes before you can see data under Top Queries and Query Histogram.

How ClusterControl visualizes the queries

Under the Query Monitor tab, you should see the following three items:

  • Top Queries

  • Running Queries

  • Query Histogram

We’ll have a quick look at these here, but remember that you can always find more details in the ClusterControl documentation.

Top Queries

Top Queries is an aggregated list of all your top queries running on all the nodes of your cluster. The list can be ordered by “Occurrence” or “Execution Time”, to show the most common or slowest queries respectively. You don’t have to login to each of the servers to see the top queries. The UI provides an option to filter based on MySQL server.

If you are using the Slow Query log, only queries that exceed the “Long Query Time” will be listed here. If the data is not populated correctly and you believe that there should be something in there, it could be:

  • ClusterControl did not collect enough queries to summarize and populate data. Try to lower the “Long Query Time”.
  • You have configured Slow Query Log configuration options in the my.cnf of MySQL server, and “Override Local Query” is turned off. If you really want to use the value you defined inside my.cnf, probably you have to lower the long_query_time value so ClusterControl can calculate a more accurate result.
  • You have another ClusterControl node pulling the Slow Query log as well (in case you have a standby ClusterControl server). Only allow one ClusterControl server to do this job.

The “Long Query Time” value can be specified to a resolution of microseconds, for example 0.000001 (1 x 10-6). The following shows a screenshot of what’s under Top Queries:

Clicking on each query will show the query plan executed, similar to EXPLAIN command output:

Running Queries

Running Queries provides an aggregated view of current running queries across all nodes in the cluster, similar to SHOW FULL PROCESSLIST command in MySQL. You can stop a running query by selecting to kill the connection that started the query. The process list can be filtered out by host.

Use this feature to monitor live queries currently running on MySQL servers. By clicking on each row that contains “Info”, you can see the extended information containing the full query statement and the query plan:

Query Histogram

The Query Histogram is actually showing you queries that are outliers. An outlier is a query taking longer time than the normal query of that type. Use this feature to filter out the outliers for a certain time period. This feature is dependent on the Top Queries feature above. If Query Monitoring is enabled and Top Queries are captured and populated, the Query Histogram will summarize these and provide a filter based on timestamp.

That’s all folks! Monitoring queries is as important as monitoring your hosts or MySQL instances, to make sure your database is performing well.

Online schema change with gh-ost - throttling and changing configuration at runtime

$
0
0

In previous posts, we gave an overview of gh-ost and showed you how to test your schema changes before executing them. One important feature of all schema change tools is their ability to throttle themselves. Online schema change requires copying data from old table to a new one and, no matter what you do in addition to that, it is an expensive process which may impact database performance.

Throttling in gh-ost

Throttling is crucial to ensure that normal operations continue to perform in a smooth way. As we discussed in a previous blog post, gh-ost allows to stop all of its activity, which makes things so much less intrusive. Let’s see how it works and to what extent it is configurable.

First things first - what does gh-ost monitor? As we know, by default, gh-ost uses a master to execute writes, and a slave to track changes in binary logs. The master, obviously, will not give us any information about replication lag, but a slave will do - that’s where gh-ost gets its data on slave lag. Of course, one single slave is not necessarily representative of the whole replication chain. Therefore it is possible to define a list of slaves to check the replication lag via --throttle-control-replicas variable. All you need to do is to pass a comma-separated list of IP’s here and gh-ost will track lag on all of them. You can define what maximum lag is acceptable for you using --max-lag-millis. Once the threshold has been passed, gh-ost will stop its activity and allow slaves to catch up with the master.

The main problem is that, right now, gh-ost uses multiple methods of lag calculation, which make things not really clear. The documentation is also not clear enough to clarify how things work internally. Let’s take a look at how gh-ost operates right now.

As we mentioned, there are multiple methods used to calculate lag. First of all, gh-ost generates an internal heartbeat in its _ghc table.

mysql> SELECT * FROM sbtest1._sbtest1_ghc LIMIT 1\G
*************************** 1. row ***************************
         id: 1
last_update: 2016-12-27 13:36:37
       hint: heartbeat
      value: 2016-12-27T13:36:37.139851335Z
1 row in set (0.00 sec)

It is used to calculate lag on the slave/replica, on which gh-ost operates and reads binary logs from. Then, replicas are mentioned in --throttle-control-replicas. Those, by default, have their lag tracked using SHOW SLAVE STATUS and Seconds_Behind_Master. This data has the granularity of one second.

The problem is that sometimes, one second of lag is too much for the application to handle, therefore one of the very important features of gh-ost is to be able to detect sub-second lag. On the replica, where gh-ost operates, gh-ost’s heartbeat supports sub-second granularity using heartbeat-interval-millis variable. The remaining replicas, though, are not supported this way - there is an option to take advantage of an external heartbeat solution like, for example, pt-heartbeat, and calculate slave lag using --replication-lag-query.

Unfortunately, when we put it all together, it didn’t work as expected - sub-second lag was not calculated correctly by gh-ost. We decided to contact Shlomi Noah, who’s leading the gh-ost project, to get some more insight in how gh-ost operates regarding to sub-second lag detection. What you will read below is a result of this conversation, showing how it will be done in the future, in the “right” way.

gh-ost, at this moment, inserts heartbeat data in its _*_ghc table. This makes any external heartbeat generator redundant and, as a result, it makes --replication-lag-query deprecated and soon to be removed. Once it will be removed, gh-ost’s internal heartbeat will be used across the whole replication topology.

If you will want to check for lag with sub-second granularity, you will need to configure correctly --heartbeat-interval-millis and --max-lag-millis ensuring that heartbeat-interval-millis is set to lower value than max-lag-millis - that’s all. You can, for example, tell gh-ost to insert a heartbeat every 100 milliseconds (heartbeat-interval-millis) and then test if lag is less than, let’s say 500 milliseconds (max-lag-millis). Of course, lag will be checked on all replicas defined in --throttle-control-replicas. You can see updated documentation related to the lag checking process here: https://github.com/github/gh-ost/blob/3bf64d8280b7cd639c95f748ccff02e90a7f4345/doc/subsecond-lag.md

Please keep in mind that this is how gh-ost will operate when you use it in version v1.0.34 or later.

We need to mention, for a sake of completeness, one more setting - nice-ratio. It is used to define how aggressive gh-ost should be in copying the data. It basically tells ghost how much should it pause after each row copy operation. If you set it to 0 - no pause will be added. If you set it to 0.5, the whole process of copying rows will take 150% of original time. If you set it to 1, it will take twice as long (200%). It works but it is also pretty hard to adjust the ratio so the original workload is not affected. As long as you can use sub-second lag throttling, this is the way to go.

Runtime configuration changes in gh-ost

Another very useful feature of gh-ost is its ability to handle runtime configuration changes. When it starts, it listens on the unix socket, which you can choose through --serve-socket-file. By default it is created in /tmp dir and name is determined by gh-ost. It seems like it depends on the schema and table which gh-ost works upon. An example would be: /tmp/gh-ost.sbtest1.sbtest1.sock

Gh-ost can also work using TCP port but for that you need to pass --serve-tcp-port.

Knowing this, we can manipulate some of the settings. The best way to learn what we can change would be to ask gh-ost about it. When we send the ‘help’ string to the socket, we’ll get a list of available commands:

root@ip-172-30-4-235:~# echo help | nc -U /tmp/gh-ost.sbtest1.sbtest1.sock
available commands:
status                               # Print a detailed status message
sup                                  # Print a short status message
chunk-size=<newsize>                 # Set a new chunk-size
nice-ratio=<ratio>                   # Set a new nice-ratio, immediate sleep after each row-copy operation, float (examples: 0 is agrressive, 0.7 adds 70% runtime, 1.0 doubles runtime, 2.0 triples runtime, ...)
critical-load=<load>                 # Set a new set of max-load thresholds
max-lag-millis=<max-lag>             # Set a new replication lag threshold
replication-lag-query=<query>        # Set a new query that determines replication lag (no quotes)
max-load=<load>                      # Set a new set of max-load thresholds
throttle-query=<query>               # Set a new throttle-query (no quotes)
throttle-control-replicas=<replicas> # Set a new comma delimited list of throttle control replicas
throttle                             # Force throttling
no-throttle                          # End forced throttling (other throttling may still apply)
unpostpone                           # Bail out a cut-over postpone; proceed to cut-over
panic                                # panic and quit without cleanup
help                                 # This message

As you can see, there is a bunch of settings to change at runtime - we can change chunk size, we can change critical load settings (when defined thresholds will cross, causing gh-ost to start to throttle). You can also set settings related to throttling: nice-ratio, max-lag-millis, replication-lag-query, throttle-control-replicas. You can as well force throttling by sending the ‘throttle’ string to gh-ost or immediately stop the migration by sending ‘panic’.

Another setting which is worth mentioning is unpostpone. Gh-ost allows you to postpone the cutover process. As you know, gh-ost creates a temporary table using the new schema, and then fills it with data from the old table. Once all data has been copied, it performs a cut-over and replaces the old table with a new one. It may happen that you want to be there to monitor things, when gh-ost performs this step - in case something goes wrong. In that case, you can use --postpone-cut-over-flag-file to define a file which, if exists, will postpone the cut-over process. Then you can create that file and be sure that gh-ost won’t swap tables unless you let it by removing the file. Still, if you’d like to go ahead and force cut-over without a need to find and remove the postpone file, you can send ‘unpostpone’ string to gh-ost and it will immediately perform a cut-over.

We coming to the end of this post. Throttling is a critical part of any online schema change process (or any database-heavy process, for that matter) and it is important to understand how to do it right. Yet, even with throttling, some additional load is unavoidable That’s why, in our next blog post, we will try to assess the impact of running gh-ost on the system.

Announcing ClusterControl 1.4 - the MySQL Replication & MongoDB Edition

$
0
0

Today we are pleased to announce the 1.4 release of ClusterControl - the all-inclusive database management system that lets you easily deploy, monitor, manage and scale highly available open source databases in any environment; on-premise or in the cloud.

This release contains key new features for MongoDB and MySQL Replication in particular, along with performance improvements and bug fixes.

Release Highlights

For MySQL

MySQL Replication

  • Enhanced multi-master deployment
  • Flexible topology management & error handling
  • Automated failover

MySQL Replication & Load Balancers

  • Deploy ProxySQL on MySQL Replication setups and monitor performance
  • HAProxy Read-Write split configuration support for MySQL Replication setups

Experimental support for Oracle MySQL Group Replication

  • Deploy Group Replication Clusters

And support for Percona XtraDB Cluster 5.7

Download ClusterControl

For MongoDB

MongoDB & sharded clusters

  • Convert a ReplicaSet to a sharded cluster
  • Add or remove shards
  • Add Mongos/Routers

More MongoDB features

  • Step down or freeze a node
  • New Severalnines database advisors for MongoDB

Download ClusterControl

View release details and resources

ClusterControl
Single Console for Your Entire Database Infrastructure
Find out what else is new in ClusterControl

New MySQL Replication Features

ClusterControl 1.4 brings a number of new features to better support replication users. You are now able to deploy a multi-master replication setup in active - standby mode. One master will actively take writes, while the other one is ready to take over writes should the active master fail. From the UI, you can also easily add slaves under each master and reconfigure the topology by promoting new masters and failing over slaves.

Topology reconfigurations and master failovers are not usually possible in case of replication problems, for instance errant transactions. ClusterControl will check for issues before any failover or switchover happens. The admin can define whitelists and blacklists of which slaves to promote to master (and vice versa). This makes it easier for admins to manage their replication setups and make topology changes when needed. 

Deploy ProxySQL on MySQL Replication clusters and monitor performance

Load balancers are an essential component in database high availability. With this new release, we have extended ClusterControl with the addition of ProxySQL, created for DBAs by René Cannaò, himself a DBA trying to solve issues when working with complex replication topologies. Users can now deploy ProxySQL on MySQL Replication clusters with ClusterControl and monitor its performance.

By default, ClusterControl deploys ProxySQL in read/write split mode - your read-only traffic will be sent to slaves while your writes will be sent to a writable master. ProxySQL will also work together with the new automatic failover mechanism. Once failover happens, ProxySQL will detect the new writable master and route writes to it. It all happens automatically, without any need for the user to take action.

MongoDB & sharded clusters

MongoDB is the rising star of the Open Source databases, and extending our support for this database has brought sharded clusters in addition to replica sets. This meant we had to retrieve more metrics to our monitoring, adding advisors and provide consistent backups for sharding. With this latest release, you can now convert a ReplicaSet cluster to a sharded cluster, add or remove shards from a sharded cluster as well as add Mongos/routers to a sharded cluster.

New Severalnines database advisors for MongoDB

Advisors are mini programs that provide advice on specific database issues and we’ve added three new advisors for MongoDB in this ClusterControl release. The first one calculates the replication window, the second watches over the replication window, and the third checks for un-sharded databases/collections. In addition to this we also added a generic disk advisor. The advisor verifies if any optimizations can be done, like noatime and noop I/O scheduling, on the data disk that is being used for storage.

There are a number of other features and improvements that we have not mentioned here. You can find all details in the ChangeLog.

We encourage you to test this latest release and provide us with your feedback. If you’d like a demo, feel free to request one.

Thank you for your ongoing support, and happy clustering!

PS.: For additional tips & tricks, follow our blog: http://www.severalnines.com/blog/


Automating MySQL Replication with ClusterControl 1.4.0 - what’s new

$
0
0

With the recent release of ClusterControl 1.4.0, we added a bunch of new features to better support MySQL replication users. In this blog post, we’ll give you a quick overview of the new features.

Enhanced multi-master deployment

A simple master-slave replication setup is usually good enough in a lot of cases, but sometimes, you might need a more complex topology with multiple masters. With 1.4.0, ClusterControl can help provision such setups. You are now able to deploy a multi-master replication setup in active - standby mode. One of the masters will actively take writes, while the other one is ready to take over writes should the active master fail. You can also easily add slaves under each master, right from the UI.

Enhanced flexibility in replication topology management

With support for multi-master setups comes improved support for managing replication topology changes. Do you want to re-slave a slave off the standby master? Do you want to create a replication chain, with an intermediate master in-between? Sure! You can use a new job for that: “Change Replication Master”. Just go to one of the nodes and pick that job (not only on the slaves, you can also change replication master for your current master, to create a multi-master setup). You’ll be presented with a dialog box in which you can pick the master from which to slave your node off. As of now, only GTID-enabled replication is supported, both Oracle and MariaDB implementations.

Replication error handling

You may ask - what about issues like errant transactions which can be a serious problem for MySQL replication? Well, for starters, ClusterControl always set slaves in read_only mode so only a superuser can create an errant transaction. It still may happen, though. That’s why we added replication error handling in ClusterControl.

Errant transactions are common and they are handled separately - errant transactions are checked for before any failover or switchover happens. The user can then fix the problem before triggering a topology change once more. If, for some reason (like high availability, for example), a user wants to perform a failover anyway, no matter if it is safe or not, it can also be done by setting:

replication_stop_on_error=0

This is set in the cmon configuration file of the replication setup ( /etc/cmon.d/cmon_X.cnf, where X is the cluster ID of the replication setup). In such cases, failover will be performed even if there’s a possibility that replication will break.

To handle such cases, we added experimental support for slave rebuilding. If you enable replication_auto_rebuild_slave in the cmon configuration and if your slave is marked as down with the following error in MySQL:

Got fatal error 1236 from master when reading data from binary log: 'The slave is connecting using CHANGE MASTER TO MASTER_AUTO_POSITION = 1, but the master has purged binary logs containing GTIDs that the slave requires.'

ClusterControl will attempt to rebuild the slave using data from the master. Such a setting may be dangerous as the rebuilding process will induce an increased load on the master, it may also be that your dataset is very large and a regular rebuild is not an option - that’s why this behavior is disabled by default. Feel free to try it out, though and let us know what you think about it.

Automated failover

Handling replication errors is not enough to maintain high availability with MySQL replication - you need also to handle crashes of MySQL instances. Until now, ClusterControl alerted the user and let her perform a manual failover. With ClusterControl version 1.4.0 comes support for automated failover handling. It is enough to have cluster recovery enabled for your replication cluster and ClusterControl will try to recover your replication cluster in the best way possible. You must explicitly enable "Cluster Auto Recovery" in the UI in order for automatic failover to be activated.

Once a master failure is detected, ClusterControl starts to look for the most up-to-date slave available. Once it’s been found, ClusterControl checks the remaining slaves and looks for additional, missing transactions. If such transactions are found on some of the slaves, the master candidate is configured to replicate from each of those slaves and apply any missing transactions.

If, for any reason, you’d rather not wait for a master candidate to get all missing transactions (maybe because you are 100% sure there won’t be any), you can disable this step by enabling the replication_skip_apply_missing_txs setting in cmon configuration.

For MariaDB setups, the behavior is different - ClusterControl picks the most advanced slave and promotes it to become master.

Getting missing transactions is one thing. Applying them is another. ClusterControl, by default, does not fail over to a slave if the slave has not applied all missing transactions. You could lose data. Instead, it will wait indefinitely to allow slaves to catch up. Of course, if the master candidate becomes up to date, ClusterControl will failover immediately after. This behavior can be configured using replication_failover_wait_to_apply_timeout setting in the cmon configuration file. Default value (-1) prevents any failover if master candidate is lagging behind. If you’d like to execute failover anyway, you can set it to 0. You can also set a timeout in seconds, this is the amount of time that ClusterControl will wait for a master candidate to catch up before performing a failover.

Once a master candidate is brought up to date, it is promoted to master and the remaining slaves are slaved off it. The exact process differs depending on which host failed (the active or standby master in a multi-master setup) but the final outcome is that all slaves are again replicating from the working master. Combined with proxies such as HAProxy, ProxySQL or MaxScale, this lets you build an environment where a master failure is handled in an automated and transparent way.

Additional control over failover behavior is granted through replicaton_failover_whitelist and replicaton_failover_blacklist lists in the cmon configuration file. These let you configure a list of slaves which should be treated as a candidate list to become master, and a list of slaves which should not be promoted to master by ClusterControl. There are numerous reasons you may want to use those variables. Maybe you have some backup or OLAP/reporting slaves which are not suitable to become a master? Maybe some of your slaves use weaker hardware or maybe they are located in a different datacenter? In this case, you can avoid them from being promoted by adding those slaves to the replicaton_failover_blacklist variable.

Likewise, maybe you want to limit the number of slaves that are promotable to a particular set of hosts which are the closest to the current master? Or maybe you use master - master, active - passive setup and you want only your standby master to be considered for promotion? Then specify the IP’s of master candidates in the replicaton_failover_whitelist variable. Please keep in mind that a restart of cmon process will be required to reload such configuration. By executing cmon --help-config on the controller, you will get more detailed information about these (and other) parameters.

Finally, you might want to manually restore replication.If you do not want ClusterControl to perform any automated failover in your replication topology, you can disable cluster recovery from the ClusterControl UI.

So, there are lots of good stuff to try out here for MySQL replication users. Do give it a try, and let us know how we’re doing.

The choice of MySQL storage engine and its impact on backup procedures

$
0
0

MySQL offers multiple storage engines to store its data, with InnoDB and MyISAM being the most popular ones.  Each storage engine implements a more specific set of features required for a type of workload, and as a result, works differently from other engines. Since data is stored inside the storage engine, we need to understand how the storage engines work to determine the best backup tool. In general, MySQL backup tools perform a special operation in order to retrieve a consistent data - either lock the tables or establish a transaction isolation level that guarantees data read is unchanged.

MyISAM/Aria

MyISAM was the default storage engine for MySQL versions prior to 5.5.5. It is based on the older ISAM code but has many useful extensions. The major deficiency of MyISAM is the absence of transactions support. Aria is another storage engine with MyISAM heritage and is a MyISAM replacement in all MariaDB distributions. The main difference is that Aria is crash safe, whereas MyISAM is not. Being crash safe means that an Aria table can recover from unexpected failures in a much better way than a MyISAM table can. In most circumstances, backup operations for MyISAM and Aria are almost identical.

MyISAM uses table-level locking. It stores indexes in one file and data in another. MyISAM tables are generally more compact in size on disk when compared to InnoDB tables. With the nature of table-level locking and no transaction support, the recommended way to backup MyISAM tables is to acquire the global read lock by using FLUSH TABLE WITH READ LOCK (FTWRL) to make MySQL read-only temporarily or use LOCK TABLES statement explicitly. Without that, MyISAM backups will be inconsistent.

InnoDB/XtraDB

InnoDB is the default storage engine for MySQL and MariaDB. It provides the standard ACID-compliant transaction features, along with foreign key support and row-level locking.

Percona’s XtraDB is an enhanced version of the InnoDB storage engine for MySQL and MariaDB. It features some improvements that make it perform better in certain situations. It is backwards compatible with InnoDB, so it can be used as a drop-in replacement.

There are a number of key components in InnoDB that directly influences the behaviour of backup and restore operation:

  • Transactions
  • Crash recovery
  • Multiversion concurrency control (MVCC)

Transactions

InnoDB does transactions. A transaction will never be completed unless each individual operation within the group is successful (COMMIT). If any operation within the transaction fails, the entire transaction will fail and any changes will be undone (ROLLBACK).

The following example shows a transaction in MySQL (assuming autocommit is off):

BEGIN;
UPDATE account.saving SET balance = (balance - 10) WHERE id = 2;
UPDATE account.current SET balance = (balance + 10) WHERE id = 2;
COMMIT;

A transaction starts with a BEGIN and ends with a COMMIT or ROLLBACK. In the above example, if the MySQL server crashes after the first UPDATE statement completed (line 2), that update would be rolled back and the balance value won’t change for this transaction. The ability to rollback is vital when performing crash recovery, as explained in the next section.

Crash Recovery

InnoDB maintains a transaction log, also called redo log. The redo log is physically represented as a set of files, typically named ib_logfile0 and ib_logfile1. The log contains a record of every change to InnoDB data. When InnoDB starts, it inspects the data files and the transaction log, and performs two steps:

  1. Applies committed transaction log entries to the data files.
  2. Performs an undo operation (rollback) on any transactions that modified data but did not commit.

The rollback is performed by a background thread, executed in parallel with transactions from new connections. Until the rollback operation is completed, new connections may encounter locking conflicts with recovered transactions. In most situations, even if the MySQL server was killed unexpectedly in the middle of heavy activity, the recovery process happens automatically. No action is needed from the DBA.

Percona Xtrabackup utilizes InnoDB crash recovery functionality to prepare the internally inconsistent backup (the binary copy of MySQL data directory) into a consistent and usable database again.

MVCC

InnoDB is a multiversion concurrency control (MVCC) storage engine which means many versions of a single row can exist at the same time. Due to this nature, unlike MyISAM, InnoDB does not require a global read lock to get a consistent read. It utilizes its ACID-compliant transaction component called isolation. Isolation is the “i” in the acronym ACID - the isolation level determines the capabilities of a transaction to read/write data that is accessed by other transactions.

In order to get a consistent snapshot of InnoDB tables, one could simply start a transaction with REPEATABLE READ isolation level. In REPEATABLE READ, a read view is created at the start of the transaction, and this read view is held open for the duration of the transaction. For example, if you execute a SELECT statement at 6 AM, and come back in an open transaction at 6 PM, when you run the same SELECT, then you will see the exact same result set that you saw at 6 AM. This is part of MVCC capability and it is accomplished using row versioning and UNDO information.

Logical backup like mysqldump uses this approach to generate a consistent backup for InnoDB without explicit table lock that can cause the MySQL server to be read-only.

MEMORY

The MEMORY storage engine (formerly known as HEAP) creates special-purpose tables with contents that are stored in memory. Because the data is vulnerable to crashes, hardware issues, or power outages, only use these tables as temporary work areas or read-only caches for data pulled from other tables.

Due to the transient nature of data from MEMORY tables (data is not persisted to disk), only logical backup is capable of backing up these tables. Backup in physical format is not possible.

That’s it for today, but you can read more about backups in our whitepaper - The DevOps Guide to Database Backups for MySQL and MariaDB.

Online schema change for MySQL & MariaDB - comparing GitHub’s gh-ost vs pt-online-schema-change

$
0
0

Database schema change is one of the most common activities that a MySQL DBA has to tackle. No matter if you use MySQL Replication or Galera Cluster, direct DDL’s are troublesome and, sometimes, not feasible to execute. Add the requirement to perform the change while all databases are online, and it can get pretty daunting.

Thankfully, online schema tools are there to help DBAs deal with this problem. Arguably, the most popular of them is Percona’s pt-online-schema-change, which is part of Percona Toolkit.

It has been used by MySQL DBAs for years and is proven as a flexible and reliable tool. Unfortunately, not without drawbacks.

To understand these, we need to understand how it works internally.

How does pt-online-schema-change work?

Pt-online-schema-change works in a very simple way. It creates a temporary table with the desired new schema - for instance, if we added an index, or removed a column from a table. Then, it creates triggers on the old table - those triggers are there to mirror changes that happen on the original table to the new table. Changes are mirrored during the schema change process. If a row is added to the original table, it is also added to the new one. Likewise if a row is modified or deleted on the old table, it is also applied on the new table. Then, a background process of copying data (using LOW_PRIORITY INSERT) between old and new table begins. Once data has been copied, RENAME TABLE is executed to rename “yourtable” into “yourtable_old” and “yourtable_new” into “yourtable”. This is an atomic operation and in case something goes wrong, it is possible to recover the old table.

The process described above has some limitations. For starters, it is not possible to reduce the overhead of the tool to 0. Pt-online-schema-change gives you an option to define the maximum allowed replication lag and, if that threshold is crossed, it stops to copy data between the old and new table. It is also possible to pause the background process entirely. The problem is that we are talking only about the background process of running INSERTs. It is not possible to reduce the overhead caused by the fact that every operation in “yourtable” is duplicated in “yourtable_new” through triggers. If you remove the triggers, the old and new table would go out of sync without any means to sync them again. Therefore, when you run pt-online-schema-change on your system, it always adds some overhead, even if it is paused or throttled. How big overhead depends on how many writes hit the table which is undergoing a schema change.

Another issue is caused again by triggers - precisely by the fact that, to create triggers, one has to acquire a lock on MySQL’s metadata. This can become a serious problem if you have highly concurrent traffic or if you use longer transactions. Under such load, it may be virtually impossible (and we’ve seen such databases) to use pt-online-schema-change due to the fact that it is not able to acquire metadata lock to create the required triggers. Additionally, the process of acquiring metadata can also lock further transactions, basically grinding all database operations to halt.

Yet another problem are foreign keys - unfortunately, there is no simple way of handling them. Pt-online-schema-change gives you two methods to approach this issue. Neither of those are really good. The main issue here is that a foreign key of a given name can only refer to a single table and it sticks to it - even if you rename the table referred to, the foreign key will follow this change. This leads to the problem: after RENAME TABLE, the foreign key will point to ‘yourtable_old’, not ‘yourtable’.

One workaround is to not use:

RENAME TABLE ‘yourtable’ TO ‘yourtable_old’, ‘yourtable_new’ TO ‘yourtable’;

Instead, use a two step approach:

DROP TABLE ‘yourtable’; RENAME TABLE ‘yourtable_new’ TO ‘yourtable’;

This poses a serious problem. If for some reason, RENAME TABLE won’t work, there’s no going back as the original table has been already dropped.

Another approach would be to create a second foreign key, under a different name, which refers to ‘yourtable_new’. After RENAME TABLE, it will point to ‘yourtable’, which is exactly what we want. Thing is, you need to execute a direct ALTER to create such foreign key - which kind of invalidates the point of using online schema change - to avoid direct alters. If the altered table is large, such operation is not feasible to execute on Galera Cluster (cluster-wide stall caused by TOI) and MySQL replication cluster (slave lag induced by serialized ALTER).

As you can see, while being a useful tool, pt-online-schema-change has serious limitations which you need to be aware of before you use it. If you use MySQL at scale, limitations may become a serious motivation to do something about it.

Introducing GitHub’s gh-ost

Motivation alone is not enough - you also need resources to create a new solution. GitHub recently released gh-ost, their take on online schema change. Let’s take a look at how it compares to Percona’s pt-online-schema-change and how it can be used to avoid some of its limitations.

To understand better what is the difference between both tools, let’s take a look at how gh-ost works.

Gh-ost creates a temporary table with the altered schema, just like pt-online-schema-change does - it uses “_yourtable_gho” pattern. It executes INSERT queries which use the following pattern to copy data from old to new table:

insert /* gh-ost `sbtest1`.`sbtest1` */ ignore into `sbtest1`.`_sbtest1_gho` (`id`, `k`, `c`, `pad`)
      (select `id`, `k`, `c`, `pad` from `sbtest1`.`sbtest1` force index (`PRIMARY`)
        where (((`id` > ?)) and ((`id` < ?) or ((`id` = ?)))) lock in share mode

As you can see, it is a variation of INSERT INTO new_table  SELECT * FROM old_table. It uses primary key to split data in chunks and then work on them.

In pt-online-schema-change, the current traffic was handled using triggers. Gh-ost uses a triggerless approach - it uses binary logs to track and apply changes which happened since gh-ost started to copy data. It connects to one of the hosts, by default it is one of the slaves, simulates that it is a slave itself and asks for binary logs.

This behavior has a couple of repercussions. First of all, network traffic is increased compared to pt-online-schema-change - not only gh-ost has to copy data but it also has to copy binary logs.

It also requires binary logs in row-based format for full data consistency - if you use statement or mixed replication, gh-ost won’t work in your setup. As a workaround, you can create a new slave, enable log_slave_updates and set it to store events in row format. Reading data from a slave is, actually, the default way in which gh-ost operates - it makes perfect sense as pulling binary logs adds some overhead and if you can avoid additional overhead on the master, you most likely want to do it. Of course, if your master uses row-based replication format, you can force gh-ost to connect to it and get binary logs.

What is good about this design is that you don’t have to create triggers, which, as we discussed, could become a serious problem or even a blocker. What is also great is that you can always stop parsing binary logs - it’s like you’d just run STOP SLAVE. You have the binlog coordinates so you can easily start in the same position later on. This makes it possible to stop practically all operations executed by gh-ost. Not only the background process of copying data from old to new table, but also any load related to keeping the new table in sync with the old one. This is a great feature in a production environment - pt-online-schema-change requires constant monitoring as you can only estimate the additional load on the system. Even if you paused it, it will still add some overhead and, under heavy load, this overhead may result in an unstable database. On the other hand, with gh-ost, you can just pause the whole process and the workload pattern will go back to what you are used to see - no additional load whatsoever related to the schema change. This is really great - it means you can start the migration at 9am, when you start your day, stop it at 5pm when you are leaving your office. You can be sure that you won’t get paged late at night because the paused schema change process is not actually 100% paused, and is causing problems to your production systems.

Unfortunately, gh-ost is not without drawbacks. For starters, foreign keys. Pt-online-schema-change does not provide any good way of altering tables which contain foreign keys. It is still way better than gh-ost as gh-ost does not support foreign keys at all. At the moment of writing, that is - it may change in the future. Triggers - gh-ost, at the moment of writing, does not support triggers at all. The same is true for pt-online-schema-change - it was a limitation of pre-5.7 MySQL where you couldn’t have more than one trigger of a given type defined in a table (and pt-online-schema-change had to create them for its own purposes). Even if the limitation is removed in MySQL 5.7, pt-online-schema-change still does not support tables with triggers.

One of the main limitations of gh-ost is, definitely, the fact that it does not support Galera Cluster. It is because of how gh-ost performs a table switch - it uses LOCK TABLE which do not work well with Galera - as of now there is no known fix or workaround for this issue and this leaves pt-online-schema-change as the only option for Galera Cluster.

These are probably the most important  limitations of gh-ost, but there are more of them. Minimal row image is not supported (which makes your binlogs grow larger), JSON and generated columns in 5.7 are not supported. Migration key must not contain NULL values, there are limitations when it comes to mixed cases in table names. You can find more details on all requirements and limitations of gh-ost in its documentation.

In our next blog post we will take a look at how gh-ost operates, how you can test your changes and how to perform it. We will also discuss throttling of gh-ost.

Planets9s - Online schema change for MySQL & MariaDB, MySQL storage engine & backups … and more

$
0
0

Welcome to this week’s Planets9s, covering all the latest resources and technologies we create around automation and management of open source database infrastructures.

Online schema change for MySQL & MariaDB: GitHub’s gh-ost & pt-online-schema-change

Online schema changes are unavoidable, as any DBA will know. While there are tools such as Percona’s pt-online-schema-change to assist, it does not come without drawbacks. However, there is a new kid on the block: GitHub released an online schema change tool called gh-ost. This post by Krzysztof Ksiazek, Senior Support Engineer at Severalnines, looks at how gh-ost compares to pt-online-schema-change, and how it can be used to address some limitations.

Read the blog

The choice of MySQL storage engine and its impact on backup procedures

As you will know, MySQL offers multiple storage engines to store its data, with InnoDB and MyISAM being the most popular ones. And this has an impact on how you design and run your backup procedures. Since data is stored inside the storage engine, we need to understand how the storage engines work to determine the best backup tool. This post by Ashraf Sharif, System Support Engineer at Severalnines, provides the necessary insight into these topics and recommendations on how best to proceed.

Read the blog

Want an easy way to deploy & monitor Galera Cluster in the cloud?

If you haven’t see it yet, we’ve recently launched a new tool that allows you to easily deploy and monitor Galera Clusters onto the AWS and Digital Ocean clouds. NinesControl allows quick, easy, point-and-click deployment and monitoring of a standalone or a clustered SQL and NoSQL database. Each provisioned database is automatic, repeatable and completes in minutes. It also provides real-time monitoring, self-healing and automatic recovery features. Find out more and get started via the link below.

Check out NinesControl

That’s it for this week! Feel free to share these resources with your colleagues and follow us in our social media channels.

Have a good end of the week,

Jean-Jérôme Schmidt
Planets9s Editor
Severalnines AB

How to Perform Efficient Backup for MySQL and MariaDB

$
0
0

All backup methods have their pros and cons. They also affect database workloads differently. Your backup strategy will depend upon the business requirements, the environment you operate in and resources at your disposal. Backups are usually planned according to your restoration requirement. Data loss can be full or partial, and you do not always need to recover the whole dataset. In some cases, you might just want to do a partial recovery by restoring missing tables or rows. In this case, you will need a combination of Percona Xtrabackup, mysqldump and binary logs to cover the different cases.

Performing a backup on MySQL or MariaDB is not that hard, but to be efficient, we do need to understand the effects of each and every procedure. It also depends on a number of factors like storage engine, recovery objective, dataset and delta size, storage capability and capacity, security as well as high availability design and architecture.

One of the most important things in performing a backup is to make sure you get a consistent backup. Backing up non-transactional tables like MyISAM and MEMORY require tables to be locked to guarantee consistency, this can be done using the global lock (FLUSH TABLE WITH READ LOCKS). Consequently, global lock will temporarily make the server to be read-only. For InnoDB, locking is unnecessary and other DML operations are allowed to execute while the backup is running.

In term of backup size, if you have limited storage space backed by an outdated disk subsystem, compression is your friend. Performing compression is a CPU intensive process and can directly impact the performance of your MySQL server. However, if it can be scheduled during periods of low traffic, compression can save you a lot of space. It is a tradeoff between processing power and storage space, and reduces the risk of server crash caused by a full disk.

If your database workload is write-intensive, you might find the difference in size (delta) between the two latest full backups to be fairly big, for example 1GB for a 10GB dataset per day. Performing regular full backups on databases with this kind of workload will likely introduce performance degradation, and it might be more efficient to perform incremental backups. Ultimately, this kind of workload will bring the database to a state where the backup size is rapidly growing and physical backup might be the only way to go.

When creating an encrypted backup, one thing to have in mind is that it usually takes more time to recover. The backup has to be decrypted prior to any recovery activities. With a large dataset, this could introduce some delays to the RTO. On the other hand, if you are using private key for encryption, make sure to store the key in a safe place. If the private key is missing, the backup will be useless and unrecoverable. If the key is stolen, all created backups that use the same key would be compromised as they are no longer secured.

It is common nowadays to have a high availability setup using either MySQL Replication or MySQL/MariaDB Galera Cluster. It is not necessary to backup all members in the replication chain or cluster. Since all nodes are expected to hold the same data (unless the dataset is sharded across different nodes), it is recommended to perform backup on only one node (or one per shard).

Taking a MySQL backup on a dedicated backup server will simplify your backup plans. A dedicated backup server is usually an isolated slave connected to the production servers via asynchronous replication. A good backup server consists of plenty of  disk space for backup storage, with the ability to do storage snapshots. Since it uses loosely-coupled asynchronous replication, it will unlikely cause additional overhead to the production database. However, this server might become a single point of failure, with the risk of inconsistent backup if the backup server regularly lags behind.

As we have seen, there are quite a few things to consider in order to make efficient backups of MySQL and MariaDB. Each of the mentioned points are discussed in depth, together with example use-cases and best practices in our latest whitepaper - The DevOps Guide to Database Backups for MySQL and MariaDB.

Viewing all 327 articles
Browse latest View live