Automated recovery scenarios for lazy MySQL DBA

This is a translation of a Korean blog I wrote in 2017. This may differ slightly from the current MySQL 8.0 version.
http://gywn.net/2017/01/automated-recovery-scenarios-for-lazy-mysql-dba/

Overview

Over using MySQL, replication is you know not easy to away from it. It’s like saying that you have to manage multiple servers with the same data soon… It’s a lot of data recovery. This is especially true if the surer environment is in case of data explosion. In addition, the use of early morning backups during recovery will soon be a batch of changes that have been saved since dawn. This is a lot of time consuming data synchronization. In this environment, Simply talk about how to automate recovery.

Prepare

There is no big issue with “Linux+MySQL+Xtrabackup”.

CentOS 6.8
MySQL 5.7.14
Xtrabackup-2.4.3
Pv

Configured to replicate data based on an existing Binlog Position-based replication rather than GTID. I would take SCP for pull the data without any errors.
This means that you will soon have to pair the replication node with the id_rsa.pub key in advance so that you can pull the data without a password for authentication. That means you need to be prepared to access the server without a password.

server1$ ssh mysql@server2
server2$ ssh mysql@server1

Restore Scenario

Depending on the characteristics of the service or individual preferences. The most important thing is the recovery efficiency. Of course, there should be little service impact before efficiency. If so, I would say. You need to control your data transfer traffic. In general, if you have a recovery case, you can consider two cases.

A case that configures slaves with data from slaves. A case that adds another same slave to the current slave. The Slave Status information of the slave that is being dictated by the data becomes the configuration information for the recovery slave.

  1. restore from master data
    
As a case to add slaves to the data of the master. The Binlog position of the server is the information for the slave configuration.
  2. restore from slave data
    
A case that configures slaves with data from slaves. A case that adds another same slave to the current slave. The Slave Status information of the slave that is being dictated by the data becomes the configuration information for the recovery slave.

Before proceeding, It is shared the explains about automatic recovery scenario not for the my personal script. I had skip for explains that validation check or password crypt.
This is a data recovery advance knowledge. The bottom of xtrabackup read the manual first and I am sure it will be a great help.

https://www.percona.com/doc/percona-xtrabackup/2.4/howtos/setting_up_replication.html

case 1) restore from master data

As I told you earlier. I received the data from the master, as a case to configure slaves right at the bottom. A case that organizes data in the same way as the bottom image.

Restore_From_Master_Data

1) Pulling data

During the last course of Xtabackup… The process of copying non-InnoDB data… If you do not give the --no-lock option, you will perform a global Lock during the data copy to ensure the data consistency. In other words, it affects the service.

Fill out the innobackupex part directly into the shell and complete it in one room as shown below. Control the backup data that is sent to the streaming to “pv” (50 MB per second, make even weaker servers)…

ssh -o StrictHostKeyChecking=no mysql@${TARGET_HOST} \
 "innobackupex \
 --host='127.0.0.1' \
 --user='backupuser' \
 --password='backuppass' \
 --no-lock \
 --stream=xbstream \
 /data/backup | pv --rate-limit 50000000" 2> innobackupex.log \
 | xbstream -x 2> xbstream.log

You can view the two logs of “innobackupex.log” and “xbstream.log”” and validate the backup data that has been shown. The “innobackupex.log” is “completed OK!”, which would normally result in data being displayed.

2) Apply redo log

To proceed with the assumption that the advanced process has been performed normally. The process of applying data that has changed while the data is being backed up and transferred. This is also the end of the innobackupex.log “completed OK!” and you can see that the log has been applied normally.

innobackupex --apply-log . 2>> innobackupex.log

3) configuration of slave

Finally, The process of extracting the position of the master to configure the actual slave. The location information is extracted as follows from the “xtrabackup_binlog_pos_innodb” that is created after the Apply log. Usually, the master’s binary log position is recorded by a tab separated as shown below.

mysql-bin.000001     481

I defined log position with sed command as below.

MASTER_LOG_FILE=`sed -r "s/^(.*)\s+([0-9]+)/\1/g" xtrabackup_binlog_pos_innodb`
MASTER_LOG_POS=`sed -r "s/^(.*)\s+([0-9]+)/\2/g" xtrabackup_binlog_pos_innodb`

Now, the master server has also been determined, and the binary log position for the replication configuration is also defined. Let’s configure the slave as shown below.

echo "change master to
 master_host='${TARGET_HOST}',
 master_user='repl',
 master_password='replpass',
 master_log_file='${MASTER_LOG_FILE}',
 master_log_pos=${MASTER_LOG_POS}; 
 start slave;"\
| mysql -uroot -pxxxxxx

This is done by leveraging the master data as shown below without a large crowd to configure the new slave.

Restore_From_Master_Data_Final

case 2) restore from slave data

Second case… Using data from the current slave server to add new slave servers. It has the following data flow: A scenario where you import data from a normally running slave server and configure additional slaves of the same topology based on the slave information for that slave.

Restore_From_Slave_Data

1) pulling data

This time, the process of releasing backup streaming data from the slaves directly into xbstream, as well as throughput control to PV.
Specify the --slave-info option as shown below for the purpose of dictating the slave position differently than the previous case. (To give this option, record the slave state.)

ssh -o StrictHostKeyChecking=no mysql@${TARGET_HOST} \
 "innobackupex \
 --host='127.0.0.1' \
 --user=backupuser \
 --password='backuppass' \
 --slave-info \
 --stream=xbstream \
 /data/backup | pv --rate-limit 50000000" 2> innobackupex.log \
 | xbstream -x 2> xbstream.log

Similarly, you can view the two logs of innobackupex.log and xbstream.log, and validate the backup data that has been available. In “innobackupex.log”, “Completed OK!” is the end of the data, and the xbstream.log result has been released properly with no content..

2) Apply redo log

While data is being backed up and transferred, it does the same as before to apply the changed data. This is also the end of the “innobackupex.log” “completed OK!” which is normally handled.

innobackupex --apply-log . 2>> innobackupex.log

3) configuration of slave

The slave position is now… I have to define a little differently from what I extracted from the master. You must first identify the binary log position in “xtrabackup_slave_info”.

Finally. The process of extracting the position of the master to configure the actual slave. The location information is extracted as follows from the “xtrabackup_binlog_info” that is created after the Apply log. Usually, the master’s binary log position is recorded by a tab separated as shown below. It is actually stored in the following form.

CHANGE MASTER TO MASTER_LOG_FILE='mysql-bin.000001', MASTER_LOG_POS=481
MASTER_LOG_FILE=`sed -r "s/(.*)MASTER_LOG_FILE='(.*)', MASTER_LOG_POS=([0-9]+)/\2/g" xtrabackup_slave_info`
MASTER_LOG_POS=`sed -r "s/(.*)MASTER_LOG_FILE='(.*)', MASTER_LOG_POS=([0-9]+)/\3/g" xtrabackup_slave_info`

Getting a binary log position up here is a big move. The problem is, where do I get the information about the server that actually replication? Of course, this can be extracted from a separate monitoring tool. Or you can push the master information somewhere on a regular basis… The way I chose was to get into the target slave equipment that came with the data, look at the Slave Status directly, and retrieve the master server information that I am currently looking at.

MASTER_HOST=`mysql -urepl -preplpass -h ${TARGET_HOST} -e 'show slave status\G' | grep 'Master_Host: ' | sed -r 's/\s*(Master_Host: )//g'|sed -r 's/\s*//g'`

One complaint.. When creating a "xtrabackup_binlog_info" file in xtrabackup, why not record the master information that the slave is looking at.

Now, the master server has also been recognized, and the binary log position for the replication configuration is also defined… Let's configure the slave as shown below. The master host (found earlier) has changed the actual master's host, not the target host.

echo "change master to
 master_host='${MASTER_HOST}',
 master_user='repl',
 master_password='replpass',
 master_log_file='${MASTER_LOG_FILE}',
 master_log_pos=${MASTER_LOG_POS}; 
 start slave;"\
| mysql -uroot -pxxxxxx

It is success to configuration with get data from slaves as follows.

Restore_From_Slave_Data_Final

Conclusion

So far, you’ve learned how to organize your slaves by leveraging the data. Before we start, we couldn’t show you all the scripts, just like you said. Of course, if only a few hundred lines, simple script but this cannot be the correct answer.

I don’t want to pull the data into the “scp”, but I can come through nc utility. Not a classic Binlog position… It’s even simpler to solve with GTID.
My solution is just one of a case not even more than that.

A little trick to express mysql read-only clearly in grafana

Overview

I mentioned about PMM(Percona Monitoring and Management), actually I will talk about using Prometheus query and Grafana with little trick. It’s how to express read-only ON/OFF clearly in grafana. 🙂

Definition

This topic is in the server notation according to the READ-ONLY setting value of MySQL, but in my opinion this is also a topic that separates the master/slave. Of course, you can also run a service with a different topology of MySQL, such as (M)-(M/S)-(S) as a body-dining structure. I think that the simplest configuration is the most powerful, when you prepare for a variety of disabilities with personal thought. In my position, READ-ONLY is defined as slave, a guy activated ON.

  • Master (READ_ONLY: OFF)

    An instance for subject of data change
    Master recognition for (M/S) with middle chaining like (M)-(M/S)-(S)

  • Slave (READ_ONLY: OFF)

    Instances that data cannot be changed

A query may differ depending on the individual’s definition, but I want to have representation of the “READ_ONLY is based on ON/OFF server Classification” or “all instance criteria.”

Grafana Variable

In MySQL, READ-ONLY means only ON/OFF, but here we have added a value of 2 to mean the entire server.

  • var_read_only : 0 (READ-ONLY  OFF)
  • var_read_only : 1 (READ-ONLY  ON)
  • var_read_only : 2 (ALL)

How to add a new Variable from Grafana is to be added in the Variables tab of the Setting of the dashboard, and the custom type as shown below 2, 0, 1. The reason why start with 2. Just try to specify the default value.

grafana-read-only-variable1

However, the problem is that if you look at the Variable select box created, the notation is shown as 2, 0, 1 and is not intuitive. 2-> ALL, 0-> OFF, 1-> ON to be represented in the Select box, it would be okay to be used? (Of course, no matter to use.)

In the UI setting of Grafana, it seems to be impossible (yet)… This requires a little manipulation and re-Import after the Export of Grafana. Look up “Share Dashboard” > “Export” tab and open it up as a JSON file, the “text” part is also set to something like value, as shown below.

{
  "allValue": null,
  "current": {
    "text": "2", <== here
    "value": "2"
  },
  "hide": 0,
  "includeAll": false,
  "label": "Read-only",
  "multi": false,
  "name": "read_only",
  "options": [
    {
      "selected": true,
      "text": "2",
      "value": "2"
    },
    {
      "selected": false,
      "text": "0", <== here
      "value": "0"
    },
    {
      "selected": false,
      "text": "1", <== here
      "value": "1"
    }
  ],
  "query": "2,0,1",
  "type": "custom"
}

The above text will be replaced with an intuitive phrase as below, and you can Import the Dashboard again.

{
  "allValue": null,
  "current": {
    "text": "ALL", <== here
    "value": "2"
  },
  "hide": 0,
  "includeAll": false,
  "label": "Read-only",
  "multi": false,
  "name": "read_only",
  "options": [
    {
      "selected": true,
      "text": "ALL", <== here
      "value": "2"
    },
    {
      "selected": false,
      "text": "OFF", <== here
      "value": "0"
    },
    {
      "selected": false,
      "text": "ON", <== here
      "value": "1"
    }
  ],
  "query": "2,0,1",
  "type": "custom"
}

Finally, it is possible to make intuitive phrases on the Grafana screen. It is not easy to control and anyway the first query setting is completed!

grafana-read-only-variable2

Promethues Query

At Grafana, we’ve created the basis to deliver whatever you want, and this time we’ll write a Prometheus query to look at the data. In this place, I will not explain the Prometheus query every one, please visit the official site manual once Prometheus, read the query syntax. (Here)
My concern is the grouping result of the READ-ONLY ON/OFF. In the preceding definition, if the READ_ONLY value is 0, the master, the READ_ONLY is 1, the slave, and 2 means the entire server. Based on this data from Grafana, we will select and take only the data that we want.

First, try extracting the current system load with a Prometheus query to get a maximum value of one minute for max_over_time, as shown below. This is the result corresponding to the entire instance.

max_over_time(node_load1[1m])

Add a condition for the READ-ONLY status value to look up the results for READ-ONLY on/OFF. Here, the “on” entry means to process data operations based on which data, if it is a SQL reference, it’s probably the JOIN condition.

max_over_time(node_load1[1m]) 
  and on (instance) (mysql_global_variables_read_only == $read_only)

But there’s one problem here. If the READ-ONLY value exists, it will query queries without a big problem, but if you look up master/slave at the same time, like a entire server, the query does not work normally.

So, here is the another trick for it. If the result of the three values is equal to 1 or more than 2, then it seems to be resolved? And if we tie the “OR” condition, we can do the first arithmetic process, and if we do not meet, we will write a query to READ-ONLY part? The metric that “exists unconditionally” in each instance is ?

You can use the value “up {job = ‘ Linux ‘}”. The result of this value is 0 or 1, and you can assume that the Exporter exists unconditionally. In other words, if the $read _oly is 0 and 1, it will be negative, and you can create a query as shown below that it will only have 1 or more positive values.

max_over_time(node_load1[1m]) 
  and on (instance) (
      up{job='linux'} <= $read_only^2-3 
      or mysql_global_variables_read_only == $read_only
  )

If this is the case, the first 2 is filtered in the first up condition and does not refer to the mysql_global_variables_read_only portion of the back. (Or does not have the preceding result) However, if the READ_ONLY filtering condition enters zero or one operation, the up result does not exist, so you can check the Mysql_global_variables_read_only value and finally extract the data with desired.

Result

I configured the VM locally as a test and set up the MySQL server with the following structure, and then I tried to float the PMM exporter. Set the OFF for read only because of node01 and node03 are the role of the main for data change.

node01 : Master
  ㄴ node02 : Slave
node03:  Master(Single)

Case1. ALL Instance

All servers are displayed, including Pmm-server instances.

pmm_result_readonly_all

Case2. READ-ONLY: OFF

As intended, only instances where the READ-ONLY is OFF are shown in the graph. Note that only MySQL instance data with the READ-ONLY property is indicate, so the Pmm-server instance will fall out of the target list.
pmm_result_readonly_off

Case3. READ-ONLY: ON

Finally, The READ-ONLY only shows the slaves that are ON. Here is the same with no pmm-server.
pmm_result_readonly_on

Conclusion

Here is the what I did
Intuitively notation for READ-ONLY with trick for dashboard Import in Grafana
Create Prometheus tips and tricks to filter by two steps (ALL, READ-ONLY ON/OFF)

It can be very simple and easy for master hand, but if you have to get resource status on system at a glace with dozens of master/slave mixed environments, the sorting according to the rolls can be a big deal.

However, since the criteria of this notation depends on the database READ-ONLY, another notation is necessary if the DB is not running. I think it is a good idea to solve this problem through the recording function in Prometheus.

Good luck~ 🙂

Introduction of pmm-ruled

I personally believe that reliable service starts with monitoring and ends with tuning.

PMM is a very useful monitoring tool. I believe super stable service is mostly coming from Service Monitoring.

Anyway, when I first saw PMM, I was very impressing because I was thirsty for customizing of Database monitoring solution. However, while deeply analyzing the structure and function of the PMM, I found out that there are some features of motoring solution that need to have additional features.

Why Monitoring Feature is so important?

The monitoring system aims to show the current indicators or status of database properly. In this respect, I would like give PMM with good score.

However, it is also a very important monitoring feature to send alerting to the administrator when the problem occurs (or before).

The problem was that there was no effective solution with PMM, and I needed that my banking service should be running without any further problems.

PMM Alerting

As you know, there are two ways you can send Alerting in PMM.

The first is to take advantage of Grafana’s Alerting feature.

You can control from UI interface and it’s intuitive.

https://www.percona.com/blog/2017/01/23/mysql-and-mongodb-alerting-with-pmm-and-grafana/
https://www.percona.com/blog/2017/02/02/pmm-alerting-with-grafana-working-with-templated-dashboards/

Unfortunately, Grafana does not support “Template Variable” properly, so I should set the number of alerting for each metric. This is a tired and not a suitable solution for the expansion of hundreds of nodes.

The second is to use “alert rule” of prometheus.

Though not intuitive, it is a way to specify multiple node thresholds at once.

https://prometheus.io/docs/prometheus/latest/configuration/alerting_rules/

However, there’s no way to set dynamic thresold for each node, only supports a fixed.
Even though I want to specify different thresholds for the purpose of the server(ex, production/development), there is no way to configure the alert rule without complicated alerting rule configuration in prometheus.

PMM-ruled

I do need a “dynamic rule manager”, so I made this with a funny idea.

Grafana alerting which I said before that was difficult to apply, so I decided the second method, prometheus alerting. As I mentioned a little while ago, promethus is a time series database. In order to perform data operations, the data to be computed and must exist in the same time range.

threshold-rule-metric

In other words, in order to configure the threshold value of each server differently, this threshold value is also periodically taken from the prometheus as some kind of metrics.

PMM-ruled is the Alerting threshold exporter.

It is not enough. There must be a principal to manage these alerting rules. If you want to set A node and B node need to receive alerting message when CPU usage is over 10%, 20% in each, this information also must be stored and managed somewhere else.

From the perspective of monitoring for dozens and hundreds of nodes with a single PMM, I did not want to manage the file thresholds one by one manually, one by one. (In fact it is impossible to handle by myself.)
Alert thresholds must be managed with “group” or “instance” in each. And I would like to have the ability to skip alerting as well.

alert-thresolds

If an instance level rule exists, it is given priority for that instance. If there is no instance level rule, priority is given to the group rule. If neither instance nor group has a specified threshold, use the threshold specified in alert rule for that node. The threshold values are classified into layers, like the image above

PMM-ruled is the alerting rule managing API server.

And if the alert rule was defined as a fixed value it would be shown as below:

groups:
- name: alert_rules
  rules:
  - alert: mysql_agent_up
    expr:  up{job="mysql"} == 0
    for: 10s
    labels: 
      level: critical
    annotations: 
      summary: MySQL Agent Down
      description: MySQL Agent Down

The thresholds can be calculated as a single metric and managed dynamically on a server basis as shown below.

groups:
- name: alert_rules
  rules:
  - alert: mysql_agent_up
    expr:  up{job="mysql"} == on (instance) group_left (level, name, gname) (alert_rule_threshold{name="mysql_agent_up", level="critical"})
    for: 10s
    labels: 
      level: critical
    annotations: 
      summary: MySQL Agent Down
      description: MySQL Agent Down

PMM-ruled is the alerting rule file generator.

In addition to this, it has more features such as prometheus recording rule management, snapshot rule management which inserts current state value into MySQL. (to show current node status with just single table.)

The figure below is showing the configuration of the whole pmm-ruled.
pmm-ruled-overview

Interesting?

I made this feature in April, 2018, and am still monitoring for dozens of mysql nodes with only single PMM server from that time. I’m currently collaborating with Percona after contributing the entire pmm-ruled source code.

You can start this project from percona-lab repository.

https://github.com/percona-lab/pmm-ruled

I would like to keep the original source, so I forked into a separate project.

https://github.com/gywndi/pmm-ruled

I will talk about how to port from pmm-ruled to pmm in the next time.

Thanks.

MySQL Sequence Automatically Initialized

Overview

MySQL does not have an object like Oracle’s sequence. The first question for developers who have had a lot of experience with Oracle is mostly about the sequence.

For simple sequence issuance, you can create a function and issue a sequence as shown below.(Simple and efficient!)
https://www.percona.com/blog/2008/04/02/stored-function-to-generate-sequences/

Sometimes, however, I received interesting questions such as the following.

Initialize sequence every 00:00

I looked for ways to solve them most efficiently and created a sequence that can handle the “interesting requirements” by myself. It can be easily solved by referring to the blog I just shared before.
https://gywn.blog/2019/01/31/using-affected-rows-in-mysql/

Goals

I already said that I need a sequence that is initialized every 00:00 above it. In general, if you solve this in your business logic(not database), it will be implemented in the following way.

begin;

select seq, last_date, 
       current_date curr
  from seq_table 
 where seq_name = 'xxx' 
for update;

## if curr == last_date ##
  update seq_table set 
     seq = seq + 1
  where seq_name = 'xxx';
## else ##
  update seq_table set 
     seq = 1,
     last_date = current_date()
  where seq_name = 'xxx';
## end if ##
commit;

This means that four queries must be sent in order to update single row, and this process is burdensome if the network latency is bad. (Think about it. If latency is 5ms, it takes at least 20ms to issue one sequence.) Because of the contention for a particular row, in this case, the application’s branching to business logic worsens performance.

If it does not matter, you can use it like that, but it is not pretty solution to me. Is there a more efficient way? Consider a simple way to solve this with a just single operation.

Solution

First, let’s create a table to save the last sequence value.

create table `seq` (
  `name` varchar(20) not null,
  `curr_date` date not null,
  `val` int(10) not null,
  primary key  (`name`)
) engine = myisam;

Second, let’s create the sequence as below.

delimiter //
drop function seq //
create function seq(seq_name char (3)) returns int
begin
  insert into seq (name, curr_date, val) values (seq_name, current_date(), last_insert_id(1))
      on duplicate key update 
         val = if(curr_date = current_date(), last_insert_id(val+1), last_insert_id(1)),
         curr_date = current_date();
  return last_insert_id();
end
//
delimiter ;

The sequence is normally issued, expected result. 🙂

mysql> select seq('a'), seq('a'), seq('a');
+----------+----------+----------+
| seq('a') | seq('a') | seq('a') |
+----------+----------+----------+
|        1 |        2 |        3 |
+----------+----------+----------+
1 row in set (0.00 sec)

mysql> select seq('a'), seq('a'), seq('a');
+----------+----------+----------+
| seq('a') | seq('a') | seq('a') |
+----------+----------+----------+
|        4 |        5 |        6 |
+----------+----------+----------+
1 row in set (0.01 sec)

To test when the time is over, try changing the timestamp as shown below.

mysql> select current_date();
+----------------+
| current_date() |
+----------------+
| 2019-03-03     |
+----------------+
1 row in set (0.00 sec)

mysql> set timestamp = unix_timestamp('2019-03-04');
Query OK, 0 rows affected (0.00 sec)

mysql> select current_date();
+----------------+
| current_date() |
+----------------+
| 2019-03-04     |
+----------------+
1 row in set (0.00 sec)

At this point, the sequence is issued with the initialized values as shown below. You no longer have to run 4 queries to issue next sequence value.

mysql> select seq('a'), seq('a'), seq('a');
+----------+----------+----------+
| seq('a') | seq('a') | seq('a') |
+----------+----------+----------+
|        1 |        2 |        3 |
+----------+----------+----------+
1 row in set (0.00 sec)

Cautions

  1. Using the InnoDB storage engine is helpful if you can branch transactions for sequence issuance.
    • InnoDB : 11,564 op/s
    • MyISAM : 8,813 op/s
  2. In semi-sync, a long node-to-node latency impacts the sequence issue regardless of the storage engine. (Semi-sync is for the IO thread between master and slave.)

Conclusion

I made a simple sequence that is automatically initialized every day. Depending on the service, some services have implemented complex business logic to initialize this value.

I compared the last value of the sequence with the current time, determined whether to initialize it, and issued a sequence value.

It’s very easy and simple, but it’s a useful, so I share!! 🙂

Using “Affected Rows” in MySQL

Overview

Have you ever thought about using “Affected Rows” in your application?

As you all know, “Affected Rows” is the number of data rows that actually affected by DML. It is usally used to check to see how many rows has been changed since DML(insert/update/delete) was executed.

In this post I would like to talk about how to use “Affected Rows” to solve various requirements in a fun way.

Affected rows in MySQL

As I mentioned before, “Affected Rows” in MySQL means the row that has really changed.

mysql> insert into test values (1,1);
Query OK, 1 row affected (0.00 sec)

mysql> update test set j = 1;
Query OK, 0 rows affected (0.01 sec)
Rows matched: 1 Changed: 0 Warnings: 0

mysql> update test set j = 2;
Query OK, 1 row affected (0.00 sec)
Rows matched: 1 Changed: 1 Warnings: 0

This looks nothing in your application, but if you think a little differently, you can get a free lunch, reducing amounts of queries. Let me explain about this. 🙂

Service requirements

If you have received the following requirements from the service planner:

  • Reset count every day at 00:00
  • Participate only 5 times in a certiain cycle

Normally, it would be implemented by bundling transactions as shown below.(This is just psudo code.)


try{
    execute("BEGIN");
    row = execute("SELECT * FROM user_event WHERE user_id = 100 FOR UPDATE");

    // 1. Check Time
    if(last_applied_time == CURRENT_DATE){

        // 2. Check Count
        if(apply_count < 5){
            execute("UPDATE user_event SET apply_count = apply_count + 1 WHERE user_id = 100");
        }
    }else{

        // 3. Reset Count
        execute("UPDATE user_event SET apply_count = 1 WHERE user_id = 100");
    }
    execute("COMMIT");
}catch(Exception e){
    execute("ROLLBACK");
}

Of course, this is not wrong, but I want to solve this service requirement more fun aspect with “affected rows”.

My own solution

Let’s create the following table before I explain it.

CREATE TABLE `user_event` (
  `user_id` int(11) NOT NULL,
  `apply_count` int(11) NOT NULL,
  `last_applied_time` timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP,
  PRIMARY KEY (`user_id`)
) ENGINE=InnoDB

It’s just a very simple table with an event apply count based on user with last event applied time (timestamp).Now, let’s create the query as shown below to suit planner’s requirements.

First, there must be no data, so we can make INSERT query as below.

INSERT INTO user_event (user_id, apply_count) VALUES (1, 1);

Second, let’s add more meaning to the query as follows to increase Count every time user participate event.

INSERT INTO user_event (user_id, apply_count) VALUES (1, 1)
    ON DUPLICATE KEY UPDATE apply_count = apply_count + 1

By the way, the planner order to initialize counter value every day.
If the last entry date is the same as today’s date, increase the apply_count. In order to satisfy a date requirement (1. Check Time), branch processing is performed by if as follows.

INSERT INTO user_event (user_id, apply_count) VALUES (1, 1) 
    ON DUPLICATE KEY UPDATE
       apply_count = if(date(last_applied_time) = current_date, apply_count + 1, 1)

Let’s set the event participate limit condition as follows. The “apply_count” is only incremented when the current value is less than 5. So, we make the final query, as below.

INSERT INTO user_event (user_id, apply_count) VALUES (1, 1) 
    ON DUPLICATE KEY UPDATE
       apply_count = if(date(last_applied_time) = current_date, if(apply_count < 5, apply_count + 1, apply_count), 1)

Now, we solved these complex requirements in just a single line only. Then, let’s do some fun tests.:-)

I’m not a patient guy who could not wait some hours to change date 🙂 , so manipulated my code for changing a reset condition from daily change to every 10secs, as shown below.

INSERT INTO user_event (user_id, apply_count) VALUES (1, 1) 
    ON DUPLICATE KEY UPDATE
       apply_count = if(timestampdiff(second, last_applied_time, now()) < 10, if(apply_count < 5, apply_count + 1, apply_count), 1)

And we can get the pretty results as below. Note that, the application can determine whether or not this user is applying (1 or more) or restricted (0) according to the Affected Rows value that is received in the query result.

1. First Insert

1 row affected means there’s no target rows in this table, so this is the first INSERT.

mysql> INSERT INTO user_event .. ON DUPLICATE KEY UPDATE .. 
Query OK, 1 row affected (0.00 sec)

mysql> select * from user_event;
+---------+-------------+---------------------+
| user_id | apply_count |   last_applied_time |
+---------+-------------+---------------------+
|       1 |           1 | 2018-03-25 23:05:38 |
+---------+-------------+---------------------+
1 row in set (0.00 sec)

2. Second Insert in 10s

2 rows affected means, target row exists and row has been updated because of unique constraints. Does it make sense, so far?

mysql> INSERT INTO user_event .. ON DUPLICATE KEY UPDATE .. 
Query OK, 2 rows affected (0.00 sec)

mysql> select * from user_event;
+---------+-------------+---------------------+
| user_id | apply_count |   last_applied_time |
+---------+-------------+---------------------+
|       1 |           2 | 2018-03-25 23:05:41 |
+---------+-------------+---------------------+
1 row in set (0.00 sec)

3. Insert more than 5 times

Affected rows are returned as 2, and 0 is returned after apply_count is 5.

mysql> INSERT INTO user_event .. ON DUPLICATE KEY UPDATE .. 
Query OK, 0 rows affected (0.00 sec)

mysql> select * from user_event;
+---------+-------------+---------------------+
| user_id | apply_count |   last_applied_time |
+---------+-------------+---------------------+
|       1 |           5 | 2018-03-25 23:05:46 |
+---------+-------------+---------------------+
1 row in set (0.00 sec)

4. After 10 seconds

After 10 seconds the limited time, affected rows is returned to 2 and apply_count changed to 1. The meaning of affected rows is 2, regardless of whether initialization or increase, or whatever the user’s apply_count has been increased successfully.

mysql> INSERT INTO user_event .. ON DUPLICATE KEY UPDATE .. 
Query OK, 2 rows affected (0.00 sec)

mysql> select * from user_event;
+---------+-------------+---------------------+
| user_id | apply_count |   last_applied_time |
+---------+-------------+---------------------+
|       1 |           1 | 2018-03-25 23:05:56 |
+---------+-------------+---------------------+
1 row in set (0.00 sec)

We don’t need to exectue another query to get the result, we can distinguish the result status and make variety message to users in application level. (No transaction, No several queries, Just 1 Query, Free lunch box)

In other words, to summarize, the following behavior can be expected.

Query OK, 1 row affected (0.00 sec)  <= Insert 1 (First)
Query OK, 2 rows affected (0.00 sec) <= Update to 2
Query OK, 2 rows affected (0.00 sec) <= Update to 3
Query OK, 2 rows affected (0.00 sec) <= Update to 4
Query OK, 2 rows affected (0.00 sec) <= Update to 5
Query OK, 0 rows affected (0.00 sec) <= No action
Query OK, 2 rows affected (0.00 sec) <= Update to 1 (Initialized)

However, in Java, you need to add jdbc parameter useAffectedRows=true to get this nice result. (Check! https://bugs.mysql.com/bug.php?id=39352)

Conclusion

If there are no changes to the data, then there is only 0 Affected Rows. This feature is depended on the DBMS, it is also (probably) dependent on MySQL.

However, if you understand and use these behaviors, you can solve some kind of problems of getting more comfortable. In addition to limiting the number of event “apply count” for a certain period of time, it may be also necessary to keep “summarized data” on a daily or monthly basis. If you think about it, you can easily implement an initialized sequence using affected rows. There is already something that I made roughly, but I will arrange it next time.

First posting in English, it’s so tough.
Thanks, have a nice day~ 🙂