Impact on binary log of bulk row update query

Overview

I suddenly wondered how it would affect the binary log of bulk row update queries.

For example, if a table of several tens of giga-bytes is updated at once, and single update row event is sent, OOM will occur in the target database. Extremely dangerous.

How to handle this in MySQL, let’s talk lightly.

Bulk row update

In MySQL, the data actually written to the DB for the Insert query is written to the binlog as it is.

If you use a syntax like insert into .. select .. All of these data are recorded in the binlog in the form of an Array. This means that multiple rows are written to one write row event.

So, using mysql-binlog-connector-java , I simply printed out the row count applied to WriteRowsEvent and UpdateRowsEvent. Each event is passed in the form of or in the data area.

Each event is passed in the form of <Serializable[]> or <Serializable[],Serializable[]> in the data area.

// Write row event processing
WriteRowsEventData eventData = (WriteRowsEventData) event.getData();
System.out.println("WriteRowsEventData>>"+eventData.getRows().size());

// Update row event processing
UpdateRowsEventData eventData = (UpdateRowsEventData) event.getData();
System.out.println("UpdateRowsEventData>>"+eventData.getRows().size());

I simply checked the rows count.

mysql> CREATE TABLE `binlog_test` (
    ->   `i` int NOT NULL AUTO_INCREMENT,
    ->   `v` text,
    ->   PRIMARY KEY (`i`)
    -> );

mysql> INSERT INTO `binlog_test` (v) VALUES (repeat(md5(rand()), 10));
Query OK, 1 row affected (0.00 sec)

mysql> INSERT INTO `binlog_test` (v) SELECT repeat(md5(rand()), 10) FROM binlog_test;
Query OK, 1 row affected (0.00 sec)
Records: 1  Duplicates: 0  Warnings: 0

mysql> INSERT INTO `binlog_test` (v) SELECT repeat(md5(rand()), 10) FROM binlog_test;
Query OK, 2 rows affected (0.00 sec)
Records: 2  Duplicates: 0  Warnings: 0

mysql> INSERT INTO `binlog_test` (v) SELECT repeat(md5(rand()), 10) FROM binlog_test;
Query OK, 4 rows affected (0.00 sec)
Records: 4  Duplicates: 0  Warnings: 0

mysql> INSERT INTO `binlog_test` (v) SELECT repeat(md5(rand()), 10) FROM binlog_test;
Query OK, 8 rows affected (0.00 sec)
Records: 8  Duplicates: 0  Warnings: 0

mysql> INSERT INTO `binlog_test` (v) SELECT repeat(md5(rand()), 10) FROM binlog_test;
Query OK, 16 rows affected (0.00 sec)
Records: 16  Duplicates: 0  Warnings: 0

mysql> INSERT INTO `binlog_test` (v) SELECT repeat(md5(rand()), 10) FROM binlog_test;
Query OK, 32 rows affected (0.01 sec)
Records: 32  Duplicates: 0  Warnings: 0

mysql> INSERT INTO `binlog_test` (v) SELECT repeat(md5(rand()), 10) FROM binlog_test;
Query OK, 64 rows affected (0.00 sec)
Records: 64  Duplicates: 0  Warnings: 0

This is a query that repeats INSERT as many as the table count each time based on the first one. (1 -> 1-> 2 -> .. -> 64)

WriteRowsEventData>>1
======================
WriteRowsEventData>>1
======================
WriteRowsEventData>>2
======================
WriteRowsEventData>>4
======================
WriteRowsEventData>>8
======================
WriteRowsEventData>>16

## devided ##
======================
WriteRowsEventData>>25
WriteRowsEventData>>7
======================
WriteRowsEventData>>25
WriteRowsEventData>>25
WriteRowsEventData>>14

Interestingly, up to 16 rows cases are transmitted as write rows events as intended. From 32 rows case, two write rows events are transmitted. As a result, it is divided into approximately 25 pieces.

mysql> update binlog_test set v = repeat(md5(rand()), 10);

This time, approximately 12 rows are included in single update rows event. This is because the update rows event includes additional before content (this format <Serializable[],Serializable[]>).

UpdateRowsEventData>>12
UpdateRowsEventData>>12
UpdateRowsEventData>>12
UpdateRowsEventData>>12
UpdateRowsEventData>>12
UpdateRowsEventData>>12
UpdateRowsEventData>>12
UpdateRowsEventData>>12
UpdateRowsEventData>>12
UpdateRowsEventData>>12
UpdateRowsEventData>>8
======================

If you roughly calculate this size, it is calculated to be about 8K. This is calculated by --binlog-row-event-max-size. And Variable 'binlog_row_event_max_size' is a read only variable, so you need to add in my.cnf before startup mysql server.

Specify the maximum size of a row-based binary log event, in bytes. Rows are grouped into events smaller than this size if possible. The value should be a multiple of 256. The default is 8192.
binlog_row_event_max_size

You might ask if one operation is unstable if data is split and passed in this way. However, in binlog, it is recorded in the binlog in the form below and data is processed.

In other words, there is no problem because data is processed in a single transaction.

query event(begin)
table map event
> update row event (12)
> update row event (12)
> update row event (12)
> update row event (12)
> update row event (12)
> update row event (8)
xid event (commit)

Conclusion

This is something I checked out of personal curiosity. Since the case of using binlog is frequent, this content is meaningful for me.

In a situation where a single row exceeds 8k, if 5-6 rows are frequently updated at a time, binlog data can be received efficiently by adjusting binlog_row_event_max_size upward.

Each time only a single update rows event is delivered on default 8k binlog_row_event_max_size, one update rows event will be delivered as an array. (Of course, as long as memory allows.)

Hope this helps. 🙂

Use MySQL as a Cache server – Memcached plugin

Overview

In this post, I will talk about MySQL InnoDB Memcached plugin.

Although there are many other posts about this kind of subject, I focus on the goal of applying the actual service. It also contains its own philosophy of manipulating the flow of data well.

So, Let’s start about this subject!

Cache layer?

As the traffic of the service increases to the level of a monster, it is difficult to service with the DB only.

Of course, most of the small and medium-sized data could be served with only single database, but in the case of giga-unit traffic, there is a performance limit to get static data that does not change from the Database every time.

More than anything! Database has limited resources, it is true that all monster traffic cannot be handled based on DB only.

So, in the past when disk I/O was not good, I remember processing frequently searched data with a memory DB such as ALTIBASE Database which is memory based in order to provide service for this processing.

There are many purposes of the cache layer, but I would like to highlight for two cases which are as below;

Queuing or preprocessing in cache and write to file asynchronously.
Deliver service after loading data with high frequency of inquiry but little change in memory.

The latter is what I will be talking about here, the frequency of inquiry is very high but little change.

I add one additional option, it can be said that it is a really interesting story for the requirement to show only the data that has been processed and completed. (like READ-COMMITTED!!)

MySQL Memcached plugin

There are many different cache stores, but the reason I was interested in the MySQL memcached plugin is simple.

It’s accessible InnoDB data directly, without query parsing and optimizer steps. In the past, I was interested in MariaDB with a concept similar to HandlerSocket.

Ref> https://dev.mysql.com/doc/refman/8.0/en/innodb-memcached-intro.html

Oh, of course, it’s possible allocate and use a separate cache memory control the memcached policy in MySQL, but in an environment where there is a device with superior random I/O than expected, such as SSD, and we have already cache memory area, InnoDB buffer pool.

It didn’t really appeal to my point of view. (Of course, since InnoDB is a B-Tree, you cannot compare the performance itself with a hash structure that normally looks up data in O(1).)

Direct access to InnoDB data? This story can also be interpreted differently as non-volatile caching layer. That is, a caching layer with infinite TTL that does not have an expire time separately!! I could see new possibilities.

Configuration

You may refer to the Oracle documentation (innodb-memcached-setup), but today, I will configure the memcached plugin based only on the purpose of using the memcache protocol.

First, configure the schema related to memcached as shown below to configure the InnoDB plugin. 🙂 Here, $MYSQL_HOME means the home directory where MySQL is installed, and it will vary depending on each system configuration environment.

$ mysql -uroot < $MYSQL_HOME/share/innodb_memcached_config.sql

I will set up a separate token table as shown below.

## Create table
mysql> CREATE DATABASE `memcache_test`;
mysql> CREATE TABLE `memcache_test`.`token` (
    ->  `id` varchar(32) NOT NULL,
    ->  `token` varchar(128) NOT NULL,
    ->  PRIMARY KEY (`id`)
    ->);

Since data change will be performed through SQL, set as follows to perform only GET operation.

## Policy to allow only GET operation
mysql> update cache_policies set 
    ->   get_policy = 'innodb_only', 
    ->   set_policy = 'disabled', 
    ->   delete_policy='disabled', 
    ->   flush_policy = 'disabled';

## Delete the existing policy, add add new token policy
mysql> delete from containers;
mysql> insert into containers values ('token', 'memcache_test', 'token', 'id', 'token', 0,0,0, 'PRIMARY');

Now, modify the setting information so that the memcached plugin can see the table created above, and finally upload the plugin.

## Start InnoDB memcached plugin
mysql> INSTALL PLUGIN daemon_memcached soname "libmemcached.so";

After going through this process, an environment that can directly access DB data using the memcached protocol. “11211 port” has been opened as the default port, and you can get data directly from InnoDB data area by performing get operation with memcache protocol through this port.

In other words, you can directly read the changed data with SQL. (You can create a VALUE of memcache by mapping table columns to delimiters, but we will skip them here.)

Performance

Non-volatile caching layer? Of course, performance is a concern. No matter how good the expectations are, it is not a good solution if the response time or stability is poor in a caching system that generates a huge amount of reads. So, I compared the speed of InnoDB data lookup through a simple PK-based single lookup query and the memcached protocol.

Let’s configure the environment for a simple performance test. First, create a test table as shown below and create about 1 million data for performance testing.

## Generate test data
mysql> insert ignore into token 
    -> select md5(rand()), concat(uuid(), md5(rand())) from dual;
mysql> insert ignore into token 
    -> select md5(rand()), concat(uuid(), md5(rand())) from token;
... repeat ...

mysql> select count(*) from token;
+----------+
| count(*) |
+----------+
|   950435 |
+----------+

I did a simple test. SQL vs memcached!!

## SQL ##
SELECT * FROM token WHERE id = ${id}

## Memcache ##
get @@token.${id}

Based on the total ID value of the generated data, the system resources were measured by performing approximately 70,000 queries per second for each of the queries in the following random pattern.

In order to even evaluate the stability, I gave a stress test with continuous traffic of 70,000 per second for about 10 days. As a result, no error occurred even at once, and it showed a response time of about 0.3ms to 0.5ms each operation. About 1 second, the response time was only 1.5ms. (For reference, the right side is the interval average seconds.)

At this level, it is at a level that can be used as a cache in production. 🙂

+-----+---------------+
| ms  | interval avg  | <= seconds
+-----+---------------+
| 0.2 |            99 |
| 0.3 |        661736 |
| 0.4 |        162582 |
| 0.5 |          5686 |
| 0.6 |          1769 |
| 0.7 |          1004 |
| 0.8 |           576 |
| 0.9 |           310 |
| 1.0 |           159 |
| 1.1 |            77 |
| 1.2 |            29 |
| 1.3 |            12 |
| 1.4 |             4 |
| 1.5 |             1 |
+-----+---------------+

I even gave 140,000 GET operation stress per second. There was no problem with the service at all, and the average CPU resource utilization was about at 15%.. it was in a very stable state. Amazing!

Monitoring

No matter how great the service is, if there is no tool to check the health status of this service in real time, it will be useless, right? So, either way, I need to find the most effective monitoring method. Personally, I prefer to collect metrics using prometheus.

The reason I prefer prometheus is simple.

There are so many exporters that have already been made, and that if I want something, I can easily access it by adding functions to suit my own taste! The fact that I can collect monitoring metric information really efficiently with Prometheus, a time-series-based data store!

It is no exaggeration to say that Prometheus is optimized for metric collection.

Of course, in the case of the memcached exporter officially released by Prometheus, there are some problems in collecting InnoDB memcached information. The data of more than 4 characters in the stats settings result is recognized as an abnormal pattern, and memcached is collected as not running normally. If there is something that doesn’t work, I just fix it. It’s an open source fun factor.

// AS-IS ======================
stats := map[string]string{}
for err == nil && !bytes.Equal(line, resultEnd) {
    s := bytes.Split(line, []byte(" "))
    if len(s) != 3 || !bytes.HasPrefix(s[0], resultStatPrefix) {
        return fmt.Errorf("memcache: unexpected stats line format %q", line)
    }
    stats[string(s[1])] = string(bytes.TrimSpace(s[2]))
    line, err = rw.ReadSlice('\n')
    if err != nil {
        return err
    }
}

// TO-BE ======================
stats := map[string]string{}
for err == nil && !bytes.Equal(line, resultEnd) {
    s := bytes.Split(line, []byte(" "))
    if len(s) == 3 {
        stats[string(s[1])] = string(bytes.TrimSpace(s[2]))
    } else if len(s) == 4 {
        stats[string(s[1])] = string(bytes.TrimSpace(s[2])) + "-" + string(bytes.TrimSpace(s[2]))
    } else {
        return fmt.Errorf("memcache: unexpected stats line format %q", line)
    }
    line, err = rw.ReadSlice('\n')
    if err != nil {
        return err
    }
}

By simply modifying a few lines, this unusual processing was easily resolved.

From now on, I can collect metrics and send alerts through prometheus, and check real-time metrics through Grafana! I’ve got a powerful monitoring tool.

Beyond physical memory

With only the contents listed above, it is absolutely perfect for production use as a cache layer.

However, I imagined possibilities beyond this.

As you all know, one of the powerful features of MySQL is data replication. If you reliably replicate master data through replication and utilize it well for services, you can naturally induce READ distribution as shown below. 🙂

Since all nodes have the same data, the same result is obtained no matter which device reads the data. In other words, as shown in the figure above, cache data is normally serviced for each group, but in case of a specific node failure, there is no service problem at all, even if data is pulled from another nearby MySQL.

Moreover, since each application reads as much data as needed, there is no reason to include all replicated data in the InnoDB buffer pool. In other words, non-volatile cache data that does not disappear is naturally distributed and serviced.

One more thing! If you take advantage of MySQL replication well, you can’t miss binary logs. Binary log is a file that stores the history of data changes in MySQL and has a role of replicating this data by replaying it as it is in the slave (applying the changed history in the master as it is).

In the case of ROW format, the changed data content itself is included. In the case of FULL image, all data before and after the change is included, so you can think of an interesting CDC structure for caching as shown below.

When data changes occur in tables related to caching (such as member information) in the MySQL Master DB where data changes occur, it is possible to create a replicator that detects and converts data into JSON format and puts data into cache of MySQL.

This related project is at the bottom URL. However, It’s an early project so that it needs to apply more modification
>> https://github.com/gywndi/uldra-binlog-json-transfer

If this happens, the left service (red) DB is converted into JSON as natural as flowing water without any manipulation and is stored as a non-volatile caching layer. Then, the application can perform asynchronous caching service based on the data converted to JSON.

Of course, I talked this easily here, but it will require a lot of defense logic to deal with many side effects such as replication delay.

Conclusion

I do not think that what I have talked about this topic so far is the best solution for each service.

However, alternatives such as InnoDB memcached plugin were able to present a fairly suitable solution in the current problem situation, and I wanted to talk about the process of thinking about new possibilities by harmonizing this with MySQL’s unique function.

The goal is not to have a system without failures, but to create a robust system with a low probability of failure, a system that can be restored quickly even if it fails!!

In order to do that, I wanted to take this opportunity to solve my own solutions that I imagined by combining the good functions of each system well. 🙂

Thanks for reading this long post. ^^

Create your own Exporter in Go !

Overview

Hi, it’s too hot summer in Korea. Today I want to talk about an interesting and exciting topic. Try to making your own exporter in Go language.

If you register a specific query, it is a simple program that shows the result of this query as an exporter result metrics. Some of you may still be unfamiliar with what Expoter is.

I will explain about Exporter step by step in today’s post.

Exporter?

You can think of an Exporter as an HTTP server for pulling data from a time series database like Prometheus. Prometheus periodically calls the specific URL of the exporter and saves the result of metrics as a time series.

There are many exporters exist in the everywhere.

Typically, there is mysqld_expoter, which is Prometheus’s Offcial projects, and mysqld_expoter, which they fork and distribute additionally in Percona. Besides these, not only node_expoter for monitoring Linux nodes, but also memcached_expoter etc..

For reference, you can see various exporters here.
>> https://exporterhub.io

What I am going to present my Blog that is the process of adding my own new exporter among these various exporters. Let’s go!

Creating a Go project

Exporter can be implemented in various languages, but today I will implement it with Go.

Personally, I think Go is very convenient in terms of distribution and compatibility. I will omit the go installation and environment configuration here.

$ cd ~/go/src

$ mkdir -p query-exporter-simple

$ cd query-exporter-simple

$ go mod init
go: creating new go.mod: module query-exporter-simple

$ ls -al
total 8
drwxr-xr-x   3 chan  staff   96  7 12 13:33 .
drwxr-xr-x  12 chan  staff  384  7 12 13:33 ..
-rw-r--r--   1 chan  staff   38  7 12 13:33 go.mod

$ cat go.mod
module query-exporter-simple

go 1.16

Although it is an fundamental project, now everything is ready to make your own exporter. From now on, package management is managed with “go mod”.

Try Empty Exporter

Now, let’s start making the Exporter in earnest.

First, as a taster, let’s try to make an empty Exporter that has no function.. it simply outputs the exporter version only.

This is to read OS parameters using flags. The “bind” is server HTTP binding information when Exporter is started.

package main

import (
    "flag"
)

func main() {
    // =====================
    // Get OS parameter
    // =====================
    var bind string
    flag.StringVar(&bind, "bind", "0.0.0.0:9104", "bind")
    flag.Parse()
}

Register Collector to collect and run Exporter with HTTP server. Collector is the concept of a thread that collects information, and it implements the Collector interface of Prometheus.

package main

import (
    "flag"
    "net/http"

    "github.com/prometheus/client_golang/prometheus"
    "github.com/prometheus/client_golang/prometheus/promhttp"
    "github.com/prometheus/common/version"
    log "github.com/sirupsen/logrus"
)

func main() {
    // =====================
    // Get OS parameter
    // =====================
    var bind string
    flag.StringVar(&bind, "bind", "0.0.0.0:9104", "bind")
    flag.Parse()

    // ========================
    // Regist handler
    // ========================
    prometheus.Register(version.NewCollector("query_exporter"))

    // Regist http handler
    http.HandleFunc("/metrics", func(w http.ResponseWriter, r *http.Request) {
        h := promhttp.HandlerFor(prometheus.Gatherers{
            prometheus.DefaultGatherer,
        }, promhttp.HandlerOpts{})
        h.ServeHTTP(w, r)
    })

    // start server
    log.Infof("Starting http server - %s", bind)
    if err := http.ListenAndServe(bind, nil); err != nil {
        log.Errorf("Failed to start http server: %s", err)
    }
 }

Since the packages used by source code do not exist in the project yet, numerous errors will be occurred.

So, as below, get related packages through “go mod vendor”. Related packages are placed under the vendor directory.

$ go mod vendor
go: finding module for package github.com/prometheus/common/version
go: finding module for package github.com/prometheus/client_golang/prometheus
go: finding module for package github.com/sirupsen/logrus
go: finding module for package github.com/prometheus/client_golang/prometheus/promhttp
go: found github.com/prometheus/client_golang/prometheus in github.com/prometheus/client_golang v1.11.0
go: found github.com/prometheus/client_golang/prometheus/promhttp in github.com/prometheus/client_golang v1.11.0
go: found github.com/prometheus/common/version in github.com/prometheus/common v0.29.0
go: found github.com/sirupsen/logrus in github.com/sirupsen/logrus v1.8.1

$ ls -al
total 112
drwxr-xr-x   6 chan  staff    192  7 13 10:26 .
drwxr-xr-x  12 chan  staff    384  7 12 13:33 ..
-rw-r--r--   1 chan  staff    169  7 13 10:26 go.mod
-rw-r--r--   1 chan  staff  45722  7 13 10:26 go.sum
-rw-r--r--   1 chan  staff   1163  7 13 10:34 main.go
drwxr-xr-x   6 chan  staff    192  7 13 10:26 vendor

If you start the Exporter server, the server will be run on port 9104 (the port specified by default in flag).

$ go run .
INFO[0000] Regist version collector - query_exporter
INFO[0000] HTTP handler path - /metrics
INFO[0000] Starting http server - 0.0.0.0:9104

If you want to change the port, give the bind OS parameter as below, then, the server will run with that port.

$ go run . --bind=0.0.0.0:9105
INFO[0000] Regist version collector - query_exporter
INFO[0000] HTTP handler path - /metrics
INFO[0000] Starting http server - 0.0.0.0:9105

Even though it is an empty Exporter.. You can see that a lot of information is extracted through the Exporter. (Most of the information is about go itself..)

$ curl 127.0.0.1:9104/metrics
# HELP go_gc_duration_seconds A summary of the pause duration of garbage collection cycles.
# TYPE go_gc_duration_seconds summary
go_gc_duration_seconds{quantile="0"} 0
go_gc_duration_seconds{quantile="0.25"} 0

.. skip ..

# HELP go_threads Number of OS threads created.
# TYPE go_threads gauge
go_threads 7
# HELP query_exporter_build_info A metric with a constant '1' value labeled by version, revision, branch, and goversion from which query_exporter was built.
# TYPE query_exporter_build_info gauge
query_exporter_build_info{branch="",goversion="go1.16.5",revision="",version=""} 1

At the very bottom, there is the query_exporter_build_info metric, which is the information collected by the Collector that we added in the previous section. This is the moment we created the new Exporter collecting version information!

Creating an Exporter in earnest

I made an empty Exporter that specifies only the exporter version. Is that easy, right? 🙂

From now on, I’m going to implement a Collector that collects the information we really need from database and sends the result to the HTTP GET method.

1. Configuration format (YAML)

As I said before, I want to make something that passes the result of the registered query to the Exporter result metric. To do this, you need to know information about the target instance as well as the query to be executed.

Let’s set it up in the below format. MySQL connection information and the query to be executed. It will show two pieces of information as a result: “Connections per host” and “Connections per user”.

dsn: test:test123@tcp(127.0.0.1:3306)/information_schema
metrics:
  process_count_by_host:
    query: "select user, 
                  substring_index(host, ':', 1) host, 
                  count(*) sessions 
            from information_schema.processlist
            group by 1,2 "
    type: gauge
    description: "process count by host"
    labels: ["user","host"]
    value: sessions
  process_count_by_user:
    query: "select user, count(*) sessions 
            from information_schema.processlist 
            group by 1 "
    type: gauge
    description: "process count by user"
    labels: ["user"]
    value: sessions

I tried defining the above yaml as Go structure.

type Config struct {
    DSN     string
    Metrics map[string]struct {
        Query       string
        Type        string
        Description string
        Labels      []string
        Value       string
        metricDesc  *prometheus.Desc
    }
}

Here, metricDesc *prometheus.Desc can be understood as a specification used in Prometheus metrics. It also specifies any label and metric types such as Counter/Gauge.

Read the YAML file as below, and finally load the setting information into the structure defined below.

var b []byte
var config Config
if b, err = ioutil.ReadFile("config.yml"); err != nil {
    log.Errorf("Failed to read config file: %s", err)
    os.Exit(1)
}

// Load yaml
if err := yaml.Unmarshal(b, &config); err != nil {
    log.Errorf("Failed to load config: %s", err)
    os.Exit(1)
}

In this way, we can now put the necessary information in the Config structure and use it to implement the desired implementation.

package main

import (
    "flag"
    "io/ioutil"
    "net/http"
    "os"

    "github.com/ghodss/yaml"
    "github.com/prometheus/client_golang/prometheus"
    "github.com/prometheus/client_golang/prometheus/promhttp"
    "github.com/prometheus/common/version"
    log "github.com/sirupsen/logrus"
)

var config Config

func main() {
    var err error
    var configFile, bind string
    // =====================
    // Get OS parameter
    // =====================
    flag.StringVar(&configFile, "config", "config.yml", "configuration file")
    flag.StringVar(&bind, "bind", "0.0.0.0:9104", "bind")
    flag.Parse()

    // =====================
    // Load config & yaml
    // =====================
    var b []byte
    if b, err = ioutil.ReadFile(configFile); err != nil {
        log.Errorf("Failed to read config file: %s", err)
        os.Exit(1)
    }

    // Load yaml
    if err := yaml.Unmarshal(b, &config); err != nil {
        log.Errorf("Failed to load config: %s", err)
        os.Exit(1)
    }

    // ========================
    // Regist handler
    // ========================
    log.Infof("Regist version collector - %s", "query_exporter")
    prometheus.Register(version.NewCollector("query_exporter"))

    // Regist http handler
    log.Infof("HTTP handler path - %s", "/metrics")
    http.HandleFunc("/metrics", func(w http.ResponseWriter, r *http.Request) {
        h := promhttp.HandlerFor(prometheus.Gatherers{
            prometheus.DefaultGatherer,
        }, promhttp.HandlerOpts{})
        h.ServeHTTP(w, r)
    })

    // start server
    log.Infof("Starting http server - %s", bind)
    if err := http.ListenAndServe(bind, nil); err != nil {
        log.Errorf("Failed to start http server: %s", err)
    }
}

// =============================
// Config config structure
// =============================
type Config struct {
    DSN     string
    Metrics map[string]struct {
        Query       string
        Type        string
        Description string
        Labels      []string
        Value       string
        metricDesc  *prometheus.Desc
    }
}

2. Implement Collector

The highlight of today’s post is the implementing a Collector to collect the desired information from database.

All the processes I implemented so far is to get the results as an HTTP result. Collector actually connect to the database and delivering the specified metric result based on the result of executing the specified query.

type QueryCollector struct{}

// Describe prometheus describe
func (e *QueryCollector) Describe(ch chan<- *prometheus.Desc) {
}

// Collect prometheus collect
func (e *QueryCollector) Collect(ch chan<- prometheus.Metric) {
}

As I have mentioned earlier, Collector is kind of a thread concept that collects information, and is a structure that implements the Collector interface of Prometheus. In other words, this story means that if you want to create another Collector of your own, you must implement two of the Describe and Collect defined by the prometheus.Collector interface.

func main(){
    .. skip ..
    // ========================
    // Regist handler
    // ========================
    log.Infof("Regist version collector - %s", "query_exporter")
    prometheus.Register(version.NewCollector("query_exporter"))
    prometheus.Register(&QueryCollector{})
    .. skip ..
}

The Version Collector added to the can exporter created earlier and the QueryCollector newly added this time are registered. When an http request comes in to “/metric”, the above two Collectors are finally executed by each thread.

2-1. Create the Describe function

This is the part that defines the specifications of each metric. Actually, it is not necessary to define the specification of the metric here, but it is useful if you consider the case of creating and operating multiple Collectors. This method is executed only once when a Collector is registered with prometheus.Register.

func (e *QueryCollector) Describe(ch chan<- *prometheus.Desc) {
    for metricName, metric := range config.Metrics {
        metric.metricDesc = prometheus.NewDesc(
            prometheus.BuildFQName("query_exporter", "", metricName),
            metric.Description,
            metric.Labels, nil,
        )
        config.Metrics[metricName] = metric
        log.Infof("metric description for \"%s\" registerd", metricName)
    }
}

Here, I have defined the specification of the metric with the information related to Query in the setting information read earlier.

prometheus.BuildFQName: name of metric
metric.Description: Description
metric.Labels: Array of label names, label values should be mapped later in this order

If you look at the config.yml, each mapping will be as follows.

metrics:
  # metricName
  process_count_by_user:
    ## metric.Description
    description: "process count by user"
    ## metric.Labels
    labels: ["user"]

2-2. Create the Collect function

This is the part that connects to the DB, executes the registered SQL, and makes it a metric.

The execution results(rows) of each query are displayed as a metric name and values as shown in the figure above.

func (e *QueryCollector) Collect(ch chan<- prometheus.Metric) {

    // Connect to database
    db, err := sql.Open("mysql", config.DSN)
    if err != nil {
        log.Errorf("Connect to database failed: %s", err)
        return
    }
    defer db.Close()

    // Execute each queries in metrics
    for name, metric := range config.Metrics {

        // Execute query
        rows, err := db.Query(metric.Query)
        if err != nil {
            log.Errorf("Failed to execute query: %s", err)
            continue
        }

        // Get column info
        cols, err := rows.Columns()
        if err != nil {
            log.Errorf("Failed to get column meta: %s", err)
            continue
        }

        des := make([]interface{}, len(cols))
        res := make([][]byte, len(cols))
        for i := range cols {
            des[i] = &res[i]
        }

        // fetch database
        for rows.Next() {
            rows.Scan(des...)
            data := make(map[string]string)
            for i, bytes := range res {
                data[cols[i]] = string(bytes)
            }

            // Metric labels
            labelVals := []string{}
            for _, label := range metric.Labels {
                labelVals = append(labelVals, data[label])
            }

            // Metric value
            val, _ := strconv.ParseFloat(data[metric.Value], 64)

            // Add metric
            switch strings.ToLower(metric.Type) {
            case "counter":
                ch <- prometheus.MustNewConstMetric(metric.metricDesc, prometheus.CounterValue, val, labelVals...)
            case "gauge":
                ch <- prometheus.MustNewConstMetric(metric.metricDesc, prometheus.GaugeValue, val, labelVals...)
            default:
                log.Errorf("Fail to add metric for %s: %s is not valid type", name, metric.Type)
                continue
            }
        }
    }
}

As you can see from the labelVals value, you need to pass the label values in the order of Labels of the specification defined in Describe earlier. There are two metric types here: counter and gauge. Each type has the following meaning.

COUNTER: A value that only increases. In prometheus, the indicator is displayed as a change calculation function such as rate/irate.

ch <- prometheus.MustNewConstMetric(metric.metricDesc, prometheus.CounterValue, val, labelVals...)

GAUGE: A type whose value can increase/decrease, such as like car gauge. In general, it is used to save the current metric value as it is, such as process count.

ch <- prometheus.MustNewConstMetric(metric.metricDesc, prometheus.GaugeValue, val, labelVals...)

For the value to be displayed as a metric, the value item specified in the setting is retrieved from the query result.

QueryExporter Source

Here’s the everything that I’ve done so far:

package main

import (
    "database/sql"
    "flag"
    "io/ioutil"
    "net/http"
    "os"
    "strconv"
    "strings"

    "github.com/ghodss/yaml"
    _ "github.com/go-sql-driver/mysql"
    "github.com/prometheus/client_golang/prometheus"
    "github.com/prometheus/client_golang/prometheus/promhttp"
    "github.com/prometheus/common/version"
    log "github.com/sirupsen/logrus"
)

var config Config

const (
    collector = "query_exporter"
)

func main() {
    var err error
    var configFile, bind string
    // =====================
    // Get OS parameter
    // =====================
    flag.StringVar(&configFile, "config", "config.yml", "configuration file")
    flag.StringVar(&bind, "bind", "0.0.0.0:9104", "bind")
    flag.Parse()

    // =====================
    // Load config & yaml
    // =====================
    var b []byte
    if b, err = ioutil.ReadFile(configFile); err != nil {
        log.Errorf("Failed to read config file: %s", err)
        os.Exit(1)
    }

    // Load yaml
    if err := yaml.Unmarshal(b, &config); err != nil {
        log.Errorf("Failed to load config: %s", err)
        os.Exit(1)
    }

    // ========================
    // Regist handler
    // ========================
    log.Infof("Regist version collector - %s", collector)
    prometheus.Register(version.NewCollector(collector))
    prometheus.Register(&QueryCollector{})

    // Regist http handler
    log.Infof("HTTP handler path - %s", "/metrics")
    http.HandleFunc("/metrics", func(w http.ResponseWriter, r *http.Request) {
        h := promhttp.HandlerFor(prometheus.Gatherers{
            prometheus.DefaultGatherer,
        }, promhttp.HandlerOpts{})
        h.ServeHTTP(w, r)
    })

    // start server
    log.Infof("Starting http server - %s", bind)
    if err := http.ListenAndServe(bind, nil); err != nil {
        log.Errorf("Failed to start http server: %s", err)
    }
}

// =============================
// Config config structure
// =============================
type Config struct {
    DSN     string
    Metrics map[string]struct {
        Query       string
        Type        string
        Description string
        Labels      []string
        Value       string
        metricDesc  *prometheus.Desc
    }
}

// =============================
// QueryCollector exporter
// =============================
type QueryCollector struct{}

// Describe prometheus describe
func (e *QueryCollector) Describe(ch chan<- *prometheus.Desc) {
    for metricName, metric := range config.Metrics {
        metric.metricDesc = prometheus.NewDesc(
            prometheus.BuildFQName(collector, "", metricName),
            metric.Description,
            metric.Labels, nil,
        )
        config.Metrics[metricName] = metric
        log.Infof("metric description for \"%s\" registerd", metricName)
    }
}

// Collect prometheus collect
func (e *QueryCollector) Collect(ch chan<- prometheus.Metric) {

    // Connect to database
    db, err := sql.Open("mysql", config.DSN)
    if err != nil {
        log.Errorf("Connect to database failed: %s", err)
        return
    }
    defer db.Close()

    // Execute each queries in metrics
    for name, metric := range config.Metrics {

        // Execute query
        rows, err := db.Query(metric.Query)
        if err != nil {
            log.Errorf("Failed to execute query: %s", err)
            continue
        }

        // Get column info
        cols, err := rows.Columns()
        if err != nil {
            log.Errorf("Failed to get column meta: %s", err)
            continue
        }

        des := make([]interface{}, len(cols))
        res := make([][]byte, len(cols))
        for i := range cols {
            des[i] = &res[i]
        }

        // fetch database
        for rows.Next() {
            rows.Scan(des...)
            data := make(map[string]string)
            for i, bytes := range res {
                data[cols[i]] = string(bytes)
            }

            // Metric labels
            labelVals := []string{}
            for _, label := range metric.Labels {
                labelVals = append(labelVals, data[label])
            }

            // Metric value
            val, _ := strconv.ParseFloat(data[metric.Value], 64)

            // Add metric
            switch strings.ToLower(metric.Type) {
            case "counter":
                ch <- prometheus.MustNewConstMetric(metric.metricDesc, prometheus.CounterValue, val, labelVals...)
            case "gauge":
                ch <- prometheus.MustNewConstMetric(metric.metricDesc, prometheus.GaugeValue, val, labelVals...)
            default:
                log.Errorf("Fail to add metric for %s: %s is not valid type", name, metric.Type)
                continue
            }
        }
    }
}

If the package does not exist, run “go mod vendor” to download the necessary packages.

Start the server and check the information collected by the actual exporter.

$ go run .
INFO[0000] Regist version collector - query_exporter
INFO[0000] metric description for "process_count_by_host" registerd
INFO[0000] metric description for "process_count_by_user" registerd
INFO[0000] HTTP handler path - /metrics
INFO[0000] Starting http server - 0.0.0.0:9104

If you run it with curl, you can see that the session count per user/host defined in the settings is displayed.

$ curl 127.0.0.1:9104/metrics
# HELP go_gc_duration_seconds A summary of the pause duration of garbage collection cycles.
# TYPE go_gc_duration_seconds summary
go_gc_duration_seconds{quantile="0"} 0
go_gc_duration_seconds{quantile="0.25"} 0

.. skip ..

# HELP query_exporter_build_info A metric with a constant '1' value labeled by version, revision, branch, and goversion from which query_exporter was built.
# TYPE query_exporter_build_info gauge
query_exporter_build_info{branch="",goversion="go1.16.5",revision="",version=""} 1
# HELP query_exporter_process_count_by_host process count by host
# TYPE query_exporter_process_count_by_host gauge
query_exporter_process_count_by_host{host="localhost",user="event_scheduler"} 1
query_exporter_process_count_by_host{host="localhost",user="test"} 1
# HELP query_exporter_process_count_by_user process count by user
# TYPE query_exporter_process_count_by_user gauge
query_exporter_process_count_by_user{user="event_scheduler"} 1
query_exporter_process_count_by_user{user="test"} 1

This is the moment when your own Exporter is created at final!. 🙂

Concluding..

The post was very long. I put the source code in the body several times.. I feel like the amount of text is getting longer.

Anyway, I’ve created my own unique Exporter! I implemented a simple function to simply register a query and extract this result as a metric result, but I think you can add more interesting elements according to your own thoughts as needed.

For reference, the source written above is organized in the following Git.
>> https://github.com/go-gywn/query-exporter-simple

Sometimes, when I need to monitor hundreds and thousands of servers from one monitoring server, it is sometimes useful to manage the collection of metrics. As of yet, only support with MySQL, I personally create another Query Exporter project. I implemented more parallel processing and timeouts in the above project base.
>> https://github.com/go-gywn/query-exporter

It’s always been like that… If there is nothing, just create and it If there is, use it well!

I hope to all you have a nice summer.

Creating resettable sequence in MySQL

Overview

When servicing, we always have requirements about SEQUENCE. MySQL has auto_increment for each table by default, but it is difficult to satisfy all requirements in production.

In this situation, as in the blog content posted by Peter Zaitsev at the bottom, a sequence function is usually created and used. Of course, specifying InnoDB ensures stability in replication situations, not MyISAM.

https://www.percona.com/blog/2008/04/02/stored-function-to-generate-sequences/

Based on this content, I would like to create a resettable sequence that resets at a specific point in time.

Preparations

Before we get started, let’s take a look at how we will process data.

First, let’s create a table to save the current sequence as shown below.

CREATE TABLE `t_sequence` (
  `name` varchar(100) NOT NULL,
  `seq_num` bigint(20) NOT NULL DEFAULT '0',
  `mtime` timestamp(6) NOT NULL DEFAULT CURRENT_TIMESTAMP(6) ON UPDATE CURRENT_TIMESTAMP(6),
  PRIMARY KEY (`name`)
) ENGINE=InnoDB

Unlike Peter’s blog, I will issue a sequence by UPSERT processing with the syntax “insert into .. on duplicate key update ..“.

insert into t_sequence (name, seq_num, mtime) values ('abc', last_insert_id(1), now(6)) 
    on duplicate key update 
       seq_num = last_insert_id(seq_num+1), 
       mtime = now(6);
return last_insert_id();

Here, let’s put the logic to reset to 1 at a specific time in “on duplicate update". The bottom is an example of resetting in minutes.

insert into t_sequence (name, seq_num, mtime) values ('abc', last_insert_id(1), now(6)) 
    on duplicate key update 
       seq_num = last_insert_id(if(mtime < date_format(now(6), '%Y-%m-%d %H:%i:00'), 1, seq_num+1)), 
       mtime = now(6);
return last_insert_id();

By adding the logic to check the time in “on duplicate key update", it is reset the sequence number to 1 under certain conditions.

mysql> insert into t_sequence (name, seq_num, mtime) values ('abc', last_insert_id(1), now(6))
    ->     on duplicate key update 
    ->        seq_num = last_insert_id(if(mtime < date_format(now(6), '%Y-%m-%d %H:%i:00'), 1, seq_num+1)), 
    ->        mtime = now(6);
Query OK, 2 rows affected (0.00 sec)

mysql> select now(), last_insert_id();
+---------------------+------------------+
| now()               | last_insert_id() |
+---------------------+------------------+
| 2021-06-21 12:31:58 |                6 |
+---------------------+------------------+
1 row in set (0.00 sec)

mysql> insert into t_sequence (name, seq_num, mtime) values ('abc', last_insert_id(1), now(6))
    ->     on duplicate key update 
    ->        seq_num = last_insert_id(if(mtime < date_format(now(6), '%Y-%m-%d %H:%i:00'), 1, seq_num+1)), 
    ->        mtime = now(6);
Query OK, 2 rows affected (0.00 sec)

mysql> select now(), last_insert_id();
+---------------------+------------------+
| now()               | last_insert_id() |
+---------------------+------------------+
| 2021-06-21 12:32:01 |                1 |
+---------------------+------------------+
1 row in set (0.00 sec)

Sequence function

Based on this content, let’s create the Function as below.

delimiter //
drop function nextval//
create function nextval(in_name varchar(100), in_type char(1)) returns bigint
begin
  declare date_format varchar(20);
  SET date_format = (
    case 
      when in_type = 'M' then '%Y-%m-01 00:00:00' 
      when in_type = 'D' then '%Y-%m-%d 00:00:00' 
      when in_type = 'H' then '%Y-%m-%d %H:00:00' 
      when in_type = 'I' then '%Y-%m-%d %H:%i:00' 
      when in_type = 'S' then '%Y-%m-%d %H:%i:%S' 
      else '%Y-%m-%d 00:00:00'  
    end
  );
  insert into t_sequence (name, seq_num, mtime) values (in_name, last_insert_id(1), now(6)) 
      on duplicate key update 
         seq_num = last_insert_id(if(mtime < date_format(now(6), date_format), 1, seq_num+1)), 
         mtime = now(6);
  return last_insert_id();
end
//
delimiter ;

As shown in the Function function, M is reset every month, D is daily reset, H is reset every hour, etc.

Below is an example of resetting in minutes. (same example above)

mysql> select nextval('abc', 'I') seq, now();
+------+---------------------+
| seq  | now()               |
+------+---------------------+
|    1 | 2021-06-21 12:40:42 |
+------+---------------------+
1 row in set (0.00 sec)

mysql> select nextval('abc', 'I') seq, now();
+------+---------------------+
| seq  | now()               |
+------+---------------------+
|    2 | 2021-06-21 12:40:52 |
+------+---------------------+
1 row in set (0.00 sec)

mysql> select nextval('abc', 'I') seq, now();
+------+---------------------+
| seq  | now()               |
+------+---------------------+
|    3 | 2021-06-21 12:40:56 |
+------+---------------------+
1 row in set (0.00 sec)

mysql> select nextval('abc', 'I') seq, now();
+------+---------------------+
| seq  | now()               |
+------+---------------------+
|    1 | 2021-06-21 12:41:00 |
+------+---------------------+
1 row in set (0.00 sec)

If necessary, you can put additional requirements in the “insert into .. on duplicate update..“ statement of the Function. 🙂

Performance

Environments

## Physical machine
Intel(R) Core(TM) i3-8100 CPU @ 3.60GHz(4core), 32G Memory

## MySQL variables
+--------------------------------+-------+
| Variable_name                  | Value |
+--------------------------------+-------+
| innodb_flush_log_at_trx_commit | 0     |
| sync_binlog                    | 0     |
+--------------------------------+-------+

Due to the structure of the function defined above, a lot of locks for a specific row occur. This means that network latency has a huge impact on query execution time, and it impacts directly on the duration of the lock time.

So, I simply tested it as follows LOCAL vs REMOTE, and results from local testing should probably be considered the best performance.

TEST1. Localhost

Using mysqlslap, let’s throw random traffic.

$ time mysqlslap -utest      \
  --password=test123         \
  --create-schema=test       \
  --iterations=1             \
  --number-of-queries=100000 \
  --query="select test.nextval('abc', 'H');"
mysqlslap: [Warning] Using a password on the command line interface can be insecure.
Benchmark
	Average number of seconds to run all queries: 5.979 seconds
	Minimum number of seconds to run all queries: 5.979 seconds
	Maximum number of seconds to run all queries: 5.979 seconds
	Number of clients running queries: 10
	Average number of queries per client: 10000
real	0m5.996s
user	0m0.915s
sys	0m1.709s

It was executed in 5.979 seconds, and 16,666 sequences were issued per second!! Not bad.

TEST2. Remote host

This time, let’s run it in an environment with some network latency. In my environment the latency is around 0.4ms.

$ ping 10.5.5.11
PING 10.5.5.11 (10.5.5.11): 56 data bytes
64 bytes from 10.5.5.11: icmp_seq=0 ttl=64 time=0.404 ms

I tried generating the same test traffic as above in this environment.

$ time mysqlslap -utest      \
  --password=test123         \
  --host=10.5.5.11           \
  --concurrency=10           \
  --create-schema=test       \
  --iterations=1             \
  --number-of-queries=100000 \
  --query="select test.nextval('abc', 'H');"
mysqlslap: [Warning] Using a password on the command line interface can be insecure.
Benchmark
	Average number of seconds to run all queries: 7.191 seconds
	Minimum number of seconds to run all queries: 7.191 seconds
	Maximum number of seconds to run all queries: 7.191 seconds
	Number of clients running queries: 10
	Average number of queries per client: 10000

was executed in 7.191 seconds, and 13,906 sequences were issued per second!! Not bad also.

But, if the network latency is higher than it is now, it will definitely show a worse figure. In any case, the sequence issuance performance itself seems to come out with an appropriate performance.

If a sequence is allocated individually, it seems that it can be used without worrying about performance or lock. (Test in which 10 users issue a sequence)

$ time mysqlslap -utest      \
  --password=test123         \
  --host=10.5.5.11           \
  --concurrency=10           \
  --create-schema=test       \
  --iterations=1             \
  --number-of-queries=100000 \
  --query="select test.nextval(concat('ab',floor(rand()*10)), 'H');"
mysqlslap: [Warning] Using a password on the command line interface can be insecure.
Benchmark
	Average number of seconds to run all queries: 5.702 seconds
	Minimum number of seconds to run all queries: 5.702 seconds
	Maximum number of seconds to run all queries: 5.702 seconds
	Number of clients running queries: 10
	Average number of queries per client: 10000

Conslusion

I started with the idea of making a sequence that doesn’t exist in MySQL to suit our service requirements in fun way.

Whether it’s a specific service or a personalized service… I’ve occasionally heard of a sequence requirement that starts from 1 at 0 o’clock.

In this case, in the existing case, it is necessary to lock and decide whether to process the reset with the current sequence value. Here, this was solved with a simple "INSERT INTO ON DUPLICATE UPDATE.."statement.

In a similar way, various functions can be created for your own service..

Have a good day 🙂

Automated recovery scenarios for lazy MySQL DBA

This is a translation of a Korean blog I wrote in 2017. This may differ slightly from the current MySQL 8.0 version.
http://gywn.net/2017/01/automated-recovery-scenarios-for-lazy-mysql-dba/

Overview

Over using MySQL, replication is you know not easy to away from it. It’s like saying that you have to manage multiple servers with the same data soon… It’s a lot of data recovery. This is especially true if the surer environment is in case of data explosion. In addition, the use of early morning backups during recovery will soon be a batch of changes that have been saved since dawn. This is a lot of time consuming data synchronization. In this environment, Simply talk about how to automate recovery.

Prepare

There is no big issue with “Linux+MySQL+Xtrabackup”.

CentOS 6.8
MySQL 5.7.14
Xtrabackup-2.4.3
Pv

Configured to replicate data based on an existing Binlog Position-based replication rather than GTID. I would take SCP for pull the data without any errors.
This means that you will soon have to pair the replication node with the id_rsa.pub key in advance so that you can pull the data without a password for authentication. That means you need to be prepared to access the server without a password.

server1$ ssh mysql@server2
server2$ ssh mysql@server1

Restore Scenario

Depending on the characteristics of the service or individual preferences. The most important thing is the recovery efficiency. Of course, there should be little service impact before efficiency. If so, I would say. You need to control your data transfer traffic. In general, if you have a recovery case, you can consider two cases.

A case that configures slaves with data from slaves. A case that adds another same slave to the current slave. The Slave Status information of the slave that is being dictated by the data becomes the configuration information for the recovery slave.

restore from master data
 As a case to add slaves to the data of the master. The Binlog position of the server is the information for the slave configuration.
restore from slave data
 A case that configures slaves with data from slaves. A case that adds another same slave to the current slave. The Slave Status information of the slave that is being dictated by the data becomes the configuration information for the recovery slave.

Before proceeding, It is shared the explains about automatic recovery scenario not for the my personal script. I had skip for explains that validation check or password crypt.
This is a data recovery advance knowledge. The bottom of xtrabackup read the manual first and I am sure it will be a great help.
 https://www.percona.com/doc/percona-xtrabackup/2.4/howtos/setting_up_replication.html

case 1) restore from master data

As I told you earlier. I received the data from the master, as a case to configure slaves right at the bottom. A case that organizes data in the same way as the bottom image.

Restore_From_Master_Data

1) Pulling data

During the last course of Xtabackup… The process of copying non-InnoDB data… If you do not give the --no-lock option, you will perform a global Lock during the data copy to ensure the data consistency. In other words, it affects the service.

Fill out the innobackupex part directly into the shell and complete it in one room as shown below. Control the backup data that is sent to the streaming to “pv” (50 MB per second, make even weaker servers)…

ssh -o StrictHostKeyChecking=no mysql@${TARGET_HOST} \
 "innobackupex \
 --host='127.0.0.1' \
 --user='backupuser' \
 --password='backuppass' \
 --no-lock \
 --stream=xbstream \
 /data/backup | pv --rate-limit 50000000" 2> innobackupex.log \
 | xbstream -x 2> xbstream.log

You can view the two logs of “innobackupex.log” and “xbstream.log”” and validate the backup data that has been shown. The “innobackupex.log” is “completed OK!”, which would normally result in data being displayed.

2) Apply redo log

To proceed with the assumption that the advanced process has been performed normally. The process of applying data that has changed while the data is being backed up and transferred. This is also the end of the innobackupex.log “completed OK!” and you can see that the log has been applied normally.

innobackupex --apply-log . 2>> innobackupex.log

3) configuration of slave

Finally, The process of extracting the position of the master to configure the actual slave. The location information is extracted as follows from the “xtrabackup_binlog_pos_innodb” that is created after the Apply log. Usually, the master’s binary log position is recorded by a tab separated as shown below.

mysql-bin.000001     481

I defined log position with sed command as below.

MASTER_LOG_FILE=`sed -r "s/^(.*)\s+([0-9]+)/\1/g" xtrabackup_binlog_pos_innodb`
MASTER_LOG_POS=`sed -r "s/^(.*)\s+([0-9]+)/\2/g" xtrabackup_binlog_pos_innodb`

Now, the master server has also been determined, and the binary log position for the replication configuration is also defined. Let’s configure the slave as shown below.

echo "change master to
 master_host='${TARGET_HOST}',
 master_user='repl',
 master_password='replpass',
 master_log_file='${MASTER_LOG_FILE}',
 master_log_pos=${MASTER_LOG_POS}; 
 start slave;"\
| mysql -uroot -pxxxxxx

This is done by leveraging the master data as shown below without a large crowd to configure the new slave.

Restore_From_Master_Data_Final

case 2) restore from slave data

Second case… Using data from the current slave server to add new slave servers. It has the following data flow: A scenario where you import data from a normally running slave server and configure additional slaves of the same topology based on the slave information for that slave.

Restore_From_Slave_Data

1) pulling data

This time, the process of releasing backup streaming data from the slaves directly into xbstream, as well as throughput control to PV.
Specify the --slave-info option as shown below for the purpose of dictating the slave position differently than the previous case. (To give this option, record the slave state.)

ssh -o StrictHostKeyChecking=no mysql@${TARGET_HOST} \
 "innobackupex \
 --host='127.0.0.1' \
 --user=backupuser \
 --password='backuppass' \
 --slave-info \
 --stream=xbstream \
 /data/backup | pv --rate-limit 50000000" 2> innobackupex.log \
 | xbstream -x 2> xbstream.log

Similarly, you can view the two logs of innobackupex.log and xbstream.log, and validate the backup data that has been available. In “innobackupex.log”, “Completed OK!” is the end of the data, and the xbstream.log result has been released properly with no content..

2) Apply redo log

While data is being backed up and transferred, it does the same as before to apply the changed data. This is also the end of the “innobackupex.log” “completed OK!” which is normally handled.

innobackupex --apply-log . 2>> innobackupex.log

3) configuration of slave

The slave position is now… I have to define a little differently from what I extracted from the master. You must first identify the binary log position in “xtrabackup_slave_info”.

Finally. The process of extracting the position of the master to configure the actual slave. The location information is extracted as follows from the “xtrabackup_binlog_info” that is created after the Apply log. Usually, the master’s binary log position is recorded by a tab separated as shown below. It is actually stored in the following form.

CHANGE MASTER TO MASTER_LOG_FILE='mysql-bin.000001', MASTER_LOG_POS=481

MASTER_LOG_FILE=`sed -r "s/(.*)MASTER_LOG_FILE='(.*)', MASTER_LOG_POS=([0-9]+)/\2/g" xtrabackup_slave_info`
MASTER_LOG_POS=`sed -r "s/(.*)MASTER_LOG_FILE='(.*)', MASTER_LOG_POS=([0-9]+)/\3/g" xtrabackup_slave_info`

Getting a binary log position up here is a big move. The problem is, where do I get the information about the server that actually replication? Of course, this can be extracted from a separate monitoring tool. Or you can push the master information somewhere on a regular basis… The way I chose was to get into the target slave equipment that came with the data, look at the Slave Status directly, and retrieve the master server information that I am currently looking at.

MASTER_HOST=`mysql -urepl -preplpass -h ${TARGET_HOST} -e 'show slave status\G' | grep 'Master_Host: ' | sed -r 's/\s*(Master_Host: )//g'|sed -r 's/\s*//g'`

One complaint.. When creating a "xtrabackup_binlog_info" file in xtrabackup, why not record the master information that the slave is looking at.

Now, the master server has also been recognized, and the binary log position for the replication configuration is also defined… Let's configure the slave as shown below. The master host (found earlier) has changed the actual master's host, not the target host.

echo "change master to
 master_host='${MASTER_HOST}',
 master_user='repl',
 master_password='replpass',
 master_log_file='${MASTER_LOG_FILE}',
 master_log_pos=${MASTER_LOG_POS}; 
 start slave;"\
| mysql -uroot -pxxxxxx

It is success to configuration with get data from slaves as follows.

Restore_From_Slave_Data_Final

Conclusion

So far, you’ve learned how to organize your slaves by leveraging the data. Before we start, we couldn’t show you all the scripts, just like you said. Of course, if only a few hundred lines, simple script but this cannot be the correct answer.

I don’t want to pull the data into the “scp”, but I can come through nc utility. Not a classic Binlog position… It’s even simpler to solve with GTID.
My solution is just one of a case not even more than that.

A little trick to express mysql read-only clearly in grafana

Overview

I mentioned about PMM(Percona Monitoring and Management), actually I will talk about using Prometheus query and Grafana with little trick. It’s how to express read-only ON/OFF clearly in grafana. 🙂

Definition

This topic is in the server notation according to the READ-ONLY setting value of MySQL, but in my opinion this is also a topic that separates the master/slave. Of course, you can also run a service with a different topology of MySQL, such as (M)-(M/S)-(S) as a body-dining structure. I think that the simplest configuration is the most powerful, when you prepare for a variety of disabilities with personal thought. In my position, READ-ONLY is defined as slave, a guy activated ON.

Master (READ_ONLY: OFF) 
An instance for subject of data change
Master recognition for (M/S) with middle chaining like (M)-(M/S)-(S) 

Slave (READ_ONLY: OFF) 
Instances that data cannot be changed

A query may differ depending on the individual’s definition, but I want to have representation of the “READ_ONLY is based on ON/OFF server Classification” or “all instance criteria.”

Grafana Variable

In MySQL, READ-ONLY means only ON/OFF, but here we have added a value of 2 to mean the entire server.

var_read_only : 0 (READ-ONLY OFF)

var_read_only : 1 (READ-ONLY ON)

var_read_only : 2 (ALL)

How to add a new Variable from Grafana is to be added in the Variables tab of the Setting of the dashboard, and the custom type as shown below 2, 0, 1. The reason why start with 2. Just try to specify the default value.

grafana-read-only-variable1

However, the problem is that if you look at the Variable select box created, the notation is shown as 2, 0, 1 and is not intuitive. 2-> ALL, 0-> OFF, 1-> ON to be represented in the Select box, it would be okay to be used? (Of course, no matter to use.)

In the UI setting of Grafana, it seems to be impossible (yet)… This requires a little manipulation and re-Import after the Export of Grafana. Look up “Share Dashboard” > “Export” tab and open it up as a JSON file, the “text” part is also set to something like value, as shown below.

{
  "allValue": null,
  "current": {
    "text": "2", <== here
    "value": "2"
  },
  "hide": 0,
  "includeAll": false,
  "label": "Read-only",
  "multi": false,
  "name": "read_only",
  "options": [
    {
      "selected": true,
      "text": "2",
      "value": "2"
    },
    {
      "selected": false,
      "text": "0", <== here
      "value": "0"
    },
    {
      "selected": false,
      "text": "1", <== here
      "value": "1"
    }
  ],
  "query": "2,0,1",
  "type": "custom"
}

The above text will be replaced with an intuitive phrase as below, and you can Import the Dashboard again.

{
  "allValue": null,
  "current": {
    "text": "ALL", <== here
    "value": "2"
  },
  "hide": 0,
  "includeAll": false,
  "label": "Read-only",
  "multi": false,
  "name": "read_only",
  "options": [
    {
      "selected": true,
      "text": "ALL", <== here
      "value": "2"
    },
    {
      "selected": false,
      "text": "OFF", <== here
      "value": "0"
    },
    {
      "selected": false,
      "text": "ON", <== here
      "value": "1"
    }
  ],
  "query": "2,0,1",
  "type": "custom"
}

Finally, it is possible to make intuitive phrases on the Grafana screen. It is not easy to control and anyway the first query setting is completed!

grafana-read-only-variable2

Promethues Query

At Grafana, we’ve created the basis to deliver whatever you want, and this time we’ll write a Prometheus query to look at the data. In this place, I will not explain the Prometheus query every one, please visit the official site manual once Prometheus, read the query syntax. (Here)
My concern is the grouping result of the READ-ONLY ON/OFF. In the preceding definition, if the READ_ONLY value is 0, the master, the READ_ONLY is 1, the slave, and 2 means the entire server. Based on this data from Grafana, we will select and take only the data that we want.

First, try extracting the current system load with a Prometheus query to get a maximum value of one minute for max_over_time, as shown below. This is the result corresponding to the entire instance.

max_over_time(node_load1[1m])

Add a condition for the READ-ONLY status value to look up the results for READ-ONLY on/OFF. Here, the “on” entry means to process data operations based on which data, if it is a SQL reference, it’s probably the JOIN condition.

max_over_time(node_load1[1m]) 
  and on (instance) (mysql_global_variables_read_only == $read_only)

But there’s one problem here. If the READ-ONLY value exists, it will query queries without a big problem, but if you look up master/slave at the same time, like a entire server, the query does not work normally.

So, here is the another trick for it. If the result of the three values is equal to 1 or more than 2, then it seems to be resolved? And if we tie the “OR” condition, we can do the first arithmetic process, and if we do not meet, we will write a query to READ-ONLY part? The metric that “exists unconditionally” in each instance is ?

You can use the value “up {job = ‘ Linux ‘}”. The result of this value is 0 or 1, and you can assume that the Exporter exists unconditionally. In other words, if the $read _oly is 0 and 1, it will be negative, and you can create a query as shown below that it will only have 1 or more positive values.

max_over_time(node_load1[1m]) 
  and on (instance) (
      up{job='linux'} <= $read_only^2-3 
      or mysql_global_variables_read_only == $read_only
  )

If this is the case, the first 2 is filtered in the first up condition and does not refer to the mysql_global_variables_read_only portion of the back. (Or does not have the preceding result) However, if the READ_ONLY filtering condition enters zero or one operation, the up result does not exist, so you can check the Mysql_global_variables_read_only value and finally extract the data with desired.

Result

I configured the VM locally as a test and set up the MySQL server with the following structure, and then I tried to float the PMM exporter. Set the OFF for read only because of node01 and node03 are the role of the main for data change.

node01 : Master
  ㄴ node02 : Slave
node03:  Master(Single)

Case1. ALL Instance

All servers are displayed, including Pmm-server instances.

pmm_result_readonly_all

Case2. READ-ONLY: OFF

As intended, only instances where the READ-ONLY is OFF are shown in the graph. Note that only MySQL instance data with the READ-ONLY property is indicate, so the Pmm-server instance will fall out of the target list.
pmm_result_readonly_off

Case3. READ-ONLY: ON

Finally, The READ-ONLY only shows the slaves that are ON. Here is the same with no pmm-server.
pmm_result_readonly_on

Conclusion

Here is the what I did
Intuitively notation for READ-ONLY with trick for dashboard Import in Grafana
Create Prometheus tips and tricks to filter by two steps (ALL, READ-ONLY ON/OFF)

It can be very simple and easy for master hand, but if you have to get resource status on system at a glace with dozens of master/slave mixed environments, the sorting according to the rolls can be a big deal.

However, since the criteria of this notation depends on the database READ-ONLY, another notation is necessary if the DB is not running. I think it is a good idea to solve this problem through the recording function in Prometheus.

Good luck~ 🙂

Introduction of pmm-ruled

I personally believe that reliable service starts with monitoring and ends with tuning.

PMM is a very useful monitoring tool. I believe super stable service is mostly coming from Service Monitoring.

Anyway, when I first saw PMM, I was very impressing because I was thirsty for customizing of Database monitoring solution. However, while deeply analyzing the structure and function of the PMM, I found out that there are some features of motoring solution that need to have additional features.

Why Monitoring Feature is so important?

The monitoring system aims to show the current indicators or status of database properly. In this respect, I would like give PMM with good score.

However, it is also a very important monitoring feature to send alerting to the administrator when the problem occurs (or before).

The problem was that there was no effective solution with PMM, and I needed that my banking service should be running without any further problems.

PMM Alerting

As you know, there are two ways you can send Alerting in PMM.

The first is to take advantage of Grafana’s Alerting feature.

You can control from UI interface and it’s intuitive.

https://www.percona.com/blog/2017/01/23/mysql-and-mongodb-alerting-with-pmm-and-grafana/
https://www.percona.com/blog/2017/02/02/pmm-alerting-with-grafana-working-with-templated-dashboards/

Unfortunately, Grafana does not support “Template Variable” properly, so I should set the number of alerting for each metric. This is a tired and not a suitable solution for the expansion of hundreds of nodes.

The second is to use “alert rule” of prometheus.

Though not intuitive, it is a way to specify multiple node thresholds at once.

https://prometheus.io/docs/prometheus/latest/configuration/alerting_rules/

However, there’s no way to set dynamic thresold for each node, only supports a fixed.
Even though I want to specify different thresholds for the purpose of the server(ex, production/development), there is no way to configure the alert rule without complicated alerting rule configuration in prometheus.

PMM-ruled

I do need a “dynamic rule manager”, so I made this with a funny idea.

Grafana alerting which I said before that was difficult to apply, so I decided the second method, prometheus alerting. As I mentioned a little while ago, promethus is a time series database. In order to perform data operations, the data to be computed and must exist in the same time range.

threshold-rule-metric

In other words, in order to configure the threshold value of each server differently, this threshold value is also periodically taken from the prometheus as some kind of metrics.

PMM-ruled is the Alerting threshold exporter.

It is not enough. There must be a principal to manage these alerting rules. If you want to set A node and B node need to receive alerting message when CPU usage is over 10%, 20% in each, this information also must be stored and managed somewhere else.

From the perspective of monitoring for dozens and hundreds of nodes with a single PMM, I did not want to manage the file thresholds one by one manually, one by one. (In fact it is impossible to handle by myself.)
Alert thresholds must be managed with “group” or “instance” in each. And I would like to have the ability to skip alerting as well.

alert-thresolds

If an instance level rule exists, it is given priority for that instance. If there is no instance level rule, priority is given to the group rule. If neither instance nor group has a specified threshold, use the threshold specified in alert rule for that node. The threshold values are classified into layers, like the image above

PMM-ruled is the alerting rule managing API server.

And if the alert rule was defined as a fixed value it would be shown as below:

groups:
- name: alert_rules
  rules:
  - alert: mysql_agent_up
    expr:  up{job="mysql"} == 0
    for: 10s
    labels: 
      level: critical
    annotations: 
      summary: MySQL Agent Down
      description: MySQL Agent Down

The thresholds can be calculated as a single metric and managed dynamically on a server basis as shown below.

groups:
- name: alert_rules
  rules:
  - alert: mysql_agent_up
    expr:  up{job="mysql"} == on (instance) group_left (level, name, gname) (alert_rule_threshold{name="mysql_agent_up", level="critical"})
    for: 10s
    labels: 
      level: critical
    annotations: 
      summary: MySQL Agent Down
      description: MySQL Agent Down

PMM-ruled is the alerting rule file generator.

In addition to this, it has more features such as prometheus recording rule management, snapshot rule management which inserts current state value into MySQL. (to show current node status with just single table.)

The figure below is showing the configuration of the whole pmm-ruled.
pmm-ruled-overview

Interesting?

I made this feature in April, 2018, and am still monitoring for dozens of mysql nodes with only single PMM server from that time. I’m currently collaborating with Percona after contributing the entire pmm-ruled source code.

You can start this project from percona-lab repository.

https://github.com/percona-lab/pmm-ruled

I would like to keep the original source, so I forked into a separate project.

https://github.com/gywndi/pmm-ruled

I will talk about how to port from pmm-ruled to pmm in the next time.

Thanks.

Using “Affected Rows” in MySQL

Overview

Have you ever thought about using “Affected Rows” in your application?

As you all know, “Affected Rows” is the number of data rows that actually affected by DML. It is usally used to check to see how many rows has been changed since DML(insert/update/delete) was executed.

In this post I would like to talk about how to use “Affected Rows” to solve various requirements in a fun way.

Affected rows in MySQL

As I mentioned before, “Affected Rows” in MySQL means the row that has really changed.

mysql> insert into test values (1,1);
Query OK, 1 row affected (0.00 sec)

mysql> update test set j = 1;
Query OK, 0 rows affected (0.01 sec)
Rows matched: 1 Changed: 0 Warnings: 0

mysql> update test set j = 2;
Query OK, 1 row affected (0.00 sec)
Rows matched: 1 Changed: 1 Warnings: 0

This looks nothing in your application, but if you think a little differently, you can get a free lunch, reducing amounts of queries. Let me explain about this. 🙂

Service requirements

If you have received the following requirements from the service planner:

Reset count every day at 00:00

Participate only 5 times in a certiain cycle

Normally, it would be implemented by bundling transactions as shown below.(This is just psudo code.)


try{
    execute("BEGIN");
    row = execute("SELECT * FROM user_event WHERE user_id = 100 FOR UPDATE");

    // 1. Check Time
    if(last_applied_time == CURRENT_DATE){

        // 2. Check Count
        if(apply_count < 5){
            execute("UPDATE user_event SET apply_count = apply_count + 1 WHERE user_id = 100");
        }
    }else{

        // 3. Reset Count
        execute("UPDATE user_event SET apply_count = 1 WHERE user_id = 100");
    }
    execute("COMMIT");
}catch(Exception e){
    execute("ROLLBACK");
}

Of course, this is not wrong, but I want to solve this service requirement more fun aspect with “affected rows”.

My own solution

Let’s create the following table before I explain it.

CREATE TABLE `user_event` (
  `user_id` int(11) NOT NULL,
  `apply_count` int(11) NOT NULL,
  `last_applied_time` timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP,
  PRIMARY KEY (`user_id`)
) ENGINE=InnoDB

It’s just a very simple table with an event apply count based on user with last event applied time (timestamp).Now, let’s create the query as shown below to suit planner’s requirements.

First, there must be no data, so we can make INSERT query as below.

INSERT INTO user_event (user_id, apply_count) VALUES (1, 1);

Second, let’s add more meaning to the query as follows to increase Count every time user participate event.

INSERT INTO user_event (user_id, apply_count) VALUES (1, 1)
    ON DUPLICATE KEY UPDATE apply_count = apply_count + 1

By the way, the planner order to initialize counter value every day.
If the last entry date is the same as today’s date, increase the apply_count. In order to satisfy a date requirement (1. Check Time), branch processing is performed by if as follows.

INSERT INTO user_event (user_id, apply_count) VALUES (1, 1) 
    ON DUPLICATE KEY UPDATE
       apply_count = if(date(last_applied_time) = current_date, apply_count + 1, 1)

Let’s set the event participate limit condition as follows. The “apply_count” is only incremented when the current value is less than 5. So, we make the final query, as below.

INSERT INTO user_event (user_id, apply_count) VALUES (1, 1) 
    ON DUPLICATE KEY UPDATE
       apply_count = if(date(last_applied_time) = current_date, if(apply_count < 5, apply_count + 1, apply_count), 1)

Now, we solved these complex requirements in just a single line only. Then, let’s do some fun tests.:-)

I’m not a patient guy who could not wait some hours to change date 🙂 , so manipulated my code for changing a reset condition from daily change to every 10secs, as shown below.

INSERT INTO user_event (user_id, apply_count) VALUES (1, 1) 
    ON DUPLICATE KEY UPDATE
       apply_count = if(timestampdiff(second, last_applied_time, now()) < 10, if(apply_count < 5, apply_count + 1, apply_count), 1)

And we can get the pretty results as below. Note that, the application can determine whether or not this user is applying (1 or more) or restricted (0) according to the Affected Rows value that is received in the query result.

1. First Insert

1 row affected means there’s no target rows in this table, so this is the first INSERT.

mysql> INSERT INTO user_event .. ON DUPLICATE KEY UPDATE .. 
Query OK, 1 row affected (0.00 sec)

mysql> select * from user_event;
+---------+-------------+---------------------+
| user_id | apply_count |   last_applied_time |
+---------+-------------+---------------------+
|       1 |           1 | 2018-03-25 23:05:38 |
+---------+-------------+---------------------+
1 row in set (0.00 sec)

2. Second Insert in 10s

2 rows affected means, target row exists and row has been updated because of unique constraints. Does it make sense, so far?

mysql> INSERT INTO user_event .. ON DUPLICATE KEY UPDATE .. 
Query OK, 2 rows affected (0.00 sec)

mysql> select * from user_event;
+---------+-------------+---------------------+
| user_id | apply_count |   last_applied_time |
+---------+-------------+---------------------+
|       1 |           2 | 2018-03-25 23:05:41 |
+---------+-------------+---------------------+
1 row in set (0.00 sec)

3. Insert more than 5 times

Affected rows are returned as 2, and 0 is returned after apply_count is 5.

mysql> INSERT INTO user_event .. ON DUPLICATE KEY UPDATE .. 
Query OK, 0 rows affected (0.00 sec)

mysql> select * from user_event;
+---------+-------------+---------------------+
| user_id | apply_count |   last_applied_time |
+---------+-------------+---------------------+
|       1 |           5 | 2018-03-25 23:05:46 |
+---------+-------------+---------------------+
1 row in set (0.00 sec)

4. After 10 seconds

After 10 seconds the limited time, affected rows is returned to 2 and apply_count changed to 1. The meaning of affected rows is 2, regardless of whether initialization or increase, or whatever the user’s apply_count has been increased successfully.

mysql> INSERT INTO user_event .. ON DUPLICATE KEY UPDATE .. 
Query OK, 2 rows affected (0.00 sec)

mysql> select * from user_event;
+---------+-------------+---------------------+
| user_id | apply_count |   last_applied_time |
+---------+-------------+---------------------+
|       1 |           1 | 2018-03-25 23:05:56 |
+---------+-------------+---------------------+
1 row in set (0.00 sec)

We don’t need to exectue another query to get the result, we can distinguish the result status and make variety message to users in application level. (No transaction, No several queries, Just 1 Query, Free lunch box)

In other words, to summarize, the following behavior can be expected.

Query OK, 1 row affected (0.00 sec)  <= Insert 1 (First)
Query OK, 2 rows affected (0.00 sec) <= Update to 2
Query OK, 2 rows affected (0.00 sec) <= Update to 3
Query OK, 2 rows affected (0.00 sec) <= Update to 4
Query OK, 2 rows affected (0.00 sec) <= Update to 5
Query OK, 0 rows affected (0.00 sec) <= No action
Query OK, 2 rows affected (0.00 sec) <= Update to 1 (Initialized)

However, in Java, you need to add jdbc parameter useAffectedRows=true to get this nice result. (Check! https://bugs.mysql.com/bug.php?id=39352)

Conclusion

If there are no changes to the data, then there is only 0 Affected Rows. This feature is depended on the DBMS, it is also (probably) dependent on MySQL.

However, if you understand and use these behaviors, you can solve some kind of problems of getting more comfortable. In addition to limiting the number of event “apply count” for a certain period of time, it may be also necessary to keep “summarized data” on a daily or monthly basis. If you think about it, you can easily implement an initialized sequence using affected rows. There is already something that I made roughly, but I will arrange it next time.

First posting in English, it’s so tough.
Thanks, have a nice day~ 🙂

Start gywndi’ database in English

I already have a Korean blog to share my experiences. As time goes on, I want to share this story with everyone. I’m not good at English, but I just do it.

Let’s start gywndi’s database !

post