Two-Phase-Commit for Distributed In-Memory Caches--reference

Part I

reference from:http://gridgain.blogspot.kr/2014/09/two-phase-commit-for-distributed-in.html

2-Phase-Commit is probably one of the oldest consensus protocols and is known for its deficiencies when it comes to handling failures,as it may indefinitely block the servers waiting in prepare state. To mitigate this,a 3-Phase-Commit protocol was introduced which adds better fault tolerance at the expense of extra network round-trip message and higher latencies. I would like to extend these traditional quorum concepts into distributed in-memory caching,as it is particularly relevant to what we do at . GridGain too has 2-phase-commit transactions,but unlike disk-based persistent systems, GridGain does not need to add a 3rd phase to the commit protocol in order to preserve data consistency during failures. 

We want to avoid the 3-Phase-Commit protocol because it adds an additional network round-trip and has a negative impact on latencies and performance.

In GridGain,the data is partitioned in memory,which in the purest form means that every key-value pair is assigned to a specific node within the cluster. If for example,we have 100 keys and 4 nodes,then every node will cache 100 / 4 = 25keys. This way the more nodes we have,the more data we can cache. Of course,in real life scenarios,we also have to worry about failures and redundancy,so for every key-value pair we will have 1 primary copy,and 1 or more backup copies.Let us now look at the 2-Phase-Commit protocol in more detail to see why we can avoid the 3rd commit phase in GridGain.

Two-Phase-Commit Protocol

Generally the 2-Phase-Commit protocol is initiated whenever an application has already made a decision to commit the transaction. The Coordinator node sends a Prepare message to all the participating nodes holding primary copies (primary nodes),and the primary nodes,after acquiring the necessary locks, synchronously send the "Prepare"message to the nodes holding backup copies (backup nodes). Once every node votes "Yes",then the"Commit" message is sent and the transaction gets committed.

Figure 1: 2-Phase-Commit Protocol For Distributed In-Memory Cache

Advantages of In-Memory

So,why does in-memory-only processing allow us to avoid the 3rd phase as opposed to persistent disk-based transactions? The main reason is that the in-memory cache is purely volatile and does not need to worry about leaving any state behind in case of a crash. Contrast that to the persistent architectures,where we need to worry whether the state on disk was committed or not after a failure had occurred.Another advantage of in-memory-only distributed cache is that the only valid vote for the "Prepare" message is "Yes". There is really no valid reason for any cluster member to vote "No". This essentially means that the only reason a rollback should happen is if the Coordinator node crashed before it was able to send the "Prepare" message to all the participating nodes. Now,that we have the ground rules settled,let's analyze what happens in case of failures: 

Backup Node Failures

If a backup node fails during either "Prepare" phase or "Commit" phase,then no special handling is needed. The data will still be committed on the nodes that are alive. GridGain will then,in the background,designate a new backup node and the data will be copied there outside of the transaction scope. 

Primary Node Failures

If a primary node fails before or during the "Prepare" phase,then the coordinator will designate one of the backup nodes to become primary and retry the "Prepare" phase. If the failure happens before or during the "Commit" phase,then the backup nodes will detect the crash and send a message to the Coordinator node to find out whether to commit or rollback. The transaction still completes and the data within distributed cache remains consistent. 

Coordinator Node Failures

Failure of the Coordinator node becomes a little tricky. Without the Coordinator node,neither primary nodes,nor backup nodes know whether to commit the transaction or to rollback. In this case the "Recovery Protocol" initiates,and all the nodes participating in the transaction send a message to every other participating node asking whether the"Prepare" message was received. If at least one of the nodes replied "No",then the transaction will be rolled back,otherwise the transaction will be committed.

Note that after all the nodes received the "Prepare" message,we can safely commit,since voting "No" is impossible during the "Prepare" phase as stated above.It is possible that some nodes will have already committed the transaction by the time they have received the Recovery Protocol message from other nodes. For such cases,every node keeps a backlog of completed transaction IDs for a certain period of time. If there is no ongoing transaction with given ID found,then the backlog is checked. If the transaction was not found in the backlog,then it was never started,which means that the failure had occurred before the Prepare phase was completed and it is safe to rollback.Since the data is volatile,and does not leave any state after the crash,it does not really matter if any of the primary or backup nodes also crashed together with the coordinator node. The Recovery Protocol still works the same.

Figure 2: Recovery Protocol

Conclusion

2-Phase-Commit Recovery Protocol in GridGain over the 3-Phase-Commit is that,in the absence of failures,the recovery protocol does not introduce any additional overhead,while the 3-phase-commit adds an additional synchronous network roundtrip to the transaction and,therefore,has negative impact on performance and latencies. GridGain also supports modes where it works with various persistent stores including transactional databases. In my next blog,I will cover the scenario where a cache sits on top of a transactional persistent store,and how the 2-Phase-Commit protocol works in that case.

Part II

reference from:http://java.dzone.com/articles/two-phase-commit-memory-caches

Generally,persistent disk-oriented systems will require the additional 3rd phase in commit protocol in order to ensure data consistency in case of failures.  I covered why the 2-Phase-Commit protocol (without 3rd phase) is sufficient to handle failures for distributed in-memory caches. The explanation was based on the open source  architecture,however it can be applied to any in-memory distributed system.In this blog we will cover a case when an in-memory cache serves as a layer on top of a persistent database. In this case the database serves as a primary system of records,and distributed in-memory cache is added for performance and scalability reasons to accelerate reads and (sometimes) writes to the data. Cache must be kept consistent with database which means that a cache transaction must merge with the database transaction.

When we add a persistent store to an in-memory cache,our primary goal is to make sure that the cache will remain consistent with on-disk database at all times. 

In order to keep the data consistent between memory and database,data is automatically loaded on demand whenever a read happens and the data cannot be found in cache. This behavior is called read-through. Alternatively,whenever a write operation happens,data is stored in cache and is automatically persisted to the database.  This behavior is called write-through. Additionally,there is also a mode called write-behind which batches up the writes in memory and flushes them to the database in one bulk operation (we will not be covering this mode here).

Figure 1: Read-through operation for K2                                
When we add a persistent store to the 2-Phase-Commit protocol,in order to merge cache and database transactions into one,the coordinator will have to write the transactional changes to the database before it sends the Commit message to the other participants. This way,if database transaction fails,the coordinator can still send the Rollback message to everyone involved,so that the cached data will remain consistent with database. Figure below illustrates this behavior.

Figure 2: Two-Phase-Commit with In-Memory-Cache and Database

版权声明:本文内容由互联网用户自发贡献,该文观点与技术仅代表作者本人。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如发现本站有涉嫌侵权/违法违规的内容, 请发送邮件至 dio@foxmail.com 举报,一经查实,本站将立刻删除。

相关推荐


在正式开始之前,我们先来看下 MySQL 服务器的配置和版本号信息,如下图所示: “兵马未动粮草先行”,看完了相关的配置之后,我们先来创建一张测试表和一些测试数据。 -- 如果存在 person 表先删除 DROP TABLE IF EXISTS person; -- 创建 person 表,其中
> [合辑地址:MySQL全面瓦解](https://www.cnblogs.com/wzh2010/category/1859594.html "合辑地址:MySQL全面瓦解") # 1 为什么需要数据库备份 - 灾难恢复:当发生数据灾难的时候,需要对损坏的数据进行恢复和
物理服务机的CPU、内存、存储设备、连接数等资源有限,某个时段大量连接同时执行操作,会导致数据库在处理上遇到性能瓶颈。为了解决这个问题,行业先驱门充分发扬了分而治之的思想,对大库表进行分割,
然后实施更好的控制和管理,同时使用多台机器的CPU、内存、存储,提供更好的性能。而分治有两种实现方式:垂直拆
1 回顾 上一节我们详细讲解了如何对数据库进行分区操作,包括了 垂直拆分(Scale Up 纵向扩展)和 水平拆分(Scale Out 横向扩展) ,同时简要整理了水平分区的几种策略,现在来回顾一下。 2 水平分区的5种策略 2.1 Hash(哈希) 这种策略是通过对表的一个或多个列的Ha
navicat查看某个表的所有字段的详细信息 navicat设计表只能一次查看一个字段的备注信息,那怎么才能做到一次性查询表的信息呢?SELECT COLUMN_NAME,COLUMN_COMMENT,COLUMN_TYPE,COLUMN_KEY FROM information_schema.CO
文章浏览阅读4.3k次。转载请把头部出处链接和尾部二维码一起转载,本文出自逆流的鱼yuiop:http://blog.csdn.net/hejjunlin/article/details/52768613前言:数据库每天的数据不断增多,自动删除机制总体风险太大,想保留更多历史性的数据供查询,于是从小的hbase换到大的hbase上,势在必行。今天记录下这次数据仓库迁移。看下Agenda:彻底卸载MySQL安装MySQL_linux服务器进行数据迁移
文章浏览阅读488次。恢复步骤概要备份frm、ibd文件如果mysql版本发生变化,安装回原本的mysql版本创建和原本库名一致新库,字符集都要保持一样通过frm获取到原先的表结构,通过的得到的表结构创建一个和原先结构一样的空表。使用“ALTER TABLE DISCARD TABLESPACE;”命令卸载掉表空间将原先的ibd拷贝到mysql的仓库下添加用户权限 “chown . .ibd”,如果是操作和mysql的使用权限一致可以跳过通过“ALTER TABLE IMPORT TABLESPACE;”命令恢_alter table discard tablespace
文章浏览阅读225次。当MySQL单表记录数过大时,增删改查性能都会急剧下降,可以参考以下步骤来优化:单表优化除非单表数据未来会一直不断上涨,否则不要一开始就考虑拆分,拆分会带来逻辑、部署、运维的各种复杂度,一般以整型值为主的表在千万级以下,字符串为主的表在五百万以下是没有太大问题的。而事实上很多时候MySQL单表的性能依然有不少优化空间,甚至能正常支撑千万级以上的数据量:字段尽量使用TINYINT、SMALLINT、MEDIUM_INT作为整数类型而非INT,如果非负则加上UNSIGNEDVARCHAR的长度只分配_开发项目 浏览记录表 过大怎么办
文章浏览阅读1.5k次。Mysql创建、删除用户MySql中添加用户,新建数据库,用户授权,删除用户,修改密码(注意每行后边都跟个;表示一个命令语句结束):1.新建用户登录MYSQL:@>mysql -u root -p@>密码创建用户:mysql> insert into mysql.user(Host,User,Password) values("localhost_删除mysql用户组
MySQL是一种开源的关系型数据库管理系统,被广泛应用于各类应用程序的开发中。对于MySQL中的字段,我们需要进行数据类型以及默认值的设置,这对于数据的存储和使用至关重要。其中,有一个非常重要的概念就是MySQL字段默认字符串。 CREATE TABLE `my_...
MySQL是一个流行的开源关系型数据库管理系统,广泛应用于Web应用程序开发、数据存储和管理。在使用MySQL时,正确设置字符集非常重要,以确保数据的正确性和可靠性。 在MySQL中,字符集表示为一系列字符和字母的集合。MySQL支持多种字符集,包括ASCII、UTF...
MySQL存储函数 n以内偶数 MySQL存储函数能够帮助用户简化操作,提高效率,常常被用于计算和处理数据。下面我们就来了解一下如何使用MySQL存储函数计算n以内的偶数。 定义存储函数 首先,我们需要定义一个MySQL存储函数,以计算n以内的偶数。下...
MySQL是一个流行的关系型数据库管理系统,基于客户机-服务器模式,可在各种操作系统上运行。 MySQL支持多种字符集,不同的字符集包括不同的字符,如字母、数字、符号等,并提供不同的排序规则,以满足不同语言环境的需求。 //查看MySQL支持的字符集与校对规...
在MySQL数据库中,我们有时需要对特定的字符串进行截取并进行分组统计。这种操作对于数据分析和报表制作有着重要的应用。下面我们将讲解一些基本的字符串截取和分组统计的方法。 首先,我们可以使用substring函数对字段中的字符串进行截取。假设我们有一张表stude...
MySQL提供了多种字符串的查找函数。下面我们就一一介绍。 1. LIKE函数 SELECT * FROM mytable WHERE mycolumn LIKE 'apple%'; 其中"apple%"表示以apple开头的字符串,%表示任意多个字符...
MySQL 是一种关系型数据库管理系统,广泛应用于各种不同规模和类型的应用程序中。在 MySQL 中,处理字符串数据是很常见的任务。有时候,我们需要在字符串的开头添加一定数量的 0 ,以达到一定的位数。比如,我们可能需要将一个数字转换为 4 位或 5 位的字符串,不足的...
MySQL是一种流行的关系型数据库管理系统,支持多种数据类型。以下是MySQL所支持的数据类型: 1. 数值型数据类型: - TINYINT 保存-128到127范围内的整数 - SMALLINT 保存-32768到32767范围内的整数 - MEDIU...
MySQL中存储Emoji表情字段类型 在现代互联网生态中,表情符号已经成为人们展示情感和思想的重要方式之一,因此将表情符号存储到数据库中是一个经常出现的问题。MySQL作为最流行的开源关系型数据库管理系统之一,也需要能够存储和管理这些表情符号的字段类型。 UT...
MySQL是一种关系型数据库管理系统。在MySQL数据库中,有多种不同的数据类型。而其中,最常见的数据类型之一就是字符串类型。在MySQL中,字符串类型的数据通常会被存储为TEXT或VARCHAR类型。 首先,让我们来看一下VARCHAR类型。VARCHAR是My...
MySQL字符串取整知识详解 MySQL是一种开源的关系型数据库管理系统,广泛应用于各个领域。在使用MySQL过程当中,我们经常需要对数据进行取整操作。本文将介绍如何使用MySQL字符串取整来处理数据取整问题。 什么是MySQL字符串取整? MySQL...