Skip to content

Commit fb899d3

Browse files
itisravinixpanic
authored andcommitted
add feature page for Automagic unsplit-brain in afr
Change-Id: I2bc990ee7d1de3a6d3c9d1c235741dd7efadec24 Signed-off-by: Ravishankar N <[email protected]> Reviewed-on: http://review.gluster.org/14554 Reviewed-by: Pranith Kumar Karampuri <[email protected]> Reviewed-by: Ashish Pandey <[email protected]> Reviewed-by: Anuradha Talur <[email protected]> Reviewed-by: Niels de Vos <[email protected]> Tested-by: Niels de Vos <[email protected]>
1 parent 230e3a3 commit fb899d3

File tree

1 file changed

+132
-0
lines changed

1 file changed

+132
-0
lines changed
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,132 @@
1+
# Automagic unsplit-brain by [ctime|mtime|size|majority]
2+
3+
## Summary
4+
A new volume option 'cluster.favorite-child-policy' is introduced which will automatically resolve split-brains by
5+
choosing a particular brick as the good copy based on the value (policy) set.
6+
7+
## Owners
8+
Ravishankar N <[email protected]>
9+
The patch is a rework of the one submitted by Richard Wareing from facebook.
10+
11+
## Current status
12+
Patch merged in master: http://review.gluster.org/#/c/14026/
13+
Patch merged in 3.8 http://review.gluster.org/#/c/14535/
14+
15+
## Related Feature Requests and Bugs
16+
3.8 BZ: https://bugzilla.redhat.com/show_bug.cgi?id=1339639
17+
Original BZ to which facebook's patch was attached: https://bugzilla.redhat.com/show_bug.cgi?id=1262161
18+
19+
## Detailed Description
20+
In a replicate volume, when a file ends up in split-brain, accessing them from the client results in input/output error.
21+
To resolve split-brains, the user/admin needs to use the gluster CLI commands or use the virtual setfattr commands to choose a particular
22+
copy of the file as source and trigger heal. Until such manual intervention happens accessing the file fails with EIO.
23+
24+
With the 'cluster.favorite-child-policy', users can set a policy for AFR to automatically pick a source when the file ends in split-brain and do the heal.
25+
This means they no longer get EIO when trying to acess files and the split-brains get resolved automatically. The various policies available are:
26+
* none: This is the default value. When set, there is no automatic resolution of split-brains.
27+
* ctime: Selects the file with the highest ctime as the source.
28+
* mtime: Selects the file with the highest mtime as the source.
29+
* size: Selects the file with the biggest file size as the source.
30+
* majority: Selects a file with identical mtime and size in more than half the number of bricks in the replica as the source.
31+
32+
This is a volume wide option, i.e. the same policy will be applied to all split-brained files of the volume.
33+
34+
## Benefit to GlusterFS
35+
No manual intervention required to fix split-brains.
36+
37+
## Scope
38+
39+
### Nature of proposed change
40+
Code changes for handling the various policies for the option is done in AFR.
41+
42+
### Implications on manageability
43+
New volume option 'cluster.favorite-child-policy' is introduced.
44+
45+
### Implications on presentation layer
46+
None.
47+
48+
### Implications on persistence layer
49+
None.
50+
51+
### Implications on 'GlusterFS' backend
52+
None.
53+
54+
### Modification to GlusterFS metadata
55+
None.
56+
57+
### Implications on 'glusterd'
58+
Just the introduction of the volume option.
59+
60+
## How To Test
61+
Create files in data/ metadata split-brain, use the volume set command to set various policies and see if split-brain heal happens according to the policy. The [.t file](https://github.com/gluster/glusterfs/blob/2f29065/tests/basic/afr/split-brain-favorite-child-policy.t) in the patch contains test cases.
62+
Here is an example of how the volume option can be used:
63+
64+
### 1. A replica 2 volume that has '/file' in split-brain:
65+
```[root@dhcp42-116 ~]# gluster v heal testvol info
66+
Brick 127.0.0.2:/brick/brick1
67+
/file - Is in split-brain
68+
69+
Status: Connected
70+
Number of entries: 1
71+
72+
Brick 127.0.0.2:/brick/brick2
73+
<gfid:fa6f2ab2-722e-4cf3-9f75-662c70be3f58> - Is in split-brain
74+
75+
Status: Connected
76+
Number of entries: 1
77+
```
78+
79+
### 2. The file size in the backend is different:
80+
```[root@dhcp42-116 ~]# ll /brick/brick*/file
81+
-rw-r--r-- 2 root root 1048576 May 30 12:59 /brick/brick1/file
82+
-rw-r--r-- 2 root root 1024 May 30 12:58 /brick/brick2/file
83+
```
84+
85+
### 3. Set the policy to heal based on bigger size:
86+
```
87+
[root@dhcp42-116 ~]# gluster volume set testvol cluster.favorite-child-policy size
88+
volume set: success
89+
```
90+
91+
### 4. Launch heal:
92+
```
93+
[root@dhcp42-116 ~]# gluster volume heal testvol
94+
Launching heal operation to perform index self heal on volume testvol has been successful
95+
Use heal info commands to check status
96+
97+
```
98+
99+
### 5. Check heal info output again to verify file has been healed:
100+
```
101+
[root@dhcp42-116 ~]# gluster v heal testvol info
102+
Brick 127.0.0.2:/brick/brick1
103+
Status: Connected
104+
Number of entries: 0
105+
106+
Brick 127.0.0.2:/brick/brick2
107+
Status: Connected
108+
Number of entries: 0
109+
```
110+
111+
### 6. Check in the backend that the bigger file has been used as source:
112+
```[root@dhcp42-116 ~]# ll /brick/brick*/file
113+
-rw-r--r-- 2 root root 1048576 May 30 12:59 /brick/brick1/file
114+
-rw-r--r-- 2 root root 1048576 May 30 12:59 /brick/brick2/file
115+
```
116+
117+
118+
## User Experience
119+
New CLI volume option 'cluster.favorite-child-policy'
120+
121+
## Dependencies
122+
None.
123+
124+
## Documentation
125+
ToDo.
126+
127+
## Status
128+
Patch merged.
129+
130+
## Comments and Discussion
131+
132+

0 commit comments

Comments
 (0)