最近对生产的Redis做了两个优化:Redis扩展、以及对简单键值对的存储优化(string改成hash形式)
Redis扩展
上一篇介绍的Codis安装。但是使用Pipeline操作时间比较长、连接数比较多的情况下,经常出现连接重置的情况。感觉不踏实,go也不懂感觉短时间处理不了这种问题。
寻求它法。前期是把不同业务数据写入不同的redis实例,根据业务来分。对于同一个业务来说,得根据key的hash来写入不同的实例,但是自己写的话得包装一堆东西。
jedis工具包括Shared的功能,根据写入key的hash映射到不同的redis实例。截取了部分Shared的主要代码:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
public class Sharded<R, S extends ShardInfo<R>> {
...
private void initialize(List<S> shards) {
nodes = new TreeMap<Long, S>();
for (int i = 0; i != shards.size(); ++i) {
final S shardInfo = shards.get(i);
if (shardInfo.getName() == null)
for (int n = 0; n < 160 * shardInfo.getWeight(); n++) {
nodes.put(this.algo.hash("SHARD-" + i + "-NODE-" + n),
shardInfo);
}
else
for (int n = 0; n < 160 * shardInfo.getWeight(); n++) {
nodes.put(
this.algo.hash(shardInfo.getName() + "*"
+ shardInfo.getWeight() + n), shardInfo);
}
resources.put(shardInfo, shardInfo.createResource());
}
}
...
public S getShardInfo(byte[] key) {
SortedMap<Long, S> tail = nodes.tailMap(algo.hash(key));
if (tail.isEmpty()) {
return nodes.get(nodes.firstKey());
}
return tail.get(tail.firstKey());
}
public S getShardInfo(String key) {
return getShardInfo(SafeEncoder.encode(getKeyTag(key)));
}
...
使用的时刻很简单,通过ShardedJedis来进读写,大部分的操作与Jedis类似。只是有部分整个集群的操作不能用:keys/scan等。
1
2
3
4
5
6
7
8
9
10
11
12
13
public List<JedisShardInfo> getShards(String sValue) {
String[] servers = sValue.split(",");
List<JedisShardInfo> shards = new ArrayList<>();
for (String server : servers) {
Pair<String, Integer> hp = parseServer(server);
shards.add(new JedisShardInfo(hp.getLeft(), hp.getRight(), Integer.MAX_VALUE));
}
return shards;
}
private ShardedJedisPool createRedisPool(String server) {
return new ShardedJedisPool(new GenericObjectPoolConfig(), getShards(server));
}
如果使用过程中要使用keys,可以通过getAllShards得到所有Jedis实例的键再进行处理:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
public Double zscore(String key, String member) {
try (ShardedJedis redis = getRedis()) {
return redis.zscore(key, member);
}
}
public void expires(List<String> patterns, int seconds) {
try (ShardedJedis shardedJedis = getRedis()) {
Set<String> keys = new HashSet<>();
for (Jedis redis : shardedJedis.getAllShards()) {
for (String p : patterns) {
keys.addAll(redis.keys(p)); // 调用单独实例的keys命令获取匹配的键
}
}
ShardedJedisPipeline pipeline = shardedJedis.pipelined();
for (String key : keys) {
pipeline.expire(key, seconds);
}
pipeline.sync();
}
}
进行多实例(集群)切分后,效果还是挺明显的。写入高峰期分流效果显著,负载均摊,可使用的内存也翻翻,键也基本平均分布( --maxmemory-policy volatile-lru
)。生产实际效果:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
[hadoop@hadoop-master1 redis]$ sh stat_cluster.sh
* [ ============================================================> ] 4 / 4
hadoop-master1:
# Memory
used_memory:44287785776
used_memory_human:41.25G
used_memory_rss:67458658304
used_memory_peak:67981990576
used_memory_peak_human:63.31G
used_memory_lua:33792
mem_fragmentation_ratio:1.52
mem_allocator:jemalloc-3.6.0
# Keyspace
db0:keys=72729777,expires=11967,avg_ttl=63510023
hadoop-master2:
# Memory
used_memory:50667945344
used_memory_human:47.19G
used_memory_rss:66036752384
used_memory_peak:64424543672
used_memory_peak_human:60.00G
used_memory_lua:33792
mem_fragmentation_ratio:1.30
mem_allocator:jemalloc-3.6.0
# Keyspace
db0:keys=100697581,expires=13426,avg_ttl=63509903
hadoop-master3:
# Memory
used_memory:56763389184
used_memory_human:52.87G
used_memory_rss:66324045824
used_memory_peak:64424546136
used_memory_peak_human:60.00G
used_memory_lua:33792
mem_fragmentation_ratio:1.17
mem_allocator:jemalloc-3.6.0
# Keyspace
db0:keys=94363547,expires=13544,avg_ttl=63505693
hadoop-master4:
# Memory
used_memory:54513952832
used_memory_human:50.77G
used_memory_rss:67257393152
used_memory_peak:64820124928
used_memory_peak_human:60.37G
used_memory_lua:33792
mem_fragmentation_ratio:1.23
mem_allocator:jemalloc-3.6.0
# Keyspace
db0:keys=83297543,expires=12418,avg_ttl=63507046
Finished processing 4 / 4 hosts in 298.89 ms
存储优化
实际环境中存在会大量的用到简单string键值对,挺耗内存的。其实使用hash(内部存储ziplist)能更有效的利用内存。
注意是ziplist形式的hash才能省内存!!如果是skiplist的hash会浪费内存。
下面引用官网对简单键值对和Hash的一个比较(Redis中key的相关特性不关注): 对于小数据量的hash进行了优化
a few keys use a lot more memory than a single key containing a hash with a few fields.
We use a trick.
But many times hashes contain just a few fields. When hashes are small we can instead just encode them in an O(N) data structure, like a linear array with length-prefixed key value pairs. Since we do this only when N is small
This does not work well just from the point of view of time complexity, but also from the point of view of constant times, since a linear array of key value pairs happens to play very well with the CPU cache (it has a better cache locality than a hash table).
优化主要涉及到ziplist的两个参数,是一个cpu/memory之间的均衡关系。entries直接用默认的就好了,value最好不要大于254(ziplist节点entry大于254需要增加4个到5字节,来存储前一个entry的长度)。
1
2
hash-max-zipmap-entries 512 (hash-max-ziplist-entries for Redis >= 2.6)
hash-max-zipmap-value 64 (hash-max-ziplist-value for Redis >= 2.6)
简单列几条数据:
1
2
3
3:0dc46077dfaa4970a1ec9f38cfc29277fa9e1012.ime.galileo.baidu.com -> 1469584847
3:co4hk52ia0b1.5buzd.com -> 1468859527
1:119.84.110.82_39502 -> 1469666877
原始key内容可以不需要,鉴于包括域名的key太长,直接对数据key取md5。以1亿键值对来进行估算,取md5的前五位作为key,后27位作为hash键值对的key。
扫描原始redis实例,然后把键值对转换后存储到新的实例。转换Scala代码如下:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
import java.util.{List => JList}
import org.apache.commons.codec.digest.DigestUtils
import redis.clients.jedis._
import scala.collection.JavaConversions._
trait RedisUtils {
def md5(data: String): String = {
DigestUtils.md5Hex(data)
}
def Type(redis: Jedis, key: String) = redis.`type`(key)
def scan(redis: Jedis)(action: JList[String] => Unit): Unit = {
import scala.util.control.Breaks._
var cursor = "0"
breakable {
while (true) {
val res = redis.scan(cursor)
action(res.getResult())
cursor = res.getStringCursor
if (cursor.equals("0")) {
break
}
}
}
}
def printInfo(redis: Jedis): Unit = {
println(redis.info())
}
// 验证:
// 打印 **总共** 的键值对数量
// eval "local aks=redis.call('keys', '*'); local res=0; for i,r in ipairs(aks) do res=res+redis.call('hlen', r) end; return res" 0
// 打印 **每个** hash包括的键值对个数
// eval "local aks=redis.call('keys', '*'); local res={}; for i,r in ipairs(aks) do res[i]=redis.call('hlen', r) end; return res" 0
//
}
Object RedisTransfer extends RedisUtils {
def handle(key: String, value: String, tp: Pipeline): Unit = {
val m5 = md5(key)
tp.hset(m5.substring(0, 5), m5.substring(5), value)
}
def main(args: Array[String]) {
val Array(sHost, sPort, tHost, tPort) = args
val timeout = 60 * 1000
val source = new Jedis(sHost, sPort.toInt, timeout)
val sp = source.pipelined()
val target = new Jedis(tHost, tPort.toInt, timeout)
val tp = target.pipelined()
scan(source) { keys =>
// 仅处理 string类型 的记录
val requests = for (key <- keys) yield Some((key, sp.get(key)))
sp.sync()
for (
request <- requests;
(key, resp) <- request
) {
try {
handle(key, resp.get(), tp)
} catch {
case e: Exception => println(s"fetch $key with exception, ${e.getMessage}")
}
}
}
tp.sync()
printInfo(target)
target.close()
source.close()
}
}
由于对数据进行了处理,对比不是很清晰,不能直接说省了多少空间。但是添加上面的处理后,原来30G(大概3亿多)的实例变成了15G。
另一个案例
另外对域名的实例做了下测试,6.4百万的键值对:707.29M内存:
md5前4个字符作为key,总共产生65536个键值对。每个hash大概包括100个kv。
hash的key使用原来的键
不调ziplist_value的值,实际的转换成hash(skiplist):939.6M,
ziplist_value修改成1024,转换成hash(ziplist):513.78M
md5的作为hash的新key:344.7M
md5的后28位作为hash的新key: 259.09M
如:
1
2
3
4
5
6
7
MD5:
3:0dc46077dfaa4970a1ec9f38cfc29277fa9e1012.ime.galileo.baidu.com
1356de078028ddf266c962533760b27c
1356 -> hash( 3:0dc46077dfaa4970a1ec9f38cfc29277fa9e1012.ime.galileo.baidu.com -> 1469584847 )
1356 -> hash( 1356de078028ddf266c962533760b27c -> 1469584847 )
1356 -> hash( de078028ddf266c962533760b27c -> 1469584847 )
–END