Redis在新浪的大规模运维经验@曾经的阿飞rj03hou@gmail.com2013-5-61关于我DBA@新浪网MySQL、Redis、HBase2013-5-62大纲Redis简介Redis应用运维经验2013-5-632013-5-64Redis简介REmoteDIctionaryServerin-memorybutpersistentdatabaseondiskStartat2009by@antirezOpensource2013-5-65存储&cachestring、hash、list、set、sortedset持久化高性能过期时间2013-5-66持久化rdbaof2013-5-67rdbFork子进程,copyonwrite–rdbcompressionyes–dbfilenamer7700.rdb–savesecondschanges2013-5-68aof2013-5-69appendonlyyesappendfilenameappendonly.aofappendfsync[everysecalwaysno]no-appendfsync-on-rewriteyesauto-aof-rewrite-percentage0auto-aof-rewrite-min-size64mbAof=appendonlylogfileMySQLbinlogbgrewriteaofAppendRewrite–Fork子进程–子进程遍历所有key写入临时文件–父进程更新aof写入缓冲区–缓冲区追加临时文件–替换已有的aof文件2013-5-610bgrewriteaofauto–auto-aof-rewrite-percentage100•aof_current_size•aof_base_size–auto-aof-rewrite-min-size64mbcrontabremote集中式2013-5-611恢复恢复过程–只打开aof,使用aof加载–同时打开aof和rdb,使用aof加载–打开rdb,使用rdb加载2013-5-612Hash恢复测试大小(G)恢复时间恢复时间/G内存2.0400rdb0.651929.23aof1150846.18rewriteaof419448.502013-5-6130100200300400500600内存rdbaofrewriteaof恢复时间024681012内存rdbaofrewriteaof大小2013-5-614大纲Redis简介Redis应用运维经验2013-5-615现状实例1500+内存总量:15T+访问量:2000亿+/天2013-5-6162013-5-617WebMCMCMCMCQueueMasterSlaveMasterSlaveMySQL2013-5-618WebQueueMasterSlaveMasterSlaveRedis2013-5-619MasterSlaveMasterSlaveRedisMySQLWebQueue业务StringHashListSetSortedset2013-5-620粉丝关注粉丝列表关注列表互相关注列表2013-5-621粉丝关注2013-5-622Hash–Key:user_id–Field:friendids–Value:addedtime相关操作–加关注:hsetuser_idfriend_idadded_time–删关注:hdeluser_idfriend_id–获取关注用户的时间:hgetuser_idfriend_id–获取关注列表:hgetalluser_id2013-5-623演化Sharding问题–获取关注列表比较慢–Cpu–Hgetall成为瓶颈2013-5-624MemcacheRedis通知Hash–Key=uid–Filed=appkey–Value=count2013-5-625通知List2013-5-626通知Set–Key=uid–Value=appkey2013-5-627通知2013-5-628WebredisredisredisredisQueueMasterSlaveMasterSlavehandlersocket大纲Redis简介Redis应用运维经验2013-5-629运维经验自劢化监控报警RedisHA改进坑2013-5-630自劢化自劢部署自劢扩容2013-5-631前段写入Redis7901-1976IDC3Redis7901-1976IDC1Redis7901-1976IDC2Redis7901-1976信息查询2013-5-632自劢部署2013-5-633自劢扩容2013-5-634资源池2013-5-6352013-5-636App2usedApp1usedFApp2userdApp1used预留预留FreeFree内存自劢部署自劢迁移自劢扩容2013-5-6372013-5-638报警CPU–单核使用率LoadDisk–剩余空间–增长速度2013-5-639报警ConnectReplicationconnected_clientsAOF–aof_current_size/aof_base_size2013-5-640报警Memory–服务器可用内存–实例可用内存•maxmemory•maxmemory-policynoeviction2013-5-641RedisHA双写复制2013-5-642双写2013-5-643MasterMaster读写读写复制2013-5-644MasterSlaveReadOnly复制写读读已有方案redis_failover–多IDC官方的Sentinel–分布式双写–数据的一致–服务恢复2013-5-645redis_failover2013-5-646Sentinel2013-5-647RedisSentinelSentinelSentinelRedisHA2013-5-64800ZookeeperDNSMasqClientMasterFailoverLogicSlaveFailoverLogicManagerAgentDNSDNSChanger经验string-hashInstagram100w数据2013-5-64901020304050607080stringhash分段hash内存占用经验hash-max-zipmap-entries512hash-max-zipmap-value512list-max-ziplist-entries512list-max-ziplist-value64set-max-intset-entries512zset-max-ziplist-entries128zset-max-ziplist-value64activerehashingyes2013-5-650经验网卡中断Bgrewriteaof戒者bgsave分开充裕的磁盘空间(128G*2*3=768G)RAID卡2013-5-651改进Rediscounter–rdb+aof–aofposition–固定大小滚劢redis–cronbgrewrite473***–自劢升级–慢查询–网络抖劢引起Slave重传2013-5-652Q&A@曾经的阿飞rj03hou@gmail.com2013-5-653