基于.NET的大型Web站点StackOverflow架构分析StackOverflow网址:当前访问量:每月9500PV(每天300多万PV)当前Alexa排名:149所用.NET技术:C#、VisualStudio2010TeamSuite、ASP.NET4、ASP.NETMVC3、Razor、LINQtoSQL+rawSQL下面是英文原文:AlothashappenedsincemyfirstarticleontheStackOverflowArchitecture(2009-8-5).Contrarytothethemeofthatlastarticle,whichlavishedattentiononStackOverflow'sdedicationtoascale-upstrategy,StackOverflowhasbothgrownupandoutinthelastfewyears.自从2009年8月发布了第一篇关于“StackOverflow架构”方面的文章,StackOverflow已经发生了很大的变化。那篇文章更多关注的是StackOverflow如何解决网站的扩展性(scale-up)问题,而经过几年的发展,StackOverflow已经长大成人,成长为了大型网站。StackOverflowhasgrownupbymorethendoublinginsizetoover16millionusersandmultiplyingitsnumberofpageviewsnearly6timesto95millionpageviewsamonth.现在与2009年相比,StackOverflow每月独立访问用户翻了一倍,超过1600万;每月PV翻了近6倍,达到9500万。StackOverflowhasgrownoutbyexpandingintotheStackExchangeNetwork,whichincludesStackOverflow,ServerFault,andSuperUserforagrandtotalof43differentsites.That'salotoffruitfulmultiplyinggoingon.StackOverflow新增了很多站点,比如ServerFault,SuperUser等,共有43个不同站点组成了StackExchangeNetwork,可谓硕果累累,迅猛增长。Whathasn'tchangedisStackOverflow'sopennessaboutwhattheyaredoing.Andthat'swhatpromptedthisupdate.Arecentseriesofpoststalksalotabouthowthey'vebeenhandlingtheirgrowth:StackExchange’sArchitectureinBulletPoints,StackOverflow’sNewYorkDataCenter,DesigningForScalabilityofManagementandFaultTolerance,StackOverflowSearch—Now81%Less,StackOverflowNetworkConfiguration,DoesStackOverflowusecachingandifso,how?,WhichtoolsandtechnologiesbuildtheStackExchangeNetwork?.StackOverflow的变化翻天覆地,而不变的是他们开放的心态,所以才有了这篇架构分享的文章。最近,他们写了一系列文章分享他们如何应对这样的快速增长。Someofthemoreobviousdifferencesacrosstimeare:穿越时空,我们来看看有哪些明显的变化?JustMore.Moreusers,morepageviews,moredatacenters,moresites,moredevelopers,moreoperatingsystems,moredatabases,moremachines.Justalotmoreofmore.更多:更多的用户,更多的PV,更多的数据中心,更多的站点,更多的开发者,更多的操作系统,更多的数据库,更多的服务器...Linux.StackOverflowwasknownfortheirWindowsstack,nowtheyareusingalotmoreLinuxmachinesforHAProxy,Redis,Bacula,Nagios,logs,androuters.AllsupportfunctionsseemtobehandledbyLinux,whichhasrequiredthedevelopmentofparallelreleaseprocesses.Linux:StackOverflow因使用Windows系统而著称,现在他们使用越来越多的Linux服务器,比如HAProxy(负载均衡),Redis(NoSQL数据库),Bacula(数据备份系统),Nagios(远程监控软件),日志,路由器都运行于Linux系统,几乎所有需要并行处理的功能都是由Linux处理(这句话的翻译可能不准确)。FaultTolerance.StackOverflowisnowbeingservedbytwodifferentswitchesontwodifferentinternetconnections,they'veaddedredundantmachines,andsomefunctionshavemovedtoaseconddatacenter.容错:StackOverflow使用了两条不同的互联网线路,增加了更多的冗余服务器,将一些网站服务运行于第二个数据中心。NoSQL.Redisisnowusedasacachinglayerfortheentirenetwork.Therewasn'taseparatecachingtierbeforesothisabigchange,asisusingaNoSQLdatabaseonLinux.NoSQL:Redis作为整个网站的缓存层。这是一个巨大的改变,以前并没有将缓存作为一个独立的层分离出来。Redis运行于Linux。Unfortunately,Icouldn'tfindanycoverageonsomeoftheopenquestionsIhadlasttime,likehowtheyweregoingtodealwithmulti-tenancyacrosssomanydiffrentproperties,butthere'sstillplentytolearnfrom.Here'sarollupafewdifferentsources:遗憾的是,一些我关注的问题并没有从中找到答案,比如面对这么多不同的系统,如何解决多租户的问题(Multi-tenancy是一种软件体系结构,在这种体系结构中软件运行在softwareasaservice服务商的服务器上,服务于多个客户组织即tenant)。但是,从中我们依然可以学到很多。下面是收集的一些数据列表:TheStats95MillionPageViewsaMonth800HTTPrequestsasecond180DNSrequestsasecond55Megabitspersecond16MillionUsers-TraffictoStackOverflowgrew131%in2010,to16.6millionglobalmonthlyuniques.DataCenters1RackwithPeakInternetinOR(HostsourchatandDataExplorer)2RackswithPeer1inNY(HoststherestoftheStackExchangeNetwork)Hardware10DellR610IISwebservers(3dedicatedtoStackOverflow):o1xIntelXeonProcessorE5640@2.66GHzQuadCorewith8threadso16GBRAMoWindowsServer2008R22DellR710databaseservers:o2xIntelXeonProcessorX5680@3.33GHzo64GBRAMo8spindlesoSQLServer2008R22DellR610HAProxyservers:o1xIntelXeonProcessorE5640@2.66GHzo4GBRAMoUbuntuServer2DellR610Redisservers:o2xIntelXeonProcessorE5640@2.66GHzo16GBRAMoCentOS1DellR610LinuxbackupserverrunningBacula:o1xIntelXeonProcessorE5640@2.66GHzo32GBRAM1DellR610LinuxmanagementserverforNagiosandlogs:o1xIntelXeonProcessorE5640@2.66GHzo32GBRAM2DellR610VMWareESXidomaincontrollers:o1xIntelXeonProcessorE5640@2.66GHzo16GBRAM2Linuxrouters5DellPowerConnectswitchesDevToolsC#:LanguageVisualStudio2010TeamSuite:IDEMicrosoftASP.NET(version4.0):FrameworkASP.NETMVC3:WebFrameworkRazor:ViewEnginejQuery1.4.2:BrowserFramework:LINQtoSQL,somerawSQL:DataAccessLayerMercurialandKiln:SourceControl(分布式版本控制系统)BeyondCompare3:CompareTool(文件比较工具)SoftwareandTechnologiesUsedStackOverflowusesaWISCstackviaBizSparkWindowsServer2008R2x64:OperatingSystemSQLServer2008R2runningMicrosoftWindowsServer2008EnterpriseEditionx64:DatabaseUbuntuServerCentOSIIS7.0:WebServerHAProxy:forloadbalancing(高性能的负载TCP/HTTP均衡器)Redis:usedasthedistributedcachinglayer.(作为分布式缓存层的NoSQL数据库)CruiseControl.NET:forbuildsandautomateddeployment(.NET平台的持续集成工具)Lucene.NET:forsearchBacula:forbackups(开源的数据备份系统)Nagios:(withn2rrdanddrrawplugins)formonitoring(监视系统运行状态和网络信息的远程监控软件)Splunk:forlogs(日志分析工具)SQLMonitor:fromRedGate-forSQLServermonitoringBind:forDNSRovio:alittlerobot(arealrobot)allowingremotedeveloperstovisittheoffice“vi