No.95,ZhongguancunEastRoadBeijing100080,China:+86-10-62554263NLPR,CAS-IA2006-3-21NLPR4.1••NLP•••NLPR,CAS-IA2006-3-21NLPR4.1¾(corpus)(corpus)(corpuslinguistics)(corpuslinguistics)NLPR,CAS-IA2006-3-21NLPR4.2•[Aijmer,1991]•[McEnery,1996]•[Crystal,1991]NLPR,CAS-IA2006-3-21NLPR4.2NLPR,CAS-IA2006-3-21NLPR4.2“”J.ThomasG.Leech[1998]NLPR,CAS-IA2006-3-21NLPR4.2NLPR,CAS-IA2006-3-21NLPR2050¾4.3NLPR,CAS-IA2006-3-21NLPR19572080¾1957Chomsky¾Chomsky••4.3NLPR,CAS-IA2006-3-21NLPR2080¾•1983LancasterLancaster-Oslo/BergenCorpus(LOB):5002000•TremordelaLanguageFrancaise,TLF20001.54.3NLPR,CAS-IA2006-3-21NLPR•TheHelsinkiCorpusofHistoricalEnglish:850-1720,1600•1988TheInternationalCorpusofEnglish,ICE:10019901993184.3NLPR,CAS-IA2006-3-21NLPR¾198119911148019591980201404.3NLPR,CAS-IA2006-3-21NLPR124.3NLPR,CAS-IA2006-3-21NLPR¾1979527¾19832000¾(1983,,106)¾19831824.4NLPR,CAS-IA2006-3-21NLPR¾19917000¾1992260019982000100080004.4NLPR,CAS-IA2006-3-21NLPR¾19981¾¾4.4NLPR,CAS-IA2006-3-21NLPRheterogeneous[2002]homogeneous“”TIPSTER4.5NLPR,CAS-IA2006-3-21NLPRSystematicspecialized4.5NLPR,CAS-IA2006-3-21NLPR¾¾¾¾//4.5¾¾NLPR,CAS-IA2006-3-21NLPR¾¾[,2003]4.5NLPR,CAS-IA2006-3-21NLPR124.5NLPR,CAS-IA2006-3-21NLPR¾“”204.5NLPR,CAS-IA2006-3-21NLPR¾C:E:Goodmorning.C:E:Couldyougivemeacupofcoffee?……4.5C:12!3E:Good2morning1.3NLPR,CAS-IA2006-3-21NLPR¾¾4.5NLPR,CAS-IA2006-3-21NLPR4[2003]¾¾4.5NLPR,CAS-IA2006-3-21NLPR¾¾4.5NLPR,CAS-IA2006-3-21NLPRmonitorcorpus4.6NLPR,CAS-IA2006-3-21NLPR[Leech,1991]4.6NLPR,CAS-IA2006-3-21NLPR1001990s10002000¾¾¾4.6NLPR,CAS-IA2006-3-21NLPR¾GB13000.11997.12.5¾GB12200.1-90011993¾GB/T12200.2-94021994¾GB137154.6NLPR,CAS-IA2006-3-21NLPR¾¾¾¾……4.6NLPR,CAS-IA2006-3-21NLPR¾¾4.6NLPR,CAS-IA2006-3-21NLPR[2003]4.6NLPR,CAS-IA2006-3-21NLPR(BrownCorpus)¾2060sFrancisKuceraBrown100¾1961¾155002000¾1961¾1970sGreeneRubinTAGGIT813300774.7NLPR,CAS-IA2006-3-21NLPRLLC(London-LundCorpusofSpokenEnglish)¾1960sQuirk¾2000¾LundSvartvik¾TheSurveyofSpokenEnglish,SSE¾SSE1981London-LundCorpusofSpokenEnglish(LLC)¾87500050¾5¾4.7NLPR,CAS-IA2006-3-21NLPR(LongmanCorpus)¾LongmanCorpusCommittee¾January1981-November1990¾125040103¾190020informative60imaginative40¾10¾28004.7NLPR,CAS-IA2006-3-21NLPR(Pennsylvania)(UPennTreeBank)(~treebank/home.html)¾M.Marcus¾1993300¾2000104185/PN/AD/VV/CD/M/JJ/NN/CC/NN/NN/PU4.7NLPR,CAS-IA2006-3-21NLPR(IP(NP-SBJ(PN))(VP(ADVP(AD))(VP(VV))(NP-OBJ(QP(CD)(CLP(M)))(NP(NP(ADJP(JJ)(NP(NN)))(CC)(NP(NN)(NN))))))(PU))4.7NLPR,CAS-IA2006-3-21NLPR4.7CDVPVVJJNNNNNNNPNPCCNPMCLPQPNP-OBJVPADADVPVPPNNP-SBJIPPUNLPR,CAS-IA2006-3-21NLPR()¾¾19982600¾/r/ns/r/a/u/m/a/n/u/n/c/d/a/w/d/d/v/v/n/w/n/n/n/d/d/d/v/v/vn/c/vn/w4.7NLPR,CAS-IA2006-3-21NLPR()¾SinicaCorpus¾500¾1)2)3)4.7NLPR,CAS-IA2006-3-21NLPRChineseLDC¾973G1998030504¾¾¾¾8102.5-3.0500100……4.7NLPR,CAS-IA2006-3-21NLPRLC-STAR(NLPR-Nokia)¾12¾¾100Mwords25004.7NLPR,CAS-IA2006-3-21NLPR¾61219%418143741282927499163551230874.7NLPR,CAS-IA2006-3-21NLPR¾38142(propernames)22,15619,93015,6187521¾4.7NLPR,CAS-IA2006-3-21NLPRWordNet()¾(PrincetonUniversity)GeorgeA.Miller¾¾95600515004410070100¾WordNet44.8NLPR,CAS-IA2006-3-21NLPR¾¾WordNet{x1,x2,…}{y1,y2,…}R{y1,y2,…}{x1,x2,…}RR4.8NLPR,CAS-IA2006-3-21NLPR¾4synonymyantonymyhypernymy/{}{}{}{}meronymy/4.8NLPR,CAS-IA2006-3-21NLPR¾25{}{}{}{}{}{}{}{}{}{}{}{}{}{}{}{}{}{}{}{}{}{}{}{}{}4.8NLPR,CAS-IA2006-3-21NLPR¾21000840014¾19500100004.8NLPR,CAS-IA2006-3-21NLPR¾WordNet“”4.8NLPR,CAS-IA2006-3-21NLPR3(#1)(#2)4.8NLPR,CAS-IA2006-3-21NLPR(HowNet)()¾1988(1)NLP(2)4.8NLPR,CAS-IA2006-3-21NLPR(3)(4)4.8NLPR,CAS-IA2006-3-21NLPR4.8NLPR,CAS-IA2006-3-21NLPR4.8“”“”“”NLPR,CAS-IA2006-3-21NLPR4.8NLPR,CAS-IA2006-3-21NLPR4.8(a)((b)(c)(d)(e)-(f)-(g)-NLPR,CAS-IA2006-3-21NLPR4.8(h)//-*“”“”(i)//-$“”“”(j)-*“”“”(k)-@“”“”(l)-@NLPR,CAS-IA2006-3-21NLPR4.8(m)-“”“”(n)-(o)-(p)#NLPR,CAS-IA2006-3-21NLPR4.8NO.=000001W_C=G_C=VE_C=~~~~~W_E=buyG_E=VE_E=DEF=buy|NLPR,CAS-IA2006-3-21NLPR4.8NO.=015492W_C=G_C=VE_C=~~~~~~~W_E=knitG_E=VE_E=DEF=weave|NLPR,CAS-IA2006-3-21NLPR4.8(HNC)HierarchicalNetworkofConcepts(1)(2)NLPR,CAS-IA2006-3-21NLPR4.8()NLPR,CAS-IA2006-3-21NLPRWordNet(HowNet)NLPR,CAS-IA2006-3-21NLPR1.2.UPennTreeBankNLPR,CAS-IA2006-3-21NLPRThanks