spark-graphx,Spark 图计算

整理文档很辛苦,赏杯茶钱您下走!

免费阅读已结束,点击下载阅读编辑剩下 ...

阅读已结束,您可以下载文档离线阅读编辑

资源描述

UC  BERKELEY  GraphXGraphAnalyticsinSparkAnkurDaveGraduateStudent,UCBerkeleyAMPLabJointworkwithJosephGonzalez,ReynoldXin,DanielCrankshaw,MichaelFranklin,andIonStoicaUC  BERKELEY  Model&DependenciesArchitectureMachineLearningLandscapeLarge&DenseParameterServerGraph-ParallelSparseSmall&DenseMapReduceModel&DependenciesArchitectureMachineLearningLandscapeLarge&DenseParameterServerSparseSmall&DenseSparkDataflowFrameworkGraphXGraphsSocialNetworksWebGraphs⋆⋆⋆⋆⋆⋆⋆⋆⋆⋆⋆⋆⋆⋆⋆⋆⋆⋆⋆User-ItemGraphsGraphAlgorithmsPageRankTriangleCountingCollaborativeFilteringUsersProductsRatingsUsers≈xProductsf(i)f(j)⋆⋆⋆⋆⋆CollaborativeFilteringr13r14r24r25f(1)f(2)f(3)f(4)f(5)UserFactorsProductFactorsf[i]=argminw2RdXj2Nbrs(i)rijwTf[j]2+||w||22TheGraph-ParallelPatternTheGraph-ParallelPatternTheGraph-ParallelPatternCollaborativeFiltering» AlternatingLeastSquares» StochasticGradientDescent» TensorFactorizationStructuredPrediction» LoopyBeliefPropagation» Max-ProductLinearPrograms» GibbsSamplingSemi-supervisedML» GraphSSL» CoEMCommunityDetection» Triangle-Counting» K-coreDecomposition» K-TrussGraphAnalytics» PageRank» PersonalizedPageRank» ShortestPath» GraphColoringClassification» NeuralNetworksManyGraph-ParallelAlgorithmsRawWikipedia///XMLHyperlinksPageRankTop20PagesTitlePRLinkTableTitleLinkEditorGraphCommunityDetectionUserCommunityUserCom.EditorTableEditorTitleTopCommunitiesCom.PR..ModernAnalyticsTablesRawWikipedia///XMLHyperlinksPageRankTop20PagesTitlePRLinkTableTitleLinkEditorGraphCommunityDetectionUserCommunityUserCom.TopCommunitiesCom.PR..EditorTableEditorTitleEditorTableEditorTitleRawWikipedia///XMLHyperlinksPageRankTop20PagesTitlePRLinkTableTitleLinkEditorGraphCommunityDetectionUserCommunityUserCom.TopCommunitiesCom.PR..GraphsTheGraphXAPIVertexProperty:• UserProfile• CurrentPageRankValueEdgeProperty:• Weights• Relationships• TimestampsPropertyGraphsGraphtype  VertexId  =  Long    val  vertices:  RDD[(VertexId,  String)]  =      sc.parallelize(List(          (1L,  “Alice”),          (2L,  “Bob”),          (3L,  “Charlie”)))    class  Edge[ED](      val  srcId:  VertexId,      val  dstId:  VertexId,      val  attr:  ED)    val  edges:  RDD[Edge[String]]  =      sc.parallelize(List(          Edge(1L,  2L,  “coworker”),          Edge(2L,  3L,  “friend”)))    val  graph  =  Graph(vertices,  edges)  CreatingaGraph(Scala)132AliceBobCharliecoworkerfriendclass  Graph[VD,  ED]  {    //  Table  Views  -­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐    def  vertices:  RDD[(VertexId,  VD)]    def  edges:  RDD[Edge[ED]]    def  triplets:  RDD[EdgeTriplet[VD,  ED]]    //  Transformations  -­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐    def  mapVertices[VD2](f:  (VertexId,  VD)  =  VD2):  Graph[VD2,  ED]    def  mapEdges[ED2](f:  Edge[ED]  =  ED2):  Graph[VD2,  ED]    def  reverse:  Graph[VD,  ED]    def  subgraph(epred:  EdgeTriplet[VD,  ED]  =  Boolean,                              vpred:  (VertexId,  VD)  =  Boolean):  Graph[VD,  ED]    //  Joins  -­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐    def  outerJoinVertices[U,  VD2]                  (tbl:  RDD[(VertexId,  U)])                  (f:  (VertexId,  VD,  Option[U])  =  VD2):  Graph[VD2,  ED]    //  Computation  -­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐    def  mapReduceTriplets[A](                  sendMsg:  EdgeTriplet[VD,  ED]  =  Iterator[(VertexId,  A)],                  mergeMsg:  (A,  A)  =  A):  RDD[(VertexId,  A)]    GraphOperations(Scala)  //  Continued  from  previous  slide    def  pageRank(tol:  Double):  Graph[Double,  Double]    def  triangleCount():  Graph[Int,  ED]    def  connectedComponents():  Graph[VertexId,  ED]    //  ...and  more:  org.apache.spark.graphx.lib  }  Built-inAlgorithms(Scala)PageRankTriangleCountConnectedComponentsRDDThetripletsviewGraph132AliceBobCharliecoworkerfriendclass  Graph[VD,  ED]  {    def  triplets:  RDD[EdgeTriplet[VD,  ED]]  }    class  EdgeTriplet[VD,  ED](      val  srcId:  VertexId,  val  dstId:  VertexId,  val  attr:  ED,      val  srcAttr:  VD,  val  dstAttr:  VD)  srcAttrdstAttrattrAlicecoworkerBobBobfriendCharlietriplets  Thesubgraphtransformationclass  Graph[VD,  ED]  {    def  subgraph(epred:  EdgeTriplet[VD,  ED]  =  Boolean,                              vpred:  (VertexId,  VD)  =  Boolean):  Graph[VD,  ED]  }    graph.subgraph(epred  =  (edge)  =  edge.attr  !=  “relative”)  subgraph  GraphAliceBobCharlierelativefriendcoworkerDavidrelativeGraphAliceBobCharliefriendcoworkerDavidThesubgraphtransformationclass  Graph[VD,  ED]  {    def  subgraph(epred:  EdgeTriplet[VD,  ED]  =  Boolean,                              vpred:  (VertexId,  VD)  =  Boolean):  Graph[VD,  ED]  }    graph.subgraph(vpred  =  (id,  name)  =  name  !=  “Bob”)  subgraph  GraphAliceBobCharlierela

1 / 34
下载文档,编辑使用

©2015-2020 m.777doc.com 三七文档.

备案号:鲁ICP备2024069028号-1 客服联系 QQ:2149211541

×
保存成功