1EFFICIENTMODESELECTIONFORH.264COMPLEXITYREDUCTIONINABAYESIANFRAMEWORK1M.Bystrom†,I.Richardson‡andY.Zhao‡†BostonUniversity,Boston,MA,USA‡RobertGordonUniversity,AberdeenUKABSTRACTInordertoachieveahighcompressionratio,theH.264/AVCstandardhasincorporatedalargenumberofcodingmodeswhichmustbeevaluatedduringthecodingprocesstodeterminetheoptimalrate-distortiontradeoff.ThecodinggainsofH.264/AVCariseattheexpenseofsignificantcodercomplexitywhichmaynotbedesiredformobiledeviceswithlimitedbatterylife.Onecoderprocessthathasbeenidentifiedashavingpotentialforachievingcomputationsavingsistheselectionbetweenskippingthecodingofamacroblockandcodingofthemacroblockinoneoftheremainingcodingmodes.Inlow-motionsubsequences,alargepercentageofmacroblocksare“skipped”,thatis,nocodeddataaretransmittedforthesemacroblocks.Byestimatingwhichmacroblocksaretobeskippedduringthecodingprocess,significantsavingsincomputationcanberealized,sincethecoderthendoesnotevaluatetherate-distortioncostsofallcandidatecodingmodes.InthisworkweplacethisskipversuscodedecisioninaBayesianframework.Weusetherate-distortioncostdifferencebetweencodingandskippingamacroblockasthesingledecisionfeatureanddetermineanappropriatedecisionthresholdfollowingmodelingofthecostdifference’sclass-conditionalPDFs.Finally,inordertofurtherlimitsystemcomplexity,wemodelthethreshold’sparametersasfunctionsofapplication-andsequence-specificcharacteristics,namely,thequantizationparameterandanactivityfactor.Thisresultsinadecisionthresholdthatisonlyafunctionofthesetwocharacteristics,whichareeitherknownoreasilymeasured.Itisshownthatthisapproachcanresultinatimesavingsofover80%forlow-motionsequencesatanegligibledecreaseor,incertaincases,aslightincreaseinqualityoverareferenceH.264codec.1ThisworkwaspresentedinpartinY.Zhao,M.BystromandI.Richardson,“AMAPFrameworkforEfficientSkip/CodeModeDecisioninH.264”,IEEEICIP,Oct.2006.21.INTRODUCTION1.1OverviewofH.264ModesandModeSelectionTheITUH.264AdvancedVideoCodingstandard[1]achievessignificantlybettercompressionthanearlierstandards,enablinghighqualityvideoonpower-constrainedhandhelddevices.However,compressionisachievedattheexpenseofincreasedcomputationalcomplexity[2].InanH.264/AVCcodedsequence,eachmacroblock(MB)canbecodedinoneofalargenumberofmodes,manyofwhicharetypicallyevaluatedbeforetheappropriatecodingmodeisselected.Forexample,Table1summarizesthemodesavailableforcodinganMBusingtheH.264BaselineProfilewhichsupportsIntra(I)andInter(P)codedslices.ThesemodesincludeaSkipmode,inwhichnofurtherdataistransmittedaftertheSkipindication,threeclassesofIntracodingmodes,andfourmodesthatuseInterpredictionwithuptofourmacroblockpartitions.EachpartitioninanInterpredictedmacroblockusesmotioncompensatedpredictionwithaseparatemotionvector.The8x8partitionsizemaybesplitfurtherintoone,two,orfoursub-macroblockpartitions,eachwithaseparatemotionvector.FurthermodeoptionsareavailableintheMainandHighProfiles.Theoptimumcodingmodeforagivenmacroblockdependsonthestatisticsofthesourcevideodata,onthecodingparameters,andonpreviouscodingdecisions.Table1:SummaryofmacroblockcodingmodesinIandPslices(H.264BaselineProfile).Inordertoachievegoodrate-distortionperformance,theRateDistortionOptimized(RDO)modeselectionprocess[3]evaluatesthedistortionandrateofeachcandidatemodepriortoselectingthemodeforthecurrentMB.IntheJointModel(JM)referenceencoder,thisiscarriedoutbycodingthemacroblockineachofthepossiblemodesandchoosingthemodethatminimizesarate-distortioncostfunction[4].Therate(R)anddistortion(D)correspondingtoeachcandidatemodearecalculatedusingtheprocessshowninFig.1.ThesourceMBisencodedusingintraorinterprediction,aforwardtransform,quantization,andsourcecoding,toproduceasequenceofRbits,whereRindicatestherateassociatedwiththisparticularcandidatemode.TheModeDescriptionP_SkipNofurtherdatatransmittedforthisMB.Amotion-compensatedMBisreconstructedatthedecoderusingamotionvectorpredictedfromneighboringMBs.Intra_4x4Intrapredictionofeach4x4lumablock.Thepredictionmodeofeach4x4blockissignaledinthemacroblockheader.Intra_16x16Intrapredictionofcompletemacroblock.Twenty-fourversionsofthismode,dependingonpredictionandcodingchoices,areavailable.I_PCMDirecttransmissionofcodedimagesamples.P_L0_16x16Interpredictionwithone16x16lumapartition.P_L0_L0_16x8Interpredictionwithtwo16x8lumapartitions.P_L0_L0_8x16Interpredictionwithtwo8x16lumapartitions.P_8x8andP_8x8ref0(2modes)Interpredictionwithfour8x8lumapartitions.Furthersub-macroblockpartitions(8x8,8x4,4x8or4x4)aresignaledforeach8x8partition.3quantizedcoefficientsarerescaled,inversetransformedandreconstructed,andthedistortion,D,namely,theSumofSquaredDifferences(SSD)betweenthesourceandreconstructedMBs,iscalculated.Therate-distortioncostiscalculatedaccordingtoJ=D+λR,whereλisaLagrangemultiplier.Thisisrepeatedforeachcandidatemodeasillustratedbythefollowingpseudocode:Initializebest_modeandbest_costForeachcandidatemodei{CalculateDandRandcodedbitstream(asperFig.1)CalculatemodecostJi=D+λRIfJibest_cost{best_cost=Jibest_mode=istorecodedbitstreamformodeistorecodingstateformodei(motionvectors,pa