2013%&%BMMB%597D:%Analyzing%Next%Generaon%Sequencing%Data%%%Week%4,%Lecture%8%István'Albert''Bioinformacs%Consulng%Center%%Penn%State%Lecture%Data%%• Download%and%unpack%the%data%for%this%lecture,%it%contains%quite%a%few%files%%%• Nucleode%sequences%and%protein%sequences%%• Single%sequence%others%and%mulple%sequences%(records)%Blast%Bit%Scores%(S)%A%bit%score%corresponds%to%alignment%quality%%%the%higher%the%score,%the%beUer%the%alignment%%the%formula%uses%a%scoring%matrix%%%similar/idencal%residues%!%increase%the%score%%gaps,%non&similar%residues%!%decrease%the%score%%(the%bit%score%is%also%normalized%from%the%alignment%score)%Blast%E%(expect)%&value%%• The%number%of%hits%one%can%expect%to%see%by%chance%when%searching%a%database%of%a%parcular%size.%E%=%m%*%n%*%2%^%&S%%m,%n%are%the%sequence%sizes,%S%is%the%bitscore%Reminder%on%DNA%to%Protein%translaon%Frame%of%translaon%usually%leads%to%different%proteins%List%of%BLAST+%programs%blastn%&%%nucleode%vs%nucleode%%• EST:%An%expressed%sequence%tag%or%EST%is%a%short%sub&sequence%of%a%cDNA%sequence%• Find%the%best%matches%of%a%single%%EST%nucleode%%sequence%(read)%against%a%nucleode%based%EST%collecon%• You%may%use%different%search%tasks%(opmizaons)%within%blastn.'%%Named%(confusingly)%as%blastn%and%megablast'%Orientaon%Understanding%a%blast%report%We%are%accessing%a%sequence%reported%in%the%blast%report%with%the%blastdbcmd%command%Fine%tuning%blastn%searches%Four%different%tasks%are%supported%in%blastn:%%1. megablast%&%for%very%similar%sequences%(e.g,%sequencing%errors)%%2. dc5megablast,%typically%used%for%inter&species%comparisons%%3. blastn%&%the%tradional%program%used%for%inter&species%comparisons%%4. blastn5short%&%opmized%for%sequences%less%than%30%nucleodes.%blastp%–%protein%vs%protein%%• Find%the%%best%alignments%of%the%gamma2%globin%protein%against%a%list%of%globin%proteins%blastx%–%nucleode%vs%protein%Looking%for%matches%on%both%strands%and%3%reading%frames%!%6%Align%the%nucleode%sequence%of%the%fugu%globin%against%the%globin%list%tblastn%–%protein%vs%%nuclode%Align%the%protein%sequence%of%the%gamma%2%globin%against%the%EST%database%tblastx%–%nuclode%vs%nucleode%%(but%aligned%as%proteins)%Compare%coding%regions%between%more%distant%organisms%• Example:%having%only%nucleode%sequence%compare%the%coding%regions%of%a%chicken%with%those%of%a%fugu%Pairwise%alignments%%• Previously%called%bl2seq'• Folded%into%the%individual%tools%under%the%parameters:%5subject,'5query'Homework%8%1. You%may%use%the%data%provided%with%this%lecture%%2. Align%a%nucleode%sequence%%against%a%protein%database.%Extract%the%nucleode%sequence%of%the%query%that%corresponds%to%the%worst%local%alignment.%(hint:%construct%a%blast%database%for%the%query,%then%use%blastdbcmd)%%%3. Extract%and%compare%the%nucleode%sequences%that%correspond%tothe%subject%and%the%query%of%a%local%alignment%obtained%via%tblastx'%