Multi-protease Approach for the Improved Identi fi cation and Molecular Characterization of Small Proteins and Short Open Reading Frame-

Multi-protease Approach for the Improved Identi fi cation and Molecular Characterization of Small Proteins and Short Open Reading Frame-

ID:81816759

大小:1.88 MB

页数:9页

时间:2023-07-21

上传者:U-14522
Multi-protease Approach for the Improved Identi fi cation and Molecular Characterization of Small Proteins and Short Open Reading Frame-_第1页
Multi-protease Approach for the Improved Identi fi cation and Molecular Characterization of Small Proteins and Short Open Reading Frame-_第2页
Multi-protease Approach for the Improved Identi fi cation and Molecular Characterization of Small Proteins and Short Open Reading Frame-_第3页
Multi-protease Approach for the Improved Identi fi cation and Molecular Characterization of Small Proteins and Short Open Reading Frame-_第4页
Multi-protease Approach for the Improved Identi fi cation and Molecular Characterization of Small Proteins and Short Open Reading Frame-_第5页
Multi-protease Approach for the Improved Identi fi cation and Molecular Characterization of Small Proteins and Short Open Reading Frame-_第6页
Multi-protease Approach for the Improved Identi fi cation and Molecular Characterization of Small Proteins and Short Open Reading Frame-_第7页
Multi-protease Approach for the Improved Identi fi cation and Molecular Characterization of Small Proteins and Short Open Reading Frame-_第8页
Multi-protease Approach for the Improved Identi fi cation and Molecular Characterization of Small Proteins and Short Open Reading Frame-_第9页
资源描述:

《Multi-protease Approach for the Improved Identi fi cation and Molecular Characterization of Small Proteins and Short Open Reading Frame-》由会员上传分享,免费在线阅读,更多相关内容在学术论文-天天文库

pubs.acs.org/jprArticleMulti-proteaseApproachfortheImprovedIdentificationandMolecularCharacterizationofSmallProteinsandShortOpenReadingFrame-EncodedPeptidesPhilippT.Kaulich,LiamCassidy,JürgenBartel,RuthA.Schmitz,andAndreasTholey*CiteThis:J.ProteomeRes.2021,20,2895−2903ReadOnlineACCESSMetrics&MoreArticleRecommendations*sıSupportingInformationABSTRACT:Theidentificationofproteinsbelowapproximately70−100aminoacidsinbottom-upproteomicsisstillachallengingtaskduetothelimitednumberofpeptidesgeneratedbyproteolyticdigestion.Thisincludestheshortopenreadingframe-encodedpeptides(SEPs),whichareasubsetofthesmallproteinsthatwerenotpreviouslyannotatedorthatarealternativelyencoded.Here,wesystematicallyinvestigatedtheuseofmultipleproteases(trypsin,chymotrypsin,LysC,Lysargi-Nase,andGluC)inGeLC−MS/MSanalysistoimprovethesequencecoverageandthenumberofidentifiedpeptidesforsmallproteins,withafocusonSEPs,inthearchaeonMethanosarcinamazei.Combiningthedataofallproteases,weidentified63smallproteinsandadditional28SEPswithatleasttwouniquepeptides,whileonly55smallproteinsand22SEPcouldbeidentifiedusingtrypsinonly.For27smallproteinsand12SEPs,acompletesequencecoveragewasachieved.Moreover,forfiveSEPs,incorrectlypredictedtranslationstartpointsorpotentialinvivoproteolyticprocessingwereidentified,confirmingthedataofaprevioustop-downproteomicsstudyofthisorganism.Theresultsshowclearlythatamulti-proteaseapproachallowstoimprovetheidentificationandmolecularcharacterizationofsmallproteinsandSEPs.LC−MSdata:ProteomeXchangePXD023921.KEYWORDS:alternativeopenreadingframes,bottom-up,LC−MS,peptidomics,smallopenreadingframes,sORF,smORF,terminomics■INTRODUCTIONTheidentificationofsmallproteins,includingSEPs,bytraditionalmethodsofbottom-upproteomicsischallengingShortopenreadingframes(sORFs)havelongbeenover-lookedingenomeannotationduetotheirsmallsizeandbecausetheyhaveaninherentbiasagainstsmallproteins.DownloadedviaUNIVOFNEWMEXICOonMay16,2021at10:30:36(UTC).Thus,smallproteinsareunderrepresentedinmostproteomicspossiblelocalizationintheuncommonregionofthegenome,e.g.,innon-codingsequences,untranslatedregions,orout-of-studies.Adaptedmethodologieshavebeendevelopedfortheframewithinlargeopenreadingframes(ORFs).1Improvedenrichmentofsmallorthedepletionoflargeproteins,suchasSeehttps://pubs.acs.org/sharingguidelinesforoptionsonhowtolegitimatelysharepublishedarticles.9,1011ORFpredictionalgorithmshaveledtoanenormousnumbergel-basedmethods,organicsolventdepletionmethods,8,12,1314ofpredictedsORFs.Forexample,thousandsofhighlysolid-phaseextraction,molecularweightcut-offfilters,15conservedsORFshavebeendiscoveredacrossvariousbacteriasizeexclusionchromatography,oradaptedproteinextractioninthehumanmicrobiome.216methods.Inthepastfewyears,sORFgeneproducts,calledshortopenMoreover,comparedtolargerproteins,naturally,onlyareadingframe-encodedpeptides(SEPs)ormicroproteins,havelimitednumberofpeptidescanbegeneratedbydigestingsmall34,56beenidentifiedineukaryotes,prokaryotes,andarchaea.proteins.Inbottom-upproteomics,aproteinisusuallySomeSEPshavebeenfunctionallycharacterizedandfulfillconsideredidentifiedwhenatleasttwopeptideshavebeenimportantfunctionswithincells,e.g.,incelldivision,proteindetectedwhereoneisproteotypic.17Thisstrictcriterion,4,5folding,membranetransport,orembryonaldevelopment.Intheliterature,thedefinitionofSEPisnotconsistentand,e.g.,50,70,or100aminoacidsareconsideredastheupperReceived:February9,2021limitforlength.4,7,8InthisstudywedefinedsmallproteinsPublished:March24,2021accordingtothemediumvalueofupto70aminoacids(aa)inlength,whichincludedthevastmajorityofpredictedSEPsfromourmodelorganismwhilenotextendingtherangetoomuch.©2021TheAuthors.PublishedbyAmericanChemicalSocietyhttps://doi.org/10.1021/acs.jproteome.1c001152895J.ProteomeRes.2021,20,2895−2903

1JournalofProteomeResearchpubs.acs.org/jprArticlehowever,isdifficulttoapplytosmallproteinsandtherefore,■MATERIALANDMETHODStheidentificationbasedononlyoneproteotypicpeptideis18ChemicalsoftenacceptedinSEPstudiesdespitethegreaterriskofitbeingafalsepositive.Besidesextensivevalidationexperiments,Trypsin,chymotrypsin,andGluCwerefromPromegathevalidityofsuchidentificationsshouldalsobeconfirmedby(Madison,WI,USA),LysCfromWakoPureChemicalamanualinspectionofthosespectra.AdaptedcriteriafortheseCorporation(Japan),andLysargiNasefromProteolysis“one-hitwonders”havebeenestablished,suchasion-series-Laboratory(SpanishResearchCouncil,Barcelona,Spain).basedcriteria,whicharebasedonthenumberofconsecutiveThecOmpleteEDTA-freeproteaseinhibitorcocktailwasfragmentionsinMS2spectra.18Recently,peptide-centricpurchasedfromRoche(Penzberg,Germany).Allothersearchengines19havebeenusedtovalidatepeptidesidentifiedchemicalswerefromSigma-Aldrich(Steinheim,Germany).Deionizedwater(18.2MΩ·cm−1)waspreparedbyanbyonlyasinglepeptidespectralmatchtoensurethatthearium611VFsystem(Sartorius,Göttingen,Germany).spectrumcannotbebetterexplainedbyanothermodifiedpeptide.10,20SamplesHowever,morestringentidentificationcriteriaandim-CultivationofM.mazeiwasperformedasdescribedprovedsequencecoveragearemandatoryforthisemergingpreviously.31Inbrief,thearchaeonwascultivatedunderfieldtoreducefalse-positiveidentificationofSEPs.Theanaerobicconditionsat37°C(150mMmethanol,40mMpredominantlyusedproteaseinbottom-upproteomicsisacetate)undernitrogenstarvationconditions(80%N2,20%trypsin.DuetothehydrolysisC-terminalofthebasicaminoCO2).Cellswereharvested(30min,3100×g,4°C)whenaacidslysineandarginine,thegeneratedpeptidesareprimarilyturbidityat600nmof0.6wasreachedandwashedwith100doublycharged.However,peptidesformedfromtheproteinmMtriethylammoniumbicarbonate(TEAB,pH8.5).CellC-terminusareoftensinglychargedandthereforeunder-lysiswasperformedinlysisbuffer(10mMTris(pH8.8),1%representedinproteomicsstudies.21Moreover,sequenceSDS,cOmpleteEDTA-freeproteaseinhibitorcocktail)viaregionswithmany,orwithoutany,lysineorarginineresiduesfreeze−thawcyclingandultrasonichomogenizing(Sonopluscannotbecoveredusingtrypsin.Thus,theuseofalternativeHD2070,Bandelin,Berlin,Germany).Aftercentrifugationproteaseswasinvestigatedtoovercometheselimitationsand(21,100×g,4°C),theproteinconcentrationoftheincreasethesequencecoverage.22Forexample,theproteasesupernatantwasdeterminedbythePierceBCAproteinassaykit(ThermoFisherScientific,Bremen,Germany).LysargiNasehydrolysespeptidebondsN-terminaloflysineandarginine,actingasamirrorproteaseoftrypsinandthereforeisIn-GelDigestion21moresuitableforC-terminalpeptideidentification.AIn-geldigestionwasperformedasdescribedearlierwithslightsignificantimprovementofsequencecoverageandnumberofmodifications.10Inbrief,50μgofaliquotsofM.mazeipeptidesperproteinwasreportedusingalternativeproteasesinproteomewereusedforseparationviadiscontinuousSDS-additiontotrypsin(e.g.,chymotrypsin,LysC,LysN,AspN,PAGE(4%stackinggel,16%resolvinggel,0.1cmthick).TheGluC,ArgC,andLysargiNase),especiallyforlowlyabundantsamplesweremixed1:1(v/v)withreducingLaemmlibuffer22,23proteins.Previously,itwasshownthatacombinationof(62mMTris−HCl(pH6.8),25%glycerin(v/v),5%β-multipleproteasescanincreasethenumberofproteotypicmercaptoethanol(v/v),2%SDS(w/v),0.01%bromophenol24blue(w/v))andseparatedusingMini-PROTEANTetraCellpeptidesforproteinswithasizebelow25kDa.Barteletal.usedalternativeproteasestotrypsin,whichreducedthe(Bio-Rad,Hercules,USA)inaTris-glycinerunningbufferspectralcountforlargerproteinsandledtoanimproved(0.3%Tris,1.44%glycine,0.1%SDS(w/v)).Aconstantidentificationofsmallproteins.13Moreover,theapplicationofvoltagewasapplied(40V,15minand120V,∼80min)untilmultipleproteaseshasbeenestablishedforseveralspecifictherunningfrontreachedtheendofthegel.Foreachprotease,issues,e.g.,phosphoproteomics,25cross-linkingmassspec-threeSDS-PAGElaneswereanalyzed.trometry,26C-terminomics,27orpaleoproteomics.28SeveralAfterelectrophoresis,thegelwasfixedfor2hinafixingtoolsfortheinsilicopredictionofasuitableproteasehavebeensolution(30%methanol,10%aceticacidinMilliQwater(v/29,30v)),changingthesolutiontwice.Notably,nofurtherproteinintroduced.However,duetothefactorssuchasionizationstainingwasperformedtoensuremaximalrecoveryofsmallefficiencyorchromatographicco-elution,insilico-predicted10proteins.Usingapre-stainedproteinmarker,threebandspeptidesarenotnecessarilyalsoidentifiableinareal-lifewet-below20kDa(FigureS1)werecutoutandwashedthreelabexperiment.timeswith200μLof100mMammoniumbicarbonate(ABC)Here,weinvestigatedtheapplicationofmultipleproteases(pH8).Theproteinswerereducedwith10mMdithiothreitolfortheanalysisofsmallproteins(≤70aa),focusingonSEPs.(200μL,1h,56°C)andalkylatedwith55mMiodoacetamideWeusedthemostcommonproteasesinproteomics:trypsin(200μL,30min,roomtemperatureinthedark).Thegel(hydrolysisC-terminaltolysineandarginine),LysargiNasebandswerewashedwith100mMABCand100%acetonitrile(N-terminaltolysineandarginine),chymotrypsin(C-terminal(200μLeach,15min,roomtemperature)anddriedbytophenylalanine,tryptophan,leucine,andtyrosine),GluC(C-vacuumcentrifugationfor10min(45°C).terminalcleavagetoglutamicacidinammoniumbicarbonateDigestionwiththedifferentproteaseswasperformedbuffer),andLysC(C-terminaltolysine).Thegoalofthestudyseparatelyondifferentgellanesbutinparalleltoistheidentificationofmorepeptidespersmallproteintoapplycorrespondinggelsections.Theproteases(20ng/μLin50stricteridentificationcriteriaandtoincreasethesequencemMTEAB)wereaddedtothegelpieces(trypsin,LysC,coverageofsmallproteinsandSEPsinthearchaeonchymotrypsin,andGluC:100ng;activatedLysargiNase:300Methanosarcinamazei.Forthis,weusedSDS-PAGEtoenrichng),andthevolumewasfilledupto100μLwith50mMABCsmallproteinsandapplieddifferentenzymesforin-geland5%ACN.Digestionwasperformedovernight(16h)at3710digestion.°C.Thepeptideswereelutedwith150μLeachof60%ACN2896https://doi.org/10.1021/acs.jproteome.1c00115J.ProteomeRes.2021,20,2895−2903

2JournalofProteomeResearchpubs.acs.org/jprArticle10plus1%TFA,and100%ACN(15min,37°C).Afterdryingproteinsduetoanunexpectedrunningbehavior.Therefore,viavacuumcentrifugation,thepeptideswereresuspendedinweexcisedthreebandsbelow20kDa(FigureS1)while15μLoftheloadingbuffer(3%ACN,0.1%TFA).ForLC−focusingonproteinsbelow70aa.AfterreductionandMS/MSmeasurements,5μLwasinjected.alkylation,theproteinsweredigestedin-gelwithdifferentLC−MS/MSAnalysisproteases(trypsin,LysargiNase,chymotrypsin,GluC,orLysC),andtheextractedpeptideswereanalyzedviaLC−ADionexU3000UHPLCsystem(Thermo,Dreieich,MS/MS(GeLC).AllexperimentswereperformedinGermany),equippedwithanAcclaimPepMap100columntriplicates.(2μm,75μm×500mm)andcoupledonlinetoaQExactiveIntotal,usingthestrictcriteria,mostproteinandpeptidePlusOrbitrapMS(Thermo,Bremen,Germany),wasused.groupswereidentifiedaftertrypticdigestionfollowedbyLysC,Thepeptideswereseparatedacrossa2hgradientataflowratechymotrypsin,GluC,andLysargiNase(TableS2).Thehighof300nL/min:4%Bfor2minfollowedbyalineargradienttonumberofidentifiedpeptidesaftertrypsindigestioncompared20%Bover80minandalineargradientto40%Bover40min,withtheotherproteasesisinagreementwithearlieran8minlinearincreaseto90%B,and10minat90%B.Inter-21−23reports.Mergingalldata,2079proteingroupswithrunequilibrationofthecolumnwasperformedfor10minat43,666peptideswereidentified.4%EluentB.EluentAwas0.05%formicacid(FA)andeluentThemolecularweightdistributionoftheidentifiedproteinsB80%ACN,0.04%FA.TheMSacquisitionprogramshowsamaximumupto20kDa,butmanylargerproteinswereconsistedofafull-scanMS(resolution:70,000,automaticalsoidentified(FigureS2A).Thisisinagreementwithgaincontrol(ACG)target:3e6,maximalinjectiontime(IT):10,35previousstudiesandiscausedbytwofactors.First,some50ms)withthetop15MS/MSacquisitionofthemostintenseproteinscanshowunexpectedmigrationbehaviorsonSDS-ionsusinga1.4m/zisolationwindow(resolution:17,500,PAGE,forexample,duetoextremeaminoacidcompositions.AGCtarget:1e5,maximalIT:100ms).Ionsofunassigned,+1,Second,itisnotstraightforwardtodiscriminatebetweenand>+6chargestateswereexcluded.Forfragmentation,higher-energycollisionaldissociation(HCD)wasutilizedwithdifferentproteoforms,i.e.,truncatedandfull-lengthproteins,36anormalizedcollisionenergy(NCE)of27.5.Dynamicwhichisyetaseverelimitationofbottom-upproteomics.exclusion(30s)andlockmass(445.12003m/z)wereenabled.ThedistributionofthesequencecoveragesfordifferentAllLC−MSdatahavebeendepositedtotheProteomeX-proteinsizesshowsthatforbiggerproteins,lowersequencechangeConsortiumviathePRIDEpartnerrepositorywiththecoverageswereobtained(FigureS2B).Therefore,itisdatasetidentifierPXD023921.32assumedthatthevastmajorityoftheidentifiedpeptidesfromlargerproteins(>20kDa)arecertainlygeneratedfromDatabaseSearch10proteolyticallyprocessedforms.ThedatabasesearchwasperformedwiththeProteomeThedistributionofthenumberofidentifiedb-andy-ionsinDiscoverersoftwarepackage(V2.4;Thermo,Germany).TheallMS/MSspectrainthedifferentdatasetsisshowninFigurerawdataofthedifferentproteasesweresearchedindividuallyS3.Asexpected,thespectraofLysCandtrypticpeptidesusingtheSequestHTalgorithmnodeagainstadatabasedisplaymanyy-ionsduetotheC-terminalpositivelychargedcontainingthegenome-derivedproteomeoftheM.mazeiaminoacid.Conversely,thespectraofpeptidesgeneratedbystrainGö1(UniProtID:UP000000595,12/09/2019),alistofLysargiNasehavemainlyb-ionsduetotheN-terminal33,34predictedSEPsandcommoncontaminants(cRAP).Forpositivelychargedaminoacid.21ThespectraofGluC-anddatabasesearches,wenamedtheSEPwitharbitraryaccessionchymotrypsin-generatedpeptidesshowedaslightpreferencenumbersthatareassignedtothegeneidentifiersinTableS1.towardy-ions.Theprecursormasstolerancewas10ppm.ThefragmentmassAsanexample,theMS2spectraoftheC-terminalpeptidetolerancewas0.02Da.ThenumberofmaximalallowedmissedfromtheSEPA1004identifiedafterchymotrypsindigestioniscleavageswasoptimizedbymultipledatabasesearchesandshowninFigureS4.Duetothehighamountofpositivelyfinallysetto2(trypsinandLysC),3(GluC),or5chargedaminoacidswithinthesequence,almostacomplete(LysargiNaseandchymotrypsin).Carbamidomethylationonseriesofbothb-andy-ionswasidentified.cysteinewassetasastaticmodification.Thevariablemodificationswereasfollows:oxidationofmethionine;IdentifiedSmallProteinsdeamidationofasparagine,glutamine,andarginine;cleavageInthisstudy,wedefinedsmallproteinsasthosecontaininglessofthestartmethionine;andN-terminalacetylationwithand7orequalto70aa.Inbottom-upproteomics,comparedtowithoutmethioninetruncation.Forthecalculationoffalselargerproteins,theidentificationofsmallproteinsishampereddiscoveryrates,apercolatorwasused.bythelowernumberofgeneratedpeptides.Therefore,here,inTwodifferentidentificationcriteriawereused:(i)strictadditiontothestrictidentificationrule(twopeptides,inwhichcriteriainvolvingtheidentificationoftwohigh-confidenceoneisproteotypic),anionseries-basedcriterionforthe(FDR:<1%)peptideswhereoneisproteotypicand(ii)ionidentificationofsmallproteins/SEPshasbeenapplied,inseries-basedcriteriainvolvingtheidentificationofatleastonewhichproteinspassedifatleastoneuniquepeptide,withfivehigh-confidenceuniquepeptide,whereinitsMS2spectrum,atconsecutiveb-orfiveconsecutivey-ionsintheMS/MS-leastfiveconsecutiveb-orfiveconsecutivey-ionsarespectrum,isidentified.18Ifnototherwisestated,wereport18present.smallproteinsandSEPswithanencodedproteinsizelessthanorequalto70aa,applyingbothcriteria.■RESULTSANDDISCUSSIONMostsmallproteinsandSEPs(≤70aa)wereidentifiedafterToenrichthesmallproteinsofM.mazei,theproteomewasdigestionwithtrypsinfollowedbydigestionwithGluC,LysC,firstseparatedviaSDS-PAGE.Previously,wehaveshownthatchymotrypsin,andLysargiNase(Table1).Thenumberstheanalysisofgelbandscorrespondingtohigher-molecular-approximatelycorrelatewiththenumberofproteinsidentifiedweightproteinsisadvantageousfortheidentificationofsmallintotal.Mergingalldata,92smallproteins(24SEPsincluded)2897https://doi.org/10.1021/acs.jproteome.1c00115J.ProteomeRes.2021,20,2895−2903

3JournalofProteomeResearchpubs.acs.org/jprArticleTable1.NumberofIdentifiedSmallProteins(≤70aa)andtrypsinaloneisonlycapableofproducingonepeptideforSEPsUsingtheIonSeries-BasedandStrictCriteria(Ionpossibleidentification.Series-Based/Strict)ImprovedNumberofProteotypicPeptidesperProteinprotease#ofsmallproteins#ofSEPsandSequenceCoveragetrypsin83/7027/22Foralmostallidentifiedsmallproteins,thenumberofpeptidesLysC63/4721/14perprotein,andthustheirsequencecoverageandconfidenceGluC63/5017/14ofidentification,wasincreasedusingthemulti-proteasechymotrypsin53/5018/14approach.ThenumberofuniquepeptidesperproteinoftheLysargiNase38/3613/11individualproteaseandcombined(multi-protease)dataiscombined92/8332/28showninFigure2A.ThelowestnumbersofpeptidesperproteinwereidentifiedfollowingGluCandLysCdigestions,wereidentifiedusingtheionseries-basedcriteriaofwhich83whichcanbeexplainedbythelowernumberofcleavagesites(20SEPincluded)alsometthestrictidentificationcriteria.In(C-terminalofonlyGluandLys,respectively)comparedtotheonlineaccessibleUniProtdatabaseofM.mazei(taxontheotherproteases(trypsin,LysargiNase,andchymotrypsin).identifier:192952;withoutSEPs),133smallproteins(≤70aa)Fortheindividualdatasets,thenumberofproteotypicarelisted.Therefore,51and47%oftheseproteinswerepeptidespersmallproteinrangedbetween3(LysC)and8identifiedinthisstudyusingtheionseries-basedandstrict(LysargiNase);aftertrypsindigestion,anaverageof6identificationcriteria,respectively.Moreover,itisnoteworthyproteotypicpeptidesperproteinwasidentified.Whentotakeintoaccountthatitisnotexpectedthatallproteinsarecombiningalldatasets,thenumberofpeptidesperproteinexpressedunderasinglebiologicalcondition.wasincreaseduptoanaverageof15.Remarkably,fivesmallTheoverlapoftheidentificationsisshowninFigure1.proteinswereidentifiedwithmorethan50uniquepeptidesUsingtheion-seriesbasedcriteria,intotalnine,andunder(TableS3).strictcriteria,fiveproteinswereexclusivelyidentifiedwithMostoftheidentifiedpeptidesfromthedifferentproteaseproteasesotherthantrypsin.Forexample,the70aa-lengthdigestionsdeliveredcomplementaryinformation(FigureS5).proteinQ8Q0J9wasidentifiedafterbothchymotrypsinandForexample,theSEPA00191wasidentifiedwith7and13GluCdigestionwithfouruniquepeptidesbutnotaftertrypsinuniquepeptidesaftertrypsinandchymotrypsindigestions,digestion.Theproteincontainsfourlysineresiduesofwhichrespectively.WhilepeptidesgeneratedfromtrypsindigestiontwoeacharelocatedintheN-terminalandC-terminalregioncouldnotprovideanyinformationinregardtotheC-andN-oftheprotein,respectively.Thus,afterinsilicotrypsintermini,thepeptidesproducedfollowingchymotrypsindigestion,noidentifiablepeptidecanbegeneratedastheydigestionprovidedexactlythissequenceinformationandareeithertoosmallandunspecificortoolarge,whichcouldresultedina100%sequencecoveragebeingobtained.hamperthefragmentationefficiencyaswellasselectionforThenumberofidentifiedcanonicalN-andC-terminioftheMS/MSfragmentation(filterrejects:>+6analytes).Further-smallproteins(≤70aa)inthedifferentdatasetsislistedinmore,whilepossible,commonsearchalgorithms,suchasTableS4.Asexpected,thenumberofidentifiedC-terminiisSEQUEST,arenotoptimizedforlargepeptideMS/MSanalysis.37reducedusingtrypsin(30usingstrictcriteria)comparedtotheThevalueofthemulti-proteaseapproachwashighlightedinotherproteases.Interestingly,inparticularafterchymotrypsintheidentificationofeightproteins(TableS3),whichwereonlydigestion,manyC-terminalpeptideswereidentified(41).Oneidentifiedaccordingtothestrictcriteria(twouniquepeptides),exampleistheSEPA00677,whichonlycontainsonelysinewheredatafromthevariousproteaseswerecombined.Theseandnoarginine.Aftertrypticdigestion,onlytheN-terminalexamplesshowthattheuseofmultipleproteasesissuitableforpeptidewasidentified,andafterchymotrypsindigestion,onlyincreasingtheconfidenceofidentificationforsmallproteins/theC-terminalpeptidewasidentified(FigureS5),resultinginSEPsunderthestrictcriteria,i.e.improvingsituationswhereanoverallsequencecoverageof100%.Figure1.OverlapoftheidentifiedSEPafterdigestionwithmultipleproteasesusing(A)ionseries-basedand(B)strictcriteria.2898https://doi.org/10.1021/acs.jproteome.1c00115J.ProteomeRes.2021,20,2895−2903

4JournalofProteomeResearchpubs.acs.org/jprArticleFigure2.Propertiesoftheidentifiedsmallproteins(≤70aa)fortheindividual,andcombined,datasets.In(A),thenumberofuniquepeptidesisshownasaboxplotdiagramandin(B),thedistributionofthesequencecoverageashalfviolinplotisshown.Duetothehighnumberofuniquepeptides,itispossibletodiscriminatebetweenverysimilarsmallproteins.Forexample,the68aa-lengthconservedproteinsQ8PX65,Q8PX66,andQ8PX67showaveryhighsequencehomologybetween88and93%(FigureS6).Adistinctionofthesethreeproteinsispossibleintrypticdigestiononthebasesof7,2,and4proteotypicpeptides,respectively.Aftercombiningalldatasets,theproteinswereidentifiedwith34,18,and25uniquepeptides,respectively,whichincreasestheconfidencesignifi-cantly.Moreover,fortheproteinQ8PX65,a100%sequencecoveragecouldbeachievedwithproteotypicpeptides.Anassessmentofthesequencecoverageobtainedbyboththeindividualproteases,andasacombineddatasetofallproteases,wasperformed(Figure3B).Themajorityofthesmallproteinsfromthetrypsin-digestedsampleshaveasequencecoveragebetween50and80%.Aftercombiningalldata,thevastmajorityofthesmallproteinsareverywellcharacterized,andthesequencecoveragerisestoabove90%.Intotal,37smallproteins(45%ofthestrictlyidentifiedsmallproteins)wereidentifiedwith100%sequencecoverage.Theimpressiveincreaseinthesequencecoverageofthecombineddata,comparedwithtrypsindigestiononly,isprimarilyduetotheidentificationofsequenceregionsthatarenotidentifiablewithtrypsin.Byutilizinganadditionalprotease,itispossibletoidentifypeptidesthatspanthesecriticalregions.Forexample,Figure3.Improvedsequencecoverageusingmultipleproteasesthelysine-andarginine-richC-terminalpartoftheSEPcomparedtotrypsinonly.ThepercentageofthesequencecoveredA00184(residuesRKPRENRY)wasnotidentifiedherewithexclusivelybytrypsinisshowninred,thepercentageofoverlapintrypsinnorwasitidentifiedinanyofourpreviousbottom-upblack,andthepercentageexclusivelycoveredwithalternative9−11studies.Withthealternativeproteases,however,theC-proteasesthantrypsininblue.Onlyproteinsidentifiedwithtrypsinterminalpeptidewasidentified(FigureS5).Anotherexampleandatleastoneotherproteaseareshown.istheSEPA00073,whichhasaC-terminalsequencerichinlysine(KKKLAKEMKKK)andcouldonlybeidentifiedafterthecombinationofLysargiNaseandtrypsintoanaveragechymotrypsindigestion.increaseof11%(n=43),andthecombinationofLysCandTheimprovementofsequencecoverageusingacombinationtrypsintoanaverageincreaseof8%(n=67).Mergingalldataoftrypsinandoneadditionalprotease(chymotrypsin,GluC,sets,anaveragesequencecoverageof85%(n=77)wasLysC,orLysargiNase),comparedtotrypsinalone,isshowninachieved,anincreaseof20%comparedtodigestionwithonlyFigureS7A−D.Onlyproteinsidentifiedafterdigestionwithtrypsin(Figure3).bothtrypsinandthesecondproteaseareincludedinthesefigures.Thehighestincreaseinaveragesequencecoverage,wasIdentificationofMispredictedN-TerminiandSignalobtainedbyacombinationoftrypsinandchymotrypsinPeptidesdigestion,whichledtoanaveragesequencecoverageof87%Inordertoidentifynon-canonicalN-terminiinthemulti-(n=50);thiscorrespondstoanincreaseof16%comparedtoproteasedataset,wealsosearchedusingsemi-enzymatictrypsinonly.Incomparisontotrypsinonly,thecombinationofspecificity.Interestingly,weidentifiedanumberofsemi-GluCandtrypsinledtoanaverageincreaseof12%(n=61),specificallycleavedpeptidesthatstartwithamethionine,which2899https://doi.org/10.1021/acs.jproteome.1c00115J.ProteomeRes.2021,20,2895−2903

5JournalofProteomeResearchpubs.acs.org/jprArticleaTable2.IdentifiedSEPsUsingtheIonSeries-BasedorStrictCriteriaacc.MW(kDa−1)#ofUP#ofPSMSCISBstrictidentifiedbA000537.1193285%XXG/T/C/LA000573.231126%X+TA0006810.37124164%XXG/T/LA/C/LA000706.80937100%XXG/LA/C/LA000737.072232452%XXG/T/LA/C/LA000757.8524173100%XXG/T/LA/C/LA000816.622626693%XXT/LA/C/LA001016.87119%X+CA0011212.96102959%XXT/LA/C/LA001166.102111689%XXG/T/LA/C/LA001336.041237100%XXG/T/C/LA001567.702332%XX+T/LA001575.88818100%XX+G/T/LA/CA001717.581364100%XXG/T/LA/LA001767.3722297100%XXG/T/LA/C/LA001799.1272464%XXG/T/LA/C/LA0018311.863726%XX+T/LA001847.70362648100%XXT/LA/CA001867.14895100%XXT/LA0018715.053337%XX+T/CA001919.532184100%XXT/C/LA001928.0323336100%XXG/T/LA/C/LA0019649.063157%XXLA002115.135978%XX+G/T/C/LA002134.81640682%XXG/T/LA/LA004112.031175%X+LA006772.952396%X+T/C/LA009442.832274%X+G/LAA009452.5524100%XX+T/LA009541.501258%XTA010043.8813151100%XXG/T/LA/C/LA011822.121356%X+TA013436.95103780%XXG/T/LA/C/LA013582.651435%X+LAaAbbreviations:acc,accessionnumber;MW,molecularweight;UP,uniquepeptides;PSM,peptidespectralmatch;SC,sequencecoverage;ISB,identifiedusingtheionseries-basedcriteria;strict,identifiedusingthestrictcriteria;identifiedafterGluC(G),trypsin(T),chymotrypsin(C),LysC(L),orLysargiNase(LA)digestion.Thesuperscriptplus(+)indicatesthatthepreviouslyunidentifiedSEPwasidentifiedusingthecorrespondingcriterionforthefirsttimeinthisstudy.TheassignmentofthearbitraryaccessionnumberstothegeneidentifiersisshowninTableS1.bXindicatesthattheproteinwasidentifiedusingthecorrespondingidentificationcriteria.wasannotatedinthedatabaseasaninternalmethionine,ortruncation),wesuggestthateithertheannotatedN-terminuspeptidesstartingwithoneaminoacidC-terminaltothatisincorrector,potentially,theN-terminalpeptideiscleavedinmethionine(FigureS8A);notethatchymotrypsinwasvivo.Overall,weidentified46proteinswitheitheranMXMorexcludedhereduetoitscapabilitytohydrolyzepeptideMXXMmotifattheannotatedN-terminusbutwithanon-bondsC-terminaltomethionine.Inmostcases,theidentifiedspecificpeptidestartingwithordirectlyafterthesecondterminusislocatedveryclosetothe(not-identified)predictedmethionine(FigureS8B).Interestingly,someofthesenon-N-terminus.canonicalN-terminiwereacetylatedattheN-terminus,Forexample,thehighlyabundantproteinmethyl-coenzymesupportingthehypothesisofmispredictedtranslationstartMreductase(Q8PXH7)wasidentifiedwithasequencepointsstillpresentinthedatabase.coverageof98%withonlythefourfirstannotatedaminoacidsThehypothesisofmisannotatedN-terminiisfurtherinnotidentified(MHEM).Moreover,12semi-specificpeptideagreementwithfindingsinM.acetivorans,whereseveralgroupswereidentifiedstartingwiththefifthaminoacidintheproteinsannotatedwithanN-terminalMXM/MXXMmotiftrypsin-,LysargiNase-,andLysC-digestedsamples,respec-withoutthefirstaminoacidswereidentifiedviatop-downmass38tively.Further,theproteinshowsahighsequencesimilarityspectrometry.Inaddition,theaforementionedexample,withothermethyl-coenzymeMreductases,lackingthefirstQ8PXH7,wasalsoidentifiedinourprevioustop-down12threeaminoacids.AsN-terminalmethioninecleavageisaanalysis.Themulti-proteaseapproachemployedinthiscommonmodification(inourdatasetsfrom749identifiedstudyfurtherstrengthensthepreviouspredictionofawronglycanonicalN-termini,413(51%)showedamethioninepredictedtranslationstartpoint.2900https://doi.org/10.1021/acs.jproteome.1c00115J.ProteomeRes.2021,20,2895−2903

6JournalofProteomeResearchpubs.acs.org/jprArticleThemulti-proteaseapproachalsoenabledustoconfirmtheTable3.MispredictedN-TerminioftheIdentifiedSEPsaproteolyticcleavageofthepredictedsignalpeptides.ThewithMultipleProteaseshighlyabundantmajorS-layerprotein(Q8PVI7),whichcoatsthesurfaceofthearchaeon,wasidentifiedwith257uniquepeptides.Thefirst24aminoacidresiduesfromtheN-terminalregionofQ8PV17werenotidentified;however,21semi-specificpeptidesstartingatposition25wereidentifiedfollowingdigestionwithtrypsin,chymotrypsin,LysC,andGluC.ThesignalpeptidetruncationobservedforQ8PV17in39thisstudyhasbeenreportedpreviously;however,suchmodificationsarenotalwayseasilyorreadilydecipherablewithinproteindatabases.Assuch,theabilityofthemulti-proteaseapproachtoclearlyshowsuchmodificationsisofvalueforthevalidationofsignalpeptidesforwhichpeptide-levelevidencedoesnotexist.IdentifiedSEPsWithinthepopulationofthesmallproteinsidentified,asdescribedabove,wewere,inparticularinterestedinpeptidesbeingencodedbyshortopenreadingframes.DuetotheainconsistentdefinitionoftheupperproteinlengthlimitforTheidentifiedsequenceisunderlinedandshowninbold.SEPs,4,7,8wealsoincludeeightSEPsidentified,whichareAbbreviations:Acc,accessionnumber;UP,uniquepeptidesincludinglongerthan70aaandarenotasubsetofsmallproteinsthemis-predictedN-terminus.describedabove.Intotal,32and28SEPswereidentifiedusingtheionseries-basedandstrictcriteria,respectively.ThehighestevidenceforeitheranincorrectlypredictedstartsiteorthenumberofSEPswasidentifiedusingtrypsinfollowedbyLysC,presenceofsignalpeptideprocessingishigherforthechymotrypsin,GluC,andLysargiNase(Tables1and2).Allremainingproteinssinceahighernumberofsemi-specificallyidentificationsarelistedintheSupportingInformation(TablescleavedputativeN-terminalpeptidesweredetectedbymultipleS3andS5−S9).proteasesandthesequencecoverageofthepotentially9−11Comparedwithourpreviousbottom-upstudies,theprocessedproteinwascomplete.sequencecoverageofmostoftheseSEPswassignificantlyincreased,and12SEPswereidentifiedwithafullsequence■CONCLUSIONScoverage.Interestingly,ahighnumberofcanonicalN-terminiWhiletrypsinistheproteasedeliveringthehighestnumberof(19/17)andC-termini(21/21)wereidentifiedusingtheproteinidentifications,makingittothegoldstandardinmulti-proteaseapproach.bottom-upproteomics,theadditionaluseofotherproteasesisThisstudyprovided,forthefirsttime,strongpeptide-levelbeneficialfortheanalysisofsmallerproteinsandshortopenevidenceofeightadditionalSEPs,whichwereidentifiedusingreadingframe-encodedpeptides.Inparticular,proteinsbeingthestrictcriteria(Table2).Notably,twooftheseSEPswereeitherrichor,viceversa,poorinbasicresiduescanbenefitidentifiedwitha100%sequencecoverage.Inaddition,fivefrommulti-proteaseapproaches.AsidefromthenumberofpreviouslyunidentifiedSEPswereidentifiedundertheionidentifications,anevenmoreimportantadvantageistheseries-basedcriteria(Table2).completenessofthesequencecoverageachievablewiththeAsdescribedaboveforallsmallproteins,weidentifiedcombinationofmulti-protease-deriveddata.ThisenablestoseveralSEPswiththeN-terminideviatingfromthepredictedobtaininformationabouttheproteinN-andC-termini,sequence(Table3).Forexample,theSEPA00081hasainformationthatisoftenlostinpuretrypsin-basedstudies.predictedN-terminalMXMmotif.FollowingdigestionwithWiththisinformation,proteolyticprocessingevents,suchastrypsin,LysargiNase,andLysC,ninepeptidegroupsweresignalpeptidecleaveoff,aswellaserrorsindatabaseidentifiedstartingattheC-terminaltotheinternalmethionineannotationscanbeidentified.Further,withtheincreased(i.e.,startingatthefourthpredictedaminoacidofA00081).numbersofidentifieduniquepeptides,theconfidenceofThecorrectidentificationoftheN-terminifortheSEPsidentificationsissignificantlyimproved,whichisagainofhighA00068,A00073,A00116,andA00167isalsosupportedbyrelevanceforsmallerproteins.dataderivedfrommultipleproteasesandstartswithordirectlyOurdatashowthatthenumberofalternativeproteasestobeafteraninternalmethionine.Interestingly,theN-terminiofappliedisnotnecessarilyfive;evenwiththecombinationofSEPsA00116andA00081wereacetylated.Thesedataareintrypsin,chymotrypsin,andGluC,asignificantgainofaccordancewithourearliertop-downproteomicsstudy,whichinformationcouldbeachieved.Ontheotherhand,proteaseidentifiedtheSEPsA00068,A00073,A00081,andA00116havingspecificitiesatsimilaraminoacids,suchasLysargiNasewithN-terminidifferenttothoseprovidedviathecomputa-andLysC,stillcontributesbuttoalesserextent.Ifthemajor12tionalprediction.aimofananalysisistoachieveamaximumnumberofWeinspectedtheidentifiedSEPforpotentialsignalpeptidesidentifications,thentheapplicationoftwoproteases(trypsinlookingfornon-predictedN-termini.TheSEPA00112wasandchymotrypsin)wassufficient.However,when,addition-identifiedtohaveits-N-terminusattheaminoacidposition25ally,anoptimizedsequencecoverage(e.g.,formolecular40(Table3).UsingthesignalpredictiontoolSignalP-5.0,wecharacterization)isthetargetofananalysis,aselectionof,e.g.,couldpredictasignalpeptideonlyforA00112butnotfortheonlytwoproteasesisnotapriori,leadingtothebestresults.otherfiveproteinsshowninTable3.Ontheotherhand,Forexample,asshowninFigureS5,theSEPA00157andthe2901https://doi.org/10.1021/acs.jproteome.1c00115J.ProteomeRes.2021,20,2895−2903

7JournalofProteomeResearchpubs.acs.org/jprArticleSEPA01004canbeunambiguouslyidentifiedbyapplyingLiamCassidy−SystematicProteomeResearch&Bioanalytics,trypsinonly,or,inordertoimproveconfidence,byaInstituteforExperimentalMedicine,Christian-Albrechts-combinationoftrypsinandchymotrypsin.However,afullUniversitätzuKiel,Kiel24105,GermanysequencecoverageforbothSEPswasonlyachievedwhenanJürgenBartel−DepartmentofMicrobialProteomics,InstituteadditionalGluCdigestionwasperformed.Ontheotherhand,ofMicrobiology,UniversityofGreifswald,Greifswald17489,trypsinandGluCalonewouldhaveidentifiedbothbutlackaGermanyfullsequencecoverageofA01004.RuthA.Schmitz−InstituteforGeneralMicrobiology,Weseeagreatpotentialofmulti-proteaseapproachesChristian-Albrechts-UniversitätzuKiel,Kiel24118,performedinparalleltotop-downproteomicsapproaches.TheGermanyinformationinthelatterdirectlyprovidesinformationaboutCompletecontactinformationisavailableat:themolecularsetupoftheentireproteoforms,includingpost-https://pubs.acs.org/10.1021/acs.jproteome.1c00115translationalmodifications;however,astheinterpretationofintactproteindatabecomesmorechallengingwithincreasingNotesproteinmasses,multiplepeptidestretchesidentifiedbymulti-protease-basedbottom-upproteomicscanserveasausefulTheauthorsdeclarenocompetingfinancialinterest.supporttointerpret,orconfirm,thetop-downdata.Inaddition,theuseofdifferentproteaseshasbeenshownto■ACKNOWLEDGMENTSimprovethedeterminationofproteinabundances,e.g.,in41FundingwasprovidedbytheDeutscheForschungsgemein-label-freeLC−MSexperiments,whichcouldbeahelpfultoolschaft(DFG)withinthepriorityprogramSPP2002,projectZ1fordecipheringthebiologicalfunctionofSEPsinfuture(TH872/10-1)andSCHM1052/20-1.studies.■■ASSOCIATEDCONTENTREFERENCES*(1)Samandi,S.;Roy,A.V.;Delcourt,V.;Lucier,J.-F.;Gagnon,J.;sıSupportingInformationBeaudoin,M.C.;Vanderperre,B.;Breton,M.-A.;Motard,J.;Jacques,TheSupportingInformationisavailablefreeofchargeatJ.-F.;etal.Deeptranscriptomeannotationenablesthediscoveryandhttps://pubs.acs.org/doi/10.1021/acs.jproteome.1c00115.functionalcharacterizationofcrypticsmallproteins.eLife2017,6,No.e27860.Analyzedgelbands(FigureS1);propertiesofthe(2)Sberro,H.;Fremin,B.J.;Zlitni,S.;Edfors,F.;Greenfield,N.;identifiedproteinsaftercombiningalldatasets(FigureSnyder,M.P.;Pavlopoulos,G.A.;Kyrpides,N.C.;Bhatt,A.S.Large-S2);distributionoftheannotatedb-andy-ions(FigureScaleAnalysesofHumanMicrobiomesRevealThousandsofSmall,S3);MS/MSspectraoftheproteotypicpeptideusingNovelGenes.Cell2019,178,1245−1259.chymotrypsindigestion(FigureS4);complementarityof(3)Galindo,M.I.;Pueyo,J.I.;Fouix,S.;Bishop,S.A.;Couso,J.P.theidentifiedsequencepartsusingmultipleproteasesPeptidesencodedbyshortORFscontroldevelopmentanddefinea(FigureS5);discriminationofproteinswithsimilarneweukaryoticgenefamily.PLoSBiol.2007,5,No.e106.sequences(FigureS6);sequencecoverageusingthe(4)Storz,G.;Wolf,Y.I.;Ramamurthi,K.S.Smallproteinscannolongerbeignored.Annu.Rev.Biochem.2014,83,753−777.differentproteasescomparedtotrypsinonly(Figure(5)Orr,M.W.;Mao,Y.;Storz,G.;Qian,S.-B.AlternativeORFsandS7);identifiednon-canonicalN-terminiandMXM/smallORFs:Sheddinglightonthedarkproteome.NucleicAcidsRes.MXXMmotif(FigureS8);accessionnumber,associated2020,48,1029−1042.geneidentifier,andsequenceoftheidentifiedSEP(6)Prasse,D.;Thomsen,J.;deSantis,R.;Muntel,J.;Becher,D.;(TableS1);numberofidentifiedpeptidesandproteinsSchmitz,R.A.FirstdescriptionofsmallproteinsencodedbyspRNAsusingthestrictcriteria(TableS2);andidentifiedN-andinMethanosarcinamazeistrainGö1.Biochimie2015,117,138−148.C-terminiofsmallproteinsincludingSEPsusingthe(7)Zahn,S.;Kubatova,N.;Pyper,D.J.;Cassidy,L.;Saxena,K.;strictcriteria(TableS4)(PDF)Tholey,A.;Schwalbe,H.;Soppa,J.Biologicalfunctions,geneticandbiochemicalcharacterization,andNMRstructuredeterminationofOverviewofallidentifiedsmallproteinsandSEPafterthesmallzincfingerproteinHVO_2753fromHaloferaxvolcanii.combiningalldatasets(TableS3)andidentifiedsmallFEBSJ.2020,288,2042−2062.proteinsandSEPsofindividualdatasets(TableS5−S9)(8)Petruschke,H.;Anders,J.;Stadler,P.F.;Jehmlich,N.;von(XLSX)Bergen,M.Enrichmentandidentificationofsmallproteinsinasimplifiedhumangutmicrobiome.J.Proteomics2020,213,103604.(9)Cassidy,L.;Prasse,D.;Linke,D.;Schmitz,R.A.;Tholey,A.■AUTHORINFORMATIONCombinationofBottom-up2D-LC-MSandSemi-top-downGelFree-LC-MSEnhancesCoverageofProteomeandLowMolecularWeightCorrespondingAuthorShortOpenReadingFrameEncodedPeptidesoftheArchaeonAndreasTholey−SystematicProteomeResearch&Methanosarcinamazei.J.ProteomeRes.2016,15,3773−3783.Bioanalytics,InstituteforExperimentalMedicine,Christian-(10)Kaulich,P.T.;Cassidy,L.;Weidenbach,K.;Schmitz,R.A.;Albrechts-UniversitätzuKiel,Kiel24105,Germany;Tholey,A.ComplementarityofDifferentSDS-PAGEGelStainingorcid.org/0000-0002-8687-6817;Phone:#49(431)MethodsfortheIdentificationofShortOpenReadingFrame-50030300;Email:a.tholey@iem.uni-kiel.de;Fax:#49EncodedPeptides.Proteomics2020,20,2000084.(431)50030308(11)Cassidy,L.;Kaulich,P.T.;Tholey,A.DepletionofHigh-Molecular-MassProteinsfortheIdentificationofSmallProteinsandAuthorsShortOpenReadingFrameEncodedPeptidesinCellularProteomes.J.ProteomeRes.2019,18,1725−1734.PhilippT.Kaulich−SystematicProteomeResearch&(12)Cassidy,L.;Helbig,A.O.;Kaulich,P.T.;Weidenbach,K.;Bioanalytics,InstituteforExperimentalMedicine,Christian-Schmitz,R.A.;Tholey,A.MultidimensionalseparationschemesAlbrechts-UniversitätzuKiel,Kiel24105,Germanyenhancetheidentificationandmolecularcharacterizationoflow2902https://doi.org/10.1021/acs.jproteome.1c00115J.ProteomeRes.2021,20,2895−2903

8JournalofProteomeResearchpubs.acs.org/jprArticlemolecularweightproteomesandshortopenreadingframe-encoded(31)Veit,K.;Ehlers,C.;Ehrenreich,A.;Salmon,K.;Hovey,R.;peptidesintop-downproteomics.J.Proteomics2021,230,103988.Gunsalus,R.P.;Deppenmeier,U.;Schmitz,R.A.Globaltranscrip-(13)Bartel,J.;Varadarajan,A.R.;Sura,T.;Ahrens,C.H.;Maaß,S.;tionalanalysisofMethanosarcinamazeistrainGö1underdifferentBecher,D.OptimizedProteomicsWorkflowfortheDetectionofnitrogenavailabilities.Mol.Genet.Genomics2006,276,41−55.SmallProteins.J.ProteomeRes.2020,19,4004−4018.(32)Vizcaíno,J.A.;Deutsch,E.W.;Wang,R.;Csordas,A.;(14)Greening,D.W.;Simpson,R.J.AcentrifugalultrafiltrationReisinger,F.;Ríos,D.;Dianes,J.A.;Sun,Z.;Farrah,T.;Bandeira,N.;strategyforisolatingthelow-molecularweight(≤25K)componentofetal.ProteomeXchangeprovidesgloballycoordinatedproteomicshumanplasmaproteome.J.Proteomics2010,73,637−648.datasubmissionanddissemination.Nat.Biotechnol.2014,32,223−(15)Harney,D.J.;Hutchison,A.T.;Su,Z.;Hatchwell,L.;226.Heilbronn,L.K.;Hocking,S.;James,D.E.;Larance,M.Small-protein(33)Jäger,D.;Sharma,C.M.;Thomsen,J.;Ehlers,C.;Vogel,J.;EnrichmentAssayEnablestheRapid,UnbiasedAnalysisofOver100Schmitz,R.A.DeepsequencinganalysisoftheMethanosarcinamazeiLowAbundanceFactorsfromHumanPlasma.Mol.Cell.ProteomicsGö1transcriptomeinresponsetonitrogenavailability.PNAS2009,2019,18,1899−1915.106,21878−21882.(16)Cardon,T.;Hervé,F.;Delcourt,V.;Roucou,X.;Salzet,M.;(34)Dar,D.;Prasse,D.;Schmitz,R.A.;Sorek,R.WidespreadFranck,J.;Fournier,I.OptimizedSamplePreparationWorkflowforformationofalternative3’UTRisoformsviatranscriptionterminationImprovedIdentificationofGhostProteins.Anal.Chem.2020,92,inarchaea.Nat.Microbiol.2016,1,16143.1122−1129.(35)Shirai,A.;Matsuyama,A.;Yashiroda,Y.;Hashimoto,A.;(17)Carr,S.;Aebersold,R.;Baldwin,M.;Burlingame,A.L.;Clauser,Kawamura,Y.;Arai,R.;Komatsu,Y.;Horinouchi,S.;Yoshida,M.K.;Nesvizhskii,A.TheneedforguidelinesinpublicationofpeptideGlobalanalysisofgelmobilityofproteinsanditsuseintargetandproteinidentificationdata:WorkingGrouponPublicationidentification.J.Biol.Chem.2008,283,10745−10752.GuidelinesforPeptideandProteinIdentificationData.Mol.Cell.(36)Chen,D.;Geis-Asteggiante,L.;Gomes,F.P.;Ostrand-Proteomics2004,3,531−533.Rosenberg,S.;Fenselau,C.Top-DownProteomicCharacterizationof(18)Slavoff,S.A.;Mitchell,A.J.;Schwaid,A.G.;Cabili,M.N.;Ma,TruncatedProteoforms.J.ProteomeRes.2019,18,4013−4019.J.;Levin,J.Z.;Karger,A.D.;Budnik,B.A.;Rinn,J.L.;Saghatelian,A.(37)Cristobal,A.;Marino,F.;Post,H.;vandenToorn,H.W.P.;Peptidomicdiscoveryofshortopenreadingframe-encodedpeptidesMohammed,S.;Heck,A.J.R.TowardanOptimizedWorkflowforinhumancells.Nat.Chem.Biol.2013,9,59−64.Middle-DownProteomics.Anal.Chem.2017,89,3318−3325.(19)Wen,B.;Wang,X.;Zhang,B.PepQueryenablesfast,accurate,(38)Ferguson,J.T.;Wenger,C.D.;Metcalf,W.W.;Kelleher,N.L.andconvenientproteomicvalidationofnovelgenomicalterations.Top-downproteomicsrevealsnovelproteinformsexpressedinGenomeRes.2019,29,485−493.Methanosarcinaacetivorans.J.Am.Soc.MassSpectrom.2009,20,(20)Cao,X.;Khitun,A.;Na,Z.;Dumitrescu,D.G.;Kubica,M.;1743−1750.Olatunji,E.;Slavoff,S.A.ComparativeProteomicProfilingof(39)Francoleon,D.R.;Boontheung,P.;Yang,Y.;Kim,U.M.;UnannotatedMicroproteinsandAlternativeProteinsinHumanCellYtterberg,A.J.;Denny,P.A.;Denny,P.C.;Loo,J.A.;Gunsalus,R.Lines.J.ProteomeRes.2020,19,3418−3426.P.;OgorzalekLoo,R.R.S-layer,surface-accessible,andconcanavalin(21)Huesgen,P.F.;Lange,P.F.;Rogers,L.D.;Solis,N.;Eckhard,AbindingproteinsofMethanosarcinaacetivoransandMethanosarci-U.;Kleifeld,O.;Goulas,T.;Gomis-Rüth,F.X.;Overall,C.M.namazei.J.ProteomeRes.2009,8,1972−1982.LysargiNasemirrorstrypsinforproteinC-terminalandmethylation-(40)AlmagroArmenteros,J.J.;Tsirigos,K.D.;Sønderby,C.K.;siteidentification.Nat.Methods2015,12,55−58.Petersen,T.N.;Winther,O.;Brunak,S.;vonHeijne,G.;Nielsen,H.(22)Giansanti,P.;Tsiatsiani,L.;Low,T.Y.;Heck,A.J.R.SixSignalP5.0improvessignalpeptidepredictionsusingdeepneuralnetworks.Nat.Biotechnol.2019,37,420−423.alternativeproteasesformassspectrometry-basedproteomicsbeyond(41)Peng,M.;Taouatas,N.;Cappadona,S.;vanBreukelen,B.;trypsin.Nat.Protoc.2016,11,993−1006.Mohammed,S.;Scholten,A.;Heck,A.J.R.Proteasebiasinabsolute(23)Swaney,D.L.;Wenger,C.D.;Coon,J.J.Valueofusingproteinquantitation.Nat.Methods2012,9,524−525.multipleproteasesforlarge-scalemassspectrometry-basedproteo-mics.J.ProteomeRes.2010,9,1323−1329.(24)Müller,S.A.;Kohajda,T.;Findeiss,S.;Stadler,P.F.;Washietl,S.;Kellis,M.;vonBergen,M.;Kalkhof,S.Optimizationofparametersforcoverageoflowmolecularweightproteins.Anal.Bioanal.Chem.2010,398,2867−2881.(25)Linke,D.;Koudelka,T.;Becker,A.;Tholey,A.Identificationandrelativequantificationofphosphopeptidesbyacombinationofmulti-proteasedigestionandisobariclabeling.RapidCommun.MassSpectrom.2015,29,919−926.(26)Leitner,A.;Reischl,R.;Walzthoeni,T.;Herzog,F.;Bohn,S.;Förster,F.;Aebersold,R.Expandingthechemicalcross-linkingtoolboxbytheuseofmultipleproteasesandenrichmentbysizeexclusionchromatography.Mol.Cell.Proteomics2012,11,M111.014126.(27)Zhang,Y.;Li,Q.;Huang,J.;Wu,Z.;Huang,J.;Huang,L.;Li,Y.;Ye,J.;Zhang,X.AnApproachtoIncorporateMulti-EnzymeDigestionintoC-TAILSforC-TerminomicsStudies.Proteomics2018,18,1700034.(28)Lanigan,L.T.;Mackie,M.;Feine,S.;Hublin,J.-J.;Schmitz,R.W.;Wilcke,A.;Collins,M.J.;Cappellini,E.;Olsen,J.V.;Taurozzi,A.J.;etal.Multi-proteaseanalysisofPleistoceneboneproteomes.J.Proteomics2020,228,103889.(29)Miller,R.M.;Ibrahim,K.;Smith,L.M.ProteaseGuru:AToolforProteaseSelectioninBottom-UpProteomics.J.ProteomeRes.2021,DOI:10.1021/acs.jproteome.0c00954.(30)Maillet,N.RapidPeptidesGenerator:Fastandefficientinsilicoproteindigestion.NAR:GenomicsBioinf.2020,2,lqz004.2903https://doi.org/10.1021/acs.jproteome.1c00115J.ProteomeRes.2021,20,2895−2903

当前文档最多预览五页,下载文档查看全文

此文档下载收益归作者所有

当前文档最多预览五页,下载文档查看全文
温馨提示:
1. 部分包含数学公式或PPT动画的文件,查看预览时可能会显示错乱或异常,文件下载后无此问题,请放心下载。
2. 本文档由用户上传,版权归属用户,天天文库负责整理代发布。如果您对本文档版权有争议请及时联系客服。
3. 下载前请仔细阅读文档内容,确认文档内容符合您的需求后进行下载,若出现内容与标题不符可向本站投诉处理。
4. 下载文档时可能由于网络波动等原因无法下载或下载错误,付费完成后未能成功下载的用户请联系客服处理。
最近更新
更多
大家都在看
近期热门
关闭