您好,欢迎来到微智科技网。
搜索
您的当前位置:首页Proving the correctness of a complete microprocessor

Proving the correctness of a complete microprocessor

来源:微智科技网
ProvingtheCorrectnessofaCompleteMicroprocessor

ChristianJacobi,DanielKroening

Dept.14:ComputerScience,UniversityofSaarlandPostBox151150,D-66041Saarbruecken,Germany

email:cj,kroening@cs.uni-sb.de

Abstract.Thispaperpresentsstatusresultsofamicroprocessorverificationproject.Theauthorsverifyacomplete32-bitRISCmicroprocessorincludingthefloatingpointunitandthecontrollogicofthepipeline.Thepaperdescribesaformaldef-initionofa”correct”microprocessor.Thiscorrectnesscriterionisprovenforanimplementationusingformalmethods.AllproofsareverifiedmechanicallybymeansofthetheoremprovingsystemPVS.

1Introduction

Microprocessordesignisanerror-proneprocess.Withincreasingcomplexityofcurrentmicroprocessordesigns,formalverificationhasbecomecrucial.Inordertoachievecompletelyverifieddesigns,adjustingthedesignprocessitselfplaysanimportantrole:themorehigh-levelinformationonthedesignisavailable,thefastertheverificationcanbedone.

Theauthorsre-designedasimpleRISCprocessor,theDLX[1],withrespecttoverifiability.Thedesignincludesthecompletepipecontrolandforwardinglogic.Thefunctionunitsarefullyfeaturedincludingafloatingpointunit.Theyarenotabstractedbymeansofuninterpretedfunctions.Theproofsforthegluelogic,theALU,andfloat-ingpointunitareverifiedusingthetheoremprovingsystemPVS[2].

RelatedWorkRecentpapersshowthecorrectnessofcomplexdesignsorschedulersintheoremprovingsystemssuchasPVS.Hosabettuetal.[3]provebothsafetyandlivenessofTomasulo’salgorithmusingPVS.SwadaandHunt[4]provideanACL2[5]proofofacompletedesignimplementingaTomasuloschedulerwithreorderbuffer.

Henzingeretal.[6]verifyasimplepipelinedprocessorusingamodelchecker.McMillan[7]partlyautomatestheproofbyrefinementofTomasulo’salgorithmpre-sentedin[8]withthehelpofcompositionalmodelchecking.Thistechniqueisimprovedin[9]bytheoremprovingmethodstosupportanarbitraryregistersizeandnumberoffunctionunits.

Therearemanypublicationsontheverificationof(partsof)floatingpointunits.Bryantandhisgroupverifieddifferentfunctionunitsusingmodel-checking[10–12].AagaardandSegerverifiedamultiplierusingmodel-checkingcombinedwiththeoremproving[13].Claesenet.al.andO’Learyet.al.haveusedtheoremproverstoverify

anSRTintegerdivider[14],andanSRTintegersquarerootcircuit[15],respectively.Russinoffhasproventhecorrectnessofthemultiplication,divisionandsquarerootalgorithmsoftheAMDK7processor[16].Mostofthepublicationsciteddonotcoverdenormalnumbers.

ProjectStatusTheverificationofthepipelineandforwardinglogichasreachedahighlevelofautomation.However,theprocessofverifyingthefunctionunitsisnotauto-matedatall.Thefundamentalsofthefloatingpointmathematicsareverifiedalready.Theverificationoftheindividualfloatingpointcircuitsiswork-in-progress.

2TheSpecificationMachine

2.1HardwareModel

Boththespecificationdesignandthehardwarearemodeledasmathematicalmachine.Mathematicalmachinesareacommonmethodtomodelthebehaviorofarbitrarymi-croprocessorsystems.Forthispaper,thedefinitionofthemathematicalmachinefrom

whichconsistsofthe[17]isused:amathematicalmachineisatriple

followingcomponents:

isthesetofallpossibleconfigurationsof.Anelementofiscalledcon-figurationorstateofthemachine.

isaconfigurationof.–Theinitialconfiguration

mapsaconfigurationtoitssuccessor.–Thetransitionfunction–

Asequence

holds.

ofconfigurationsiscalledcomputationof

iff

NotationRegistersareusedinboththespecificationandtheimplementationofami-beafinitesetofregisters.Eachregistercancroprocessor.Let

.haveavaluewithinafinitedomain

Theconfigurationsetconsistsofthedomainsoftheregisters:

Theprojectionfunction

.Letbe

extractsthevalueofaregister

fromaconfiguration.

Letfor

bepartofacomputationofamathematicalmachine.

.Letbeashorthandforthefollowingprojectionon:

isashorthand

Inanalogytothat,let

transitionfunction:

beashorthandforthefollowingprojectiononastate

Asignalisdefinedasamappingfromthesetofconfigurationsintoanarbitrary

:domain

2.2DLXArchitecture

OurdesignimplementstheDLXarchitecture.TheDLXarchitecture[1]featuresaRISCinstructionsetincludedbothintegerandfloatingpointinstructions.Theintegercoreistakenfrom[17]andextendedbyafloatingpointregisterfile(FPR)andfloatingpointinstructionsasdescribedin[18].

2.3CorrectIEEEFloatingPointArithmetic

Ourprimarygoalistheverificationofacompleteprocessor.Thus,weformallyverifythecorrectnessofafloatingpointunit(FPU).Intheprocessorframework,theFPUisamulti-cyclefunctionunit,andcan(almost)beseenasablackbox.TheFPUsupportstheoperationsaddition,subtraction,multiplication,divisionandsquareroot.TheFPUhandlesnormalanddenormalnumbers,specialvalues,traps,andinterrupts.Thisisincontrasttomostpreviousresults,wheredenormalnumbers,trapsandinterruptsaredisregarded.

ThecorrectnesscriterionsfortheFPUaregivenbytheIEEEstandard754[19].ThestandardisinformalwhichmakesitunusablefortheformalverificationoftheFPU.OnethereforehastoformalizetheIEEEstandard;thisformalizationhastopreservethenotionofthestandard.Inherently,onecannotprovetheequivalenceoftheformalandtheinformalspecification.Theformalspecificationshavetoconvinceanybodyoftheircorrectness.WewillgivethespecificationoftheIEEEroundingmodeto

up,roundzeroarenot

ascomplicatedasthemodeto

WecallanIEEE-factoringsemi-representable,ifisrepresentable.Wecallareal(semi-)representable,ifis(semi-)representable.

Representablenumbersexactlycorrespondtotherepresentablenumbersasdefinedinthestandard.Inthefollowing,wewillonlyinvestigatesemi-representablefactorings.Inorderto“round”semi-representablefactoringstorepresentableones,onejusthastodecidewhetheronehastoroundtoinfinityornot.Thiscanbasicallybedonebya

.comparisonofwith

Weproceedwiththedefinitionoftheroundingfunction.Thestandarddefinesthe

roundingmodeto

nearestasfollows:

Nowwehaveadefinitionrelativelyclosetothehardwarebutfarawayfromthespecificationinthestandard.Ononehand,thisenablessimplerimplementationandverificationoftherounder,aswewillseeinsection3.5.Ontheotherhand,itisnotobviousthatthesedefinitionsconformtothetheIEEEstandard.Wegivethreetheoremswhichjustifythisclaim.Theorem1.Foranyreal,

issemi-representable.

Thenexttheoremstatesthattheresultoftheroundingfunctionindeedisanearestrepresentablenumber.

Theorem2.Foranyreal,andanysemi-representableIEEE-factoring

.holds

,it

Thethirdtheoremstatesthatanumberwithleastsignificantbitzeroischosenin

caseofatiebetweenthetwonearestrepresentablenumbers.Thus,wefirstboundthe

.Wethenshowthatthesignificantisevenifthemaxi-distancebetweenandmumdistanceisreached.

Theorem3.Foranyreal,itholds

and,then

.If

iseven.

Wewillgiveatheoreminsection3.5whichsimplifiestheverificationoftheround-ingunitbydecomposingitintosmallerparts.Thistheoremwillseemfairlyobvious

justbecauseweinvestedreasonableeffortinthedefinitionoftheroundingfunction.Incontrasttoourdefinition,theroundingresultin[18]isdefinedas“arepresentablenumberclosestto.Iftherearetwosuchnumbers,onechoosesthenumberwithevensignificant”.ThiscoincidesobviouslywiththeIEEEstandard.Nevertheless,itisimpracticaltoverifytherounderwiththisinformaldefinition.Theeffortwehavespentonthedefinitionoftheroundingfunctionpaysoffwhenverifyingthehardwareimplementation.

3ImplementingtheProcessor

3.1ForwardingandStallingLogic

Thedesignusesacommonfivestagepipelineaspresentedin[1,18].Thepipelinedma-chineisgeneratedbyanautomatictransformationfromasequentialpreparedmachineasdescribedin[17].

1

fullfull

fullfull

fullfull

Fig.1.Theregistersofa-stagepipeline.Thefunctions

to

representthedatapaths.

Ourdesignfeaturesacompletestallengine[21,18].Incontrasttothestallenginepresentedin[18],itallowsstallingallstagesindividually.Thestallengineistakenfrom[17]withsmallchanges:aclockenablesignalisnolongerused.Thefullbitsareupdatedineverycycleinstead(figure1).

Thetransitionfunctionforthefullbitsischangedaccordingly;thefullbitofeachstageissetiffthestageisupdatedorstalled.

full

Thecalculationofthesignalsandisnotchangedandtakenfrom[17].The

istheclockenablesignaloftheoutputregistersofstage:theregistersaresignal

updatediffthestageisfullandnotstalled:

full

Lemma3.Givenacycle,thevaluesoftheschedulingfunctionsoftwoadjoiningstagesareeitherequalorthevalueoftheschedulingfunctionofthelaterstageisgreaterbyone.

Lemma4.Thevaluesareequaliffthefullbitofthelaterstageisnotset.

Negatingbothsidesofthelastequationandapplyinglemma3resultsin:

ProofTheproofofthelemmasabovedependsonthestallengine.Itisaninvariant

.proofbyinduction.Lemma2forcycleisshownusinglemma4forcycle

.Lemma3forcycleisshownusinglemma2incycleandlemma4incycle

Lemma4isshownusinglemma2and3incycle.

Duetolackofspace,onlytheinductionstepforlemma2isshownhere:Theclaim

holdsbydefinition.Lethold.Forthecase,forthecase

andtheclaimisshowntheclaimfollowsfromthedefinitionof.For

,whichstatesthattheclaimisequivalentto.usinglemma4forcycle

signals.Thisistruebecauseofthedefinitionofthe

Theorem4isthenshownbyinductionon:theclaimisobviousforstageswhich

),thecorrectnessarenotupdatedinagivencycle.Ifthestageisupdated(i.e.,

ofthesevaluesisarguedbyshowingthecorrectnessoftheinputvaluesofthestage.Anexampleproofusingthelemmasabovefortheinstructionfetchstageisin[17].

3.3Liveness

oftheThelivenesscriterionisformalizedasfollows:foranygivenconfiguration

specificationmachine,weprovethattheimplementationmachinecalculatestheseval-holds.ueswithinafiniteamountoftime,i.e.,thereisafinitesuchthat

Theproofismadebyshowingthatanyactivestallsignalbecomesde-activewithinafiniteamountoftime.Thisisaproofbyinductiononthenumberofstagesbeginningwiththelaststage.

3.4IntegerUnit

Ourdesignfeaturesanintegerunit(ALU).Itsupportsaddition,subtraction,shiftandcompareoperations,andbit-wiseoperations(AND,OR,XOR).TheALUisverifiedcompletelywiththetheoremprovingsystemPVS.Thisincludesanarbitrary-sizedcarrylookaheadadder.However,theimplementationandtheproofforthecarrylookaheadadderisincludedonlyinordertoachievecompleteness.Inordertocreatehardware,apre-definedadderfromthevendorlibraryisused.

AUnpackA’BUnpack

Operand BuseNormShift

fs

B’SigRd

Mul / Div

Sqrt

PostNorm

Rounder

Result Bus

Add / Sub

ExpRd

Result Bus

Fig.2.ToplevelschematicsoftheFPUFig.3.Toplevelschematicsoftherounder

3.5FloatingPointUnit

Figure2showsthetop-levelschematicoftheFPU.TheprocessorfeedspackedIEEEnumbers[19]andintotheFPU.Theunpackercircuitconvertsthesenumbersintothefactoringformatdescribedinsection2.3.Dependingontheoperation,theoperandsandarefedintooneofthefunctionunits.ThelaststageroundstheresultoftheoperationtoarepresentableandpackedIEEEnumber,andplacestheresultontheresult-busoftheprocessor.

Thedesignispipelined,i.e.,thedesignincludesregisterswhichstoreintermediateresults.ThedivisioniscarriedoutusingtheNewton-Raphsonmethod.Thus,thefunc-tionunitformultiplicationanddivisioncontainsloopstofeedbackintermediateresultsforthenextNewton-Raphsoniteration.

Completehardwareschematicsatthegatelevelcanbefoundin[18].Wewillfocusontherounder.Wedemonstrateapartoftheverificationoftheroundingunitexemplary.Wegiveatheoremwhichdecomposestheroundingfunctionintothreesimplerfunc-tionswhichthenserveasabasisfortheimplementationoftherounder.Thethreefunc-andthepost-normalizationtionsarethenormalizationshift,thesignificantround

.Figure3showsadecompositionoftheroundinghardwareincorrespondingsub-circuits.Thesub-circuit“ExpRd”roundstoinfinity,ifanoverflowoccurs.Thispartisnotyetformalized.

wasdefinedastheuniqueIEEE-factoringwithForreals,

insection2.3.

Lemma5.Foranyreal1.2.3.

and

,itholds

iff

,

,and

.

Lemma6.Foranyfactoring(notnecessarilyIEEE-factoring)

,itholds

with

1.2.3.

,

.

,and

Weassumethattheinputtotherounderisencodedasafactoring,butnotnecessarily

asanIEEE-factoring.Thenormalizationshiftcanthenbecomputedinhardwarebyaleadingzerocountertocomputethelogarithmof,aleft/rightshiftertocompute,andanaddertoadjusttheexponent.

ForIEEE-factorings

,wedefinethesignificantround

asthesignificantroundedtofractionalbinarydigits.Themultiplicationwiththesignisnecessarysincetheroundingdecisiondependsonthesign.Inhardware,thesignificantroundiscomputedbytheexaminationofthelow-orderbitsofthesignificant.Thistechniqueiscalledstickybitcomputation[18].

Lemma7.ForanyIEEE-factoring1.2.

,itholds

,if,if

isdenormal,isnormal.andanyreal

with

,itholds

Thelemmaisprovenbyunfoldingthedefinitions,andapplyingthefollowinglemma:Lemma8.Foranyintegers

.

InPVS,thislemmaisprovenautomaticallybythepowerfulproof-strategygrind.beanIEEE-factoring,andlet.IfthesignificantroundLet

,theresulthastobepost-normalized;thesignificantissettoyieldsasignificant

1,andtheexponentisincremented.Thisisaccomplishedbythefunction:

ifif

Thevalueofthefactoringsisobviouslypreservedbythefunction.Thefunctionisimplementedbyanincrementerfortheexponentandanmultiplexerforthesignifi-cant.

Assumethatthesub-circuitsinfigure3indeedcomputethecorrespondingfunc-tions.Thenthecorrectnessofthewholerounderfollowsfromthenexttheorem:Theorem5.Foranyfactoring

(notnecessarilyanIEEE-factoring),itholds

Thistheoremisprovenbydefinitionunfolding,theuseofthelemmasabove,andsomerulesonexponentiation.

Theorem5decomposestheverificationproblemintosmallersub-problemssuchthatthesub-circuitsfromfigure3canbeverifiedseparately.Thesesub-circuitsarefurtherdecomposedin[18].

4ConvertingMathematicalMachinestoVerilogHDL

TheimplementationaboveisspecifiedasmathematicalmachineinthePVSlanguage.Allproofsrelyonthisspecification.Thisspecificationisconvertedintoasynthesiz-ablesubsetofVerilogHDL[22].Thisisdoneautomaticallybyaprogram.Asimilarapproachismadein[23].

Theprogramislimitedtoconvertmathematicalmachines,i.e.,ittakesaconfigu-rationset,aninitialconfiguration,andatransitionfunctionasinputs.Thistoolisnotlimitedtoin-orderdesigns.

5FutureWork

Weareinprogressofextendingthedesignwithamechanismforspeculativeexecutionandpreciseinterrupts.Furthermore,out-of-orderexecutioncapabilitiesareaddedbymeansofaTomasuloscheduler.

Themathematicsofthefloatingpointarithmetichavebeenverifiedcompletely.Ourfutureworkistoverifythecorrespondingcircuits.

Acknowledgment

TheauthorswouldliketothankMichaelBosch,MichaelKlein,andJochenPreissforvaluablediscussions.

References

1.J.L.HennessyandD.A.Patterson.ComputerArchitecture:AQuantitativeApproach.Mor-ganKaufmannPublishers,INC.,SanMateo,CA,2ndedition,1996.

2.D.Cyrluk,S.Rajan,N.Shankar,andM.K.Srivas.Effectivetheoremprovingforhardwareverification.In2ndInternationalConferenceonTheoremProversinCircuitDesign,1994.3.RaviHosabettu,GaneshGopalakrishnan,andMandayamSrivas.AproofofcorrectnessofaprocessorimplementingTomasulo’salgorithmwithoutareorderbuffer.InCorrectHard-wareDesignandVerificationMethods:IFIPWG10.5InternatinalConferenceonCorrectHardwareDesignandVerificationMethods(CHARME),pages8–22.Springer,1999.

4.JunSawadaandWarrenA.Hunt.Resultsoftheverificationofacomplexpipelinedmachinemodel.InCorrectHardwareDesignandVerificationMethods:IFIPWG10.5InternatinalConferenceonCorrectHardwareDesignandVerificationMethods(CHARME),pages313–316.Springer,1999.

5.MattKaufmannandJ.S.Moore.ACL2:Anindustrialstrengthversionofnqthm.InProc.oftheEleventhAnnualConferenceonComputerAssurance,pages23–34.IEEEComputerSocietyPress,1996.

6.ThomasA.Henzinger,ShazQadeer,andSriramK.Rajamani.Youassume,weguarantee:Methodologyandcasestudies.InProc.10thInternationalConferenceonComputer-aidedVerification(CAV),1998.

7.K.L.McMillan.VerificationofanimplementationofTomasulo’salgorithmbycompositionmodelchecking.InProc.10thInternationalConferenceonComputerAidedVerification,pages110–121,1998.

8.W.DammandA.Pnueli.Verifyingout-of-orderexecutions.InH.F.LiandD.K.Probst,editors,AdvancesinHardwareDesignandVerification:IFIPWG10.5InternatinalCon-ferenceonCorrectHardwareDesignandVerificationMethods(CHARME),pages23–47.Chapmann&Hall,1997.

9.M.L.McMillan.Verificationofinfinitestatesystemsbycompositionalmodelchecking.InCorrectHardwareDesignandVerificationMethods:IFIPWG10.5InternatinalConfer-enceonCorrectHardwareDesignandVerificationMethods(CHARME),pages219–233.Springer,1999.

10.Y.-A.ChenandR.E.Bryant.Verificationoffloating-pointadders.LectureNotesinCom-puterScience,1427,1998.

11.Y.-A.ChenandR.E.Bryant.PHDD:Anefficientgraphrepresentationforfloatingpoint

circuitverification.InIEEE/ACMInternationalConferenceonComputerAidedDesign;Di-gestofTechnicalPapers(ICCAD’97),pages2–7,Washington-Brussels-Tokyo,November1997.IEEEComputerSocietyPress.

12.Y.-A.Chen,E.Clarke,P.-H.Ho,andY.Hoskote.Verificationofallcircuitsinafloating-point

unitusingword-levelmodelchecking.LectureNotesinComputerScience,1166,1996.13.M.D.AagaardandC.-J.H.Seger.Theformalverificationofapipelineddouble-precision

IEEEfloating-pointmultiplier.InInternationalConferenceonComputerAidedDesign,pages7–10,LosAlamitos,Ca.,USA,November1995.IEEEComputerSocietyPress.

14.L.Claesen,D.Verkest,andH.DeMan.Aproofofthenon-restoringdivisionalgorithmand

itsimplementationonanALU.InFormalMethodsinSystemDesign,vol.5,pages5–31,1994.

15.J.O’Leary,M.Leeser,J.Hickey,andM.Aagaard.Non-restoringintegersquareroot:Acase

studyindesignbyprincipledoptimization.LectureNotesinComputerScience,901,1995.16.DavidM.Russinoff.AmechanicallycheckedproofofIEEEcomplianceofthefloatingpoint

multiplication,divisionandsquarerootalgorithmsoftheAMD-K7processor.LMSJournalofComputationandMathematics,1:148–200,1998.17.DanielKr¨oning,WolfgangPaul,andSilviaM.M¨uller.Provingthecorrectnessofpipelined

micro-architectures.InKlausWaldschmidtandChristophGrimm,editors,Proc.ofITG/GI/GMM-Workshop”MethodenundBeschreibungssprachenzurModellierungundVer-ifikationvonSchaltungenundSystemen”,pages–98.VDEVerlag,2000.18.SilviaM.M¨ullerandWolfgangPaul.ComputerArchitecture:ComplexityandCorrectness.

Springer,2000.

19.InstituteofElectricalandElectronicsEngineers.ANSI/IEEEstandard754–1985,IEEEStan-dardforBinaryFloating-PointArithmetic,1985.forareadableaccountseethearticlebyW.J.Codyetal.intheIEEEMICROJournal,Aug.1984,84–100.

20.PaulS.Miner.DefiningtheIEEE-854floating-pointstandardinPVS.Technicalreport,

NASA,LangleyResearchCenter,1995.

21.WolfgangPaul.RecherarchitekturIISS98,1998.LectureNotes.

22.DonaldE.ThomasandPhilipMoorby.TheVerilogHardwareDescriptionLanguage.

Kluwer,Boston;Dordrecht;London,1991.

23.JamesC.HoeandArvind.Hardwaresynthesisfromtermrewritingsystems.InProc.of

VLSI’99,Lisbon,Portugal,1999.

因篇幅问题不能全部显示,请点此查看更多更全内容

Copyright © 2019- 7swz.com 版权所有 赣ICP备2024042798号-8

违法及侵权请联系:TEL:199 18 7713 E-MAIL:2724546146@qq.com

本站由北京市万商天勤律师事务所王兴未律师提供法律服务