Tải bản đầy đủ - 0 (trang)
Chapter 7.  Know when and how to code for scalability

Chapter 7.  Know when and how to code for scalability

Tải bản đầy đủ - 0trang

Summary

Bewareofexplosivedatagrowth:Withoutoptimizing

prematurely,keepaneyeonasymptoticcomplexity.Algorithms

thatworkonuserdatashouldtakeapredictable,and

preferablynoworsethanlinear,timewiththeamountofdata

processed.Whenoptimizationisprovablynecessaryand

important,andespeciallyifit'sbecausedatavolumesare

growing,focusonimprovingbig-Ohcomplexityratherthanon

micro-optimizationslikesavingthatoneextraaddition.



Discussion

ThisItemillustratesonesignificantbalancepointbetween

Items8and9,"don'toptimizeprematurely"and"don't

pessimizeprematurely."ThatmakesthisatoughItemtowrite,

lestitbemisconstruedas"prematureoptimization."Itisnot

that.

Here'sthebackgroundandmotivation:Memoryanddisk

capacitycontinuetogrowexponentially;forexample,from

1988to2004diskcapacitygrewbyabout112%peryear

(nearly1,900-foldgrowthperdecade),whereasevenMoore's

Lawisjust59%peryear(100-foldperdecade).Oneclear

consequenceisthatwhateveryourcodedoestodayitmaybe

askedtodotomorrowagainstmoredatamuchmoredata.Abad

(worsethanlinear)asymptoticbehaviorofanalgorithmwill

soonerorlaterbringthemostpowerfulsystemtoitsknees:

Justthrowenoughdataatit.

Defendingagainstthatlikelyfuturemeanswewanttoavoid

"designingin"whatwillbecomeperformancepitsinthefaceof

largerfiles,largerdatabases,morepixels,morewindows,more

processes,morebitssentoverthewire.Oneofthebigsuccess

factorsinfuture-proofingoftheC++standardlibraryhasbeen

itsperformancecomplexityguaranteesfortheSTLcontainer

operationsandalgorithms.

Here'sthebalance:Itwouldclearlybewrongtooptimize

prematurelybyusingalessclearalgorithminanticipationof

largedatavolumesthatmaynevermaterialize.Butitwould

equallyclearlybewrongtopessimizeprematurelybyturninga

blindeyetoalgorithmiccomplexity,a.k.a."big-Oh"complexity,

namelythecostofthecomputationasafunctionofthenumber

ofelementsofdatabeingworkedon.

Therearetwopartstothisadvice.First,evenbeforeknowing



whetherdatavolumeswillbelargeenoughtobeanissuefora

particularcomputation,bydefaultavoidusingalgorithmsthat

workonuserdata(whichcouldgrow)butthatdon'tscalewell

withdataunlessthereisaclearclarityandreadabilitybenefitto

usingalessscalablealgorithm(seeItem6).Alltoooftenwe

getsurprised:Wewritetenpiecesofcodethinkingthey'llnever

havetooperateonhugedatasets,andthenwe'llturnouttobe

perfectlyrightnineofthetentimes.Thetenthtime,we'llfall

intoaperformancepitweknowithashappenedtous,andwe

knowithashappenedorwillhappentoyou.Sure,wegofixit

andshipthefixtothecustomer,butitwouldbebettertoavoid

suchembarrassmentandrework.So,allthingsbeingequal

(includingclarityandreadability),dothefollowingupfront:

Useflexible,dynamically-allocateddataandinsteadof

fixed-sizearrays:Arrays"largerthanthelargestI'llever

need"areaterriblecorrectnessandsecurityfallacy.(See

Item77.)Arraysareacceptablewhensizesreallyarefixed

atcompiletime.

Knowyouralgorithm'sactualcomplexity:Bewaresubtle

trapslikelinear-seemingalgorithmsthatactuallycallother

linearoperations,makingthealgorithmactuallyquadratic.

(SeeItem81foranexample.)

Prefertouselinearalgorithmsorfasterwhereverpossible:

Constant-timecomplexity,suchaspush_backandhash

tablelookup,isperfect(seeItems76and80).O(logN)

logarithmiccomplexity,suchasset/mapoperationsand

lower_boundandupper_boundwithrandom-access

iterators,isgood(seeItems76,85,and86).O(N)linear

complexity,suchasvector::insertandfor_each,is

acceptable(seeItems76,81,and84).

Trytoavoidworse-than-linearalgorithmswhere

reasonable:Forexample,bydefaultspendsomeefforton



findingareplacementifyou'refacingaO(NlogN)orO(N2)

algorithm,sothatyourcodewon'tfallintoa

disproportionatelydeepperformancepitintheeventthat

datavolumesgrowsignificantly.Forexample,thisisa

majorreasonwhyItem81advisestopreferrangemember

functions(whicharegenerallylinear)overrepeatedcallsof

theirsingle-elementcounterparts(whicheasilybecomes

quadraticasonelinearoperationinvokesanotherlinear

operation;seeExample1ofItem81).

Neveruseanexponentialalgorithmunlessyourbackis

againstthewallandyoureallyhavenootheroption:

Searchhardforanalternativebeforesettlingforan

exponentialalgorithm,whereevenamodestincreasein

datavolumemeansfallingoffaperformancecliff.

Second,aftermeasurementsshowthatoptimizationis

necessaryandimportant,andespeciallyifit'sbecausedata

volumesaregrowing,focusonimprovingbig-Ohcomplexity

ratherthanonmicro-optimizationslikesavingthatoneextra

addition.

Insum:Prefertouselinear(orbetter)algorithmswherever

possible.Avoidpolynomialalgorithmswherereasonable.Avoid

exponentialalgorithmswithallyourmight.



References

[Bentley00]Đ6,Đ8,Appendix4[Cormen01][Kernighan99]

Đ7[Knuth97a][Knuth97b][Knuth98][McConnell93]

Đ5.1-4,Đ10.6[Murray93]Đ9.11[Sedgewick98]

[Stroustrup00]Đ17.1.2



8.Don'toptimizeprematurely

Summary

Discussion

Examples

Exceptions

References



Summary

Spurnotawillinghorse(Latinproverb):Prematureoptimization

isasaddictiveasitisunproductive.Thefirstruleof

optimizationis:Don'tdoit.Thesecondruleofoptimization(for

expertsonly)is:Don'tdoityet.Measuretwice,optimizeonce.



Tài liệu bạn tìm kiếm đã sẵn sàng tải về

Chapter 7.  Know when and how to code for scalability

Tải bản đầy đủ ngay(0 tr)

×