We analysed more than 40 000 000 questions and answers on stackoverflow.com to bring you the top of most mentioned books (5720 in total)

How we did it:

  • We got database dump of all user-contributed content on the Stack Exchange network (can be downloaded here)
  • Extracted questions and answers made on stackoverflow
  • Found all amazon.com links and counted it
  • Created tag-based search for your convenience
  • Brought it to you

For any feedback, any questions, any notes or just for chat - feel free to follow us on social networks

Recomended tags

Top etl books mentioned on stackoverflow.com

Working Effectively with Legacy Code

Michael C. Feathers

The average book on Agile software development describes a fairyland of greenfield projects, with wall-to-wall tests that run after every few edits, and clean & simple source code.


The average software project, in our industry, was written under some aspect of code-and-fix, and without automated unit tests. And we can't just throw this code away; it represents a significant effort debugging and maintaining. It contains many latent requirements decisions. Just as Agile processes are incremental, Agile adoption must be incremental too. No more throwing away code just because it looked at us funny.


Mike begins his book with a very diplomatic definition of "Legacy". I'l skip ahead to the undiplomatic version: Legacy code is code without unit tests.


Before cleaning that code up, and before adding new features and removing bugs, such code must be de-legacified. It needs unit tests.


To add unit tests, you must change the code. To change the code, you need unit tests to show how safe your change was.


The core of the book is a cookbook of recipes to conduct various careful attacks. Each presents a particular problem, and a relatively safe way to migrate the code towards tests.


Code undergoing this migration will begin to experience the benefits of unit tests, and these benefits will incrementally make new tests easier to write. These efforts will make aspects of a legacy codebase easy to change.


It's an unfortunate commentary on the state of our programming industry how much we need this book.

More on Amazon.com

Groovy in Action

Dierk K├Ânig

A guide to the Groovy programming language covers such topics as shell scripting, dynamic programming, Grails, GDK, and XML.

More on Amazon.com

Agile Data Warehouse Design

Lawrence Corr, Jim Stagnitto

Agile Data Warehouse Design is a step-by-step guide for capturing data warehousing/business intelligence (DW/BI) requirements and turning them into high performance dimensional models in the most direct way: by modelstorming (data modeling ] brainstorming) with BI stakeholders. This book describes BEAM, an agile approach to dimensional modeling, for improving communication between data warehouse designers, BI stakeholders and the whole DW/BI development team. BEAM provides tools and techniques that will encourage DW/BI designers and developers to move away from their keyboards and entity relationship based tools and model interactively with their colleagues. The result is everyone thinks dimensionally from the outset! Developers understand how to efficiently implement dimensional modeling solutions. Business stakeholders feel ownership of the data warehouse they have created, and can already imagine how they will use it to answer their business questions. Within this book, you will learn: Agile dimensional modeling using Business Event Analysis & Modeling (BEAM ) Modelstorming: data modeling that is quicker, more inclusive, more productive, and frankly more fun! Telling dimensional data stories using the 7Ws (who, what, when, where, how many, why and how) Modeling by example not abstraction; using data story themes, not crow's feet, to describe detail Storyboarding the data warehouse to discover conformed dimensions and plan iterative development Visual modeling: sketching timelines, charts and grids to model complex process measurement - simply Agile design documentation: enhancing star schemas with BEAM dimensional shorthand notation Solving difficult DW/BI performance and usability problems with proven dimensional design patterns LawrenceCorr is a data warehouse designer and educator. As Principal of DecisionOne Consulting, he helps clients to review and simplify their data warehouse designs, and advises vendors on visual data modeling techniques. He regularly teaches agile dimensional modeling courses worldwide and has taught dimensional DW/BI skills to thousands of students. Jim Stagnitto is a data warehouse and master data management architect specializing in the healthcare, financial services, and information service industries. He is the founder of the data warehousing and data mining consulting firm Llumino.

More on Amazon.com

The Data WarehouseETL Toolkit

Ralph Kimball, Joe Caserta

Cowritten by Ralph Kimball, the world's leading data warehousing authority, whose previous books have sold more than 150,000 copies Delivers real-world solutions for the most time- and labor-intensive portion of data warehousing-data staging, or the extract, transform, load (ETL) process Delineates best practices for extracting data from scattered sources, removing redundant and inaccurate data, transforming the remaining data into correctly formatted data structures, and then loading the end product into the data warehouse Offers proven time-saving ETL techniques, comprehensive guidance on building dimensional structures, and crucial advice on ensuring data quality

More on Amazon.com

javac#c++c.netalgorithmphppythonjavascriptdesignasp.netlanguage-agnosticdesign-patternsandroidoopsqllinuxdatabasematharchitectureperformanceprogramming-languagesresourcesuser-interfacemysqlhtmlsql-serverwindowsiphonerubyiosmultithreadingdata-structuresresearchobjective-csecuritycomputer-scienceassemblyruby-on-railsjquerydatabase-designdocumentationasp.net-mvcunit-testingrcompiler-constructiontestingunixcsssoftware-engineeringwpfartificial-intelligenceweb-applicationsvb.netreferenceclassvisual-studioweb-servicesoptimizationarraysnetworkingproject-managementjava-eeumleclipseosxcoding-styleagilegraphicswinformsoperating-systemmemory-managementoracleembeddedspringwinapivisual-c++image-processingmodel-view-controllerajaxhardwareparsingfunctional-programmingregexamazonmemorytddtheoryinheritancefunction3dstatisticsperlopenglkernelopen-sourcexcodeapistringnlpxmlcocoawcfentity-frameworkdebuggingmachine-learninginterfacerefactoringactionscript-3concurrencymatlabrestdomain-driven-designdelphic#-4.0stlgraphpointersnode.jsscriptingopengl-estemplatessharepointidelispsilverlightdjangoprocessparallel-processingamazon-web-servicesuntaggedcommand-linex86scalasocketssql-server-2008visual-studio-2008frameworkshtml5audioobjectlistneural-networkproject-planningvb6visual-studio-2010flexswinglinux-kernelstandardslogicproductivitysearchencryptionc++11mobilec++-faqqtscrumgccscalabilityf#genericsmfcflashhaskelltsqlsortingsoftware-designversion-controllinqsql-server-2005treeweb-scrapingterminologyimagemethodologyasynchronousvbaxamlvariablesshellasp.net-mvc-3boostexcelooadopencvusbcryptographyc#-3.0eventsphysicstypesapachepluginsnetwork-programminghibernatetcpclojurematrixsyntaxmethodsssasrecursionsignal-processinggeometrycachingfortranpostgresqlcocoa-touchdependency-injectionschemestackipadgame-enginefileclass-designauthenticationinterpreterdeploymentarduinomodelruby-on-rails-32djsondirectxbluetoothfrontendusabilityspring-mvccomputer-visionformslambdasoawebinternalscollectionslinux-device-driverfilesystemsejbexceptionnhibernatevectorqadllcomlanguage-design.net-3.5linq-to-sqlvideogarbage-collectionhttpembedded-linuxjvmsqlitenaming-conventionssystembashimplementationmvvmtimezend-frameworkwebsitebddjspdata-access-layerthread-safetyscientific-computingvalidationindexingjunitosdevmockingopensslconstructorgrailsuser-experiencexpathreal-timerelational-databasecompilationmodelingtextmongodbdata-modelingormextreme-programmingprotocolsdevelopment-environmentprojectwebformsanalysisposixlow-levelclient-serversingletonservletsdata-miningsynchronizationclrcomparisonlanguage-featuresamazon-product-apimonocommon-lispmicrocontrollercontinuous-integrationhadoopxnacomplexity-theoryasp.net-mvc-4diagramcudasimulationencapsulationrandomhyperlinkdommacrosmusicshaderprojectsgwtjoinms-accesse-commercelinked-listwindows-phone-7linkerbinarydictionarystaticanti-patternscontent-management-systempolymorphismloopsstructurevisualizationstructactionscriptjpasvndata-warehouseabstract-classgroovyclosurescss3distributedprologtfsexception-handlinganimationdynamic-programmingdiscrete-mathematicssocial-networkingiorepository-patternif-statementcpuhashautomated-testssslkeyboardcode-reviewdynamicmetaprogrammingseleniumeclipse-pluginestimationheapgenetic-algorithmcode-generationlockingclassificationolapdrupalroboticsreverse-engineeringspecificationscomputational-geometryazurenormalizationmemory-leaksdriverssisopenclprofilingopengl-es-2.0grammarraspberry-picastingjava-meelectronicsmsbuildpthreadsarmxhtmlscreen-scrapingpropertiescanvastomcatuser-interactionrenderingpython-3.xstored-proceduresinputnullandroid-ndknosql.net-4.0scopeplsqlhistorydevice-drivervhdlios5pattern-matchingmessagingrequirementscommentscpu-architectureubuntujsfcommunicationasp.net-ajaxnunitipcinversion-of-controldesign-principlesclouderlangdefinitionworkflown-tierxsltnamespacesmoduleblogssolid-principlessharepoint-2010gitgpudelegatesextjsdatasetlayoutsoaphigh-availabilityfftmakefileintegration-testingparameterscomputer-architecturenetbeansswiftruntimereporting-servicesquery-optimizationlegacygoogle-app-engineios4windows-7pocograph-theorydsltransactionsasp-classicautomationconfigurationsharepoint-2007workflow-foundationuse-caseinitializationfacebookreflectionbuttoniteratorado.netbig-o