|
 |
|
 |
 |
Discovering Knowledge in Data: An Introduction to Data Mining
|
by Daniel T. Larose
Sales Rank: 313303
|
List Price: $87.95
$70.36
At Amazon

|
|
Hardcover: 240 pages
Publisher: Wiley-Interscience; 1 edition November 18, 2004
Language: English
ISBN-10: 0471666572
ISBN-13: 978-0471666578
Product Dimensions:
9.3 x 6.3 x 0.8 inches
Shipping Weight: 1.2 pounds
Product Review
"an excellent introductory book of data mining. I recommend it for every one who wants to learn data mining." (Journal of Statistical Software, May 2006)
"selected material is described in a simple, clear, and…precise waycase studies…examples, and screen shots has definitely added to the learning value of the book." (Journal of Biopharmaceutical Statistics, January/February 2006)
"does a good job introducing data mining to novicesit skillfully previews some of the basic statistical issues needed to understand data mining techniques." (Journal of the American Statistical Association, December 2005)
"If you need a book to help colleagues understand your data mining procedures and results, this is the one you want to give them." (Technometrics, November 2005)
"…an excellent book…it should be useful for anyone interested in analysing epidemiological data." (Statistics in Medical Research, October 2005)
"an excellent 'white-box' overview of established approaches for data analysis, in which readers are shown how, why, and when the methods work." (CHOICE, April 2005)
"Larose has the making of a good series of books on data mining…I, for one, look forward to the next two books in the series." (Computing Reviews.com, February 15, 2005)
Product Description
Learn Data Mining by doing data mining Data mining can be revolutionary-but only when it's done right. The powerful black box data mining software now available can produce disastrously misleading results unless applied by a skilled and knowledgeable analyst. Discovering Knowledge in Data: An Introduction to Data Mining provides both the practical experience and the theoretical insight needed to reveal valuable information hidden in large data sets. Employing a "white box" methodology and with real-world case studies, this step-by-step guide walks readers through the various algorithms and statistical structures that underlie the software and presents examples of their operation on actual large data sets. Principal topics include: * Data preprocessing and classification * Exploratory analysis * Decision trees * Neural and Kohonen networks * Hierarchical and k-means clustering * Association rules * Model evaluation techniques Complete with scores of screenshots and diagrams to encourage graphical learning, Discovering Knowledge in Data: An Introduction to Data Mining gives students in Business, Computer Science, and Statistics as well as professionals in the field the power to turn any data warehouse into actionable knowledge. An Instructor's Manual presenting detailed solutions to all the problems in the book is available online.
Customer Reviews & Comments
There is a lot to like about this book, but it has some unfortunate flaws. Note that it is part of a Data Mining trilogy. The other two books are: Data Mining Methods and Models and Data Mining the Web: Uncovering Patterns in Web Content, Structure, and Usage. My initial reaction was more negative as I feel strongly about the issues that this book addresses poorly. However, I find myself turning to this book again and again. I would endorse it highly, but with a caution or two. The very best features of the book are the exceptionally clear explanations of complicated algorithms. In particular, Chapters 6 and 7 and their explanations of Decision Trees and Neural Nets are just perfect for both new and veteran analysts who want to understand what is happening "under the hood". Those chapters are stand-outs, but all of the 80%+ part of the book that describes algorithms in detail (clear, careful, and readable detail) is uniformly excellent. For some readers, it may be the first time that the techniques really make sense to them. Now the flaws. The three book format is, frankly, annoying. The second book and third books are much weaker, but the it was clearly designed as a trilogy, so it is hard to recommend the first to a client without at least implicitly recommending the second. Spending my reading time well is more important to me than my reading budget, but the set of three costs more than $200. Unless you plan on an entire shelf of related books, like me, I can't recommend the entire set. The other flaw is less obvious, and is the one that concerns me the most. Although this book cites Dorian Pyle's excellent book ... it seems to miss the whole point. Data Mining data prep is quite different from data prep for statistics. Although the two areas share a lot in common, and while mastery of statistics is a good thing for data miners, this is one of the differences between the two disciplines. Data cleaning and data reduction are critical, but this book suggests that this is accomplished by the human doing all possible bivariates. Recommendations of factor analysis and log transformations abound, but never with cautions of when that is unnecessary or even a bad idea - something Pyle's book explores. Also, transformations like binning come off as something the analyst does during data exploration, getting it perfect before modeling. Sounds like statistics data prep to me - not data mining data prep. If anyone has ever completed data prep without preliminary modeling, or has modeled without having to revisit data prep, I have never heard of it. If a novice data miner were to take the advice too literally, they could get themselves into trouble. This would be especially true of a reader that is well versed in statistics - there is a predictable set of mistakes awaiting the classically trained on their first data mining project! My advice? There is a lot to benefit from here. All of the "white box" walk through examples are great. Consider buying this book, the Pyle book Data Preparation for Data Mining (The Morgan Kaufmann Series in Data Management Systems), and Berry and Linoff Data Mining Techniques: For Marketing, Sales, and Customer Relationship Management, while skipping the other two in this trilogy. Use this book for the algorithm explanations, but be cautious otherwise. The screen shots and discussion of Clementine may be helpful to you, but note that Clementine 8.5 was used.
Comment | Permalink |
(Report this)
|
Discovering Knowledge in Data: An Introduction to Data Mining
List Price: $87.95
Available from Amazon
Price: $70.36

| |
|
|
|
|