Dominant Systems - Michigan Network Solutions Provider Dominant Systems - Michigan Network Solutions Provider
Dominant Systems - Michigan Network Solutions Provider Dominant Systems - Michigan Network Solutions Provider
ARCSPIDER SEARCH
Enter Keywords:

Powered by Arc Spider - Smart Product Search Services 
Privacy Statement
PARTNER LINKS

Buy.com Coupons

Sony VAIO PC Special Offers

The Hottest Notebook Deals Are Here!


Data Warehouse Method, The
Home > Computer/ Network Books > Data Warehousing > Item 41
View Previous Product in Data Warehousing View Next Product in Data Warehousing

Click here to buy Data Warehouse Method, The by  Tom Debevoise. Data Warehouse Method, The
by Tom Debevoise
Sales Rank: 1263848
$29.36
At Amazon
Get More Info On Data Warehouse Method, The! Buy Data Warehouse Method, The Now!

  • Hardcover: 420 pages
  • Publisher: Prentice Hall December 29, 1998
  • Language: English
  • ISBN-10: 0130813060
  • ISBN-13: 978-0130813060
  • Product Dimensions: 9.6 x 7.3 x 1.6 inches
  • Shipping Weight: 2.5 pounds

    From the Author
    My book describes how the convergence of new web-based architectures, advanced object-oriented methodologies, and powerful computing architectures can create business results for your organization. However, even with these enabling technologies, the construction of the data warehouse remains a challenging undertaking. Success requires both a capable team and a group of users willing to change their daily activities. (For the Oracle - Business Objects users, this text is a must-have, see the next-to last paragraphs of my comments).

    Success for your organization means improving the quality of what your team does. To improve your project's productivity and success, I describe how advanced visualization and modeling capabilities from object oriented analysis design components of a data warehouse. In addition, I use the unified modeling language (UML) to detail the steps of the of the data warehouse method for both data modeling and data acquisition. In my chapter on Design, I show how UML can address many dimensional modeling issues that were impossible to solve with the entity relationship diagram (ERD). The DWM presents a complete solution that utilizes the Oracle 8.0 RDBMS as the data source. The DDL and sample data for the problem is included on the CD.

    The DWM improves the success of the project, by incrementally breaking the line-of-business into cycles that implement business-models. This technique incorporates scaleable design techniques, including data partitions, and delivers short-term business results and insures that the project cycle built today will be reusable by those next waiting to roll out of the factory. By building focused, business-model based data marts at three-six month intervals; the DWM reduces the time required to deliver business results for your organization.

    The data warehouse project can be very risky. According to the META group, after one year more than 50% of data warehouse projects have failed to achieve their objectives. Another study of large corporations attempting to construct large-scale data warehouses reports more than 80% of all data warehouse projects fail to meet organizational objectives, with a significant portion in complete failure. The process of acquiring data from operational systems, transforming it, and loading it into the data warehouse can be a fundamental cause of a project's failure. The prevalent use of pre-defined star schemas, or 'by the book' solutions, may delude the project team into thinking that the organization's operational systems will (easily) support the data model. I have often found that some project teams don't even attempt to load operational data until late in the project. Until this loading takes place, the project team cannot truly evaluate how well the business rules model of the operational system matches the design.

    In the DWM, data acquisition is a critical component of the process. Very little has been written on this topic. Therefore -- at each step in the process, I show how early data prototypes and extracts from operational systems are critical to the success of the organization. The team that applies the object-oriented analysis in this book will improve the efficiency and effectiveness of the integration between the data warehouse and the operational systems. This method eliminates serious project risk by moving operational data from source system(s) very early in the project cycle.

    Eventually, every data warehouse manager must seek out ways of improving the performance of the mature environment. In latter chapters of the text, I develop the concept of aggregate management. Through this approach, pre-summarized subsets of fact tables are precisely configured to dramatically improve the performance of the managed query environment. The CD included with this text contains executable source for an 'Aggregate Wizard'. This program merges the semantic and CASE repositories to provide an important service to the users, a highly performing, highly available data warehouse. It specifically works with the Oracle and Business Objects environment and utilizes data structures that can be incorporated into the universes with the @aggregate_aware function. I have included the source of the processes and data schema so that readers may develop their own aggregate solutions.

    One of the key reasons that I wrote this text is that I have observed an evolution in the use the CASE, semantic, and administration repositories. To me, this suggests new types of methodologies will be emerging from higher abstractions in the construction of an IT infrastructure. There are very few in this profession that would build systems the same way they did 10 years ago. The current IT infrastructure has been built with neither "architectural" nor business concepts. Additionally, their operational systems often lack the current principles of management and industrial engineering. Most systems have been built with the outward manifestations of the outmoded practices. Businesses trying to operate in this environment are finding that consolidations and mergers are reaching the limits of productivity gains. It will soon be time for information technology to be a fresh source of productivity gains. In a closed loop, new methods, in combination with the data warehouse method, will have the ability to deploy new systems that marry existing business models with the fine tunings of a data warehouse analysis. To survive today's changing, chaotic environment, elements of the new operational systems will need to enable a 'zero cost' deployment of new business rules.

    From the Inside Flap
    FOREWORD

    Truisms abound in information technology. The project is perpetually over budget, behind schedule, mired in requirements, and a victim of politics. In response, I have developed my own truism: six of the right people can do more than sixty. I have worked on many IT projects, large and small, and have experienced both truisms. I have found that quality is the single point of failure for any project, the quality in the gap between what a team could and did achieve.

    The theme of this text is quality processes for the data warehouse. I view this opportunity as a chance to state not what is but what could be. Methodology should control the process of creation. It should be planned with project management and created with discipline.

    The data warehouse project must contribute to the performance of an organization. This performance should be measurable. The time to integrate for its own sake has passed. It is time for each organization to examine its IT development processes and find which contribute to the health of the organization and which do not.

    For the manager or executive, the methodology that I describe is constructed to meet the strategic objectives of your organization and create a measurable result. Most IT objectives are accomplished in cycles. I suggest that you should use the project cycle to achieve business results for your organization.

    Not every project cycle will be strictly for new developments. Your infrastructure must be maintained. Upward moving events and technology requires re-hosting and re-scripting operational systems. The Intel-based work station and operating system has a very short productive life. Because most organizations have not made this distinction, a challenge for senior management is to separate the infrastructure from the business results projects.

    The recent business process reengineering fad, while somewhat defunct, was useful in pointing out the age of the processes that are ingrained in today's operational systems. Many legacy systems were implemented by moving paper-based processes onto databases and screens. Somewhere in the jumble of prompts and fields resides the business knowledge of the enterprise. With or without, the enabling technology of data warehousing, the managers make decisions that keep the business afloat. Or not.

    Many technical books are written to fill a basic human need: to make the impenetrable understandable. With good prose, just about any technical topic can be illuminated. From subatomic physics to the World Wide Web, there are books that beautifully explain their topics; however these are not more likely to make the reader a physicist or even a database designer.

    To build the data warehouse requires a broad range of technical disciplines. Adding maturity and capability to the data warehouse team requires stretching their capabilities and challenging each member to grow. Despite the best efforts of the self-help guides, the construction of the data warehouse remains a challenging undertaking. Success requires both a capable team and a group of users willing to change their daily activities.

    At the heart of the management environment is the data warehouse of the discipline of quality. Taken in isolation, quality is the gap between capability and performance. Quality is either high, with a minimal gap, or low, with larger gaps. A quality data warehouse serves the strategic intent of the organization, is created with the best available data, and is achieved at an optimal rate. Both the data available and the rates of implementation are highly dependent on your organization. If your organization has older, less integrated systems and less technical acumen you still achieve a quality data warehouse by promoting consistent methods in its creation.

    Collectively, the methods that I discuss in this book enable the implementation and maintenance of a quality environment. It is intuitive that the strategic directions taken in the early phase of a project will sway the technical architecture and ultimately the quality and the performance of the system. Beyond a discussion of the activities and personnel that create the data warehouse, there are technical design approaches that should be taken in order to create a high-performance data warehouse. Since my audience in this text are the project implementers, I will need to be very explicit in my description of these technical implementations.

    In choosing to describe the specific nature of integrated environments, I can focus on how the environment can be integrated and managed to provide a true solution. My discussions include UNIX Servers, relational database management systems (RDBMS), and several managed query environments. I hope that by diverting the focus from a generic attempt to providing specific solutions that a model of the characteristics and capabilities of the quality data warehouse environment will emerge.

    Reuse is an object-oriented concept that makes the efforts of one project available to another; however, it's the use of the product, not it's features that promote this. Often, I find today's corporate environment a dizzying array of software products with similar capabilities, many of which are object-oriented (OO). On more than one occasion, I have been astonished to discover multiple computer aided software engineering (CASE) tools, multiple user interfaces, even multiple on line analytical processing (OLAP), and managed query environments (MQE) on the same IT shop floor.

    For the past decade, vendors of enterprise products, including CASE and RDBMS products have been sold as a method of unifying operational systems across lines of business. Their ubiquitous argument has been that the legacy of the mainframe is a series of non-communicating, outdated systems. The parallel component of the marketing assault is the position that their tools have taken the best "open" or "environment independent" solution. While marketing personnel attack the enterprise from the top, they have strategies that appeal to the shop-floor programmer. Software is given away for free. The software vendor will develop subtle appeals to the programmers, from who has the best implementation of Microsoft's component object model (COM) to who supports the strongest inheritance. These are designed to promote various intellectual viewpoints and develop agents of opposition within the competitor's camp.

    The result is that client-server architectures have promulgated more stove pipes than the mainframes ever created. These organizations must maintain a confusing array of skills and capabilities. They must manage huge deployment issues. By maintaining multiple tools with correspondingly similar objectives, be it client server GUI tools, MQE's, reporting tools, and especially CASE tools, these organizations miss the opportunity to promote mature development cycles in their applications deployment.

    For instance, adopting database independence can weaken the capabilities of an organization. Databases are no longer merely repositories of data, accessible by standardized sequential query language (SQL) and data manipulation language (DML). Databases provide server-side components that have been optimized to perform in a particular environment. These tools include job queues, alerters, and pipes. By now, database designers should be very familiar with these tools. The result of database independence is freedom from applications that scale. My text will show how these components are indispensable for the generation of a quality environment.

    With the advent of very large databases (VLDB) and parallel technologies, these non-generic features are required to create world-class, viable data warehouse solutions. By world-class, I am referring to an environment that supports the broad sweeping, strategic decisions of a large enterprise.

    In this text I have taken the position that the advanced visualization and modeling capabilities of the object oriented analysis are required to articulate how the components of the quality data warehouse should cooperatively moil. I utilized the unified modeling language (UML) to detail the steps of the DWM. Not only are deliverable artifacts produced with UML, but the interactions of the project team are also described as discrete services that the development environment should provide to complete a methodology step. All system designers would benefit from adopting the analytical techniques of OO concepts including distributed objects, components, and function points. OO techniques are well suited to design the separation of responsibilities in computer architectures between the operational systems and the data warehouse. Current OO techniques of analysis are applied to the requirements of the data warehouse. For instance, from a business analysis perspective, the economic facts of the customer purchasing an item is not related to the transactions that created the purchase. The transactions are only an atomic element of the economic fact. Within the object oriented data warehouse methods that are presented here, techniques are developed that separate and assign responsibilities of the evaluations of an organization's data among the multiple tiers of the organization's environment.

    To provide strategic views of the data in our current environment, there are a broad and diverse set of transactional systems that must be interfaced with the data warehouse. For example the 7x24 transactional support system extracts of the data often cannot be directly queried from the system. Often, the design of data acquisition must accommodate a lack of computing resources.

    The concept of "wrappering" legacy systems to provide methods to send transactions and receive business data is a potential integration tool; however, the business-oriented strategist is only interested in an aggregated subset of the data. Strategic data is mostly digested and processed. Even the finest granularity of analytical data requires summarization. For instance, the daily sale of a product by store requires this summarization along the store and product dimensions. This digest is best housed by the data warehouse technique. Creating systems that would support on demand queries of production data would be inefficient both in terms of input/output and processing. The data warehouse method has evolved to suggest that new transactional systems include methods that post useful, a pre-summarized result to multidimensional data structures. Object oriented analysis can design these components. Agents, brokers, and managers are visualized and designed. The analysis optimizes the data warehouse environment into a multi-tiered environment.

    For both the object oriented and object relational advocates, the year 2000 issue has breathed new life into the legacy database systems including IDMS, Codacyl, and others. With a more "request broker" oriented approach to data feeds into the data warehouse, gateways could be used to automate the process.

    In the final deployment of the data warehouse project, database or operating systems independence is a fallacy. As this text will detail, at some point the IT team must choose an infrastructure and become dependent upon their choices. Software companies have invested hundreds of millions of dollars in the research and development of commercial-off-the-shelf (COTS) tools for the user and development community. The "vendor-independence" movement has sometimes caused managers and developers to resist or avoid using such high quality products as a query environment. With the broad number of databases and systems that COTS products interfaces with, comes increasing flexibility in integrating a large scale environment. The data warehouse manager can maximize the quality of a data warehouse project through the management of these design choices.

    A stable environment, designed from state of the art components, is mandatory to accomplish the goals of the organization building the data warehouse. I feel that the most important elements of a particular development environment are not vendor independence, but its features and characteristics. In VLDB's, how scaleable is the technology? Is it efficient? Does is support a methodology? Has it been regression tested? What are the deployment issues. Is it open and extensible? Does it support multi-processing environments? Most importantly, does the environment support repository management and re-use? Finally, the environment must support the business objectives of the organization and it must be capable of maturing along with the users and developers.

    I will attempt to present the construction of the quality data warehouse environment in a full, life cycle manner. In this text, I hope to detail most important aspects of methodology, administration, and performance management. The manager or planner will be able to read and understand the technical portions of this text and understand the breadth of issues in this environment.

    This text is divided into 11 chapters:

    Chapter 1, The Data Warehouse Method. in this chapter, I present the Data Warehouse Method (DWM) and the repeating processes that create the quality data warehouse. I also present the concept of the Multidimensional Business Model. My two objectives in this chapter are to present the themes of the book and to construct my taxonomy for data warehouse development objects and processes. Chapter 1 also presents the concepts of the Integrated Data Warehouse Support Environment (IDSE). The chapter covers a overview of the DWM team, the characteristics of the flow of data among the workstations, and the use and integration of CASE tools and semantic repositories. These themes are threaded into the more general topics of Data warehouse Methodology and the development of technical Architectures. Finally, I describe how data warehouse administration and performance Management work together.

    Chapter 2, Managed Query Environment Tools. In this chapter I describe how managers and decision makers use the MQE to decode the business perception developed by the multidimensional business model. The aim of this chapter is to describe the characteristics of the quality data warehouse from the user's perspective. I describe how the environment develops a clear vision and understanding of the business area. The chapter describes the characteristics of the user's interaction with the MQE. These interactions prepare graphical presentations that effectively encode business events against dimensions within the multidimensional business model. I describe standard MQE capabilities including slice and dice and drill methods. The chapter concludes with a short description of the characteristics and application of data mining tools.

    Chapter 3, Methodology. Here, I present a detailed vision of how the disciplines of quality, team management, process engineering, and function points are intertwined and directed through methodology. These concepts, are the heart of this text. In the sections that describe strategy, analysis, design, deployment, and discovery, I will develop the steps, components, and capabilities necessary to create a successful data warehouse project.

    Using CASE tools and the semantic repository, I describe how to integrate development tools into elements of the work flow and how to present methodology output using the Unified Modeling Language (UML).

    The chapter presents a simplified function point methodology for estimating the size of the selected project. I describe how designers can move from data modeling to dimension and event modeling. Finally, the text will describe how to develop the IDES project plan and improve existing projects and teams.

    Chapter 4, Strategy. This chapter describes the strategy phase of the Data Warehouse Method. Strategy initiates, defines, and controls the direction of a data warehouse. The chapter presents a method for achieving business objectives through strategic planning. The text details specific distinction of the differences between OLAP and OLTP development requirements. This understanding is pivotal to the successful implementation.

    Chapter 5, Analysis. In this chapter I discuss the analysis phase of the Data Warehouse Method. In the analysis phase, the multidimensional model is more completely detailed and verified. The capabilities of the model are sharply defined through a process of multidimensional analysis. Through business rules and data harmonization the event spaces are rigorously detailed. During analysis, the business semantics gathered in the strategy phase are critically evaluated. Business rules define the data feeds from operational systems that are included in the data acquisition packages. Short term rapid prototypes are prepared for the MQE.

    Chapter 6, Design. In this chapter I describe the analysis phase of the Data Ware Method. The data and process models are generated from the logical models prototypes. The design phase finalizes the general kinds of questions posed of the data warehouse. The logical model is then converted into physical data structures. With the physical data structures in place, limited volume testing is performed to avoid costly mistakes in succeeding phases. The team works to design and prototype the necessary connectivity or collection feed from data sources, the data acquisition modules.

    Chapter 7, Construction. Here I show the final steps of the Data Warehouse Method. In the construction phase, the operational data warehouse is constructed. Data acquisition modules and other programs are developed and tested. The architecture (including the database) is tuned to handle larger data volumes than were tested in the prior phase. For the large data warehouse to run most efficiently, all of the components should be tightly integrated with a management module.

    Chapter 8, The Quality DW Technical Architecture. In chapter 8, I present a methodology for developing a data warehouse architecture. I describe modern architectural IT components including parallel technologies, client server architectures, object oriented programming paradigms, and how to present a logical architecture using the Unified Modeling Language (UML).

    Also integration steps are enumerated for the IDSE components, including CASE, semantic repositories, systems administration tools and MQE's. Additionally described are the specific attributes of the physical architecture including parallel architectures and server configurations. The text includes information on designing specific CPU requirements, memory and I/O subsystem requirements and redundant array of inexpensive disks (RAID) storage. Finally the chapter describes middle ware and its interaction with the network.

    Chapter 9, Administration of the IDSE. Chapter 9 details the technical aspects of server administration in the IDSE. The objective is to describe the mission of the database administrator. The chapter describes specific administration and tuning requirements for databases including: control, tuning, parallizing operations, managing I/O, and memory. Also described are partitions in the database and how they fit into a data warehouse environment. The text uses this new capability as a way of achieving greater parallelism in the data warehouse. Chapter 9 also describes process management in the IDSE. Finally, performance management is described in terms of parallizing operations, and managing I/O and memory.

    Chapter 10, Managing Dimension and Event Data. In chapter 10, a specific example of integrating the CASE, semantic, and MQE environments is presented. Using an Oracle RDBMS environment with a Business Objects MQE, the text describes a technique for managing the multidimensional model's degrees of freedom with aggregates. Because it describes dimensions and their distributions, the chapter is of general interest. Also described is the solution architecture, availability, and the aggregate management process. Chapter 11 presents, specific code for correlating the CASE repository with the semantic repository.

    Chapter 11, Conclusion: At the conclusion of the book, I present a vision of how the data warehouse environment fits into the future of all information technology environments. I describe the concept of the business knowledge repository as an aggregate of the operation of an enterprise. The text describes a specific instance of a business knowledge repository developed for an integrated transactional/data warehouse environment.

    Suggested Reading Patterns. To master the concepts of this text, one should proceed through the entire text. Designers and analysts should complete the text. However, executives and non-technical managers can read a subset of the text and achieve a grasp of the concepts. Non-technical managers should read Chapters 1 through 8 and the conclusion. Executives should read chapters 1,2,3,4 and the conclusion. For the executive, chapter 4 describes how a project cycle of the DWM can be used to create impact on the bottom-line. Quality personnel should read Chapters 1, 2, 3 and the introductions of chapters 4, and 7, the conclusion, and the appendix. Today, there is a great deal of confusion about components of architecture including RAID, parallel servers, and how they should work together and when they should be used in the data warehouse. I have presented clear explanations of how these should be used so that all may understand.

    Through a contemplation of the evolution of the CASE, semantic, and administration repositories, I hope that the reader can visualize the directions of new methodologies emerging from these increasing abstractions to the construction of an IT infrastructure. The data warehouse method has created the final corner of what will become the business knowledge method. In a closed loop, this method, in combination with the data warehouse method, will have the ability to deploy new systems that marry existing business models with the fine tunings of a data warehouse analysis. These elements of the new operational systems will enable a zero cost deployment of new business rules.

    Customer Reviews & Comments
    Sorry folks -- this book just didn't work for me. I'm pretty technically savvy (BSEE, MCSD), but I just couldn't get much out of this tome. The language is stunningly obscure to me (read the authors' notes above, then imagine if they got "scholarly"), and the meat just wasn't there for me and my little three-person warehousing project. Adding to the difficulty -- there must be at least one typo per paragraph, which I find extremely distracting. This is the only book I've ever sent back.

  • Data Warehouse Method, The
    Available from Amazon
    Price: $29.36
    Get More Info On Data Warehouse Method, The! Buy Data Warehouse Method, The Now!
    Home |  About Us |  Network Services |  Security Services |  Testimonials |  Case Studies
    Tips & Tools |  Press Room |  Newsletters |  Employment |  Contact Us

    Copyright © 2008, Dominant Systems Corporation

    Dominant Systems Corporation