| Publisher | Microsoft | ||
|---|---|---|---|
| Format | 414.5KB PDF | Date added | 03 Oct 2004 |
| Topics | Data Sharing and Integration, Programming Languages, Data Mining - Analysis | ||
| Downloads | 23 | ||
Recent work in data integration has shown the importance of statistical information about the coverage and overlap of sources for efficient query processing. Despite this recognition there are no effective approaches for learning the needed statistics. The key challenge in learning such statistics is keeping the number of needed statistics low enough to have the storage and learning costs manageable. Naive approaches can become infeasible very quickly. This paper presents a set of connected techniques that estimate the coverage and overlap statistics while keeping the needed statistics tightly under control. The approach uses a hierarchical classification of the queries, and threshold based variants of familiar data mining techniques to dynamically decide the level of resolution at which to learn the statistics.
Related white papers
Software Engineering Today - Best Practices & Patterns
This is the final webcast in the 15 part series ?Modern Software Development in .NET Using Visual Basic?. Developers shouldn?t miss this opportunity to examine the following topics with renowned...
Market-Leading Data-Modeling Tools: Research Report from the Burton Group
The Burton Group provides an in-depth research report on Market-Leading Data-Modeling Tools. According to their research, basic data modeling tools have become commoditized - basic features are yesterday's...
The Converging Paths of SQL Server and SharePoint - Don't Wait Until It's Too Late!
SharePoint and SQL server have much in common, and understanding their similarities will help you streamline your day-to-day tasks and help you work more efficiently. Do you know what those...
Supporting Employees Anytime, Anywhere
New business demands require a new approach to end-user support. This is leading organizations to a remote service delivery model that leverages the Web and Saas technology
The Pursuit of a Standardized Solution for Secure Enterprise RBAC
Each RBAC implementation varies in its capabilities and method of management. In a multi-platform environment, these differences introduce higher administration hours and costs because the various RBAC models are not...
Combining the Power of Rhapsody Model-Driven Development, UML and Hitex Tools to Streamline the Development of 8, 16, and 32 Bit Applications
Studies have shown that software is now the main bottleneck for most embedded systems projects. According to Embedded Market Forecasters, 56% of all embedded designs are behind schedule, and software...
Massive But Agile: Best Practices for Scaling the Next-Generation Enterprise Data Warehouse - Forrester Report
Information and knowledge management (I&KM) professionals continue to expand the scale, scope, and deployment roles for their enterprise data warehouse (EDW) investments. Information managers are adopting EDW best practices that...



