| Publisher | Microsoft | ||
|---|---|---|---|
| Format | 120.7KB PDF | Date added | 26 May 2006 |
| Topics | Knowledge and Data Management, Parallel Processing, Data Mining - Analysis | ||
| Downloads | 101 | ||
This paper presents a new web mining scheme for parallel data acquisition. Based on the Document Object Model (DOM), a web page is represented as a DOM tree. Then a DOM tree alignment model is proposed to identify the translationally equivalent texts and hyperlinks between two parallel DOM trees. By tracing the identified parallel hyperlinks, parallel web documents are recursively mined. Compared with previous mining schemes, the benchmarks show that this new mining scheme improves the mining coverage, reduces mining bandwidth, and enhances the quality of mined parallel sentences.
Related white papers
The Journey Along an Information-Led Transformation
A shift is underway from simple automation to business optimization, and information is at the center of it. Information, when aligned with your business strategy, holds the key to driving profitable...
The new information agenda:Do you have one?
The lack of trusted information — information that is accurate, timely and relevant— is on the minds of CEOs and senior executives around the world. a paradigm shift from siloed...
Best Practices for Translating Customer Satisfaction into Revenue
Today's support organisations are focused on two top-level metrics: financial results and customer satisfaction. For most, it's easy to track financial performance, but customer satisfaction is akin to speaking a...
Support Strategies: Customer Experience Management
Customer experience is the most powerful tool available today for distinguishing your company from competitors ? each contact with the customer offers an opportunity for strengthening your relationships by delivering...
3 Strategies for Reducing IT Support Costs
As companies brace for more bumps in the economic downturn, many organisations are indiscriminately cutting costs. To ensure a seamless transition into the post-recession market, however, slashing and burning is...
Forrester Strategies for Assessing IT Business Satisfaction
If you aren't assessing customer satisfaction you are overlooking a potential goldmine. This valuable data is crucial to creating a successful IT strategy. But where do you start? This new...
Realising the benefits of going green
Cross over to greener communications with improved data accuracy. Many organisations have processes in place to improve the quality of their contact data to address business drivers, such as cost reduction...



