ZDNet UK


Skip to Main Content

  1. Home
  2. News
  3. Blogs
  4. Reviews
  5. Jobs
  6. Resources
  7. Community
  8. My ZDNet

 

ZDNet UK RSS Feeds


Databases Toolkit

Download now

2D Conditional Random Fields for Web Information Extraction

PublisherMicrosoft
Format221.8KB PDFDate added30 May 2005
Topics Knowledge and Data Management, Web Content Management, Data Acquisition - ETL
Downloads261

The Web contains an abundance of useful semi-structured information about real world objects, and the empirical study shows that strong sequence characteristics exist for Web information about objects of the same type across different Web sites. Conditional Random Fields (CRFs) are the state of the art approaches taking the sequence characteristics to do better labeling. However, as the information on a Web page is two-dimensionally laid out, previous linear-chain CRFs have their limitations for Web information extraction. To better incorporate the two-dimensional neighborhood interactions, this paper presents a two-dimensional CRF model to automatically extract object information from the Web. This paper empirically compare the proposed model with existing linear-chain CRF models for product information extraction and the results show the effectiveness of the model.

Download now

Did you find this white paper useful?
29 out of 50 users found this white paper useful


  • Trackback
  • Clip Link

Related white papers

Maxmise IT Flexibility and Lower Costs With Grid Computing

This document discusses grid computing: Oracle's definition of it, the benefits of it, and the elements that comprise an Oracle Enterprise Grid Computing environment. It also highlights an Oracle customer's...


Ten Things to Know About Grid Computing on Windows

This Oracle whitepaper offers insights into Oracle Grid. A grid allows a business to add capacity, cheaply, whenever it is needed. What's more, an Oracle Grid can help you gain control...


The ROI of Data Governance - A Revenue Generation Perspective

"Concentrating on increasing revenue necessarily means paying attention to metrics such as return on investment (ROI). This white paper from Gwen Thomas of the Data Governance Institute provides a practical...


Building a Data Quality Scorecard for Operational Data Governance

" Operational data governance is the manifestation of the processes and protocols necessary to ensure that an acceptable level of confidence in the data effectively satisfies the organization's business needs....


MDM Components and the Maturity Model

Any effective master data management program requires a mix of technologies to achieve success. This white paper by David Loshin provides a conceptual outline of technical MDM components and examines...


Ovum Report: IBM Simplfies Service Management to facilitate business and IT

All too often, companies find their IT infrastructure is fragmented and confused. Different parts of the business have different silos of data and applications, with no integrated vision joining them...


Massively Scalable NAS - Pre-Empting Tomorrow's Data Overload with Today's Technology

HP is launching the HP StorageWorks 9100 Extreme Data Storage System that solves challenges such as extreme scability, manageability and affordability and creates new business opportunities. HP is going to...


White Paper

Featured White Paper

Video case study

Duncan Scott, CIO of DTZ, talks about their global email project and how Mimecast has helped them with their challenges operating as a multinational business

Download Now

Other White Papers

Software Configuration Management: The Foundation of Global Distributed Development Today

By distributing development, you can create a collaborative work environment staffed by the best...

Ten Things to Know About Grid Computing on Windows

This Oracle whitepaper offers insights into Oracle Grid. A grid allows a business to add capacity,...

See All White Papers