ZDNet UK


Skip to Main Content

  1. Home
  2. News
  3. Blogs
  4. Reviews
  5. Jobs
  6. Resources
  7. Community
  8. My ZDNet

 

ZDNet UK RSS Feeds


Security threats Toolkit

Download now

Improving Web Spam Classification Using Rank-Time Features

PublisherAssociation for Computing Machinery
Format136.6KB PDFDate added08 May 2007
Topics Spam - E-mail Fraud - Phishing
Downloads23

This paper studies the classification of web spam. Web spam refers to pages that use techniques to mislead search engines into assigning them higher rank, thus increasing their site traffic. The contributions are two fold. First, the paper find that the method of dataset construction is crucial for accurate spam classification and it notes that this problem occurs generally in learning problems and can be hard to detect. In particular, the paper find that ensuring no overlapping domains between test and training sets is necessary to accurately test a web spam classifier. In this case, classification performance can differ by as much as 40% in precision when using non-domain-separated data. Second, the paper shows rank-time features can improve the performance of a web spam classifier.

Download now

Did you find this white paper useful?
24 out of 50 users found this white paper useful


  • Trackback
  • Clip Link

Related white papers

Combating Fraud and Improving Claims Processes in SMB Insurance

IBM understands the increasing threats facing insurance companies and offers proven solutions to capitalize on a variety of risks. This executive kit contains articles related to claims management, insurance fraud...


IBM Healthcare Security Executive Kit

Healthcare organizations must protect an expanding amount of patient information from internal and external threats, while ensuring 24x7 availability and secure, immediate access to critical patient data by authorized users....


Risk, compliance and security: Can your financial institution weather the storm?

Learn how preemptive security can help stop Internet threats before they affect the network. IBM provides a variety of smart solutions tailored specifically for mid-sized financial institutions. Start with a...


Small Business Webcast - Spam Protection for Small Businesses

In June 2004, 65% of all Internet email was identified as spam. As spam continues to proliferate, small businesses have been forced to protect themselves from lost employee productivity, unnecessary,...


Staying Ahead of Spammers With Symantec Solutions

During the period January - June 2005, spam made up over 61% of all email traffic, a slight increase over the second half of 2004. As spam continues to proliferate,...


Indiana School District Cuts Costs with iPrism Web Filtering Solution

When Indiana's Greenwood School District sought a reliable and cost-effective way to set up Web filtering parameters that would meet the diverse needs of students, teachers, and staff, it selected...


Securing SMBs Against Spam and Virus Threats

This white paper from St. Bernard Software explains why spam and viruses are particularly tough to eliminate in small- and medium-sized businesses (SMBs) that can't dedicate IT staff to combating...


White Paper

Featured White Paper

Video case study

Duncan Scott, CIO of DTZ, talks about their global email project and how Mimecast has helped them with their challenges operating as a multinational business

Download Now

Other White Papers

Software Configuration Management: The Foundation of Global Distributed Development Today

By distributing development, you can create a collaborative work environment staffed by the best...

Ten Things to Know About Grid Computing on Windows

This Oracle whitepaper offers insights into Oracle Grid. A grid allows a business to add capacity,...

See All White Papers