View Document Solutions site map Document Solutions Home Page
Site Map Contact us
Document Solutions - Getting the most out of PDFDocument Solutions - Getting the most out of PDFDocument Solutions - Getting the most out of PDFDocument Solutions - Getting the most out of PDF
Document Solutions - Getting the most out of PDFDocument Solutions - Getting the most out of PDF
Document Solutions - Getting the most out of PDFDocument Solutions - Getting the most out of PDFDocument Solutions - Getting the most out of PDFDocument Solutions - Getting the most out of PDF
Document Solutions - Getting the most out of PDFDocument Solutions - Getting the most out of PDF
Document Solutions - Getting the most out of PDF
Document Solutions Home Page

PDF News and Tips
March, 2005

Google lets your PDFs look their worst!

Doubtless, most readers of this newsletter use Google regularly. Many of us don't let an hour go by without at least one Google search.

Almost every time I do this, lots of PDF files come up in the search results. And why not? Google indexes PDF files, and they represent a large portion of the content that's actually used online, day in and day out.

However, in Google's results, most of the search results listing PDF files looked like HELL! The blue text displayed by Google (see the image to the right) in search results comes directly from the PDF Document Information field "Title". This vital information is either absent, bogus or malformed in a HUGE proportion of online PDF files.

On ten recent more or less random searches, searches, Google's search results included an average of 4.3 PDF files on the first search results page of each search. Of those PDF files, an average of 60% were displayed with TOTALLY meaningless Titles.

The result of this apparent inability or unwillingness of content managers to quality-control the metadata contained in the PDF files they post online is simple: user frustration.

Without Title metadata, search engines tend to rely on the first text (or gibberish) they encounter to "stand in" as the Title for the purposes of Search Results. Worse, many authoring applications leave nonsense information in the Document Information fields that just looks flat-out unprofessional online.

Key Take Away: Check each PDF's "Description" (in Document Properties) before posting!

See what the PDFs on IBM's web site look like to Google. Then try your own site!

Users prefer PDF over HTML for scholarly articles

Annual Reviews, of Palo Alto, California, has allowed Document Solutions to publish results of an internal study, showing the distribution of article downloads in 2004 by file-type.

This data does not include downloads of "legacy" PDF files: materials from before 1996, and available to Annual Reviews subscribers only in PDF form. If the legacy content is included, total PDF downloads increase by over 30%. The data DOES exclude downloads from Google and other search engines.

We clearly see a strong user preference for PDF over HTML, especially in the biomedical sciences. However, the preference for PDF appears to cross all disciplines, strongly implying that users prefer to go direct to the PDF rather than "browsing" the HTML version before accessing the PDF.

These findings are all the more surprising when one considers that it is HTML files that usually contain interactive features such as hyperlinks and other features designed to add utility to the document. More page-requests might be expected from HTML as a result. 2004 downloads of Annual Reviews titles published in 2003 (not shown in the chart) do show a modest gain for HTML vs. PDF usage.

It remains unclear whether users consider the features and seamlessness of HTML browsing to outweigh the print-readiness, reliability and familiarity of PDF. Indeed, with Reader 6.0 and now with the latest 7.0, online PDF usage has become increasingly seamless with the web-browsing experience. With the addition of enhanced capabilities for PDF files including embedded hyperlinks, 3D objects and movies, there's little reason to believe the trend will not continue.

Visit Annual Reviews at annualreviews.org.

The Many Uses of CD and DVD-ROMs

Even though they have enjoyed wide usage in direct mail, product and software delivery, many publishers consider discs passé, a technology that had its day before the flowering of the internet.

Until recently, however, few publishers have found the Web a major source of profits. Meanwhile, publishers who added CD or DVD products to their offerings, or leveraged their marketplace and physical distribution potential on behalf of their advertisers, have found new revenues AND attractive opportunities to leverage their online initiatives.

For many publications, a "backstart disc" - the most recent five or ten years of publication on a single CD ROM - is an easy upsell or promotion for new subscribers, a potent premium for securing long term renewals, and a valuable distribution medium for advertisers, often all at the same time.

For magazines with "evergreen" content, historical collections may be a superb premium ancillary product. One Document Solutions client sells $500,000 worth of CD-ROMs year after year, with largely the same content.

Document Solutions specializes in all-PDF discs, believing that the stability, reliability, cross-platform usability, flexibility and cost-effectiveness of PDF makes it an ideal choice for a wide variety of electronic publishing and content distribution applications.

Learn more about DSI's CD and DVD-ROM development services.

Get FREE PDF News & Tips
Email:
Get Adobe Reader 8.0 from Adobe.com 
Copyright © 2008, Document Solutions, Inc.