Blog-CTA-SEO-Guide-730x70

PDFs and SEO

Posted June 8th, 2009 by Charles Beatley

I just watched the webinar HubSpot recorded on SEO and I had a question.   Somebody had asked a question about whether it was better to have long pages with a lot of content or use additional pages to increase search engine rank.  My question is about PDFs; if a website has several pages with links to PDFs on its servers, would that count as additional pages for search engines?   I guess the bigger question is whether search engines can search through the content on a PDF or just the PDF file name?  I’m guessing search engines can’t read them just like images.

I would create an HTML alternative

Jon Bishop's picture

Jon Bishop 2 years 49 weeks 4 days 16 hours ago

Just to expand a bit on what Mike had to say.

When you are composing your PDF to be, make sure you use appropriate tags in whatever editor you are using. For example, use headers, bold and italic tags in Microsoft word.

Also, as mentioned, you can control meta-data however it doesn't seem like search engines know what to do with it .... yet.

I usually advise people to create an HTML version of their content because it is both easier to consume by readers and easier to index by search engines.

Yes, PDFs Are Searchable, But Not Optimal at All

Mike Volpe's picture

Mike Volpe 2 years 49 weeks 4 days 17 hours ago

PDFs are are searchable and indexable by most search engines depending how they are produced.  The text needs to be in the file as a font, not a scanned image.  If you take a word doc and save as PDF, you are most likely producing the right kind of PDF.

To check and see if your PDFs are indexed by Google, search in Google for "site:yourwebsite.com filetype:pdf" (example, I would search for "site:www.hubspot.com filetype:pdf").  The PDFs that Google has found will show up there.

HOWEVER.... PDF is not an optimal format for SEO.  HTML web pages are MUCH better - you can control the metadata, they are much mroe likely to get links.  So, I would strongly advise you to have all the content in your PDFs also on web pages.  Maybe take a whitepaper and break it up in to 5 articles and publish those on your blog, etc.

It is OK to also have the content in PDF.  There is no duplicate content penalty for PDFs having the same content as HTML web pages.  And sometimes people prefer to download a PDF for later or for printing, so it can be a good format.  But to attract new people to your company through search / SEO, using PDF alone is like running a marathon with weights tied around your ankles.  It's possible, but unnecessarily difficult.

 

 

Optimizing PDF docs

Bernie Borges's picture

Bernie Borges 2 years 49 weeks 3 days 23 hours ago

Mike's advice is spot on. Once upon a time PDF docs were not indexable by search engines. But, about a year ago Adobe came out with new meta data features in their professional version. This allows the author of a PDF document to be indexable. That's how you find PDF docs in Google search results.

That said, I agree with Mike that it's best to create web page version of your PDF document. That may not always be possible so at least make sure you're using the meta data features in Adobe to make your PDF docs search engine friendly.

And, don't forget to link back into your web pages from relevant (anchor text) keywords!

http://twitter.com/berniebay

But - is there a way to control the preview/serp text?

Jack Leblond's picture

Jack Leblond 2 years 49 weeks 4 days 17 hours ago

Mike - good tip with the "filetype:pdf".  I found several very old pdf indexed that will need to be removed. 

In scanning the listings though, I wondered if there was a way to control what goog uses as the document description, or does it just grab the first 160 or so characters of text?

User Login

TW_Business_160