The Joining Dots Blog

Studying what happens when people, information and technology collide.
Exploring the possibilities. This blog is for sharing news, links and observations.

09 May 2007

SharePoint 2007 and Adobe PDF

[Entire Post reviewed and updated: 17/03/09 to include infrastructure update, 64-bit and farm installation notes]

If you are deploying Microsoft Office SharePoint Server (MOSS) 2007, even if you are planning a vanilla deployment (i.e. no bespoke development, using only out-of-the-box features) there is one piece of bespoke configuration you will likely still want - the ability to index and search for Adobe PDF files.

There is a great post over on SharePoint blogs, written by S.S.Ahmed, detailing how to add PDF support to your MOSS box. However, Adobe have made some changes to their use of iFilters and there is now a shorter and easier way that doesn't require registry edits or resetting your web server.

The following process works on my demo build for Microsoft Office SharePoint Server 2007 and has been repeated for numerous clients with various different SharePoint deployments. Default names and file locations have been used here - e.g. SharedServices1. I named the PDF icon as 'pdficon.gif'. If you have used different names and locations, substitute as necessary.

  1. Download and install Adobe Acrobat Reader 7 or later on the server to be used for indexing. (Note: From version 7 onwards, the reader includes the iFilter by default, previously you had to install the iFilter separately). If you have a 64-bit environment, you will need to download the 64-bit version. See notes at the end of these instructions.
  2. Download the PDF icon (select 'small 17 x 17') from http://www.adobe.com/misc/linking.html
    1. Give the icon a name (I use pdficon.gif)
    2. Save the icon in c:\Program Files\Common Files\Microsoft Shared\web server extensions\12\TEMPLATE\IMAGES
  3. Edit the Docicon.xml file to include the PDF icon
    1. Navigate to c:\Program Files\Common Files\Microsoft Shared\web server extensions\12\TEMPLATE\XML
    2. Open the DOCICON.XML file in Notepad (or an XML editor). You should see that the file has two main tags - ByProgID and ByExtension
    3. Within the ByExtension tag, add an entry for the PDF icon 'Mapping Key="pdf" Value="pdficon.gif" /' (replace the single quotes with angle brackets)
    4. Save and close the file
  4. Stop and restart Internet Information Server (IIS). Note: This will temporarily take SharePoint offline. Open a command line (Start - Run - and enter 'cmd') and type 'iisreset'
  5. Repeat steps 3 and 4 for any SharePoint web front-end servers (Note: You only need to install the iFilter on the indexing server but for the icons to appear they need to be added to all web front-ends)
  6. Add the PDF file type to your search index (note that this has to be completed for each index, i.e. each Shared Service)
    1. Open your Search Settings: SharePoint 3.0 Central Administration - SharedServices1 - Search Settings (Note: If you have installed the Infrastructure Update, you will see the option in the side bar under both Search Settings and the new Search Administration dashboard)
    2. Select File Types
    3. Click Add File Type
    4. Enter pdf in the text box (labelled File extension) and click OK
    5. Check that the pdf file type is listed and has the pdf icon showing next to it. If the icon isn't showing, something has gone wrong. Review all of the above steps. Most common problems are spelling mistakes in the docicon.xml file and not repeating the process on all SharePoint web front-ends.
  7. Perform a full crawl of your content sources - PDFs will not show in search results until a full crawl has completed and indexed the PDFs using the iFilter you just installed.

That's all there is to it. Check that everything is working by doing a search for a file you know is a PDF document. The document should be listed in the results and should have the PDF icon displayed next to it.

If you have installed SharePoint on 64-bit servers, you will need to use the 64-bit filter. It can be downloaded from http://labs.adobe.com/wiki/index.php/PDF_iFilter_8_-_64-bit_Support. Please do not follow the instructions provided by Adobe with the filter. They include modifying the registry (not necessary - you should add file types using SharePoint's administration pages, not by modifying the registry). Also, the icon does not appear to get installed automatically and still needs to be added using the instructions outlined above.

Again, thanks to S.S.Ahmed for writing the original 'how to' that this post is based on.

Technorati tags: SharePoint, SharePoint 2007, MOSS 2007

21 Comments:

Blogger Mark said...

An alternative is the Foxit PDF IFilters - unlike Adobe, they have a 64bit version too.

http://markharrison.co.uk/blog/2007/05/foxit-pdf-ifilter-x64-and-32-bit.htm

12 May, 2007  
Anonymous Anonymous said...

I don't know if this is a typo or an eccentricity with my MOSS server, but I couldn't get the pdf icon to work when I put the .gif file in the folder you listed. I had to put it in c:\Program Files\Common Files\Microsoft Shared\web server extensions\12\TEMPLATE\IMAGES (the difference is the TEMPLATE part).

16 July, 2007  
Blogger Joining Dots said...

You were right first time - it is a typo. Weirdly, I'd managed to get the full string in for the XML folder.

Many thanks for letting me know - the post has been updated.

17 July, 2007  
Blogger SuperSantos said...

I tried to make what you describe here, but the filtering is not working... the only difference with your description is that I used MOSS 2007 with SP1. Do you know if there are bugfixes to use this feature?
Best regards

18 February, 2008  
Anonymous Anonymous said...

If you see internet explorer icon [e] instead of the pdf icon...

It is because we should perform a Full Update on the Search content indexes.

-Open a Command Prompt on the Indexing Server.

-net stop osearch.
-net start osearch.

-Go to Central Administration, then to the Shared Services Administration Web of the current SSP, go to Search Settings and start a full crawl of all locations containing PDF files.

That should refresh it and the pdficon.gif file will be visible as it should.

[Georgia.]

17 March, 2008  
Anonymous http://mattbeaver2002.spaces.live.com/blog/cns!801AB4157DF28FDA!14442.entry said...

This blog will show you how to use PDF iFilters for x32 & x64 servers and also how to get your document librarys to display a PDF icon.

http://mattbeaver2002.spaces.live.com/blog/cns!801AB4157DF28FDA!14442.entry

21 April, 2008  
Anonymous Anonymous said...

What is the step 5:
Perform a full crawl of your content sources

I have downloaded the icon of the pdf file, added the new file type and added a new line in DOCICON.XML. But the last step is 5 "Perform a full crawl of your content sources", I don't know how to do it. can someone tell me that?

Thanks

12 January, 2009  
Blogger Joining Dots said...

Hi Anon

You need to go into Shared Services | Search Settings. The first item on the list are your content sources. You can go in there and perform a full crawl. If you have a lot of content, it may take some time and impact your network. In which case, consider performing the full crawl in off-peak hours.

Hope that helps, let me know if you need more info.

12 January, 2009  
Anonymous Anonymous said...

Thank you Joining Dots,
I’ve downloaded/installed the Adobe 9.
Added the file type “pdf” in Docicon.XML file.
Added file type in the File extension in Sets Settings.
Added a new content sources and performed a Full crawl.

But the search for .pdf file is not working for me. I upload 2 documents in Shared Documents,
One is Word document file John Smith Resume.doc, another is OCR Adobe file John Smith Resume.pdf, they are having the same text. But when I search the text “University of MD”, I only get the result: “Result 1-1 of 1. your search took 0.37 seconds.” And the Word “John Smith Resume.doc” listed there.

Did I miss something? Can Sharepoint2007 really search the text in the text in pdf file?

Thanks.

13 January, 2009  
Blogger Joining Dots said...

Now that's interesting. The third instance of PDF indexing not working when installing Adobe 9. I suspect Adobe have changed something and broken the method. Will give it a whirl on my demo server to confirm.

Two easiest fixes - install a copy of Adobe 8 instead. If you can't get a copy of 8, it's back to the old method of extracting and installing the filters. Alternatively, use different PDF filters. Foxit are highly rated and rumoured to perform much better than Adobe too.

13 January, 2009  
Anonymous Anonymous said...

Thank you Joining Dots.
I got this Adobe 9 worked, Adobe 9 has something missed. Please read this blog:
http://blogs.msdn.com/ifilter/archive/2007/03/29/indexing-pdf-documents-with-adobe-reader-v-8-and-moss-2007.aspx

The post on Friday, December 19, 2008 1:05 PM by Mike Ruhl.

Thanks.

14 January, 2009  
Blogger Bleeding Crimson said...

Thanks for the help. Just wanted to note that I had a little trouble and found my solution. I added the file type first in central admin, and then added the image and edited the xml. This caused the image to not be associated with file type. I went in and deleted the file type and added it back and it worked fine.

Just might want to change your steps up to make sure people don't get confused. Thanks for everything.

18 February, 2009  
Blogger Joining Dots said...

Ah, now that's interesting. I had some odd behaviour with my last build and now wonder if the last updates to SharePoint have made the install order important. In the past, icon first was never a problem...

Am in the process of setting up another prototype. Will follow your install order and, assuming it works, will write an updated version of the post.

Many thanks for your feedback.

18 February, 2009  
Blogger Bleeding Crimson said...

One other question for you. I have my DB (SQL 2005) installed on another box separate from my sharepoint box. Do I need to also install a filter on the sql server? Thanks much.

18 February, 2009  
Blogger Joining Dots said...

You need to install the ifilters on your indexing/search servers

Chances are, the separate SQL DB box is for your content. Shouldn't need to install ifilters there. The indexing process pulls all content across to the indexing server for indexing - that's where the ifilters need to be.

If you've got an architecture diagram showing server layout, send it through on email (details on the contact page - http://www.joiningdots.net/about/contact.htm) and I'll have a look at it.

18 February, 2009  
Blogger Joining Dots said...

Hi Bleeding Crimson

You're not wrong - I should have been clearer with the order. When following it to the letter, I had the same problems you experienced :-)

And an added bonus: this time around I had to do an iisreset before the PDF icon would be recognised in Search Settings.

Will update the blog post shortly. Thanks again for the feedback.

22 February, 2009  
Anonymous Anonymous said...

I've followed these instructions carefully, After I run the full crawl, my pdf documents show up when I do a search but if I do a search on a word in the pdf - it fails. Any thoughts on this? Thanks.

13 April, 2009  
Blogger Jayvardhan Patil (Jay) said...

Currently I'm doing some R&D on iFilters available including iFilters for PDF files.

Having 2-3 iFilters options in market calls for need to evaluate them.

Read the performance evaluation here

http://codeforfuture.com/2009/05/22/finding-best-sharepoint-pdf-ifilter-64-bit/

and please let me know your views if any.

24 May, 2009  
Blogger MindTheGap said...

This also suggets some links to other ifilters u might want, as well as some good instructions and links.
http://zebracube.wordpress.com/2009/06/21/pdf-ifilter-sharepoint/

21 June, 2009  
Blogger sboals said...

Great article. We create searchable PDFs, and get this question all the time.

Steve
PSIGEN SharePoint Imaging

09 November, 2009  
Blogger Joining Dots said...

Thanks for the nice feedback Steve, always appreciated.

10 November, 2009  

Post a Comment

Links to this post:

Create a Link

<< Home

Note regarding comments: Whilst we do not mind links to external web sites if they are relevant
to the post, comments that include spam, abuse or unsolicited marketing will be removed.