The Liferay File Manager Portlet


is part of our document management strategy for Liferay and integrates seamlessly into all content concepts of [j] karef GmbH. An ingenious set of tools allows editors and authors to provide their assets in Liferay with meta-information that will be retained even after the download of a document to the user. With special software tools, this metadata can be used without external interfaces. Especially the annotation of PDFs usually causes special problems. PDFs have become an integral part of the digital world and enjoy great popularity. PDFs allow the facsimile of "print" products and allow publishers to distribute documents whose layout can not be destroyed by technical incompatibility of client computers. A PDF "... feels like an original." This is one of the main reasons for the popularity of the format. For search engines, text extraction or data mining, however, the technical preconditions create special problems.

Machines can not read PDFs!


A PDF is not really machine readable. Of course, several search engine providers are able to read text from PDFs and offer them in their search indexes, but searching for specific chapters or structuring is almost impossible. This is where the File Manager portlet comes in. Information about title, author, release date, or other metadata is stored by File Manager both in the Liferay database and in the PDF. The meta-information is forwarded with the PDF and can also be read again by PDF reader. [j] search automatically recognizes and indexes the formats and embedded metadata, and ensures that metadata is displayed in the search result lists as well as in various Liferay portlets.

PDFs managed by the File Manager are also provided with a digital signature. This ensures that documents can not be corrupted.