AppSuite:DocumentViewer: Difference between revisions

From Open-Xchange
No edit summary
Line 32: Line 32:
Add the following repositories to your Open-Xchange yum configuration:
Add the following repositories to your Open-Xchange yum configuration:


  {{for loop||call=YUMRepo|pv=reponame|pc1n=path|pc1v=products/appsuite/stable|pc2n=rhelname|pc2v=RHEL6|documentconverter-api|office-web}}
  {{for loop||call=YUMRepo|pv=reponame|pc1n=path|pc1v=products/appsuite/stable|pc2n=rhelname|pc2v=RHEL6|documentconverter-api|office-web|open-xchange-pdftool}}
  {{for loop||call=YUMRepo|pv=reponame|pc1n=path|pc1v=products/appsuite/stable|pc2n=rhelname|pc2v=RHEL6|pc3n=ldbaccount|pc3v=[CUSTOMERID:PASSWORD]|documentconverter|readerengine}}
  {{for loop||call=YUMRepo|pv=reponame|pc1n=path|pc1v=products/appsuite/stable|pc2n=rhelname|pc2v=RHEL6|pc3n=ldbaccount|pc3v=[CUSTOMERID:PASSWORD]|documentconverter|readerengine}}



Revision as of 10:59, 15 November 2018

OX Document Viewer

Product Description

The OX Document Viewer delivers plugin-free document viewing capabilities for Microsoft Office (.docx, .doc, .rtf, .pptx, .ppt, .xlsx, xls) and OpenDocument (.odt, .ods, .odp, .odg) file types as well as for the Portable Document Format (.pdf). It extends OX App Suite with content thumbnails and preview capabilities.

Requirements

OX Document Viewer requires a 64bit system. 32bit systems are not supported.

See the Open-Xchange software requirements page for details.

The OX Document Viewer deployment consists of two functional modules, that need to be installed separately: the readerengine component and the Document Viewer component.

ReaderEngine

See Readerengine installation instructions

Document Viewer

See Document converter API installation instructions

See Document converter installation instructions

For a more detailed guide explaining different installation variants visit the Installation Guide

Installation

Redhat Enterprise Linux 6 or CentOS 6

Add the following repositories to your Open-Xchange yum configuration:

 [open-xchange-documentconverter-api]
name=Open-Xchange-documentconverter-api
baseurl=https://software.open-xchange.com/products/appsuite/stable/documentconverter-api/RHEL6/
gpgkey=https://software.open-xchange.com/oxbuildkey.pub
enabled=1
gpgcheck=1
metadata_expire=0m

[open-xchange-office-web] name=Open-Xchange-office-web baseurl=https://software.open-xchange.com/products/appsuite/stable/office-web/RHEL6/ gpgkey=https://software.open-xchange.com/oxbuildkey.pub enabled=1 gpgcheck=1 metadata_expire=0m
[open-xchange-open-xchange-pdftool] name=Open-Xchange-open-xchange-pdftool baseurl=https://software.open-xchange.com/products/appsuite/stable/open-xchange-pdftool/RHEL6/ gpgkey=https://software.open-xchange.com/oxbuildkey.pub enabled=1 gpgcheck=1 metadata_expire=0m
[open-xchange-documentconverter] name=Open-Xchange-documentconverter baseurl=https://[CUSTOMERID:PASSWORD]@software.open-xchange.com/products/appsuite/stable/documentconverter/RHEL6/ gpgkey=https://software.open-xchange.com/oxbuildkey.pub enabled=1 gpgcheck=1 metadata_expire=0m
[open-xchange-readerengine] name=Open-Xchange-readerengine baseurl=https://[CUSTOMERID:PASSWORD]@software.open-xchange.com/products/appsuite/stable/readerengine/RHEL6/ gpgkey=https://software.open-xchange.com/oxbuildkey.pub enabled=1 gpgcheck=1 metadata_expire=0m
$ yum install readerengine open-xchange-documentconverter

Redhat Enterprise Linux 7 or CentOS 7

Add the following repositories to your Open-Xchange yum configuration:

 [open-xchange-documentconverter-api]
name=Open-Xchange-documentconverter-api
baseurl=https://software.open-xchange.com/products/appsuite/stable/documentconverter-api/RHEL7/
gpgkey=https://software.open-xchange.com/oxbuildkey.pub
enabled=1
gpgcheck=1
metadata_expire=0m

[open-xchange-office-web] name=Open-Xchange-office-web baseurl=https://software.open-xchange.com/products/appsuite/stable/office-web/RHEL7/ gpgkey=https://software.open-xchange.com/oxbuildkey.pub enabled=1 gpgcheck=1 metadata_expire=0m
[open-xchange-documentconverter] name=Open-Xchange-documentconverter baseurl=https://[CUSTOMERID:PASSWORD]@software.open-xchange.com/products/appsuite/stable/documentconverter/RHEL7/ gpgkey=https://software.open-xchange.com/oxbuildkey.pub enabled=1 gpgcheck=1 metadata_expire=0m
[open-xchange-readerengine] name=Open-Xchange-readerengine baseurl=https://[CUSTOMERID:PASSWORD]@software.open-xchange.com/products/appsuite/stable/readerengine/RHEL7/ gpgkey=https://software.open-xchange.com/oxbuildkey.pub enabled=1 gpgcheck=1 metadata_expire=0m
$ yum install readerengine open-xchange-documentconverter

Debian GNU/Linux 8.0 (Jessie)

Add the following repositories to your Open-Xchange apt configuration:

deb https://software.open-xchange.com/products/appsuite/stable/documentconverter-api/DebianJessie /
deb https://software.open-xchange.com/products/appsuite/stable/office-web/DebianJessie /
deb https://[CUSTOMERID:PASSWORD]@software.open-xchange.com/products/appsuite/stable/documentconverter/DebianJessie /
deb https://[CUSTOMERID:PASSWORD]@software.open-xchange.com/products/appsuite/stable/readerengine/DebianJessie /
$ apt-get update
$ apt-get install readerengine open-xchange-documentconverter

Debian GNU/Linux 9.0 (Stretch)

Add the following repositories to your Open-Xchange apt configuration:

deb https://software.open-xchange.com/products/appsuite/stable/documentconverter-api/DebianStretch /
deb https://software.open-xchange.com/products/appsuite/stable/office-web/DebianStretch /
deb https://[CUSTOMERID:PASSWORD]@software.open-xchange.com/products/appsuite/stable/documentconverter/DebianStretch /
deb https://[CUSTOMERID:PASSWORD]@software.open-xchange.com/products/appsuite/stable/readerengine/DebianStretch /
$ apt-get update
$ apt-get install readerengine open-xchange-documentconverter

SUSE Linux Enterprise Server 12

$ zypper ar https://software.open-xchange.com/products/appsuite/stable/documentconverter-api/SLE_12 documentconverter-api
$ zypper ar https://software.open-xchange.com/products/appsuite/stable/office-web/SLE_12 office-web
$ zypper ar https://[CUSTOMERID:PASSWORD]@software.open-xchange.com/products/appsuite/stable/documentconverter/SLE_12 documentconverter
$ zypper ar https://[CUSTOMERID:PASSWORD]@software.open-xchange.com/products/appsuite/stable/readerengine/SLE_12 readerengine
$ zypper ref
$ zypper install readerengine open-xchange-documentconverter 

Configuration

To enable document viewing for OX Drive the associated permission has to be set.

The default setting for all users is changed in the file permissions.properties in the directory /opt/open-xchange/etc.

After installation the functionality is disabled:

# Default permissions for all users
permissions=

The following line enables the functionality:

# Default permissions for all users
permissions=document_preview

If there are already some permissions in this file, add document_preview separated with a comma.

Further settings for the underlying readerengine are located in the file documentconverter.properties located in the directory "/opt/open-xchange/etc" as described below.

Note: Document conversion service and OX App Suite backend may be run on different nodes. See the configuration item com.openexchange.documentconverter.RemoteBaseUrl for details.

A summary of all configuration items, together with each default value, is given below. Although the defaults have been carefully chosen for a real life deployment, the admin should take a closer look at each of them and adjust them accordingly, if necessary.

com.openexchange.documentconverter.installDir=/opt/readerengine

This item contains the the directory of the libreaderengine installation. The libreaderengine installation directory in general contains the ./program directory, which itself contains the engine executables.
VERY IMPORTANT: If not set correctly, the complete web service will be nonfunctional.
Default value: "/opt/readerengine"

com.openexchange.documentconverter.cacheDir=/var/spool/open-xchange/documentconverter/readerengine.cache

This item contains the directory that will make up the cache for persistent job data. The directory itself does not need to exist at startup, but the parent directory needs to exist and needs to have write permissions for the user running the servlet, in order for the servlet to create this cache directory at runtime.
VERY IMPORTANT: If not set correctly, the complete web service will be nonfunctional.
Default value: "/var/spool/open-xchange/documentconverter/readerengine.cache"

com.openexchange.documentconverter.scratchDir=/var/spool/open-xchange/documentconverter/readerengine.scratch

This item contains the directory, that will make up the runtime enironment for the readerengine. The directory itself does not need to exist at startup, but the parent directory needs to exist and needs to have write permissions for the user running the servlet , in order for the servlet to create this cache directory at runtime.
VERY IMPORTANT: If not set correctly, the complete web service will be nonfunctional.
Default value: "/var/spool/open-xchange/documentconverter/readerengine.scratch"

com.openexchange.documentconverter.errorDir=

This item specifies a directory for files that could not be loaded due to an error condition or due to a timeout.
Note: The used disk space will grow with retained files. Files have to be removed manually.
Default value: n/a

com.openexchange.documentconverter.blacklistFile=/opt/open-xchange/etc/readerengine.blacklist

The list of external document content URLs that are not allowed to be loaded by the readerengine after loading a document. The file itself contains a list of (newline separated) regular expressions. Each external URL is first checked against the list of blacklist URL regular expressions. If the external URL matches one blacklist entry, the external URL is then checked against the list of whitelist URL regular expressions. The behavior in summary is as follows: If the URL is not blacklisted and not whitelisted, it is resolved at runtime. If the URL is blacklisted but not whitelisted, it is not resolved at runtime. If the URL is not blacklisted but whitelisted, it is resolved at runtime. If the URL is blacklisted and whitelisted, it is resolved at runtime. In boolean notation: valid = (!blacklisted) || whitelisted Please note that the regular expressions need to fully qualify the patterns that the URL should be checked against. Upper/Lower cases need to be handled by the regular expression as well. The file itself needs to be UTF-8 encoded to be read appropriately.
Default value: "/opt/open-xchange/etc/readerengine.blacklist"

com.openexchange.documentconverter.whitelistFile=/opt/open-xchange/etc/readerengine.whitelist

The list of external document content URLs that are allowed to be loaded by the readerengine after an external URL matched a blacklist pattern. The file itself contains a list of (newline separated) regular expressions. Each external URL is only checked against the list of whitelist URL regular expressions if it previously matched a pattern in the blacklist file. If the external URL matches one blacklist entry, the external URL is then checked against the list of whitelist URL regular expressions. The behavior in summary is as follows: If the URL is not blacklisted and not whitelisted, it is resolved at runtime. If the URL is blacklisted but not whitelisted, it is not resolved at runtime. If the URL is not blacklisted but whitelisted, it is resolved at runtime. If the URL is blacklisted and whitelisted, it is resolved at runtime. In boolean notation: valid = (!blacklisted) || whitelisted Please note that the regular expressions need to fully qualify the patterns that the URL should be checked against. Upper/Lower cases need to be handled by the regular expression as well. The file itself needs to be UTF-8 encoded to be read appropriately.
Default value: "/opt/open-xchange/etc/readerengine.whitelist"

com.openexchange.documentconverter.urlLinkLimit=200

The external URL link limit specifies the maximum amount of valid external internet URLs (filtered by blacklist and whitelist before), that are tried to get resolved by the engine when loading a document. When this limit is reached, no more external internet URLs are resolved for the current document.
Important: Please take note than one externally linked object within the document does not automatically correspond to one external URL call. In general, there are - at least - two URL calls necessary to display one externally linked object. Such additional calls are in most cases based on a format detection, happening prior to resolving the object data itself.
Set to -1 for no upper limit or to 0 to disable the resolving of internet URLs completely
Default value: 200

com.openexchange.documentconverter.urlLinkProxy =

The external URL link proxy entry specifies a proxy server that is used by the readerengine to resolve external links, contained within a document. Such links are e.g. external http:// graphic links, that are going to be resolved during the filtering process of a readerengine instance. Set this entry to the address of the proxy server: host:port Recognized protocols are http://, https:// and ftp:// Leave empty, if no proxy server should be used by the readerengine
Default value: n/a

com.openexchange.documentconverter.RemoteBaseUrl =

Use a remote document conversion webservice to do the actual conversion; Set this entry to the base URL of the remote host http://host[:port]/documentconverterPath; leave empty if conversion should happen on the local machine
Default value: n/a

From 7.8.2 on: The com.openexchange.documentconverter.RemoteBaseUrl is not valid for the documentconverter.properties file anymore. The corresponding documentconverter server needs to be set on the Ox backend node, where the documentconverter-client package has been installed. The name of the new entry is com.openexchange.documentconverter.client.remoteDocumentConverterUrl. The entry itself is located within the documentconverter-client.properties configuration file>

com.openexchange.documentconverter.RemoteCacheUrls =

Use one or more remote converter cache(s) to speedup the conversion. The first entry, if set, is treated as the remote master cache, receiving cache updates from the local cache. Additional entries are treated as remote slave caches for read purposes only.
Set the (whitespace separated) entries to the base URL('s) of the appropriate remote host(s): http://host[:port]/documentconverterCachePath
Leave empty if only the local filesystem cache should be used
Default value: n/a

com.openexchange.documentconverter.RemoteSharePointUrl =

Use a remote SharePoint service to do MSO to PDF conversions.
Set this entry to the URL of the SharePoint host: http://host[:port]/_vti_bin/oxconvert.svc/mex?wsdl
If left empty, the corresponding conversion job always returns false.
Default value: n/a

com.openexchange.documentconverter.RemoteSharePointUsername =

The login user name to be used for calls to the SharePoint service
Default value: n/a

com.openexchange.documentconverter.RemoteSharePointPassword =

The password to be used for calls to the SharePoint service
Default value: n/a

com.openexchange.documentconverter.jobProcessorCount=3

This item determines the number of engines working in parallel for job execution. The value needs to be greater or equal to 1, with best performance results about (n-1), where n specifies the number of available CPU cores of the machine the service is running on.
Default value: 3

com.openexchange.documentconverter.jobRestartCount=50

This item determines the maximum number of executed jobs after which a single engine is automatically restarted in order to avoid memory fragmentation and possible memory leaks within one libreaderengine instance,
Default value: 50

com.openexchange.documentconverter.jobExecutionTimeoutMilliseconds=60000

This item determines the timeout in milliseconds, after which the execution of a single job is terminated.
Default value: 60000

com.openexchange.documentconverter.maxVMemMB=2048

This item determines the maximum size in megabytes (MB) of virtual memory that each started readerengine process is allowed to consume. If a job tries to consume more VMem than set via this config item, the processing of the current job for the appropriate readerengine process will be aborted and the underlying process is restarted to avoid memory corruption.
Set this value to -1 for no upper limit.
Default value: 2048

com.openexchange.documentconverter.maxCacheSizeMB=-1

This item determines the maximum size in megabytes (MB) of all persistently cached converter job entries at runtime. A larger value may drastically reduce the time for conversion jobs, e.g. in case of a repeated creation of document previews.
Set this value to -1 for no upper limit.
Default value: -1

com.openexchange.documentconverter.maxCacheEntries=-1

This item determines the maximum number of converter jobs cached at runtime. The value affects the amount of runtime job information to be cached as well as the number of file entries within the cache directory.
Set this value to -1 for no upper limit.
Default value: -1

com.openexchange.documentconverter.cacheEntryTimeoutSeconds=2592000

This item determines the timeout in seconds, after which a cached job result is automatically removed from the cache.
Set this value to 0 to disable the timeout based removal of cached job results.
Default value: 2592000

com.openexchange.documentconverter.enableCacheLookup=false

Setting this flag to true enables the caller of the RemoteInternalPreviewService#getCachedPreviewFor implementation (OfficePreviewService) to retrieve the cached only result of a previous conversion call, without scheduling a new job in case of a non existing cache entry, which might run for a long period time, up to the given job timeout time.
Set to false to disable the cache lookup within the RemoteInternalPreviewService#getCachedPreviewFor implementation.
Default value: false

com.openexchange.documentconverter.errorCacheTimeoutSeconds=600

This value determines, how long an error, associated with a job hash value, is held within the error cache. If the timeout has not been reached, additional RemoteInternalPreviewService#getPreviewFor calls with the same job hash will instantly return with the cached error code instead of processing the job again.
Set to 0 to disable the error cache handling.
Default value: 0

com.openexchange.documentconverter.errorCacheMaxCycleCount=5

This value determines the number of cycles, a job, associated with a job hash value, is added to the error cache. One cycle starts after adding a job to the error cache and ends after the errorCacheTimeout has been reached. After reaching the given maximum cycle count, the job is not removed from the error cache anymore and will be held within the error cache for the rest of the runtime of the current backend instance. Since the error cache is not persistent, the cycle counter for each job hash is reset after a restart of the backend instance.
Set to 0 to disable the error cache handling.
Default value: 5

com.openexchange.documentconverter.servletLocalFileUrls=false

This item determines, if the documentconverter servlet should be allowed to handle file Urls of the form file://... The file Url itself is a resource that locates files that are locally accessible on the machine, the documentconverter backend is running on.
Default value: false

com.openexchange.capability.sharepointconversion=false

Capability to enable the usage of a SharePoint conversion server; capability is only checked, if a valid SharePoint remote converter has been configured appropriately
Default value: false

Handling of temporary files

The DocumentConverter server needs to store files at runtime for different purposes at different volume locations:

  • Persistent files (Cache) The files that should last longer than the runtime of one converter instance are stored at the configurable com.openexchange.documentconverter.cacheDir directory. As the name of the property implies, such files are result cache entries used by multiple converter instances. This directory is monitored at runtime and all files are managed by the converter. Constraints for this directory are set via the converter properties com.openexchange.documentconverter.minFreeVolumeSizeMB, com.openexchange.documentconverter.maxCacheSizeMB, com.openexchange.documentconverter.maxCacheEntries and com.openexchange.documentconverter.cacheEntryTimeoutSeconds.
  • Medium lasting files These files are only valid for the runtime of one converter instance (e.g. ReaderEngine related runtime config files for each ReaderEngine instance). They are stored within the configurable com.openexchange.documentconverter.scratchDir directory. This directory is not constantly monitored at runtime but all files, contained in the ${com.openexchange.documentconverter.scratchDir}/oxdc.tmp sub directory are managed by the converter during the startup and shutdown phase of one converter server instance. In this case, the whole ${com.openexchange.documentconverter.scratchDir}/oxdc.tmp directory gets cleaned up during converter server shutdown as well as converter server startup. Initial cleanup during startup is necessary due to the fact, that the last converter instance might have aborted for unknown reasons, like e.g. power outage, VM abort etc.
  • Short lasting files These files are stored within the Java VM specific I/O temporary directory, whose location is configurable via the Java VM system property java.io.tmpdir. This directory is used by the converter to temporarily store request attachments in most cases. The files stored within this directory have a lifetime equal to the duration of the request itself. When the request has been finished, the appropriate files are cleaned up. For the converter, this means that e.g. source files to be converted and attached to the request are extracted from the request and stored in order to prevent exceeding memory consumption by source file buffers. When the conversion request is finished, the stored temporary file gets deleted.

From 7.10.2 on: The java.io.tmpdir Java system property specified directory will not be used by the converter anymore. Instead, even short living temporary files will be stored at the ${com.openexchange.documentconverter.scratchDir}/oxdc.tmp location. By this change, even short living files will be stored inside this managed directory, so that a server shutdown/start cleans up this directory automatically. This change affects all files created by the converter implementation itself. Temporary files from other baseline bundles might still be stored within the configured java.io.tmpdir.


Caches

Conversion results are cached as images for thumbnail/preview/viewer. There are caches in the backend filestore and at each documentconverter involved.

Caching Strategy

Starting with the 7.4.1 release the .../documentconverter/readerengine.cache directory is persistent and kept alive between OX backend shutdowns/startups.

In addition a backend configured as a remote document converton service this service is able to act as a remote cache server. Similar to setting the 'RemoteBaseURL' within the documentconverter properties of the 'client' backends wrt. the documentconverter functionality, it is now possible to set RemoteCacheURLs. The lookup of converter results happens as follows:

  • the lookup of a cache entry is based upon the unique checksum of the source file as well as some conversion properties (e.g. target format, target size etc.)
  • if a converter backend is not able to find a cache entry within its runtime structures, the local - now persistent - readerengine.cache directory is looked up for a cache entry to be recreated from filesystem
  • if there's still no cache entry available and one or more RemoteCacheURLs are set, each given CacheURL Host is tried to retrieve a valid cache entry from
  • if a remote cache entry is found, this entry is added to the local cache of the requesting converter
  • if no remote cache entry can be found, the conversion is done locally, a cache entry is created locally (persistent) and also transferred to the first entry of the RemoteCacheURLs, acting as a master cache server in this case


Cleaning Caches

In case you want to clean the caches follow these two steps.

Use the command line tool for each affected context:

/opt/open-xchange/sbin/clearpreviewcache -c <CONTEXT ID>

Furthermore remove the content of the documentconverter cache directory manually at each conversion node. Before cleaning/removing the documentconverter cache directory, the OX backend node running the documentconverter, should be shut down in order to properly recreate the runtime information for the local cache. The local documentconverter cache location is defined by the following configuration property:

/opt/open-xchange/etc/documentconverter.properties

# The directory, containing the cache for persistent, huge job data at runtime
# Default value: "/var/spool/open-xchange/documentconverter/readerengine.cache"
com.openexchange.documentconverter.cacheDir=/var/spool/open-xchange/documentconverter/readerengine.cache

After the cleanup of both caches new conversions will be triggered for thumbnails, previews and viewer images.