Path

7x / documentation / ez publish / technical manual / 4.5 / features / clustering / setting it up for an ezdfsf...


Caution: This documentation is for eZ Publish legacy, from version 3.x to 6.x.
For 5.x documentation covering Platform see eZ Documentation Center, for difference between legacy and Platform see 5.x Architecture overview.

Setting it up for an eZDFSFileHandler

The following instructions reveal how you can configure eZ Publish to store images, binary files and content-related caches in the database when using a eZ DFS File Handler. Before going any further, please read the note: known issue added at the end of this page regarding a known issue when using eZ Publish in a clustered environment.

1. Clear the caches (optional)

It is recommended (but not required) to clear all eZ Publish caches before enabling the clustering functionality. This can be done by running the following command from the root of your eZ Publish installation (if you are using multiple servers, run this command from each server node in order to clear the local caches for each one):

$php bin/php/ezcache.php --clear-all --purge

"$php" should be replaced by the path to your php executable.

After running the script, make sure that all cache files have been cleared by inspecting the contents of the various cache sub-directories within the "var" directory (typically the "var/cache/" and "var/<name_of_siteaccess>/cache/" directories). If there are any cache files left, remove them manually.

2. Modify the "file.ini" settings

Add the following lines to an override for the "file.ini" configuration file ("settings/override/file.ini.append.php" or "settings/siteaccess/ezwebin_site/file.ini.append.php" where "ezwebin_site" is the name of your siteaccess):

[ClusteringSettings]
FileHandler=eZDFSFileHandler

First define the proper file handler, which here is "FileHandler=eZDFSFileHandler".

When using eZDFSFileHandler configure the settings in the [eZDFSClusteringSettings] block in the same override file. It is necessary to define ("var/nfsmount" is just an example) the path to the NFS mount point (a local folder) and set the database back-end setting to "eZDFSFileHandlerMySQLiBackend" (MySQL is the only database supported for eZDFS, and the use of MySQLi is recommended) as shown here:

[eZDFSClusteringSettings]
MountPointPath=var/nfsmount
DBBackend=eZDFSFileHandlerMySQLiBackend
DBHost=dbhost
DBPort=3306
DBSocket=
DBName=cluster
DBUser=root
DBPassword=
DBConnectRetries=3
DBExecuteRetries=20

Replace "dbhost", "name" (for example "DBName=Cluster"), "user" and "pass" by actual host name, database name, user name and password. In most cases these values will be the same as "Server", "Database", "User", "Password" settings specified under the [DatabaseSettings] block of your "site.ini.append.php" configuration file.

Note: Folder indicated in MountPointPath shouldn't contain anything but files handled by eZ Publish cluster since files within this folder are maintained by cluster maintenance scripts, and can be potentially removed. If you need to store files here (i.e. custom cache files), be sure to use eZClusterFileHandler in your own PHP code.

3. Create a new script for serving images

When clustering your installation all images (except design images) will be served by PHP. Your web-server (e.g Apache) will be instructed to use a specific PHP script called "index_cluster.php" for handling images, which will make the serving of images faster because the system does not have to read the configuration from the database. "index_cluster.php" doesn't exist by default so it must be created manually and must include "index_image.php" along with a collection of configuration settings.

Create the "index_cluster.php" inside the eZ Publish root directory and make sure that it contains the following lines. (The default contents of this file can also be found in the comments and example at the top of your installations "index_image.php" file):

<?php
define( 'STORAGE_BACKEND',          'dfsmysqli'    );
define( 'STORAGE_HOST',             'localhost'   );
define( 'STORAGE_PORT',             3306          );
define( 'STORAGE_SOCKET',           false         );
define( 'STORAGE_USER',             'user'        );
define( 'STORAGE_PASS',             'pass'        );
define( 'STORAGE_DB',               'name'        );
define( 'MOUNT_POINT_PATH',         'var/nfsmount');
include_once(  'index_image.php' );
?>

Note: Make sure you specify the same database settings as indicated under the "[ClusteringSettings]" block in your "file.ini.append.php" configuration file.

The location 'var/nfsmount' is just an example, you can set another location if you wish for example /mnt/nfs.

With this script the inclusion sequence will work as follows. First the rewrite rules (to be added in step 7) will redirect the images requested to the newly created "index_cluster.php". The custom image file "http://site.com/var/storage/images/foo.jpg" will be transformed to "http://site.com/index_cluster.php/var/storage/images/foo.jpg". Because "index_cluster.php" contains the set of settings described below, the "index_image.php" file (which is part of eZ Publish distribution) will be included when "index_cluster.php" is executed. This index_image.php includes a file named "index_image_<STORAGE_BACKEND>.php". This means that if, for example, STORAGE_BACKEND is set to 'mysqli' in "index_cluster.php", the included file will be "index_image_mysqli.php". This "index_image_<STORAGE_BACKEND>.php" is the final script that will read and stream the files to the browser.

4. Create new database tables

The database table structure required to hold clustered file information needs to be created. This must be done manually, either on the same MySQL server or on the one used for the relational database or on a different one. Keep in mind that for large scale websites, a dedicated MySQL server can improve performance. The definitions of this table can be found inside the comment blocks in the beginning of the "mysqli.php" file located in the following sub-directory:

(...)/kernel/private/classes/clusterfilehandlers/dfsbackends/mysqli.php

5. Import files to the cluster

You need to copy the files stored in the "var" directory to the cluster. To do this, go to the root directory of eZ Publish and run the following script (replace "ezwebin_site" by the actual name of your siteaccess):

$php bin/php/clusterize.php -s ezwebin_site

Note that "$php" should be replaced by the path to your php executable.

The meta data will be stored on the database, whereas the files themselves are copied to the configured NFS mount point using a structure exactly similar as that of the "var" directory.

Keep in mind that this process might take some time, depending on the amount of files that need to be imported.

Note: Moving files to NFS must always be done with clusterize.php, as doing it manually will produce an incomplete/invalid content repository, and will make it impossible to clusterize files.

6. Compile the templates (optional)

Since all caches now are empty, you should re-compile the templates. Note that this step can be skipped and thus the templates will be compiled on-demand when the site is browsed. Go to the root directory of eZ Publish and run this command (if you are using multiple servers, run this command from each server node in order to compile the templates for each one):

$php bin/php/eztc.php -s ezwebin_site

Note that "$php" should be replaced by the path to your php executable.

Replace "ezwebin_site" by the actual name of your siteaccess. Repeat this step for all siteaccesses that are in use.

7. Update the Apache configuration

Apache needs to know which PHP script to use when serving images, in this case index_cluster.php. The script simply fetches the images from the database and serves them. By adding the RewriteRules mentioned below every request for a content image or binary file will be rewritten to index_cluster.php, which will then deliver the files directly through HTTP from the NFS server. These rules are the same for eZDFS and eZDB. So add the following rewrite rules to the ".htaccess" file before the other/existing rules:

RewriteRule ^/var/([^/]+/)?storage/images-versioned/.* /index_cluster.php [L]
RewriteRule ^/var/([^/]+/)?storage/images/.* /index_cluster.php [L]
RewriteRule ^/var/([^/]+/)?cache/public/(stylesheets|javascript) /index_cluster.php [L]
RewriteRule ^/index_cluster.php - [L]

If no ".htaccess" file is used, add the same rules above the existing rewrite rules for eZ Publish in your Apache configuration file because these rules need to be found before the standard eZ Publish rewrite

8. Restart Apache and test the site

Restart the Apache web server. After it has been restarted, the system should be up and running in cluster mode. Verify that the site works correctly, content images are displayed and content binary files are accessible (open the site pages in a web browser, log in to the administration interface, try clicking around and so on).

If for example a page of your website does not work correctly because its images are not displayed, your rewrite rules or your "index_cluster.php" file might be configured incorrectly. To locate the error, load the image directly in the browser (by, for example, choosing "open image in a new tab"). If instead of the image "Module not found" is displayed, then your rewrite rules are not correctly configured. If a PHP error is shown, your "index_cluster.php" is most likely configured wrong.

To test and troubleshoot your website, it can be useful to have more debug information regarding the cluster. This is an optional configuration but to enable it, create an override of the debug.ini file and enable "kernel-clustering" in the [GeneralCondition] block like this:

[GeneralCondition]
(...)
kernel-clustering=enabled
(...)

9. Remove the imported files from the file system

If the site works correctly, you can remove the original content images and binary files from the file system (since they have been successfully imported to the database). To do this, you need to inspect the contents of the various storage sub-directories within the "var" directory (typically the "var/storage/" and "var/<name_of_siteaccess>/storage/" directories). If there are any content images and binary files left, remove them manually or by using the following command from the root of your eZ Publish installation:

$php bin/php/ezcache.php  --clear-all

Note that "$php" should be replaced by the path to your php executable.

If you configure multiple servers, execute the command from each server node in order to clear the local caches for each one.

Note

The "clusterize.php" file mentioned in step "5. Import files to database" can also be used with a "-r" option. This will automatically remove the imported files after they have been clusterized. Using it will make this step "9. Remove the imported files from the file system" obsolete. But keep in mind that using the "-r" option is some what advanced so use with caution.

Note: known issue

When using a database based file handler (eZ DB or eZ DFS) the following bug will occur if all of the conditions listed here are true:

  •  You use MySQL
  •  You use different databases for the content and cluster tables
  •  You use the same host, port, user name and password for both databases
  •  The port is explicitly specified in both site.ini and file.ini.

The bug is that eZ Publish will look for content tables in the cluster database, which means that all page requests will fail. Although a solution has been proposed, it has not yet been approved at the time of this writing. So for the moment the quickest workaround is to use different user names for the two databases.

For more information regarding this issue, please visit http://issues.ez.no/13927

Limitation on some file systems when storing large number of content files

eZ Publish stores all disc related content (eg Images, PDF's etc) in var/storage like the structure from content tree, creating one folder for each object. In most file systems used under Linux (especially ext2 + ext3) there exists a hard LIMIT TO 32000 directories per folder. So it is not possible to store more as 31999 objects under one folder.

To get around this limitation without changing the file system, you can split your content tree so that you don't have more than 32k content files (example: images) in the same folder.
Examples of file systems that supports more file/folder entries per folder.
- ReiserFS: roughly 1.2 million per directory
- ZFS: 2^48 (a really big number: 281474976710656)!

Performance issues on cache generation

The MaxCopyRetries setting has been introduced in order to solve cache generation issues on low performance conditions.
This relates to a new feature that that retries to generate the cache file in the case of failure. There will be considered as much retries as defined in the MaxCopyRetries setting.
By default it comes set to 5. Increase this value if you find cache files like expiryXYZtmp.php with 0 byte size.

For more details please refer to the MaxCopyRetries setting documentation.

Character encoding and filenames

Also please make sure to configure the SystemLocale setting with the correct language, in order to avoid issues when uploading files with special characters, or with characters of a different encode.
Here's a configuration example:

[RegionalSettings]
SystemLocale=fr_FR.UTF-8

Please refer to Jira Issue EZP-20966, for more details on this subject.

Ester Heylen (14/09/2010 12:35 pm)

Ricardo Correia (25/11/2013 11:44 am)

Geir Arne Waaler, Jérôme Vieilledent, Ricardo Correia


Comments

There are no comments.