Path

ezpublish / documentation / ez publish / technical manual / 4.7 / features / clustering / setting it up for an ezdbfi...


Caution: This documentation is for eZ Publish legacy, from version 3.x to 6.x.
For 5.x documentation covering Platform see eZ Documentation Center, for difference between legacy and Platform see 5.x Architecture overview.

Setting it up for an eZDBFileHandler

The following instructions reveal how you can configure eZ Publish to store images, binary files and content-related caches in the database using the eZ DB File Handler.
 Before going any further, please read the note: known issue added at the end of this page regarding a known issue when using eZ Publish in a clustered environment.

1. Clear the caches (optional)

It is recommended (but not required) to clear all eZ Publish caches before enabling the clustering functionality. This can be done by running the following command from the root of your eZ Publish installation (if you are using multiple servers, run this command from each server node in order to clear the local caches for each one):

$php bin/php/ezcache.php --clear-all --purge

"$php" should be replaced by the path to your php executable.

After running the script, make sure that all cache files have been cleared by inspecting the contents of the various cache subdirectories within the "var" directory (typically the "var/cache/" and "var/<name_of_siteaccess>/cache/" directories). If there are any cache files left, remove them manually.

2. Modify the "file.ini" settings

Add the following lines to an override for the "file.ini" configuration file ("settings/override/file.ini.append.php" or "settings/siteaccess/ezwebin_site/file.ini.append.php" where "ezwebin_site" is the name of your siteaccess):

[ClusteringSettings]
FileHandler=eZDBFileHandler
DBBackend=eZDBFileHandlerMysqliBackend
DBHost=localhost
DBPort=3306
DBName=name
 
DBUser=user
DBPassword=pass

First define the proper file handler, which in this case is "FileHandler=eZDBFileHandler". Specifying "eZDBFileHandler" in the "FileHandler" configuration setting will instruct eZ Publish to use the database specified here for storing images, binary files and content-related caches.

Next replace "localhost", "name" (for example "DBName=Cluster"), "user" and "pass" by actual host name, database name, user name and password. In most cases these values will be the same as "Server", "Database", "User", "Password" settings specified under the [DatabaseSettings] block of your "site.ini.append.php" configuration file.

The "DBBackend" setting specifies which back-end should be used by the eZ DB File Handler. For the database backend setting (DBBackend=) the previously used value "mysql" is deprecated. The PHP class name is to be used, with the following choices:

  • "eZDBFileHandlerMysqliBackend" for MySQL
  • "eZDBFileHandlerOracleBackend" for Oracle (requires the Oracle database extension).

3. Create a new script for serving images

On a clustered installation, files from the var folder that are read through HTTP will be served by PHP. Your web-server (e.g Apache) will be instructed to use a specific PHP script, "index_cluster.php", for serving those files. From version 4.7, this file is common to all clustered installations.

In order to maximize performances, this script doesn't use the fully blown INI configuration system. Dedicated settings must be provided using either config.php, or config.cluster.php, at the root of your eZ Publish setup. Both files will lead to the same behaviour, but config.cluster.php is only included when serving requests through index_cluster.php, not index.php. The list of possible settings can be found in the config.php-RECOMMENDED file shipped with the release. The easiest way is to copy the relevant settings from this file to your own config.php. A typical config.php for a DFS MySQLi cluster would look like this:

<?php
define( 'CLUSTER_STORAGE_BACKEND', 'dbmysqli' );
define( 'CLUSTER_STORAGE_HOST', 'localhost' );
define( 'CLUSTER_STORAGE_PORT', 3306 );
define( 'CLUSTER_STORAGE_USER', 'dbuser' );
define( 'CLUSTER_STORAGE_PASS', 'dbpassword' );
define( 'CLUSTER_STORAGE_DB', 'ezpcluster' );
define( 'CLUSTER_STORAGE_CHARSET', 'utf8' );

Note: Make sure you specify the same database settings as indicated under the "[ClusteringSettings]" block in your "file.ini.append.php" configuration file.

4. Create new database tables

The database table structure required to hold clustered file information needs to be created. This must be done manually, either on the same database server or on the one used for the relational database or on a different one. Keep in mind that for large scale websites, a dedicated database server will improve performances. The definitions of this table for MySQL can be found in the kernel/sql/mysql/cluster_db_schema.sql file. Instructions for Oracle are available in the Oracle database extension documentation.

5. Import files to the database

The files stored in the "var" directory need to be copied to the database. To do this, go to the root directory of eZ Publish and run the following script (replace "ezwebin_site" by the actual name of your siteaccess):

$php bin/php/clusterize.php -s ezwebin_site

Note that "$php" should be replaced by the path to your php executable.

The script will import your files, images and image aliases (image variations) that are stored under the "var" directory to the cluster database. eZ DB stores both metadata and data in tables, the metadata in ezdbfile, the binary data will however be split into chunks and stored in ezdbfile_data.

Keep in mind that this process might take some time, depending on the amount of files that need to be imported.

6. Compile the templates (optional)

Since all caches now are empty, you should re-compile the templates. Note that this step can be skipped and thus the templates will be compiled on-demand when the site is browsed. Go to the root directory of eZ Publish and run this command (if you are using multiple servers, run this command from each server node in order to compile the templates for each one):

$php bin/php/eztc.php -s ezwebin_site

Note that "$php" should be replaced by the path to your php executable.

Replace "ezwebin_site" by the actual name of your siteaccess. Repeat this step for all siteaccesses that are in use.

7. Update the Apache configuration

Apache needs to know which PHP script to use when serving images, in this case "index_cluster.php". The script simply fetches the images from the database and serves them. By adding the RewriteRules mentioned below every request for a content image or binary file will be rewritten to "index_cluster.php", which will then deliver the files directly through HTTP from the NFS server. These rules are the same for eZ DB and eZ DFS. So add the following rewrite rules to the ".htaccess" file before the other/existing rules:

RewriteRule ^/var/([^/]+/)?storage/images-versioned/.* /index_cluster.php [L]
RewriteRule ^/var/([^/]+/)?storage/images/.* /index_cluster.php [L]
RewriteRule ^/var/([^/]+/)?cache/public/(stylesheets|javascript) /index_cluster.php  [L]

If no ".htaccess" file is used, add the same rules above the existing rewrite rules for eZ Publish in your Apache configuration file because these rules need to be found before the standard eZ Publish rewrite rules.

8. Restart Apache and test the site

Restart the Apache web server. After it has been restarted, the system should be up and running in cluster mode. Verify that the site works correctly, content images are displayed and content binary files are accessible (open the site pages in a web browser, log in to the administration interface, try clicking around and so on).

If for example a page of your website does not work correctly because its images are not displayed, your rewrite rules or your "index_cluster.php" file might be configured incorrectly. To locate the error, load the image directly in the browser (by, for example, choosing "open image in a new tab"). If instead of the image "Module not found" is displayed, then your rewrite rules are not correctly configured. If a PHP error is shown, your "index_cluster.php" is most likely configured wrong.

To test and troubleshoot your website, it can be useful to have more debug information regarding the cluster. This is an optional configuration but to enable it, create an override of the debug.ini file and enable "kernel-clustering" in the [GeneralCondition] block like this:

[GeneralCondition]
(...)
kernel-clustering=enabled
(...)

9. Remove the imported files from the file system

If the site works correctly, you can remove the original content images and binary files from the file system (since they have been successfully imported to the database). To do this, you need to inspect the contents of the various storage sub-directories within the "var" directory (typically the "var/storage/" and "var/<name_of_siteaccess>/storage/" directories). If there are any content images and binary files left, remove them manually or by using the following command from the root of your eZ Publish installation:

$php bin/php/ezcache.php --clear-all

Note that "$php" should be replaced by the path to your php executable.

If you configured multiple servers, execute the command from each server node in order to clear the local caches for each one.

Note

The "clusterize.php" file mentioned in step "5. Import files to database" can also be used with a "-r" option. This will automatically remove the imported files after they have been clusterized. Using it will make this step "9. Remove the imported files from the file system" obsolete. But keep in mind that using the "-r" option is some what advanced so use with caution.

Note: known issue

When using a database based file handler (eZ DB or eZ DFS) the following bug will occur if all of the conditions listed here are true:

  •  You use MySQL
  •  You use different databases for the content and cluster tables
  •  You use the same host, port, user name and password for both databases
  •  The port is explicitly specified in both site.ini and file.ini.

The bug is that eZ Publish will look for content tables in the cluster database, which means that all page requests will fail. Although a solution has been proposed, it has not yet been approved at the time of this writing. So for the moment the quickest workaround is to use different user names for the two databases.

For more information regarding this issue, please visit http://issues.ez.no/13927

Limitation on some file systems when storing large number of content files

eZ Publish stores all disc related content (eg Images, PDF's etc) in var/storage like the structure from content tree, creating one folder for each object. In most file systems used under Linux (especially ext2 + ext3) there exists a hard LIMIT TO 32000 directories per folder. So it is not possible to store more as 31999 objects under one folder.
To get around this limitation without changing the file system, you can split your content tree so that you don't have more than 32k content files (example: images) in the same folder.
Examples of file systems that supports more file/folder entries per folder.

  • ReiserFS: roughly 1.2 million per directory
  • ZFS: 2^48 (a really big number: 281474976710656)!

Ester Heylen (14/09/2010 12:35 pm)

Ricardo Correia (25/02/2013 11:44 am)

Geir Arne Waaler, Andrea Melo, Ricardo Correia


Comments

There are no comments.