Sans Pareil Technologies, Inc.

Key To Your Business

Setting up a Clustered Workspace


This page describes the steps required to set up a clustered workspace in Magnolia. The usual Magnolia workflow is to produce all content on the author instance, and then active the content to all registered subscribers. This model is ideal for most CMS requirements, however, most web sites require some form of public user interaction that needs to be saved. Examples are polls, user comments, request and contact forms, all of which usually involve content being created on the public instances and that needs to be reflected across all instances (including author). Some of these use cases also involve approval from content editors (usually performed on author instance) before the content is visible on the public instances.

For all such cases, the simplest technique for handling content creation on public instance is to set up a clustered repository and additional workspaces within that repository. The default magnolia repository is set up as a local dedicated repository, and hence we need a new repository that is clustered to serve as the shared data store.
JNDI Data Source 
The first requirement for a clustered JackRabbit repository is the availability of a central data store - PostgreSQL in our example. JackRabbit may be configured to use its own connection pool, however for deployment flexibility we prefer to use a JNDI data source that is configured for each container - in our case Tomcat.

Configuring the data source in the container makes it easy to use a common repository configuration file (although each instance still needs a separate cluster id) that can be used to set up development, QA and production environments. Configuring the JNDI data source involves the following steps:
  • Add a Resource configuration in $CATALINA_BASE/conf/server.xml under GlobalNamingResources.
    <GlobalNamingResources>
      <Resource name="jdbc/clustered" auth="Container"
        type="javax.sql.DataSource"
        maxActive="20" maxIdle="2" maxWait="5000"
        username="magnolia" password="magnolia"
        driverClassName="org.postgresql.Driver"
        url="jdbc:postgresql:spt"
        description="JackRabbit clustered data store" />
    </GlobalNamingResources>
    
  • Add a ResourceLink configuration in our application META-INF/context.xml. Note that this configuration is used only if the application is being deployed for the first time. For applications that have already been deployed, we will need to remove/over-write the $CATALINA_BASE/conf/Catalina/<servername>/appname.xml file.
    <?xml version='1.0' encoding='utf-8'?>
    <Context>
      <ResourceLink name='jdbc/clustered' global='jdbc/clustered' type='javax.sql.DataSource' />
    </Context>
    
  • Copy the JDBC library to $CATALINA_BASE/lib.
JackRabbit Configuration 
The next step is to adapt a standard JackRabbit repository configuration file for use as a clustered repository. A clustered repository requires a Cluster configuration in addition to the general repository configuration. The following are the main configuration items that need to be modified for the new clustered repository.
  • Configure the PersistenceManager for workspaces and versioning.
      <Workspace name="default">
        <FileSystem class="org.apache.jackrabbit.core.fs.local.LocalFileSystem">
          <param name="path" value="${wsp.home}/default" />
        </FileSystem>
        <PersistenceManager class="org.apache.jackrabbit.core.persistence.bundle.PostgreSQLPersistenceManager">
          <param name="driver" value="javax.naming.InitialContext" />
          <param name="url" value="java:comp/env/jdbc/clustered" />
          <param name="schema" value="postgresql" /><!-- warning, this is not the schema name, it's the db type -->
          <param name="schemaObjectPrefix" value="${wsp.name}_" />
          <param name="externalBLOBs" value="false" />
        </PersistenceManager>
      </Workspace>
      <Versioning rootPath="${rep.home}/version">
        <FileSystem class="org.apache.jackrabbit.core.fs.local.LocalFileSystem">
          <param name="path" value="${rep.home}/workspaces/version" />
        </FileSystem>
        <PersistenceManager class="org.apache.jackrabbit.core.persistence.bundle.PostgreSQLPersistenceManager">
          <param name="driver" value="javax.naming.InitialContext" />
          <param name="url" value="java:comp/env/jdbc/clustered" />
          <param name="schema" value="postgresql" /><!-- warning, this is not the schema name, it's the db type -->
          <param name="schemaObjectPrefix" value="version_" />
          <param name="externalBLOBs" value="false" />
        </PersistenceManager>
      </Versioning>
    
  • Configure the Cluster for each node in the cluster. Note that you will need to modify the cluster id for each node - author, public1, public2 etc.
      <Cluster id="author" syncDelay="2000">
        <Journal class="org.apache.jackrabbit.core.journal.DatabaseJournal">
          <param name="driver" value="javax.naming.InitialContext"/>
          <param name="url" value="java:comp/env/jdbc/clustered" />
          <param name="schemaObjectPrefix" value="journal_"/>
          <param name="databaseType" value="postgresql"/>
        </Journal>
      </Cluster>
    
Copy one of the standard repository files that come with Magnolia and apply the above changes and name it jackrabbit-bundle-postgresql-search.xml. This file needs to copied into $CATALINA_BASE/webapps/<appname>/WEB-INF/config/repo-conf. Remember to change the cluster id after copying to the magnolia app directory.
Magnolia Configuration 
The final step is to configure the new repository (spt in our case) and a new workspace (clustered in our case) that we will use for all shared content.
  • Modify magnolia.properties and add a property for our new clustered repository. Add the following to the end of $CATALINA_HOME/webapps/<appname>/WEB-INF/config/default/magnolia.properties:
    # SPT clustered repository
    magnolia.repositories.spt.config=WEB-INF/config/repo-conf/jackrabbit-bundle-postgresql-search.xml
    
  • Modify repositories.xml and add configuration for our new clustered repository as well as our new workspace.
        <RepositoryMapping>
            <Map name="website" repositoryName="magnolia" workspaceName="website" />
            <Map name="config" repositoryName="magnolia" workspaceName="config" />
            <Map name="users" repositoryName="magnolia" workspaceName="users" />
            <Map name="userroles" repositoryName="magnolia" workspaceName="userroles" />
            <Map name="usergroups" repositoryName="magnolia" workspaceName="usergroups" />
            <Map name="mgnlSystem" repositoryName="magnolia" workspaceName="mgnlSystem" /> <!-- System internal data -->
            <Map name="mgnlVersion" repositoryName="magnolia" workspaceName="mgnlVersion" /> <!-- magnolia version workspace -->
            <Map name="clustered" repositoryName="spt" workspaceName="clustered" /> <!-- SPT clustered workspace -->
        </RepositoryMapping>
    
        <!-- spt clustered repository -->
        <Repository name="spt" provider="info.magnolia.jackrabbit.ProviderImpl" loadOnStartup="true">
            <param name="configFile" value="${magnolia.repositories.spt.config}" />
            <param name="repositoryHome" value="${magnolia.repositories.home}/spt" />
            <param name="contextFactoryClass" value="org.apache.jackrabbit.core.jndi.provider.DummyInitialContextFactory" />
            <param name="providerURL" value="localhost" />
            <param name="bindName" value="${magnolia.webapp}.spt" />
            <workspace name="clustered" />
        </Repository>