Upgrade path from 2.x to 3.2 : some tuning and recommendations (Part 1)

I would like to share with you some information about the upgrade from 2.1.1 to 3.2 (Enterprise version) we are currently executing, and especially the time you should plan for migration if you have a big repository as we do…

1/ Description of the context:

So we have executed the upgrade path from 2.1.1 to 2.1.7 and from 2.1.7 to 3.2 (please note it is not
mandatory to go through 3.1.1, as validated by the support).

Upgrade path from 2.1.1 to 2.1.7 is quite quick (as there is no schema upgrade here).

However, the upgrade 2.1.7 to 3.2 takes approx. 40 hours in our case…!
So the first thing you have to take care is to plan for a downtime period (or at least read-only) of your source v2.x production system…The only way to assess the duration is to run a “real” migration with a copy of your production data.

For your information, our hadware settings and data volume is described at the end of this post.
Basically, we have about 500 Gb of documents in our repository, but this is not the most important indicator actually.

What you especially have to measure is the number of records you have in ALF_NODE table (and also maybe in ALF_TRANSACTION table). In our case:
SELECT COUNT(*) from ALF_NODE; => 962984

Indeed, during the upgrade, documents are not updated at all, but there are a lots of treatments on the database to upgrade the schema and update the meta-data. We have clearly seen that most of the time/CPU is spent at database level (Oracle for us).

Especially, we can see that most of the time is spent on these 2 treatments:

a/dbscripts/upgrade/2.2/org.alfresco.repo.domain.hibernate.dialect.AlfrescoOracle9Dialect

b/ Applying patch ‘patch.updateDmPermissions’ (Update ACLs on all DM node objects to the new 3.0 permission model).

——————
2/ Some tuning and recommendations:

2.1/ The first advice (mandatory !) is to do a “cold backup” of your source production database. That means when you will do the database export to get a copy of your data, you have to stop Alfresco application. Otherwhise, you will probably experience several unique constraint violation errors…

——

2.2/ If you experience some SQL errors during data migration, it might be “normal”:

For instance, in our case we had some errors like this:
17:09:48,034 ERROR [org.alfresco.repo.domain.schema.SchemaBootstrap] Statement execution failed:
SQL: INSERT INTO t_alf_node
(
id, version, store_id, uuid, transaction_id, node_deleted, type_qname_id, acl_id,
audit_creator, audit_created, audit_modifier, audit_modified
)
SELECT
n.id, 1, s.id, n.uuid, nstat.transaction_id, 0, q.qname_id, n.acl_id,
null, null, null, null
FROM
alf_node n
JOIN t_qnames q ON (q.qname = n.type_qname)
JOIN alf_node_status nstat ON (nstat.node_id = n.id)
JOIN t_alf_store s ON (s.protocol = nstat.protocol AND s.identifier = nstat.identifier)

This might not necessarily a big problem. Some of these errors are well know by the support and due to some bugs in the v2.x Alfresco code. If you have such problem, then contact the support directly, and they will probably be able to provide an SQL script to cleanup the source v2.x data. Of course, you will have to restart the upgrade path from scratch, and re-import the initial v2.x database copy (cleanup script should be applied on the data copy, not directly in production :-).

——

2.3/ Database tuning:

The best way to speed up these treatments (if you want to avoid a too big downtime during migration) is to do some database tuning.

We have asked the support to know what are the best practices for Oracle, and here is the feedback:

” From previous occurrences we have analyzed the Oracle performances using certain statements.

We would recommend that you recalculate the schema stats on the database which may well speed things up a little:

dbms_stats.gather_schema_stats(ownname => user, options => ‘GATHER AUTO’, estimate_percent => dbms_stats.auto_sample_size); “.

We have tested that for Oracle 10g (run a dbms_stats before running the upgrade), but it has no significant effect…

If you are using MySQL, then I think there are a lots of tuning recommendation that the support could share with you, depending
on your configuration…sorry I have no more details for you, but support will probably be able to help.

——

2.4/ JVM tuning
Another tuning we applied was to increase -Xmx to 4G (we have a 64 bits server). This clearly helps for migration, at least it prevent us from OOM during the migration.

——

2.5/ If you experience OutOfMemory during data migration, then in some case you can simply – increase heap size – and relaunch the upgrade. This is possible only if all the Schema Update database script have been succesfully applied, and that the script fail during the “patches” execution.

So basically, if you see such lines in your log files…

10:37:06,193 INFO [org.alfresco.repo.domain.schema.SchemaBootstrap] Executing database script /home/alfresco/alfresco211/tomcat/temp/Alfresco/AlfrescoSchemaUpdate-org.hibernate.dialect.Oracle9Dialect-52799.sql (Copied from classpath:alfresco/dbscripts/create/1.4/org.hibernate.dialect.Oracle9Dialect/post-create-indexes-02.sql).
10:37:06,226 INFO [org.alfresco.repo.domain.schema.SchemaBootstrap] All executed statements written to file /home/alfresco/alfresco211/tomcat/temp/Alfresco/AlfrescoSchemaUpdate-All_Statements-52800.sql.
(…)
10:37:47,789 INFO [org.alfresco.repo.admin.patch.PatchExecuter] Checking for patches to apply …
(…)
18:13:41,080 INFO [org.alfresco.repo.admin.patch.PatchExecuter] Applying patch ‘patch.updateDmPermissions’ (Update ACLs on all DM node objects to the new 3.0 permission model).
18:14:01,264 WARN [org.alfresco.repo.cache.TransactionalCache.org.alfresco.storeAndNodeIdTransactionalCache] Transactional update cache ‘org.alfresco.storeAndNodeIdTransactionalCache’ is full (10000).
18:52:15,387 ERROR [org.alfresco.repo.admin.patch.PatchExecuter] (…)

…that means you can restart the server, and the PatchExecuter will probably be able to restart the upgrade from the latest point.

——

2.6/ See next post : Upgrade path from 2.x to 3.2 : some tuning and recommendations (Part 2)

=============================
3/ Our hardware settings and volume of data:

FYI, our stack is:
– Linux server with 4CPU and 8GB of RAM (64 bits)
– Red Hat Enterprise Linux Server release 5.x
– Oracle 10g (installed on the same server as Alfresco application)
– Tomcat 6.0.18
– JDK 6 u16 x64
– JVM settings:
export JAVA_OPTS=”-Xms4G -Xmx4G -XX:NewRatio=2 -XX:PermSize=160m -XX:MaxPermSize=160m -server”
export JAVA_OPTS=” ${JAVA_OPTS} -Dalfresco.home=${ALF_HOME} -XX:+UseParNewGC -XX:+UseConcMarkSweepGC -XX:CMSInitiatingOccupancyFraction=80 -XX:+HeapDumpOnOutOfMemoryError”

Our volume of data:

Our contentstore contains ~ 1 400 000 items (files and directories) and ~ 464Gb of documents.

SELECT MAX(ID) from ALF_NODE; => 103353578
SELECT COUNT(*) from ALF_NODE; => 962984

SELECT MAX(ID) from ALF_TRANSACTION; => 103353617
SELECT COUNT(*) from ALF_TRANSACTION; => 2540267

Advertisements

2 Responses to Upgrade path from 2.x to 3.2 : some tuning and recommendations (Part 1)

  1. Enguerrand SPINDLER says:

    Another database tuning, if you are using Oracle:

    We have the option to put tablespace in NOLOGGING mode.

    It means redo entries will not record in archive files. Hence reducing the database load…

    In our case, improvement was not small, but you can try that…

  2. […] Upgrade path from 2.x to 3.2 : some tuning and recommendations (Part 2) Previous post : Upgrade path from 2.x to 3.2 : some tuning and recommendations (Part 1) […]

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: