October 25, 2009
Last week we had a conf-call with Mike Farman (Alfresco Product Manager) and we got some news about the “quota per space” feature roadmap:
Reminder : quota per space is the ability to set the maximum size limit on a space (this already exist per user).
– Quota per space is still not implemented in v3.2,
– This is clearly identified by Alfresco team as a must have feature (requested by many customers).
– Alfresco is working on that topic, but they need time to design it properly to avoid impact on performance due to space size calculation.
– This feature is expected for v3.3 E (Q3 2010).
Note: for our company this feature would be really useful because we have currently no way to restrict the space used by business unit which are using our centralized DM platform. What we have implemented is a job which will recursively browse all the space and calculate the total size of each space. However, this is a very intensive and long running process (of course the bigger the space are and the longer the process will take). This job is under development/test, and we will soon deliver it in production. We currently have 600 GB of data, so we will soon have a feedback about wheter or not such process can be runned on a big volume of data…
October 25, 2009
Last week we had a conf-call with Mike Farman (Alfresco Product Manager) and we got some infos about the web client roadmap:
– First we have mentionned our need for a more user-friendly web UI (better than the JSF client)…
– We told that the new Share UI is a good improvement ; however the problem is that document uploaded through Share are managed in a “Site” structure in Alfresco DM, which is not the expected design (we would like to manage a standard DM structure). Also, Share is not aligned with DM in term of capabilities…
– Mike told us that the current client (JSF based) will still continue to be supported in the long term (no “retirement plan” for this client, still minor enhancement and bug fix supported).
– Also Mike told that:
– In the “middle term” (starting v3.3 to v4.0):
– Alfresco plan is to implement more and more core DM client capabilities into Share.
– The target is to fully align Share capabilities with DM client capabilities.
– Features that are currently under “migration” are advanced worklfow, actions, etc.
– We have mentionned that our biggest priority is to align security management (role, user-group) between DM and Share.
– Long term roadmap (v4.0 or higher ? ; early 2011 ; very rough estimate ; no commitment on this deliver date can be done for the moment):
– Alfresco has a plan to deliver a “Share like UI” on top of DM repository.
– This new UI will be a “Repository view” (currently Share provide a “Site view”).
– So the DM structure will remain unchanged (like “Company Home/MyRootSpace/MyChildSpace),
– The internal name for this project “repository doc lib”.
– This seems to be exactly what we are looking for….but once again : v4.0 or higher ; early 2011 ? ; no commitment on this deliver date….
– Custom web UI implementation:
– If customer is willing to implement a custom web UI, recommendation is clearly to use SURF and webscripts.
– SURF and webscripts are the target for any web UI (used for Share), and will continue to be supported.
October 23, 2009
We had a call with Mike Farman (Alfresco Product Manager) this morning, and here is the summary of our discussion about “basic” changes that can be done on Alfresco DM in order to “quickly” optimize performances:
- Mike Farman did confirm that the Open-Office treatment (plain text transformation for indexing) and the Lucene indexing process are the most intensive in term of CPU.
- So these are the 2 major things we should work on (in our case) if we need improve the scalability of the current plateform (as well as JVM tuning, see below).
- For Open-Office :
– As expected OO has big impact on CPU/RAM (whatever the version used, impact is similar),
– OO is used for:
– File transformation (e.g .ppt to pdf),
– Plain-text transformation (e.g .ppt to plain-text), this is to allow lucene indexing,
– For some specific meta-data extraction: but OO is not mandatory here, and it can be “replaced” by another treatment,
– Technically, it is possible to not use OO at all,
– Reminder : OO is currently disabled in our environment (however some limitation/impact because .ppt file content are not indexed anymore by Lucene, also pdf transformation is not working).
– OO process can be runned on a remote separated server/JVM.
– This is documented here : http://wiki.alfresco.com/wiki/Running_OpenOffice_From_Remote_Machine
– We will check if this is something feasible in our current architecture,
– If OO is runned on a separated server, the benefits for us will then be:
– Lucene indexing of .ppt document (and others ?) can be re-enable,
– File transformation to .pdf can be re-enable.
- For Lucene indexing:
– Lucene has 2 major impacts on Alfresco CPU/RAM:
– The Lucene indexing process is resource consuming (indexing of new documents). This is the biggest treatment.
– Lucene search request (end-user searching for a document).
– Lucene search engine cannot be completly disabled (it is used for doc search but also for some internal technical search operation).
– Lucene search engine cannot be runned on a separated server (neither in version 2.x nor 3.x).
– Lucene indexing of document content can be adapted in order to:
– Either do the job asynchronously (scheduled during the night using cron). Meta-data will still be searchable in real-time.
– Or completely “disabled” (i.e search form removed from web UI) : then we need to implement another third-party search solution. Meta-data will still be searchable in real-time.
– A custom solution (AMP) will be provided by Mike to do the job asynchronously or to disable completely Lucene indexing of doc content.
– Technically: the solution consist in applying an aspect on each document (flag indexing = no/yes, default value = no). To allow indexing the flag should be set to yes programmatically (solution to be studied ?).
- The JVM tuning seems to be also very important : we should check if the following recommendation can/should be applied (http://wiki.alfresco.com/wiki/Repository_Hardware).
It is recommended to do some JVM profiling to see if some tuning parameter can improve performances (also check gc cycle).
- Database tuning should be considered also (no recommendation provided here).
- Other performances consideration: the more rules you have, the bigger CPU consumption will be…
– On our case, we only use a few rules (mainly for meta-data extraction),
– Moving OO on a separated server can reduce the impact on CPU when file to .pdf transformation is used. (this rule is still available on our server, it might or might not be used by end-users…).
October 8, 2009
Today the Xenit company showed us a demo of their new “Fred” interface.
What is Fred ?
Basically, it’s a desktop application (heavy client) which provides “windows explorer” like features, and other features like outlook email client integration.
To give you an idea, the UI looks like an Instant Messaging windows, which display a windows folder tree structre.
Technically the client side is based on .Net, and the communication with Alfresco servers involves Alfresco REST API (as well as a Fred “connector” which should be installed on server side).
Clearly the target is to provide an alternative to the windows explorer client which is of course well known by users, but which uses the CIFS protocole (which is too verbose to be used on WAN).
With first release of Fred’s interface, one can:
– drag&drop of files or emails (from windows to Alfresco).
– update/read meta-data,
– configure/bookmark a prefered root space folder,
– easily get and send by email the link to an Alfresco doc,
More about Fred
FYI, Fred is still in beta version and will be soon deployed in a first client, according to Xenit representative.
Basically, my opinion is that Fred is a very user-friendly interface (compared to current JSF web client) and that the REST API approach might ease remote working (vs CIFS).
However, I’m not sure that the heavy client is the best approach for distributed companies (deploying and installing binaries on local computer is clearly very difficult when yo have thoushands of PC to manage).
So first company able to deliver a “windows explorer” like client, based on pure web technologies, will probably will the batlle…
October 8, 2009
Answer is: yes
According to Alfresco representative, document nodeid are not modified during a software upgrade. NodeID are stored in the database and so they keep the same value (as upgrade require a DB copy).
Also on a cluster, both cluster member will work with the same nodeID.
On a cluster, the only elements which might not have the same ID are the lucene indexes UUID (but this is not a problem because each cluster member has its own index copy).