In 2008 I became aware of CMIS (Content Management Interoperability Standard), and started following its progress (from afar). I knew that several big players in the ECM world were pushing CMIS, and that it would allow interoperability between different repositories. (I blogged about CMIS in an earlier post.)
However, I never really understood how CMIS worked on a technical level. I know that there is a plethora of content about how CMIS worked, but, not having a BIG brain, I had a tendency to jump in at the deep end, and almost drown in the information that was available.
This has changed (a little bit). I’ve just watched several short YouTube videos. One in which Dr David Choy (Chair of the OASIS CMIS TC and a member of EMC’s CTO staff) discusses the technology behind CMIS, and another where Jignesh Shah (Product Manager for EMC’s DFC technology) talks about the application of CMIS. (Links to these are at the bottom of the page.)
… I made notes! Here they are:
Mark’s small Brain Notes on CMIS
CMIS Technology Primer
What is it?
A proposal from EMC, IBM & Microsoft developed in collaboration with OpenText, Oracle, Alfresco and SAP to provide an inter- operability standard to allow a generic application to access different content repositories without product-specific interface code.
Why is it needed?
Many companies are running disparate repositories from different vendors. This can be, for example, because each repository offers specific functionality that a specific business unit within the company makes use of, or that the company has acquired the content store/repository through the purchase of, or merger with, another company. Having a standard that allows an application to access each repository (regardless of which product the application belongs to) allows the Enterprise to make full use of the information that it has.
How does it do that?
CMIS offers a Service Oriented Infrastructure (CMIS APIs) that an application can use to access the content from each repository. This means that an application does not have to worry about how to communicate with the repository – just the CMIS APIs.
Design points for CMIS
The design goals were: – platform independent, programming language independent, protocol independent. – easy to lay on top of existing repository (cannot be complex – the data model should be able to sit on top of existing repositories so that the data model and behaviour match that of the repository.
What isn’t it?
CMIS is not a full-function interface to explore the priority functionality of priority content management systems. It provides, however, core functionality that is common to all CMS products that are on the market today.
What does CMIS consist of?
1. An abstract data model that describes the content stored in the different repositories. It is a generic model that can be easily mapped to different priority applications.
2. Set of abstract services that allow an application to access the content stored in a repository.
3. Two web binding protocols. These allows the CMIS’ generic services to be made available via the web to an application regardless of the specific protocol used by the CMS.
The web binding protocols currently used are: * SOAP * REST using APP Additional protocols may be supported in the future.
= Predefined object types =
There are 4 object types defined by CMIS, Each has an immutable object_id.
1. Document – represents elementary asset that is stored in the repository (document, image file, video, etc). Document objects can be versioned, and are searchable.
2. Folder – container object – can contain other objects (including folders, thus folder hierarchy is also possible). Document objects can be stored in multiple folders (multi-filing). Document objects can also be “unfiled” (orphaned) – that is they do not reside in a folder.
3. Relationship objects – Represents binary relationship between two other objects. The relationship can have its own properties.
4. Policy Object – administrative policy that may be applied to other objects. (E.g. content retention policy)
| FOLDER |
| * Content | * Container | | * Metadata | * Hierarchy/Filing | | * Version History | * Metadata | | | | | ————– | |————–| META MODEL | ————| | ————– | | | | | RELATIONSHIP | POLICY | | * Source | * Target | | * Target | | | | |
=========================================== = Basic Services = i) Access
CMIS provide basic services to access these 4 objects – CRUD Create Retrieve Update Delete Applies to all 4 types of objects A repository can create subtypes using the above 4 basic types Basic services also apply to sub-types.
CMIS also allows services to apply a policy on an object, (effectively placing an object under the control of an administrative policy). ii) Query capability CMIS allows for the querying of objects.
The CMIS Data Model consists of object type definitions. The object types define the schema. (The schema is effectively a set-up of properties for each object).
On top of the model, a Relational View is defined.
A property in a data model will appear as a column in a table. On top of that a subset of SQL92 is used. This can be used to search against the relational view. This was extended to be able to – do FullText searching, as well as – searches on multi-value properties. (Each column in a relational view is single-value, however in content management a property can sometimes have multi- ple values) – ability to limit search within a particular folder, or folder tree
Thus – two basic functions: 1. CRUD 2. Query
This allows a generic application to access repositories without writing repository-specific code.
Dr David Choy’s video discussing the Technology behind CMIS
Jignesh Shah video discussing application scenarios of CMIS
Alfresco’s – Getting Started with CMIS