Monday, February 21, 2011

OWB and Microsoft Sharepoint - Part II

--

In Part-I, I had covered how data\documents can be inserted\uploaded into Sharpoint. In this post, I will demonstrate how to extract data. Please read the previous post, if you have not read it.

The extraction can be done in two ways.
1) Use the code template mapping to invoke the webservice and store the XML in to a table. Then use a conventional mapping to extract the data using SQL.
2) Use the code template mapping to invoke the webservice to hold the data into C$ table and then parse it using a table function or leveraging XDB in the  Default Oracle Code Template.

This blog post will demonstrate using the first approach.
I will be extracting the data that was inserted into sharepoint list "Employee" from the previous post.

Extracting data from sharepoint using GetListItems -

Selecting soap action "GetListItems" under "lists.asmx" webservice provides details which can be used to extract listitems from sharepoint.


CAML Format to extract the listitems -

Target is to extract all the listitems. The default row limit of SharePoint is 100 rows which can be overrided as shown below.

Note: If you want to extract data incrementally, all the list have inbuild column "Modified" which can be used as the filter in the CAML query.

Details of how CAML for GetListItems can be found here on msdn. CAML can be constructed to select the columns of your choice,order them , filter them , etc.




Data required to invoke the webservice to get listitems.



Logical View of the Mapping to extract sharepoint listitems -




Execution View of the Mapping to get listitems -



After executing the mapping -



All the listitems (rows) returned in the XML are in the CLOB field.

Parsing the XML -

Conventional OWB Map using SQL Override feature to parse the XML.



Undbounded View Query -

The Unbounded view using the table EMPLOYEE_LISTITEMS.  The query parses the XML to get all the listitems.


Executing the Map -


Integrating with SharePoint  - change data can be easily tracked but there is no webservice which returns deleted data.  Either a minus operation after complete extraction or creating a webservice on sharepoint server can easily fix this issue.

Webservices are slow by nature but again with parallel option and cdc techniques data can be easily pushed into sharepoint.

--

Sunday, February 20, 2011

OWB and Microsoft Sharepoint - Part I

--

In this blog post, We will look at one of the ways to integrate with beast of a content management tool "Microsoft SharePoint Portal".

Sharepoint is used in wide variety of ways like for collaboration, content management , reporting tool , etc within an organization. There is always a need to integrate with it. Some organziations need the data from sharepoint into the warehouse or move data from database into sharepoint.

How difficult is integration with Sharepoint ? Can data be pushed in realtime into sharepoint or extracted incrementally from sharepoint ? Can documents be handled ?

Well I will try to answer these questions in this post with a solution and a demo.

For demo will be concentrate on Sharepoint Listitems and attaching\uploading documents (Yes Documents) to listitems using OWB 11gR2. (ODI blogs will follow soon :-) )

Sharepoint is built on .NET framework and there is absolutely no support for JAVA. No jar files. No JDBC drivers. Thats one of the reason, so many ETL\ELT tools in the market do not have any adapter to integrate with sharepoint. As far as I know only Microsoft's ETL tool SSIS comes with an adapter.

Here are few more options to extract and push data into sharepoint -
1) Microsoft BDC (Business Data Catalog) or BCS (Business connectivity services) sharepoint utilities to sync data between databases and sharepoint lists.
2) Sharepoint Object Model programming
3) Third party products
4) MOSS Webservices ( CAML + SOAP )

Top 3 options belong to the .NET family and they need to be coded\configured\executed at sharepoint server.

If you are only limited to using MOSS webservices (which is the case in almost all cases in all the projects - nobody wants to pay for additional licenses) here is a way to go. I will try to explain most of the things in this post in detail to explain how sharepoint works.

MOSS webservices exposes every action(add\delete\update) in sharepoint.
It is a soap based webservice with the infamous CAML (Collabartive Application Markup Language ). CAML contains specific groups of tags to both define and display data. In short, CAML is like a sql query with a XML syntax. You can find more details about CAML over here.

There are couple of challenges using MOSS webservices -
1) IIS Authentication (NTLM V1/V2) used by sharepoint to authenticate users
2) Document handling - requires base64 binary format

If you have tried Oracle PL/SQL to invoke MOSS webservice , it fails with "401 Authentication" error. This is because of the IIS authentication. Oracle PL/SQL just supports basic authentication.Many unanswered queries can be found on the forum.

To integrate with sharepoint, I will use a custom Code Template which would use CAML.
Code template takes care of all the challenges and abstract's the complexity from the user. One code template to invoke\consume sharepoint service.This makes mapping design very simple.




The code template uses SOAP protocol and designed using Java , Jython and Java Bean Shell.
Sharepoint Portal contains many sites\sub-sites ( like a database schema). Each site\subsite have list (imagine them like tables). There can be multiple views(like a database view) on top of a list but one view is created by default. List have listitems (basically rows).MOSS Webservices works for each site\subsite.





"Employee" List will be the demo example.

Note: Custom Metadata Interface (CMI) (Java based OWB api) can be used to reverse engineer sharepoint lists to represent it within OWB as tables and code templates can be created to generated CAML on the fly. This is out of context of this blog post but i will try to demonstrate CMI in a future blog post with some other simple example.

To invoke MOSS webservices, I will use the technique as demonstrated in one of my previous blogpost about invoking/consuming RESTful webservice using OWB11gR2.

MOSS webservice allows to perform different kinds of action on entire site\subsite like add\update\delete list\listitems , uploading documents, adding users , groups etc. Every SharePoint site has different webservices for it and every webservice has multiple actions (soapactions) to carry out different tasks.

Listitems webservice:

On (any) sharepoint site homepage:
http://sharepoint.yourorganization.net/sites/yoursite/orsubsite/default.aspx
replace "default.aspx" with "_vti_bin/lists.asmx"
this will give you the webservice endpoint
http://sharepoint.yourorganization.net/sites/yoursite/orsubsite/_vti_bin/lists.asmx
to handle list and listitems.
Soap Action and Webservice Endpoint will be used as inputs by the code template.




Inserting Listitmes :

Selecting operation "UpdateListItems" under "lists.asmx" webservice provides details of the soap message to insert/update/delete listitems.



CAML Format generation for Inserting Listitems -

The details of how to create the CAML query for "UpdateListItems" can be found here.
For the example above ,  soap message for insert will look like this.
All the column names of Employee list are substituted in the soap message.

Note: The "%s" will be replaced by values from the rows by the code template.




I will be using the default Employees table which is under the sample HR schema , to insert all rows from it into the sharepoint portal. I had customized the table to include three more columns .
LISTNAME(1st column) - SharePoint Listname under which all the data will appear as listitems
SUCCESS_FLAG(Second last column) - Flag to indicate whether the record is successfully updated\inserted\deleted in sharepoint
LISTITEM_ID (last column) - SharePoint generates ID(unique number) for each row which can be used to update\delete data from sharepoint




Logical view of the code template mapping to insert listitmes -

Note: Sharepoint assigns every listitem with a unique number called ID (like a surrogate in sharepoint). This is very handy to update delete records from sharepoint.


The MOSS_XML field will store the return XML after invoking sharepoint webservice from which ID will be extracted and stored in LISTITEM_ID column.





Execution view of the mapping to insert listitems-

The important inputs are CAML_FORMAT , WEBSERVICE_ENDPOINT and SOAP_ACTION.




Once the mapping is executed , the results can be seen. The Code template does the required IIS Authentication and updates the SUCCESS_FLAG for inserted row in SharePoint and captures the ID from the sharepoint server.





The Sharepoint View -






Attaching(Uploading) document to SharePoint Listitems:

Selection operation "AddAttachment" under "lists.asmx" webservice provides details of soap message to upload documents to listitems.


CAML Format to upload files as attachment -

The CAML format is simple soap message.




The list of files to be uploaded.





Logical view of the map to upload the files -



Execution view of the mapping to upload attachments -

The column FILE_PATH which holds the path along with filename is specified as input. Code template reads the file and converts it into base64 binary to upload it to sharepoint.


After executing the mapping.


The sharepoint view -


On clicking the listitiems, the attachments can be viewed.




Using the Code template approach any sharepoint webservice can be invoked. All kinds of things like handling users,groups,lists,calenders,etc can be managed easily.

Customizing the template to include Journalized data and now you can push data in near realtime into sharepoint.

In the Part-II , I will cover extraction of data from sharepoint into oracle and discuss the pros and cons.

--