Thursday, April 25, 2013

Data Services within the SAP BI staging process


By Bhargav Gandhi, Tata Consultancy Services
Data Services XI 3.x is the official name for the EIM (Enterprise Information Management) platform. BusinessObjects™ Data Services is the one platform for data delivery and data quality to move, integrate, and improve any type of data anywhere at any frequency. Data Services is both a platform and a product. It is the platform upon which all new EIM functionality will be based

SAP BI is the business intelligence platform from SAP.

Data Services tools can be used into the SAP BI staging process, especially for Non-SAP data, where the enhanced integration capabilities of Data Services might be required anyway.

Following is the procedure for using Data Services features into the SAP BI staging process -  
  1. Load the non-SAP data in the Data Services engine.
  2. Perform Data Quality operations on the data in Data Services.
  3. Load the data into SAP BI
Connecting SAP BI with Data Services
In order to allow the exchange of data and metadata between SAP BI and Data Services, we have to first establish a connection between both the systems. On the SAP BI side this is achieved by creating an External Source System and in Data Services we have to create a new Datastore....
 
1.       Log on to the SAP BI system.
2.       Start the Data Warehousing Workbench using TC RSA1 and select Modeling from the navigation area. Pick the entry Source Systems.
3.       Position the cursor on the External System folder, and choose Create from the context menu as shown above.
 
4.       Enter Logical System Name and Source System Name as shown above and hit Continue. 
 
5.       Data Services will start an RFC Server program and indicates to SAP BI that it is ready to receive RFC calls. To identify itself as the RFC Server representing this SAP BI Source System a keyword is exchanged, in the screen shot above it is "DEMO_SRC". This is the Registered Server Program, the Data Services RFC Server will register itself with at SAP. Therefore, provide the same Program ID that you want to use for the call of the RFC Server on Data Services side. All other settings for the Source System can remain on the default settings. To complete the definition of the Source System, save it.  
6.       Now, start the RFC Server on the Data Services side so that we can test the connection and load data from Data services to SAP BI system.
a)       Open MS-DOS command prompt. (Start > Run > Type CMD & hit OK)
b)       Go to the bin directory/folder for BusinessObjects Data Services. E.g . - C:\Program Files\Business Objects\BusinessObjects Data Services\bin>
c)       Call the program "rfcsvr -aPROGRAM_ID -gSAP_ROUTER_STRING -xSAP_GATEWAY". The PROGRAM_ID is the keyword we used above, "DEMO_SRC" in our example. The Router-String and SAP Gateway is something the BW admins should know. As a guideline, you can open the SAP Logon Pad and try the settings there. 
d)       The server here is 172.17.10.162 and the System Number is 00. The router string could hence be "/H/172.17.10.162/S/3300". The /H/ stands for the hostname, the /S/ for the port number. The port number can be derived from the system number, system 00 means port 3300. Same thing with the SAP Gateway. In a standard standalone installation the value for this is sapgw00.
e)       Type rfcsvr -aDEMO_SRC -g/H/172.17.10.162/S/3300 -xsapgw00 and hit Enter. It should not return anything 
 
7.       To check the connection, click on Connection Test as shown below -  
 
8.       The result would be as shown below in case of a success. 
 
9.       The newly entered 3rd party system would appear in the list of External Systems as shown below. 



PoC – Loading unstructured data from Flat files into SAP BI using BusinessObjects Data Services 
Let us take a sample tab delimited flat file as shown below. (Note - Sample data is placed in an excel file only for understanding purpose. In actual PoC we will be using a text file) 
 
For the sample flat file used here, click here. 
Define the SAP BI system as Datastore in Data Services  
Before we start designing a job for our PoC, let us create a datastore for SAP BI system. 
  1. Open the BusinessObjects Data Services Designer.
  2. In the Local Object Library, select the Datastore tab.
  3. Right click and select New to create new Datastore as shown below
  1. Type in datastore name as shown below.
  2. Select SAP BW Target as the Datastore type. This is because we want to load data into SAP BI and hence SAP BI is the target.
  3. Enter location/IP address of the SAP BI system server
  4. Enter your user name and password for the SAP BI
  5. Under Advanced section, enter Client and System Number. ( You can find system number, application server details from SAP Logon Pad > Change Item)
  6. In our example (refer screenshot 4), enter ‘/H/’ as the router string.
  7. Hit OK.
  1. The new datastore BW_PoC_Target_Datastore is now available under Local Object Library as shown below.
Defining new Flat File format for our sample data 
  1. Open the BusinessObjects Data Services Designer.
  2. In the Local Object Library, select the Formats tab
  3. Right click on Flat Files format and select New.
  1. You will see File Format Editor window

  1. Select/Enter the data as given below
    1. Type – Delimited
    2. Name – PoC_File_Format
    3. Root directory – Path where the sample flat file is stored (In our case it is - C:\Documents and Settings\164600\My Documents\BOBJ Data Services\BOBJ Data Services\Reference Material\PoC)
    4. File name – In our case it is PoC1.txt. You would see a prompt to overwrite the current schema.



      Hit Yes. Now, you would see your sample data in the editor window. Note that Designer automatically assigned appropriate data types based on the input data.

    1. Under Delimiters, select Column – Tab
    1. Click Save & Close
PoC_File_Format is now available under the Flat Files tree in the Local Object Library as shown below.


Designing job for the PoC in Data Services Designer 
  1. Click on Project menu and select New > Project.
 
  1. Let us call our new project as PoC
  2. Create new job by selecting New Batch Job from context menu(right click menu). Call it as PoC_Job1
 
  1. Create a new dataflow and name it as PoC_DataFlow1.
  1. Open the PoC_DataFlow1 workspace.
  2. Drag PoC_File_Format to the workspace as a Source.
  1. Let us have some validation done on the source data. Let us assume we want only those records where country is ‘IN’ (India) or ‘US’.
  2. Drag the Validation from the Transform tab of the Local Object Library on the Data Flow workspace.
  1. Connect source with the validation.
  1. Double click on the Validation box
  2. You will see Validation Transform Editor as shown below.
    - Click on Country column in the Schema In section.
    - Enable Validation
    - Select ‘In’ radio button and enter validation criteria as ‘IN’, ‘US’.
    - This indicates that records with country as either ‘IN’ or ‘US’ will pass the validation and rest would fail the validation. Passed and failed records are stored separately.
  1. Drag the target for failed records from any of the available datastores e.g. Database, Flat Files, Excel Workbooks, BW Target, etc. In our case, I have created a target datastore for Oracle XE schema on my local machine. So I have taken a template table as a target for the failed records.
  1. Name it as Invalid_Customer.
  1. Connect the Validation box with target Invalid_Customer for failing records.

Now, we need SAP BI target for the records which pass the validation. We first have to create DataSource/InfoSource on the SAP BI side. 
In order to create InfoSource, we need InfoObjects corresponding to our sample flat file data. 
1.       Log on to the SAP BI system.
2.       Start the Data Warehousing Workbench using transaction Code RSA1 and select Modelingfrom the navigation area. Pick the entry InfoObjects.
3.       Create the InfoObjects as per the cloumns in Flat File as shown in the following 3 screenshots. 
 
 
 
4.       Save and Activate the InfoObjects as you create them.
5.       Below screenshot shows all the created InfoObjects. 
 
Now, let us create a DataSource on the SAP BI side as a target for our sample data. 
Following 3 screenshots shows that System is not allowing us to create a DataSource in a source system of type External System.  
It also provides alternate way to achieve the same. It seems we cannot create BI 7.0 dataSource of type external system. 
 
 
As per the method described in the earlier screenshot, we will create InfoSource 3.x. Follow the steps as given below. 


Include all the InfoObjects created earlier in this InfoSource. 
Save and Activate InfoSource 
Click Yes, for activating all the independent transfer programs. 
 
You would see logs on the screen as shown below. Go back. 
 
Now, create Transfer Rules for the newly created InfoSource. 
 
 
Click Yes for assigning the DataSource to InfoSource. 
 
Save and Activate the InfoSource. 
 
Go back and you would see the subtree similar to following. 
To import the SAP BI structures, open the Datastore tab strip in the Object Library. Search for the Datastore that you have created for the SAP BI system as target (in our caseBW_PoC_Target_Datastore). Position the cursor on the Datastore name and choose Openfrom the context menu. Depending on the structures you want to use, expand the Master InfoSources or Transaction InfoSources tree. 
 
Find your InfoSource, and open its subtree. Position the cursor on the DataSource name, and use the option Import from the context menu. 
 
Afterwards, the DataSource will be available in the Object Library for your SAP BI system. 
 
Open the workflow, you have created in the previous section. Open the Datastore tab strip in the Object Library. Search for the Datastore that you have created for the SAP BI system as target (in our case BW_PoC_Target_Datastore). Drag the structure to the canvas you want to load the data in the SAP BI system to. Connect the existing Validation Transform with the new target.
 
Double click this new target to open the Target Table Editor as shown below. Go to Options tab and select Column Comparison = Compare by position
This is done so that even if column names do not match (InfoObject names cannot be more that 9 characters), we can still go ahead and carry out the data movement.  
Validate your DataFlow. 


Since the loading process has to be initiated by the SAP BI system, we have to create a batch file for the execution of the Job in Data Services. This batch file can subsequently be called by an InfoPackage in the SAP BI system. 
Open the Data Services Management Console by choosing the corresponding menu entry from theTools menu.  
 
Log in with your user credentials. In the Management Console navigate to the Administrator, and open your repository (or All Repositories) of the Batch folder.  
 
Switch to the Batch Job Configuration tab strip and find your Job you want to schedule. Choose the option Export Execution Command.  
 
Provide a File Name for the batch file. Leave the other settings on the default values and press theExport button.
Note
Since the maximum length of the file name entry field in the InfoPackage is limited to 44 characters, your file name entered must not exceed 40 characters (44 characters minus 4 characters for the extension .bat 
 
Screen would refresh confirming the successful export. 
 
In our case following two files gets created after the export under C:\temp folder. 
 
Before you switch to the SAP BI system, check that the RFC server is still running.  
Switch to the SAP BI system you want to load the data to. Open the Data Warehousing Workbench and find your InfoSource / DataSource. Create an InfoPackage for the DataSource.  
 
 
Switch to the 3rd Party Selection tabstrip. Press the Refresh Sel. Fields button to display the input fields.  



Enter the File Name that you have provided in the creation of the batch file.  
 
Switch to the Processing tab strip. Select “Only PSA” for updating the data to. Hit Save. 
 
Switch to Schedule tab strip. Ensure “Start Data Load Immediately” is selected. Hit Start. It start executing the job designed in the Data Services using the batch file. 
 
A message “Data was requested” would appear on completion of the job. 
Click on Monitor (press F6) to check the status of the load.  
 
Following screen shows you the status of the load. 
To check whether your data has been transferred as expected, you can check the PSA for the load. From the monitor, you can choose the icon PSA Maintenance (Ctrl+F6) in the monitor to jump to the PSA display. 
 
 
Click Continue.  
Following screen shows the data loaded into the PSA. In our example we transferred the records, which passed all our Data Quality measures, i.e. all records which Country equal to either ‘IN’ or ‘US’. In total there are 3 records.  
 
Records not satisfying the validation criteria i.e. records with countries other than ‘IN’ or ‘US’ are stored the template table as desired.
Screenshot below shows the failing records stored in the Oracle XE schema.


No comments:

Post a Comment