Wednesday, January 28, 2009

Lookup Transformation Caching

Lookup Transformation

If you are working with large tables that are millions of records or more you will probably want to enable cache restrictions. 

The default caching is without restriction.  So the whole dataset will load into memory.  In one of our cases we had 39 million rows which failed because we ran out of memory.  I set it back to 500 mb, this took 5 hours, but it completed.  This is too long and we needed another way to trim down the cache.  Or the size of the table that was being used for the lookup.

To do this we need to figure out an algorithym that will make the lookup table smaller.

Answer:
Stop using the lookup transformation.  Capture the max ID of the primary key column into a variable, using Execute SQL Task.  It is assumed to be an Identity column on the source. 

In order to capture the ID we had to do a workaround with two variables.
  1. capture the variable into an Object Type variable
  2. Use a script to convert the object variable into an Int64 variable.
  3. Use the Int64 Variable in a conditional split data flow task
In the conditional Split compare the max ID to the current ID coming through the dataflow.
  1. if the current ID is greater than the max ID found then do INSERT
  2. otherwise do an UPDATE

Friday, January 23, 2009

What Business Intelligence Application to Use?

In researching business intelligence applications we have many choices and of these we have look at three different vendor solutions:
  • Microsoft SQL Server, includes Database, Integration Services, SOAP Services, Reporting Services and Analysis Services
  • SAP Business Objects Enterprise , includes Crystal Reports, Web Intelligence and Xcelsius Enterprise
  • InterSystems – DeepSee , Cache Objects, Connector, Analyzer, Architect, and Designer

Feature Comparisons

MS SQL ServerSAP BusinessObjectsInterSystems DeepSee
Comes with Relational Database ServerGoodNoneFair
Can Use Existing SQL Server DataGoodGoodNone
Cost ExtraNoYesYes
Rate Reporting ToolsGoodGoodFair
Report Management ServerGoodGoodGood
Integration ServicesGoodGoodGood
Analysis ServicesGoodGoodGood
End-to-End Data Warehouse SolutionGoodNoneFair
Dash Board Tools (Widgets)FairGoodFair
Developer KnowledgeGoodFairFair
Business KnowledgeFairGoodBad
InteroperabilityFairGoodBad