Articles

SSIS on SYBASE – Driving Properties From Configurations

In SQL Server 2008 on September 9, 2010 by ashwaniroy

So Now that I have a package that works and loads data from a CSV file to a Untyped Table and then a Typed , I think it is the time when we try to find out what is the best (or only) way to use Configuration driven properties.

SSIS Gives these way to set up config :-

Go to SSIS à Package ConfigurationsàAdd you will see these options

image001 (1)

  • I don’t like registry entry, and neither will your System / Database Admin. So I am leaving it there.
  • Parent Package variable, well don’t want that.
  • SQL Server Option wont work  because only database I have is Sybase but there is a Bug here which I would like to Point out

If you say SQL Server , and select “Specify Configuration Settings Directly” even you Sybase Connection will show up

image002

But you wont go too far from here because after you create a new Config table to hold the config values, you wont be able to add the configuration filter. It is a bit dumb because Sybase connection should not surface as SQL Connection if it is not a SQL Connection. But anyway we are left with Environment variable and XML Configuration.

So I would Say

  1. Use Env variable to store the path to XML connection
  2. Use XML config to store rest of the properties

So Just select the XML Configuration Option

image003

3. Select the properties that you want to drive from config. I have selected the connection string here.

image004

Next àNext àFinish

Don’t forget to enable package Configuration

image005

  1. Now Open the Config file(Best way to do which I know is , add to your SSIS Project and Reformat the document using Ctrl+k , Ctrl+D. it will look like

 

<Configuration ConfiguredType="Property" Path="\Package.Connections[staging].Properties[ConnectionString]" ValueType="String">
              <ConfiguredValue>Data Source=ash-pc:5400;User ID=ash;Password=roy;Initial Catalog=etl;Provider=ASEOLEDB.1;</ConfiguredValue>
       </Configuration>
       <Configuration ConfiguredType="Property" Path="\Package.Connections[staging].Properties[Description]" ValueType="String">
              <ConfiguredValue></ConfiguredValue>
       </Configuration>
       <Configuration ConfiguredType="Property" Path="\Package.Connections[staging].Properties[InitialCatalog]" ValueType="String">
              <ConfiguredValue>etl </ConfiguredValue>
       </Configuration>
       <Configuration ConfiguredType="Property" Path="\Package.Connections[staging].Properties[Name]" ValueType="String">
              <ConfiguredValue>staging</ConfiguredValue>
       </Configuration>
       <Configuration ConfiguredType="Property" Path="\Package.Connections[staging].Properties[Password]" ValueType="String">
              <ConfiguredValue></ConfiguredValue>
       </Configuration>
       <Configuration ConfiguredType="Property" Path="\Package.Connections[staging].Properties[ProtectionLevel]" ValueType="Int32">
              <ConfiguredValue>1</ConfiguredValue>
       </Configuration>
       <Configuration ConfiguredType="Property" Path="\Package.Connections[staging].Properties[RetainSameConnection]" ValueType="Boolean">
              <ConfiguredValue>0</ConfiguredValue>
       </Configuration>
       <Configuration ConfiguredType="Property" Path="\Package.Connections[staging].Properties[ServerName]" ValueType="String">
              <ConfiguredValue> ash-pc:5400</ConfiguredValue>
       </Configuration>
       <Configuration ConfiguredType="Property" Path="\Package.Connections[staging].Properties[UserName]" ValueType="String">
              <ConfiguredValue>ash</ConfiguredValue>
       </Configuration>
       <Configuration ConfiguredType="Property" Path="\Package.Connections[staging].Properties[Password]" ValueType="String">
              <ConfiguredValue>roy</ConfiguredValue>
   </Configuration>

So If you are using SSIS as ETL on a Datawarehouse that is not Microsoft SQL Server then you can use this as a way to use config driven SSIS .

Articles

SSIS on SYBASE – ASE 15 – Loading from CSV file to Untyped table and then to Typed Table

In SQL Server 2008 on September 8, 2010 by ashwaniroy

This is first of my series on SSIS of Sybase ASE 15 .

1.  Create table to hold this data
--Create a typed and Untyped table
--Typed Table
create table foo (
       book_id                         int                              not null  ,
       book_code                       varchar(16)                      not null  ,
       desk_id                         int                              not null  
)
--Untyped Table
Create table bar (
       book_id                         varchar(50)                      not null  ,
       book_code                       varchar(50)                      not null  ,
       desk_id                         varchar(50)                      not null  
)
--Error Table
Create table error (
       book_id                         varchar(50)                      not null  ,
       book_code                       varchar(50)                      not null  ,
       desk_id                         varchar(50)                      not null  ,
       error_code                      varchar(256)                         null 
)
  1. Use these 3 records as your test.csv file

1,AMSTMM,23

2,CBOND,9

CMD,CBOND,DBO

  1. Add a new connection to the connection Manager of the type Native OLEDB\Sybase ILEDB Provider
  2. Configure the Connection Provider’s Property

1

  1. Add a new Connection Manager for Flat File Source for the CSV file test.csv. Name it CSV

That’s all you need in set up

  1. Drag and drop a Data Flow Task and Open it
  2. Add a Flat file source component and configure it to point to the CSV connection
  3. Drag a OLEDB Destination and configure it to use the Sybase connection and use dbo.bar as the destination table

2

  1. Run the task and it has succeeded your data will be in the table

3

4

  1. Ok So far so good. Now I want to load it to Typed Table. Drag another dataflow task for this and Open it.
  2. Drag a OLEDB Source and Point it to dbo.bar and Drag a OLEDB destination and point it to dbo.foo and connect it and Map the columns. Done !! IS it ? Watch this..

When You execute this task this is what will happen

  1. Task will look like this

5

So it failed !! Ya I expected it to because of the Bad typed row in the untyped. But that is not what I want to show you. This is what I want to show you

  1. Log will be saying

    An OLE DB error has occurred. Error code: 0×80020005.

The ProcessInput method on component "typed" (42) failed with error code 0xC0202009. The identified component returned an error from the ProcessInput method. The error is specific to the component, but the error is fatal and will cause the Data Flow task to stop running.

Thread "WorkThread0" has exited with error code 0xC0202009.

  1. The Error Message clearly is not really helpful from the Sybase Adapter. Grrrr!!
  2. 2.And further down the line you will see this
  3. OnInformation,FMDX2622,MARKETS\c001334,Package,{40CB9E75-B873-45C7-8129-D74D85F567FC},{5B329EA9-3798-44A0-B3DC-1EAF61F204B7},07/09/10 11:10:45,07/09/10 11:10:45,1074016267,0x,"component "typed" (42)" wrote 2 rows.
  1. Now that it is saying that it loaded 2 rows lets look at the Database.

6

Huh!! How did the Row -3 with first and last column casted to 0 came here. I did not do any type casting and also log said only 2 rows are inserted.

Well that is either a problem with Sybase Adapter for Oledb for returning and SSIS and the way it treats Sybase data type.

Lets look more deep in the Sybase OLEDB Destination Component.(Right Click à Show Advance Editor)

Book_id is INTEGER column so that’s what the External Column says .. No problem there.

7

Now look at book_id in  Input Columns

8

How is that chaged to DT_STR. This is the reason why the load it doing implicit cast. The Bug is with the Sybase Adapter as well which then commits another mistake of returning the wrong count of inserted rows.

Ok So how do I fix it. Well by explicit typing using Data Conversion task. Here is the solution.

  1. Add a data conversion task in between Source(bar) and destination(foo) . If you want you can add a error redirect to error

9

  1. Data conversion task basically takes the column from the UnType and casts them to explicit types. On Error while casting it redirects it to error table which is untyped as given above. Correct rows are loaded to Typed.
  2. This is what the Data Conversion Task looks like

10

  1. Run this and verify the results. Mine looked my fav color GREEN

11

  1. Logs also return what has actually been done

OnInformation,ash-pc,Foo\c001334,Package,{40CB9E75-B873-45C7-8129-D74D85F567FC},{16316E6B-94DA-4235-B27E-10EF3E472E38},07/09/10 11:31:35,07/09/10 11:31:35,1074016267,0x,"component "typed" (42)" wrote 2 rows.

OnInformation,ash-pc,Foo\c001334,Package,{40CB9E75-B873-45C7-8129-D74D85F567FC},{16316E6B-94DA-4235-B27E-10EF3E472E38},07/09/10 11:31:35,07/09/10 11:31:35,1074016267,0x,"component "Error" (1004)" wrote 1 rows.

  1. Database looks like this

Foo

12

Error

13

Job Done!!

Loading UnTyped To Typed is a very common pattern and perhaps this will help someone to do this for SYBASE ASE-15 and SSIS. (Both 2005 and 200

Articles

SSIS on Sybase ASE 15

In SQL Server 2008 on September 8, 2010 by ashwaniroy

I have decided to post couple of blog entries on SSIS with Sybase. This is because my the datawarehouse that I am working on is on SYBASE ASE – 15 but I have decided to use SSIS (not because I am a MVP or Microsoft fanatic but just because I love the tool (mostly) and we can do pretty much anything that we can with any other ETL tool.

So watch out for the next few entries

Articles

Speaking at SQL Bits 7 – University Of York OCT 2

In SQL Server 2008 on August 12, 2010 by ashwaniroy

Agenda for SQL Bits 7 has been announced. This has a similar format as SQLBits 5 i.e a 2 day session. Friday 1st Oct is the paid session with speakers

  • Buck Woody, SQL Server Technical Specialist at Microsoft
  • Thomas Kejser, Microsoft SQLCAT team
  • Brent Ozar
  • Brad McGehee
  • Chris Testa-O’Neill
  • Simon Sabin
  • Chris Webb
  • Allan Mitchell
  • Martin Bell

Go to www.sqlbits.com for more information.

I will be speaking in the free community day session which is on 2nd October on Analysis Services Best Practices.This includes the things that I have picked up on the job and encompasses areas like attribute relationships , aggregations and some tips and tricks.

I hope you all would enjoy the event and hope to see you guys there.

Articles

SQL Bits demos and slides available for download

In SQL Server 2008 on May 12, 2010 by ashwaniroy

So finally I have got all my demos and slides together . SQL Bits and attending SQL Bits in person is an incredible experience and I want to thank all attendees who attended my session and thanks a lot for your feedback. It is very encouraging.

You can download the slides from here.

Articles

Speaking at SQL Bits 6 – LONDON

In SQL Server 2008 on April 15, 2010 by ashwaniroy

I will be speaking at the SQLBits conference in London on April 16th.  My talk will be a 300-level talk on how to use MDX studio to performance tune MDX queries.

There are many very interesting session on performance and scalability from very famous speakers like Connor Cunningham (Senior Principal Architect From SQL query processor team),Chirs Webb(SQL MVP , author and Internationally renowned speaker,Keith Burns  and many more.

I hope to see some of you there!

You can read more about this and the other talks at http://www.sqlbits.com/information/PublicSessions.aspx

Articles

A Generic Untyped To Typed Table Load which does automatic Type Checking and error redirection

In SQL Server 2008 on April 11, 2010 by ashwaniroy

Loading data from untyped tables to tables is a very frequently occurring scenario in a Data warehouse application.Normally we load all the data we need in a table which is NVARCHAR or VARCHAR column and then check , row-by-row if all the data satisfies the data type check. If the row passes the type check validation then it is loaded to the typed else it is flagged as INVALID and / or is redirected to another table where we have error out records. So far , so good…But there are few things which are quite painful about this processes.

No 1:- You have to write some kind of LOOP operation for this .

No 2:- You need to write code for each untyped to typed table load.

Even if you use SSIS and use error redirect the INSERT operation becomes row-by-row and no longer is set based which again hits the performance.

If we look at the problem scenario we have

1. A Untyped Table

2. A Typed Table

3. A Flag Column to mark Invalid records

4. A table where all errors go

I am going to try to write a Stored Procedure that does that just. Pass any typed table to it and untyped typed and / or error table , Is_Valid column name (the 4 parameters mentioned above) and it will just find out all the rows that satisfy the type check validation and load then to typed table and will mark the other rows as invalid.

So as long as you pass the typed table name and corresponding untyped table name , where all the column of typed table are in the untyped table then this SP should work .

Lets Create 2 tables Untyped and Typed

CREATE  TABLE UNTYPED 
( 
 ID      VARCHAR(20) 
,TITLE   VARCHAR(20) 
,NAME    VARCHAR(20) 
,SALARY  VARCHAR(20) 
,IS_VALID BIT 
) 
  
CREATE  TABLE TYPED 
( 
 ID      INT 
,TITLE   VARCHAR(20) 
,NAME    VARCHAR(20) 
,SALARY  FLOAT 
) 
  
INSERT INTO UNTYPED VALUES('1','Mr','Ash','100.00',NULL) 
INSERT INTO UNTYPED VALUES('2','Mr','John','300.00',NULL) 
INSERT INTO UNTYPED VALUES('3','Mr','Doe','400.00',NULL) 
INSERT INTO UNTYPED VALUES('4','Mr','Bell','500.00',NULL) 
INSERT INTO UNTYPED VALUES('5','Mr','Jack','150.00',NULL) 
INSERT INTO UNTYPED VALUES('6','Miss','Jane','200.00',NULL) 
INSERT INTO UNTYPED VALUES('7','Mrs','Ann','130.00',NULL) 
INSERT INTO UNTYPED VALUES('8','12','Judy','Foo',NULL) 
INSERT INTO UNTYPED VALUES('Bar','13','Jude','BAR',NULL) 

So The Data in the untyped table Now is

clip_image001

 

So This is this is the SP . Pleas read the comments. You may have to extend it a bit and might like to add more error handling etc.

SET ANSI_NULLS ON
GO
SET QUOTED_IDENTIFIER ON
GO
CREATE PROCEDURE [dbo].[usp_load_untyped_to_typed_staging] 
                   @stg_unTyped_table_name NVARCHAR(256)
                 , @stg_Typed_table_name NVARCHAR(256)
                 , @is_valid_column_name    NVARCHAR(256)
                 , @i_debugOnly INT = 0
AS
/**
 * <Example>
   EXEC  [dbo].[usp_load_untyped_to_typed_staging]  
                 @stg_typed_table_name     =  'typed'
                ,@stg_unTyped_table_name   =  'untyped'
                ,@is_valid_column_name     =  'IS_VALID'
                ,@i_debugOnly               =   1

 * <Comment>.
 * As of now it does type checking for only INT and FLOAT bit it can be easily extended
 * It does not do NOT NULL Validation as of now  but it can be extended 
 * ----------------
 *
 * Ver    Author            Date        Comments
 * ---    -------------------    -----------    ------------------------------
 *   1    Ashwani Roy     09-04-2010   Load From Untyped Staging to Typed Staging with type checking and error flagging 
 */
SET NOCOUNT ON;

DECLARE    @sql_script NVARCHAR(MAX)

DECLARE
        -- Define user variables.
        @ERR_GENERAL          INT,
        @ERR_OBJ_NULL          INT,
        @ErrCode              INT = 0,
        @ErrMsg               NVARCHAR(4000),
        @StepNbr              SMALLINT = 0,
        @SQL                  VARCHAR(MAX);
SET     @ERR_GENERAL          =99 
SET     @ERR_OBJ_NULL         =0

DECLARE @numeridTypes          TABLE  (datatype NVARCHAR(255))
INSERT INTO @numeridTypes   
SELECT 'INT'
UNION ALL
SELECT 'FLOAT'

BEGIN
 /**
  * STEP 1
  * Set variables.
  */
  SET @StepNbr += 1;
  SET @ErrMsg = N'ERROR - Stored procedure [' + OBJECT_NAME(@@PROCID) + N'] ';
  
  
 /**
  * STEP 2
  * CREATE DYNAMIC SQL FOR INSERT
  * MY AUTOMATIC TYPE CHECKING PATTERN FOR SQL
  */
    SELECT 
         COLUMN_NAME
        ,DATA_TYPE = 
                CASE WHEN DATA_TYPE = 'VARCHAR' THEN DATA_TYPE+'(' +CAST(CHARACTER_MAXIMUM_LENGTH AS VARCHAR(5)) +')' 
                ELSE DATA_TYPE 
        END
        ,IS_NULLABLE 
    INTO #temp_columns
    FROM INFORMATION_SCHEMA.COLUMNS (NOLOCK) WHERE TABLE_NAME  =@stg_typed_table_name 

IF(@@ROWCOUNT=0)
BEGIN
    SET @ErrCode = @ERR_OBJ_NULL 
    SET @ErrMsg = 'TYPED TABLE :'+'['+@stg_typed_table_name+']'+N'DOES NOT EXISTS'
    GOTO FAIL;
END

    --Flag the records that Fail Data type Validation as INVALID--
    SET @sql_script = N''
    SET @sql_script= @sql_script+ 'UPDATE'+'['+ @stg_unTyped_table_name+']'+CHAR(13)
    SET @sql_script= @sql_script+ 'SET'+CHAR(9)+@is_valid_column_name+'=0'+CHAR(13)
    SET @sql_script= @sql_script+ 'WHERE '+STUFF((SELECT 'OR '+ '1 = CASE WHEN ISNULL(LEN('+COLUMN_NAME+'),0) != 0 
    AND ISNUMERIC('+COLUMN_NAME+')= 0 THEN 1 WHEN ISNULL(LEN('+COLUMN_NAME+'),0) = 0    THEN 1    ELSE 0 END  ' 
    FROM #temp_columns 
    WHERE DATA_TYPE IN(SELECT datatype FROM @numeridTypes) FOR XML PATH('')),1, 3, '')

BEGIN TRY
    IF @i_debugOnly = 1    
    BEGIN
        PRINT @sql_script
    END
    ELSE
    BEGIN
        EXEC SP_EXECUTESQL @sql_script
    END
END TRY
BEGIN CATCH
    IF(@ErrCode NOT IN (0,1))
        SET @ErrCode = @ERR_GENERAL;
    SET @ErrMsg = @ErrMsg + N'raised error (' + CAST(ERROR_NUMBER() AS NVARCHAR) + N') in step number (' 
    + CAST(@StepNbr AS NVARCHAR) + N') at line number (' 
    + CAST(ERROR_LINE() AS NVARCHAR) + N') with message: ' + ERROR_MESSAGE() ;
    GOTO FAIL;
END CATCH;

    --INSERT THE GOOD ONES IN typed--
    --If the Untyped is added with a IS_VALID column then the below SQL can be optimized to JUST a simple SELECT WHERE IS_VALID = 1--
    SET @sql_script = N''
    SET @sql_script= @sql_script+ 'INSERT INTO'+'['+ @stg_typed_table_name+']'+CHAR(13)
    SET @sql_script= @sql_script+ 'SELECT'+CHAR(13)
    SET @sql_script= @sql_script+ STUFF((SELECT ', ' + '['+COLUMN_NAME+']' FROM #temp_columns FOR XML PATH('')),1, 2, '')
                     +CHAR(13)
    SET @sql_script= @sql_script+ 'FROM'+CHAR(9)+'['+@stg_unTyped_table_name+']'+CHAR(13)
    SET @sql_script= @sql_script+ 'WHERE '+STUFF((SELECT 'AND '+ '1 = CASE WHEN ISNULL(LEN('+COLUMN_NAME+'),0) != 0 
                     AND ISNUMERIC('+COLUMN_NAME+')= 1 THEN 1 WHEN ISNULL(LEN('+COLUMN_NAME+'),0) = 0    THEN 1    ELSE 0 END  '
                     FROM #temp_columns
                     WHERE DATA_TYPE IN(SELECT datatype FROM @numeridTypes) FOR XML PATH('')),1, 3, '')
                           
    
BEGIN TRY
    IF @i_debugOnly = 1    
    BEGIN
        PRINT @sql_script
    END
    ELSE
    BEGIN
        EXEC SP_EXECUTESQL @sql_script
    END
END TRY
BEGIN CATCH
    IF(@ErrCode NOT IN (0,1))
        SET @ErrCode = @ERR_GENERAL;
    SET @ErrMsg = @ErrMsg + N'raised error (' + CAST(ERROR_NUMBER() AS NVARCHAR) + N') 
    in step number (' + CAST(@StepNbr AS NVARCHAR) + N') at line number (' 
                 + CAST(ERROR_LINE() AS NVARCHAR) + N') with message: ' + ERROR_MESSAGE() ;
    GOTO FAIL;
END CATCH;

GOTO SUCCESS;
  
FAIL:
  RAISERROR(@ErrMsg, 16, 1);
  RETURN @ErrCode;

SUCCESS:  
  -- Commit the transaction.
  RETURN 0;

DROP TABLE #temp_columns
END;

Now when you run this SP in DEBUG MODE as show in the example in the SP code itself

It will print out the SQL that it generated. Setting debug_only to 0 will run this. The SQL generated is like

UPDATE[untyped]
SET    IS_VALID=0
WHERE 1 = CASE WHEN ISNULL(LEN(ID),0) != 0 AND ISNUMERIC(ID)= 0 THEN 1 WHEN ISNULL(LEN(ID),0) = 0    
THEN 1    ELSE 0 END  OR 1 = CASE WHEN ISNULL(LEN(SALARY),0) != 0 AND ISNUMERIC(SALARY)= 0 THEN 1 
WHEN ISNULL(LEN(SALARY),0) = 0    THEN 1    ELSE 0 END  

INSERT INTO[typed]
SELECT
[ID], [TITLE], [NAME], [SALARY]
FROM    [untyped]
WHERE  1 = CASE WHEN ISNULL(LEN(ID),0) != 0 AND ISNUMERIC(ID)= 1 THEN 1 WHEN ISNULL(LEN(ID),0) = 0    THEN 1    
ELSE 0 END 
AND 1 = CASE WHEN ISNULL(LEN(SALARY),0) != 0 AND ISNUMERIC(SALARY)= 1 THEN 1 WHEN ISNULL(LEN(SALARY),0) = 0    THEN 1
ELSE 0 END  

You can see how using dynamic SQL and Information schemas a nice Type check is generated and another

UPDATE to set the validation column = 0 is also created. Best Part is table names and Validation Column names and ISNUMRIC validation types all are configurable.

Now I will run the query with Debug flag = 0 and you would see that the output is as expected. Out of 9 ROWS 7 Will be loaded in Typed Tabled and 2 Will be marked as invalid in the untyped table.

clip_image002 

There you go. Job Done.

Hope you find it useful and will be able to extend to do more validation , checks and Error Redirect type stuff using one SP making you life a little easier.

 

Follow

Get every new post delivered to your Inbox.