SlideShare a Scribd company logo
1 of 42
CREATE ETL SOLUTIONS FASTER
WITH METADATA DRIVEN DEVELOPMENT

KOEN VERBEECK

SQL SERVER DAYS 2013
WHO AM I?
OUTLINE
Introduction

Hello World

Read Flat
File

Read Flat
Files While
Looping

Metadata
driven
development

Conclusion
OUTLINE
Introduction

Hello World

Read Flat
File

Read Flat
Files While
Looping

Metadata
driven
development

Conclusion
INTRODUCTION
INTRODUCTION
• large percentage of BI projects fail
• Gartner - http://www.gartner.com/newsroom/id/492112

• one of the reasons is underestimating development effort ETL
• Kimball: 70% time of building a DWH goes into ETL
http://www.informationweek.com/the-38-subsystems-of-etl/55300422
INTRODUCTION

“I choose a lazy person to do a hard
job. Because a lazy person will find
an easy way to do it”
Bill Gates
INTRODUCTION
• a lot of SSIS packages are very similar
•
•
•
•
•

packages importing flat files
packages writing change data to staging tables
packages exporting data to excel (for some reason)
packages updating dimensions
…

• … but they take a lot of time to create
INTRODUCTION
• solution?

• code reuse

o SSIS basically only supports copy-paste
o copy-paste has improved in SSIS 2012

• design patterns

o for example: incremental load package
o SQL Server 2012 Integration Services Design Patterns

• enable through templates

o build a template package
o save it to C:Program Files (x86)Microsoft Visual Studio
11.0Common7IDEPrivateAssembliesProjectItemsDataTransformationProjectDataTransformationIt
ems

• but still requires you to edit each package!
• (and what if you forget to edit a crucial piece?)
INTRODUCTION
• metadata driven development to the rescue!

• (aka code generating code)
• automate generation of common logic in SSIS packages

• first option is the “dynamic SSIS package”
1.
2.

3.
4.

reads metadata from tables
generates code

o usually outputs T-SQL or bcp commands
o uses T-SQL or C#
o for example: SELECT … INTO statements

loops over the generated code
executes each statement

• disadvantages
•
•
•
•

complex project
no parallelism
difficult row based error handling
difficult to incorporate “business logic”
INTRODUCTION
• second option: BIML
• started as a project at MS: http://vulcan.codeplex.com/
• developer left to found company Varigence http://www.varigence.com/
o took the idea (not the code) and developed BIML

• BIML is a markup language and compiler
o translates metadata into business intelligence solutions for SQL Server
o supports SSIS and SSAS
o Varigence made part of BIML available as open source
INTRODUCTION
• BIDS Helper has open source implementation of BIML
• it’s free!
• it’s already in the add-on you love!
• it is available for SSIS 2005, 2008, 2008R2, 2012 (and 2014?)

• BIML offers

• powerful code generation

o only some parts of the project deployment model are not supported

• reuse BI patterns and components

o create your pattern in BIML and generate all your packages with the same structure
o BIML files can reference each other

• .NET based script language

o C# code can be incorporated into BIML to generate objects based on metadata
o Intellisense (sometimes) available

• don’t like BIML?

o generated packages are just SSIS packages, you can edit them using BIDS/SSDT/SSDTBI
o no vendor lock-in
INTRODUCTION
• scenario for our demos
• import different flat files
o exports from ERP systems, other database vendors, 3rd party providers, …

• each type of flat file has a different structure
o no single SSIS package for all flat files

• the name of the flat files can change
o for example the name includes a timestamp

• this would normally require 1 SSIS package per flat file type
• couple of hours/days work?

• let’s solve it with BIML!
OUTLINE
Introduction

Hello World

Read Flat
File

Read Flat
Files While
Looping

Metadata
driven
development

Conclusion
HELLO WORLD
• basic BML script structure

Tasks

BIML
Dataflow
Connections
FileFormats
Packages
Tasks
Containers

Precedence constraints
Transformations
You can also specify
• events
• log handlers
• variables
• parameters
• custom tasks
• script tasks/components
• …
HELLO WORLD
• let’s take a look at a simple BIML script
HELLO WORLD
• BIML root node

• add connections

• add packages
HELLO WORLD
• specify Tasks

• specify specific properties
HELLO WORLD
• check for errors & generate package

• result
DEMO
show Hello World BIML
OUTLINE
Introduction

Hello World

Read Flat
File

Read Flat
Files While
Looping

Metadata
driven
development

Conclusion
READ FLAT FILE
• specify FlatFileFormat
• columns: name, data type, size, delimiter (, code page)
• what you’d normally specify in the flat file connection manager

• specify connection
READ FLAT FILE
• specify data flow with transformations
• if no input/output connectors are specified, transformations are connected in
the order specified in the BIML file

• result
DEMO
import flat file with BIML
OUTLINE
Introduction

Hello World

Read Flat
File

Read Flat
Files While
Looping

Metadata
driven
development

Conclusion
READ FLAT FILE IN LOOP
• now let’s loop over a bunch of flat files
• specify variables to hold path to current file and source folder

• add an expression on the flat file connection manager
READ FLAT FILE IN LOOP
• add a for each loop
• which has its own tasks child element
READ FLAT FILE IN LOOP
• result
DEMO
import flat file using for each loop with BIML
OUTLINE
Introduction

Hello World

Read Flat
File

Read Flat
Files While
Looping

Metadata
driven
development

Conclusion
METADATA DRIVEN DEVELOPMENT
• BIML is nice
• … but isn’t the GUI much faster to developer packages?
• time to enhance BIML with some C# goodness!
called BIMLScript
use C# to read metadata
loop over metadata and create multiple objects
entire website dedicated with tutorials and code snippets
http://bimlscript.com/
• also has an online editor
•
•
•
•
METADATA DRIVEN DEVELOPMENT
• Add namespaces

• Declare variables
METADATA DRIVEN DEVELOPMENT
• Retrieve metadata (stored in a SQL Server table)

• Loop over metadata and create corresponding objects
METADATA DRIVEN DEVELOPMENT
• result
METADATA DRIVEN DEVELOPMENT
• remarks
• make sure the code or the metadata doesn’t contain invalid XML characters
o <>“&

• using C# can mess with the Intellisense
o Visual Studio thinks it’s not valid XML anymore
o color coding can disappear > right click file and choose Open With…
o Intellisense can stop working in Visual Studio > use online editor

• beware of the protection levels
• some elements can only appear once
o do not put those in a loop
o e.g. Connections, Packages
DEMO
generate multiple packages using BIMLScript
OUTLINE
Introduction

Hello World

Read Flat
File

Read Flat
Files While
Looping

Metadata
driven
development

Conclusion
CONCLUSION
• BIML can radically reduce SSIS development time
• for frequently used package patterns
• when combined with BIMLScript

• BIML supports all versions of SSIS
• but some project deployment functionality is missing

• bit of a learning curve
• good understanding of SSIS is necessary
• basic C# skills needed
• return of investment is in next projects
RESOURCES
• Official BIML

• Varigence BIML product page

http://www.varigence.com/Products/Biml/Capabilities

• BIMLScript resource hub
http://bimlscript.com/

• BIDS Helper on Codeplex
http://bidshelper.codeplex.com/

• Blogs

• Stairway to BIML by Andy Leonard

http://www.sqlservercentral.com/stairway/100550/

• BIML articles by Joost van Rossum

http://microsoft-ssis.blogspot.be/search/label/BIML

• BIML articles by Marco Schreuder
http://blog.in2bi.eu/tags/biml/

• BIML articles by John Welch

http://agilebi.com/jwelch/tag/biml/

• Introduction to BIML part I by Koen Verbeeck

http://www.mssqltips.com/sqlservertip/3094/introduction-to-business-intelligence-markup-languagebiml-for-ssis/
Q&A

SQL SERVER DAYS 2013
THANKS FOR LISTENING!
koen.verbeeck@element61.be
@Ko_Ver
http://www.linkedin.com/in/kverbeeck

SQL SERVER DAYS 2013
© 2011 Microsoft Corporation. All rights reserved. Microsoft, Windows, Windows Vista and other product names are or may be registered trademarks and/or trademarks in the U.S. and/or other countries.
The information herein is for informational purposes only and represents the current view of Microsoft Corporation as of the date of this presentation. Because Microsoft must respond to changing market
conditions, it should not be interpreted to be a commitment on the part of Microsoft, and Microsoft cannot guarantee the accuracy of any information provided after the date of this presentation.
MICROSOFT MAKES NO WARRANTIES, EXPRESS, IMPLIED OR STATUTORY, AS TO THE INFORMATION IN THIS PRESENTATION.

More Related Content

Recently uploaded

Time Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directionsTime Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directionsNathaniel Shimoni
 
Assure Ecommerce and Retail Operations Uptime with ThousandEyes
Assure Ecommerce and Retail Operations Uptime with ThousandEyesAssure Ecommerce and Retail Operations Uptime with ThousandEyes
Assure Ecommerce and Retail Operations Uptime with ThousandEyesThousandEyes
 
A Framework for Development in the AI Age
A Framework for Development in the AI AgeA Framework for Development in the AI Age
A Framework for Development in the AI AgeCprime
 
The State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxThe State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxLoriGlavin3
 
Long journey of Ruby standard library at RubyConf AU 2024
Long journey of Ruby standard library at RubyConf AU 2024Long journey of Ruby standard library at RubyConf AU 2024
Long journey of Ruby standard library at RubyConf AU 2024Hiroshi SHIBATA
 
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxDigital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxLoriGlavin3
 
UiPath Community: Communication Mining from Zero to Hero
UiPath Community: Communication Mining from Zero to HeroUiPath Community: Communication Mining from Zero to Hero
UiPath Community: Communication Mining from Zero to HeroUiPathCommunity
 
Data governance with Unity Catalog Presentation
Data governance with Unity Catalog PresentationData governance with Unity Catalog Presentation
Data governance with Unity Catalog PresentationKnoldus Inc.
 
(How to Program) Paul Deitel, Harvey Deitel-Java How to Program, Early Object...
(How to Program) Paul Deitel, Harvey Deitel-Java How to Program, Early Object...(How to Program) Paul Deitel, Harvey Deitel-Java How to Program, Early Object...
(How to Program) Paul Deitel, Harvey Deitel-Java How to Program, Early Object...AliaaTarek5
 
Connecting the Dots for Information Discovery.pdf
Connecting the Dots for Information Discovery.pdfConnecting the Dots for Information Discovery.pdf
Connecting the Dots for Information Discovery.pdfNeo4j
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsSergiu Bodiu
 
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24Mark Goldstein
 
Generative Artificial Intelligence: How generative AI works.pdf
Generative Artificial Intelligence: How generative AI works.pdfGenerative Artificial Intelligence: How generative AI works.pdf
Generative Artificial Intelligence: How generative AI works.pdfIngrid Airi González
 
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024BookNet Canada
 
Rise of the Machines: Known As Drones...
Rise of the Machines: Known As Drones...Rise of the Machines: Known As Drones...
Rise of the Machines: Known As Drones...Rick Flair
 
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxThe Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxLoriGlavin3
 
Sample pptx for embedding into website for demo
Sample pptx for embedding into website for demoSample pptx for embedding into website for demo
Sample pptx for embedding into website for demoHarshalMandlekar2
 
The Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsThe Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsPixlogix Infotech
 
Manual 508 Accessibility Compliance Audit
Manual 508 Accessibility Compliance AuditManual 508 Accessibility Compliance Audit
Manual 508 Accessibility Compliance AuditSkynet Technologies
 
TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024Lonnie McRorey
 

Recently uploaded (20)

Time Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directionsTime Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directions
 
Assure Ecommerce and Retail Operations Uptime with ThousandEyes
Assure Ecommerce and Retail Operations Uptime with ThousandEyesAssure Ecommerce and Retail Operations Uptime with ThousandEyes
Assure Ecommerce and Retail Operations Uptime with ThousandEyes
 
A Framework for Development in the AI Age
A Framework for Development in the AI AgeA Framework for Development in the AI Age
A Framework for Development in the AI Age
 
The State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxThe State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptx
 
Long journey of Ruby standard library at RubyConf AU 2024
Long journey of Ruby standard library at RubyConf AU 2024Long journey of Ruby standard library at RubyConf AU 2024
Long journey of Ruby standard library at RubyConf AU 2024
 
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxDigital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
 
UiPath Community: Communication Mining from Zero to Hero
UiPath Community: Communication Mining from Zero to HeroUiPath Community: Communication Mining from Zero to Hero
UiPath Community: Communication Mining from Zero to Hero
 
Data governance with Unity Catalog Presentation
Data governance with Unity Catalog PresentationData governance with Unity Catalog Presentation
Data governance with Unity Catalog Presentation
 
(How to Program) Paul Deitel, Harvey Deitel-Java How to Program, Early Object...
(How to Program) Paul Deitel, Harvey Deitel-Java How to Program, Early Object...(How to Program) Paul Deitel, Harvey Deitel-Java How to Program, Early Object...
(How to Program) Paul Deitel, Harvey Deitel-Java How to Program, Early Object...
 
Connecting the Dots for Information Discovery.pdf
Connecting the Dots for Information Discovery.pdfConnecting the Dots for Information Discovery.pdf
Connecting the Dots for Information Discovery.pdf
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platforms
 
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24
 
Generative Artificial Intelligence: How generative AI works.pdf
Generative Artificial Intelligence: How generative AI works.pdfGenerative Artificial Intelligence: How generative AI works.pdf
Generative Artificial Intelligence: How generative AI works.pdf
 
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
 
Rise of the Machines: Known As Drones...
Rise of the Machines: Known As Drones...Rise of the Machines: Known As Drones...
Rise of the Machines: Known As Drones...
 
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxThe Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
 
Sample pptx for embedding into website for demo
Sample pptx for embedding into website for demoSample pptx for embedding into website for demo
Sample pptx for embedding into website for demo
 
The Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsThe Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and Cons
 
Manual 508 Accessibility Compliance Audit
Manual 508 Accessibility Compliance AuditManual 508 Accessibility Compliance Audit
Manual 508 Accessibility Compliance Audit
 
TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024
 

Featured

2024 State of Marketing Report – by Hubspot
2024 State of Marketing Report – by Hubspot2024 State of Marketing Report – by Hubspot
2024 State of Marketing Report – by HubspotMarius Sescu
 
Everything You Need To Know About ChatGPT
Everything You Need To Know About ChatGPTEverything You Need To Know About ChatGPT
Everything You Need To Know About ChatGPTExpeed Software
 
Product Design Trends in 2024 | Teenage Engineerings
Product Design Trends in 2024 | Teenage EngineeringsProduct Design Trends in 2024 | Teenage Engineerings
Product Design Trends in 2024 | Teenage EngineeringsPixeldarts
 
How Race, Age and Gender Shape Attitudes Towards Mental Health
How Race, Age and Gender Shape Attitudes Towards Mental HealthHow Race, Age and Gender Shape Attitudes Towards Mental Health
How Race, Age and Gender Shape Attitudes Towards Mental HealthThinkNow
 
AI Trends in Creative Operations 2024 by Artwork Flow.pdf
AI Trends in Creative Operations 2024 by Artwork Flow.pdfAI Trends in Creative Operations 2024 by Artwork Flow.pdf
AI Trends in Creative Operations 2024 by Artwork Flow.pdfmarketingartwork
 
PEPSICO Presentation to CAGNY Conference Feb 2024
PEPSICO Presentation to CAGNY Conference Feb 2024PEPSICO Presentation to CAGNY Conference Feb 2024
PEPSICO Presentation to CAGNY Conference Feb 2024Neil Kimberley
 
Content Methodology: A Best Practices Report (Webinar)
Content Methodology: A Best Practices Report (Webinar)Content Methodology: A Best Practices Report (Webinar)
Content Methodology: A Best Practices Report (Webinar)contently
 
How to Prepare For a Successful Job Search for 2024
How to Prepare For a Successful Job Search for 2024How to Prepare For a Successful Job Search for 2024
How to Prepare For a Successful Job Search for 2024Albert Qian
 
Social Media Marketing Trends 2024 // The Global Indie Insights
Social Media Marketing Trends 2024 // The Global Indie InsightsSocial Media Marketing Trends 2024 // The Global Indie Insights
Social Media Marketing Trends 2024 // The Global Indie InsightsKurio // The Social Media Age(ncy)
 
Trends In Paid Search: Navigating The Digital Landscape In 2024
Trends In Paid Search: Navigating The Digital Landscape In 2024Trends In Paid Search: Navigating The Digital Landscape In 2024
Trends In Paid Search: Navigating The Digital Landscape In 2024Search Engine Journal
 
5 Public speaking tips from TED - Visualized summary
5 Public speaking tips from TED - Visualized summary5 Public speaking tips from TED - Visualized summary
5 Public speaking tips from TED - Visualized summarySpeakerHub
 
ChatGPT and the Future of Work - Clark Boyd
ChatGPT and the Future of Work - Clark Boyd ChatGPT and the Future of Work - Clark Boyd
ChatGPT and the Future of Work - Clark Boyd Clark Boyd
 
Getting into the tech field. what next
Getting into the tech field. what next Getting into the tech field. what next
Getting into the tech field. what next Tessa Mero
 
Google's Just Not That Into You: Understanding Core Updates & Search Intent
Google's Just Not That Into You: Understanding Core Updates & Search IntentGoogle's Just Not That Into You: Understanding Core Updates & Search Intent
Google's Just Not That Into You: Understanding Core Updates & Search IntentLily Ray
 
Time Management & Productivity - Best Practices
Time Management & Productivity -  Best PracticesTime Management & Productivity -  Best Practices
Time Management & Productivity - Best PracticesVit Horky
 
The six step guide to practical project management
The six step guide to practical project managementThe six step guide to practical project management
The six step guide to practical project managementMindGenius
 
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...RachelPearson36
 

Featured (20)

2024 State of Marketing Report – by Hubspot
2024 State of Marketing Report – by Hubspot2024 State of Marketing Report – by Hubspot
2024 State of Marketing Report – by Hubspot
 
Everything You Need To Know About ChatGPT
Everything You Need To Know About ChatGPTEverything You Need To Know About ChatGPT
Everything You Need To Know About ChatGPT
 
Product Design Trends in 2024 | Teenage Engineerings
Product Design Trends in 2024 | Teenage EngineeringsProduct Design Trends in 2024 | Teenage Engineerings
Product Design Trends in 2024 | Teenage Engineerings
 
How Race, Age and Gender Shape Attitudes Towards Mental Health
How Race, Age and Gender Shape Attitudes Towards Mental HealthHow Race, Age and Gender Shape Attitudes Towards Mental Health
How Race, Age and Gender Shape Attitudes Towards Mental Health
 
AI Trends in Creative Operations 2024 by Artwork Flow.pdf
AI Trends in Creative Operations 2024 by Artwork Flow.pdfAI Trends in Creative Operations 2024 by Artwork Flow.pdf
AI Trends in Creative Operations 2024 by Artwork Flow.pdf
 
Skeleton Culture Code
Skeleton Culture CodeSkeleton Culture Code
Skeleton Culture Code
 
PEPSICO Presentation to CAGNY Conference Feb 2024
PEPSICO Presentation to CAGNY Conference Feb 2024PEPSICO Presentation to CAGNY Conference Feb 2024
PEPSICO Presentation to CAGNY Conference Feb 2024
 
Content Methodology: A Best Practices Report (Webinar)
Content Methodology: A Best Practices Report (Webinar)Content Methodology: A Best Practices Report (Webinar)
Content Methodology: A Best Practices Report (Webinar)
 
How to Prepare For a Successful Job Search for 2024
How to Prepare For a Successful Job Search for 2024How to Prepare For a Successful Job Search for 2024
How to Prepare For a Successful Job Search for 2024
 
Social Media Marketing Trends 2024 // The Global Indie Insights
Social Media Marketing Trends 2024 // The Global Indie InsightsSocial Media Marketing Trends 2024 // The Global Indie Insights
Social Media Marketing Trends 2024 // The Global Indie Insights
 
Trends In Paid Search: Navigating The Digital Landscape In 2024
Trends In Paid Search: Navigating The Digital Landscape In 2024Trends In Paid Search: Navigating The Digital Landscape In 2024
Trends In Paid Search: Navigating The Digital Landscape In 2024
 
5 Public speaking tips from TED - Visualized summary
5 Public speaking tips from TED - Visualized summary5 Public speaking tips from TED - Visualized summary
5 Public speaking tips from TED - Visualized summary
 
ChatGPT and the Future of Work - Clark Boyd
ChatGPT and the Future of Work - Clark Boyd ChatGPT and the Future of Work - Clark Boyd
ChatGPT and the Future of Work - Clark Boyd
 
Getting into the tech field. what next
Getting into the tech field. what next Getting into the tech field. what next
Getting into the tech field. what next
 
Google's Just Not That Into You: Understanding Core Updates & Search Intent
Google's Just Not That Into You: Understanding Core Updates & Search IntentGoogle's Just Not That Into You: Understanding Core Updates & Search Intent
Google's Just Not That Into You: Understanding Core Updates & Search Intent
 
How to have difficult conversations
How to have difficult conversations How to have difficult conversations
How to have difficult conversations
 
Introduction to Data Science
Introduction to Data ScienceIntroduction to Data Science
Introduction to Data Science
 
Time Management & Productivity - Best Practices
Time Management & Productivity -  Best PracticesTime Management & Productivity -  Best Practices
Time Management & Productivity - Best Practices
 
The six step guide to practical project management
The six step guide to practical project managementThe six step guide to practical project management
The six step guide to practical project management
 
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
 

SQL Server Days 2013 - Create ETL solutions faster with metadata driven development

  • 1. CREATE ETL SOLUTIONS FASTER WITH METADATA DRIVEN DEVELOPMENT KOEN VERBEECK SQL SERVER DAYS 2013
  • 3. OUTLINE Introduction Hello World Read Flat File Read Flat Files While Looping Metadata driven development Conclusion
  • 4. OUTLINE Introduction Hello World Read Flat File Read Flat Files While Looping Metadata driven development Conclusion
  • 6. INTRODUCTION • large percentage of BI projects fail • Gartner - http://www.gartner.com/newsroom/id/492112 • one of the reasons is underestimating development effort ETL • Kimball: 70% time of building a DWH goes into ETL http://www.informationweek.com/the-38-subsystems-of-etl/55300422
  • 7. INTRODUCTION “I choose a lazy person to do a hard job. Because a lazy person will find an easy way to do it” Bill Gates
  • 8. INTRODUCTION • a lot of SSIS packages are very similar • • • • • packages importing flat files packages writing change data to staging tables packages exporting data to excel (for some reason) packages updating dimensions … • … but they take a lot of time to create
  • 9. INTRODUCTION • solution? • code reuse o SSIS basically only supports copy-paste o copy-paste has improved in SSIS 2012 • design patterns o for example: incremental load package o SQL Server 2012 Integration Services Design Patterns • enable through templates o build a template package o save it to C:Program Files (x86)Microsoft Visual Studio 11.0Common7IDEPrivateAssembliesProjectItemsDataTransformationProjectDataTransformationIt ems • but still requires you to edit each package! • (and what if you forget to edit a crucial piece?)
  • 10. INTRODUCTION • metadata driven development to the rescue! • (aka code generating code) • automate generation of common logic in SSIS packages • first option is the “dynamic SSIS package” 1. 2. 3. 4. reads metadata from tables generates code o usually outputs T-SQL or bcp commands o uses T-SQL or C# o for example: SELECT … INTO statements loops over the generated code executes each statement • disadvantages • • • • complex project no parallelism difficult row based error handling difficult to incorporate “business logic”
  • 11. INTRODUCTION • second option: BIML • started as a project at MS: http://vulcan.codeplex.com/ • developer left to found company Varigence http://www.varigence.com/ o took the idea (not the code) and developed BIML • BIML is a markup language and compiler o translates metadata into business intelligence solutions for SQL Server o supports SSIS and SSAS o Varigence made part of BIML available as open source
  • 12. INTRODUCTION • BIDS Helper has open source implementation of BIML • it’s free! • it’s already in the add-on you love! • it is available for SSIS 2005, 2008, 2008R2, 2012 (and 2014?) • BIML offers • powerful code generation o only some parts of the project deployment model are not supported • reuse BI patterns and components o create your pattern in BIML and generate all your packages with the same structure o BIML files can reference each other • .NET based script language o C# code can be incorporated into BIML to generate objects based on metadata o Intellisense (sometimes) available • don’t like BIML? o generated packages are just SSIS packages, you can edit them using BIDS/SSDT/SSDTBI o no vendor lock-in
  • 13. INTRODUCTION • scenario for our demos • import different flat files o exports from ERP systems, other database vendors, 3rd party providers, … • each type of flat file has a different structure o no single SSIS package for all flat files • the name of the flat files can change o for example the name includes a timestamp • this would normally require 1 SSIS package per flat file type • couple of hours/days work? • let’s solve it with BIML!
  • 14. OUTLINE Introduction Hello World Read Flat File Read Flat Files While Looping Metadata driven development Conclusion
  • 15. HELLO WORLD • basic BML script structure Tasks BIML Dataflow Connections FileFormats Packages Tasks Containers Precedence constraints Transformations You can also specify • events • log handlers • variables • parameters • custom tasks • script tasks/components • …
  • 16. HELLO WORLD • let’s take a look at a simple BIML script
  • 17. HELLO WORLD • BIML root node • add connections • add packages
  • 18. HELLO WORLD • specify Tasks • specify specific properties
  • 19. HELLO WORLD • check for errors & generate package • result
  • 21. OUTLINE Introduction Hello World Read Flat File Read Flat Files While Looping Metadata driven development Conclusion
  • 22. READ FLAT FILE • specify FlatFileFormat • columns: name, data type, size, delimiter (, code page) • what you’d normally specify in the flat file connection manager • specify connection
  • 23. READ FLAT FILE • specify data flow with transformations • if no input/output connectors are specified, transformations are connected in the order specified in the BIML file • result
  • 25. OUTLINE Introduction Hello World Read Flat File Read Flat Files While Looping Metadata driven development Conclusion
  • 26. READ FLAT FILE IN LOOP • now let’s loop over a bunch of flat files • specify variables to hold path to current file and source folder • add an expression on the flat file connection manager
  • 27. READ FLAT FILE IN LOOP • add a for each loop • which has its own tasks child element
  • 28. READ FLAT FILE IN LOOP • result
  • 29. DEMO import flat file using for each loop with BIML
  • 30. OUTLINE Introduction Hello World Read Flat File Read Flat Files While Looping Metadata driven development Conclusion
  • 31. METADATA DRIVEN DEVELOPMENT • BIML is nice • … but isn’t the GUI much faster to developer packages? • time to enhance BIML with some C# goodness! called BIMLScript use C# to read metadata loop over metadata and create multiple objects entire website dedicated with tutorials and code snippets http://bimlscript.com/ • also has an online editor • • • •
  • 32. METADATA DRIVEN DEVELOPMENT • Add namespaces • Declare variables
  • 33. METADATA DRIVEN DEVELOPMENT • Retrieve metadata (stored in a SQL Server table) • Loop over metadata and create corresponding objects
  • 35. METADATA DRIVEN DEVELOPMENT • remarks • make sure the code or the metadata doesn’t contain invalid XML characters o <>“& • using C# can mess with the Intellisense o Visual Studio thinks it’s not valid XML anymore o color coding can disappear > right click file and choose Open With… o Intellisense can stop working in Visual Studio > use online editor • beware of the protection levels • some elements can only appear once o do not put those in a loop o e.g. Connections, Packages
  • 37. OUTLINE Introduction Hello World Read Flat File Read Flat Files While Looping Metadata driven development Conclusion
  • 38. CONCLUSION • BIML can radically reduce SSIS development time • for frequently used package patterns • when combined with BIMLScript • BIML supports all versions of SSIS • but some project deployment functionality is missing • bit of a learning curve • good understanding of SSIS is necessary • basic C# skills needed • return of investment is in next projects
  • 39. RESOURCES • Official BIML • Varigence BIML product page http://www.varigence.com/Products/Biml/Capabilities • BIMLScript resource hub http://bimlscript.com/ • BIDS Helper on Codeplex http://bidshelper.codeplex.com/ • Blogs • Stairway to BIML by Andy Leonard http://www.sqlservercentral.com/stairway/100550/ • BIML articles by Joost van Rossum http://microsoft-ssis.blogspot.be/search/label/BIML • BIML articles by Marco Schreuder http://blog.in2bi.eu/tags/biml/ • BIML articles by John Welch http://agilebi.com/jwelch/tag/biml/ • Introduction to BIML part I by Koen Verbeeck http://www.mssqltips.com/sqlservertip/3094/introduction-to-business-intelligence-markup-languagebiml-for-ssis/
  • 42. © 2011 Microsoft Corporation. All rights reserved. Microsoft, Windows, Windows Vista and other product names are or may be registered trademarks and/or trademarks in the U.S. and/or other countries. The information herein is for informational purposes only and represents the current view of Microsoft Corporation as of the date of this presentation. Because Microsoft must respond to changing market conditions, it should not be interpreted to be a commitment on the part of Microsoft, and Microsoft cannot guarantee the accuracy of any information provided after the date of this presentation. MICROSOFT MAKES NO WARRANTIES, EXPRESS, IMPLIED OR STATUTORY, AS TO THE INFORMATION IN THIS PRESENTATION.

Editor's Notes

  1. Blue - Use for Cloud on Your Terms specific content
  2. Green - Use for Mission Critical Confidence specific content
  3. Orange - Use for Breakthrough Insight specific content
  4. Blue - Use for Cloud on Your Terms specific content