Building AI-Driven Apps Using Semantic Kernel.pptx
The role of Data Virtualisation in your EIM strategy
1. How do you want your data served?
Use this layout for a title
with a horizontally
striped picture.
The role of Data Virtualisation in
your EIM Strategy
Christopher Bradley, IPL
Intelligent Business
chris.bradley@ipl.com
1
2. Presenter
Chris Bradley
Head of Business Consulting
chris.bradley@ipl.com
+44 1225 475000
Use this layout for a title
@InfoRacer
My blog: Information Management, Life & Petrol
with a vertically striped
http://infomanagementlifeandpetrol.blogspot.com
picture.
Intelligent Business
2
7. Introduction & Agenda
Use this layout for a title
with a horizontally
striped picture.
7
I Intelligent Business
8. Chris Bradley Summary: Chris Bradley Recent speaking engagements: DAMA UK & BCS Data Management Group:, June 11th 2009; London,
DAMA International (DAMA / Wilshire), March 5th -8th 2007, Boston, MA “Evolve or Die - Data Modelling is not just for DBMS’s”
30 years Information BPM Europe: (IRM), September 2009, London: ½ day workshop
“Data as a service”
Management experience “Panel of Data Modelling experts” “An introduction to Data and the BPMN”
CDi_MDM Summit (IRM UK), April 30 – May 2nd 2007, London, Data Migration Matters: October 1st 2009, London,
MOD, Volvo, Thorn EMI, “A Data Architecture for Data Governance”
“Designing for Success”
Coopers & Lybrand, IPL DAMA UK: June 15th 2007, London, Data Management & Information Management Europe: (DAMA / IRM), November 2-5
2009, London,
“Data Modelling – Where did it all go wrong?”
“Modelling is NOT just for DBMS’s anymore”
Sample Clients: BP, Data Governance Conference, (Debtech / Wilshire) June 25 -28, 2007, San Francisco, CA,
“Meet the Metadata Professional Organisation”
Enterprise Oil, Statoil, “Data Architecture for Governance – case study”
IPL & Embarcadero seminar series: (Bristol, London, Manchester, Edinburgh), October Enterprise Data World International: (DAMA / Wilshire), March 14th – 19th 2010, San
Exxon Mobil, Audit 2007, Francisco CA,
Commission, MoD, Merrill “Data Modelling – Where did it all go wrong?” “How to communicate with the business using high level models”
IPL & DataFlux Seminar Series: (IPL/DataFlux), March 26th 2010, Bath, UK. “The
Lynch, Barclays, DoD, DQ/IM & DAMA Europe (IRM London), November 2007,
Information Advantage – Exploiting Information Management For The Business”
Imperial Tobacco, GSK …. “Data Modelling as a service”
Data Governance Conference: (Debtech / Wilshire) Florida, December 2007, BeyeNETWORK Webinar: (CA/BeyeNETWORK), March 31st 2010, Webinar.
“Data Governance 2.0” “Communicating with the Business through high level data models”
Experience: Data DAMA International: (DAMA / Wilshire), March 16th – 21st 2008, San Diego, CA. Enterprise Architecture Europe: (IRM), June 16th – 18th 2010, London: ½ day
Governance, Master Data “Modelling for SoA” workshop
“The Evolution of Enterprise Data Modelling”
Management, Enterprise “XML amd data models”
ECIM Exploration & Production: September 13th 15th 2010, Haugesund, Norway:
Information Management DAMA International: (DAMA / Wilshire), March 16th – 21st 2008, San Diego, CA.
“Establishing Data Modelling as a Service in BP” “Information Challenges and Solutions”
BPM Europe: (IRM), September 2008, London: Information Management in Pharmaceuticals: September 15th 2010, London,
Author & conference “BPMN for Dummies” “Clinical Information Management – Are we the cobblers children?”
speaker DAMA Europe: (IRM / DAMA), November 2008, London, BPM Europe: (IRM), September 27th – 29th 2010, London, “Learning to Love BPMN 2.0”
“BPMN for Dummies” DAMA Scandinavia: October 26th-27th 2010, Stockholm, “Incorporating ERP Systems
CDMP(Master), CBIP, “Data Modelling as a service” into your overall Models & Information Architecture”
Data Governance Europe Sysmposia: (IRM / Debtech; London), February 2009, Data Management & Information Management Europe: (DAMA / IRM), November
Prince2, APM 2010, London, “How do you get a Business person to read a Data Model?
“Data Governance Challenges in a Major Multi National”
Webinar series: (Embarcadero Technologies & IPL), Oct 2008 – Feb 2009, Data Governance & MDM Europe: (DAMA / IRM), March 2011, London,
Director DAMA UK & MPO “Clinical Information Data Governance”
“The New Formula for Success – Moving Data Modelling beyond the Database”
Data Rage 2009: March 17-19 2009, Enterprise Data World International: (DAMA / Wilshire), April 2011, Chicago IL,
BeyeNetwork Expert “Evolve or Die – Modelling is not just for DBMS’s anymore” “How do you want yours served? – the role of Data Virtualisation and Open Source BI”
Channel Author “Data Modelling as a service”
“Information Asset Enterprise Data World International: (DAMA / Wilshire), April 5th -12th 2009, Tampa FL,
Management” “Exploiting Models for effective SAP implementations”
Chairing panel of experts “Keeping modelling relevant”
Panel of experts “Issues in information internationalisation”
October 1st 2009
“Modelling is not just for RDBMS’s”
DAMA UK & BCS Data Management Group:, June 11th 2009; London,
The Kings Fund
London
Intelligent Business
8 “Evolve or Die - Data Modelling is not just for DBMS’s”
9. Chris Bradley Summary: Chris Bradley Recent publications:
30 years Information Database Marketing Magazine, February 2009, “Preventing a Data Disaster”
Management experience http://content.yudu.com/A12pnb/DMfeb09/resources/30.htm
MOD, Volvo, Thorn EMI, Data Modelling For The Business – A Handbook for aligning the business with IT using high-level data models;
Coopers & Lybrand, IPL Technics Publishing; ISBN 978-0-9771400-7-7;
http://www.amazon.com/Data-Modeling-Business-Handbook-High-
Sample Clients: BP, Level/dp/0977140075/ref=sr_1_4?ie=UTF8&s=books&qid=1235660979&sr=1-4
Enterprise Oil, Statoil, BeyeNETWORK “Chris Bradley Expert Channel” Information Asset Management
Exxon Mobil, Audit http://www.b-eye-network.co.uk/channels/1554/
Commission, MoD, Merrill
Article “Data Modelling is NOT just for DBMS’s” (July 2009)
Lynch, Barclays, DoD,
http://www.b-eye-network.co.uk/channels/1554/view/10748 and (August 2009)
Imperial Tobacco, GSK ….
http://www.b-eye-network.co.uk/view/10986
Experience: Data Article: Information Management Deficiency Syndrome (September 2009)
Governance, Master Data http://www.b-eye-network.co.uk/channels/1554/view/11216/
Management, Enterprise Article: Drowning in spreadsheets (September 2009)
Information Management http://www.b-eye-network.co.uk/channels/1554/view/11482/
Author & conference Article “Seven deadly sins of data modelling” (October 2009)
speaker http://www.b-eye-network.co.uk/view/11481
Article “How do you want yours served (data that is)” (December 2009)
CDMP(Master), CBIP,
http://www.b-eye-network.co.uk/
Prince2, APM
Article “How Do You Want Your Data Served?” Conspectus Magazine (February 2010)
Director DAMA UK & MPO Article “10 easy steps to evaluate Data Modelling tools” Information Management, (March 2010)
BeyeNetwork Expert Article “Big Data, Same Problems” TechTarget (July 2011)
Channel Author http://searchdatamanagement.techtarget.co.uk/news/2240039201/Round-table-The-value-of-big-data
“Information Asset
Management” October 1st 2009
The Kings Fund
London
Intelligent Business
9
10. Agenda
1. An Enterprise Information Management Framework
2. What is Data Virtualisation?
3. 5 ways where EII / Data Virtualisation can add value to
Data Warehousing
4. 6 key considerations when deciding upon Data
migration and take on (ETL vs EII or both?)
5. Information Management issues in the BI world.
6. IM Certification & Competencies Intelligent Business
10
11. 1. IPL’s Information Architecture Framework
Architecture: Framework:
Goals
Orderly arrangement Principles Purpose Components of
and structure for the Architecture
assets
Governance Planning People
Lifecycle Services Process
Quality
Management Infrastructure
Structure
Models / Taxonomy Catalog / Meta data
Data
Structured Types
Transaction Unstructured
Master Data MI/BI Data Technical
Data Data
Data Intelligent Business
11
12. Information Architecture Framework Components
1. Goals / Principles Goals
2. Governance Principles
1
3. Planning Governance Planning
(Information Asset Strategy and Roadmap) 2 3
4. Information Quality Process Quality
Lifecycle Services
Management Infrastructure
5. Life Cycle Management 4 5 6
Processes Models / Taxonomy Catalog / Meta data
6. Services Infrastructure 7 8
(Data Integration, Distribution, etc) Structured
Transaction Unstructured
Master Data MI/BI Data Technical
Data Data
7. Information Models 9 Data
(includes Information relationship models)
8. Information Catalog / Meta 9. Master Data Management
Data Services Intelligent Business
12
13. Information Architecture is one of the four
components of the overall Enterprise Architecture
Business strategy,
Business Organization, and
Core business processes
Architecture
Applications
Information Architecture ERP, etc
Enterprise Data Architecture
Model & Catalog, etc.
Technology
Architecture
Desktop, network,
Data centre strategy
Intelligent Business
13
14. Turning data into Business wisdom
Data
10,000 feet
Information
Your current altitude is 10,000 feet
Knowledge
There is a mountain ahead, peak of 12,000 feet
Wisdom
Climb immediately to 15,000 feet
Intelligent Business
14
15. Now – That should clear up a few things around here!
Businesses NEED a
common vocabulary for
communication
Intelligent Business
15
16. 2. What is Data Virtualisation?
Use this layout for a title
with a horizontally
striped picture.
A primer .....
16
I Intelligent Business
18. Genres of Virtualisation
Data Virtualisation
Abstracts data
from location
and complexity
RDBMS Data Web
Packages Warehouses Excel
Services
Storage Virtualisation
Abstracts logical
storage from
physical storage
Disk 1 Disk 2 Disk 3 Disk 4
Application / Server Virtualisation Abstracts logical
apps & servers
from physical
apps & servers
Intelligent Business
18
Application 1 Application 2 Server 1 Server 2
19. Key Purpose of Virtualisation
Overcome (mask) Complexity
Hardware
Software
Improve Agility
New solutions
Existing solutions
Reduce Costs
Operating
New development
Intelligent Business
19
20. Data Virtualisation in a Nutshell
BI, MI and Portals and Enterprise
Custom Apps
Reporting Dashboards Search
Star SQL Web Services
Virtual Virtual Relational
Data Marts Shareable Data Operational Views
Data Model Services Data Stores
Intelligent Business
20 Legacy Packages RDBMS Web
Files Mainframes Services
21. What are the Business challenges DV addresses?
Mergers &
Acquisitions
Business Cost Savings
Challenges Sales Growth Risk Reduction
Business
Solutions
Complexity Disparity
Data
Location Performance Completeness
Integration
Challenge Security, Quality, Governance
Data
Sources
Intelligent Business
21
22. What DV Does
Data Virtualisation
Intelligent Business
22
23. Typical Data Integration Architectures
BI Tools/Apps. Master Data Mgmt. Operational Apps. Inter-enterprise
Common Design, Admin.,
Physical Movement and Abstraction / Virtual Synchronization
Consolidation (ETL, Consolidation and Propagation
CDC) (Data Federation) (Messaging)
Governance
Common Metadata
Common Connectivity
Pace of Business change & requirement for agility demands that Intelligent Business
23 organizations support multiple styles of data integration
24. How DV differs
Physical Movement and Abstraction / Virtual Synchronization
Consolidation (ETL, Consolidation and Propagation
CDC) (Data Federation) (Messaging)
Middle-
ETL CDC Data Virtualization EAI / ESB
ware
Purpose DB DB DB DB DB Application Application Application
Event Event
Attribute Scheduled On Demand
Driven Driven
Intelligent Business
24
25. How DV Works – Example Scenario
1) I need to build an
application that
looks like this…
2) The view or data
service needs to
look like this…
3) And the data
comes from these
sources…
Intelligent Business
25
26. Traditional Integration with ETL and Data Warehouses
Traditional Approach
1. Design entire DW schema
2. Develop ETL
3. Refresh on batch basis
4. Application gets data from
DW
Issues
Slow development cycle
Replicated data
Batch latencies
Physical stores overhead
Intelligent Business
26
27. Data Virtualisation design
Design Steps
1. Discover data
2. Model individual view/service
3. Validate view/service
Data model layer
Benefits
Faster time to solution
Easy to learn and use tools
Extensible / reusable objects
Conform data to a standard data model Intelligent Business
27
28. Data Virtualisation Production
Production Steps
1. Application invokes
request
2. Optimized data access
and retrieval (single query)
Optimizer
3. Deliver data to application
Benefits
Less replication
High performance
Up-to-the-minute data
Intelligent Business
28
29. Data Virtualisation Production with Caching
Production Steps
1. Cache essential data
2. Application invokes request
3. Optimized data access and
retrieval (leveraging cached
data)
Optimizer Cache
4. Deliver data to application
Benefits
Removes network constraints
7-24 availability
Optimal performance
Intelligent Business
29
30. 3. Five example usage patterns
Use this layout for a title
with a horizontally
striped picture.
Where Data Virtualisation can
add value to Data Warehousing
30
I Intelligent Business
31. Prototyping Data Warehouse Development
In traditional DW development,
time taken for schema changes,
adding new data sources and
providing data federation are often
considerable.
Use DV to prototype a development
environment rapidly building
a virtual DW rather than
a physical one.
Reports, dashboards and
so on can be built on the
virtual DW.
After prototyping the physical DW
can be introduced if the
usage merits.
Packages Databases Files XML Intelligent Business
31
32. Enriching the DW ETL Process
Frequently new data sources particularly from ERPs are required
in the DW.
Often the ETL lacks data access capabilities to complex sources.
Tight processing windows may require access, aggregation &
federation activities to be performed prior to the ETL process.
Powerful data access capabilities of EII provide rich access and
federation capabilities which can present virtual views to the ETL
DW process which continues as though using a simpler data source.
Intelligent Business
32
33. Federating
Data
Warehouses
Many organisations have more than
one DW
Is the Information in each DW
DW DW completely discrete?
Data Virtualisation provides powerful
options to federate multiple DW’s by
creating an integrated view across
them.
This has particular relevance in
providing rapid cross warehouse
views following a merger or
acquisition.
Intelligent Business
33
34. DW
Extension
Business Users Require Data From
Outside the Data Warehouse so they
can meet reporting and operational
needs.
DW Historical data from the warehouse
and up-to-the-minute data from
transaction systems or operational
data stores is required.
Summarized data from the warehouse
and drill-down detail from transaction
systems or operational data stores is
required.
Data Virtualisation can Extend Existing
Data Warehouses quickly and easily to
work around the fact that key data
users need resides outside the
consolidated data warehouse.
Intelligent Business
34
35. Complete
Master
Data View
Master MDM applications alone cannot fully support
all requirements as data exists outside of MDM
Data hub.
Complementary data integration solutions are
Hub needed to deal with data maintained outside of
MDM hubs often in complex, disparate data
silos.
DV can extend the Master Data and provide a
complete 360o view by using master data from
the hub as the foreign key to quickly and easily
federate master data with additional
transactional and historical data to get a
complete single view of master data.
Intelligent Business
35
36. 4. Data migration
and take on
Use this layout for a title
with a vertically striped
picture.
6 key considerations:
ETL vs EII /DV or both?
36
I Intelligent Business
37. Some Migration Considerations
What data have we got?
E-discovery
Data owners vs. users
What other data do we require?
Source model vs target model
Move all the data or leave some in place?
Do we use EII vs ETL (or even both)
Intelligent Business
37
38. EII or ETL?
1. Will the data be replicated in
both the DW and the Operational
System?
• Will data will need to be updated
in one or both locations?
• If data is physically in two locations
beware of regulatory &
compliance issues (e.g. SoX, HIPPA,
BASEL2, FDA etc)
Intelligent Business
38
39. EII or ETL?
2. Data Governance
• Is the data only to be managed in
the originating Operational
System?
• What is the certainty that DW will
be a reporting DW only
(vs Operational DW)? Intelligent Business
39
40. EII or ETL?
3. Currency of the data, i.e. Does it
need to be up to the minute?
• How up to date are the data
requirements of the DW?
• Is there a need to see the
operational data?
Intelligent Business
40
41. EII or ETL?
4. Time to solution i.e. how
quickly is the solution required?
• Immediate requirement?
• Confirmed users & usage? Vs..
• ..Flexible, emerging requirements?
Intelligent Business
41
42. EII or ETL?
5. What is the life expectancy of
source system(s)?
• Are the source systems likely to be
retired?
• Will new systems be
commissioned?
• Are new sources required?
Intelligent Business
42
43. EII or ETL?
6. Need for historical / summary /
aggregate data
• How much historical data is
required in the DW solution?
• How much aggregated / summary
data is required in the DW
solution?
Intelligent Business
43
44. 5. BI &
Information
Management Use this layout for a title
with a vertically striped
Maybe picture.
spreadsheets
aren’t such a
good solution
after all! Intelligent Business
44
45. Effective IM IS crucial today
Higher volumes of data generated by organisations
Information is all pervasive – if you don’t have a strategy to manage it, you
will certainly drown in it
Proliferation of data-centric systems
ERP, CRM, ECM…
Greater demand for reliable information
Accurate business intelligence is vital to gain competitive advantage, support
planning/resourcing and monitor key business functions
Tighter regulatory compliance
Far more responsibility now placed on organisations to ensure they store,
manage, audit and protect their data (SoX, BASEL, SOLVENCY2, HIPPA, FDA ...)
Business change is no longer optional – it’s inevitable
Mergers/acquisitions, market forces, technological advances…
Intelligent Business
45
47. Excel, BI and IM !
Several users within a business are adept at manipulating
large data extracts in Excel
Easily derive new fields
Pivot data
Aggregate data
Produce charts and dashboards.
“All good”, you might say?
Intelligent Business
47
48. Excel, BI and IM !
A “new” copy of the source data is now in your spreadsheet
You are now (unwittingly) a data steward!
What are the rules & calculations for derivations?
Where does the additional data come from?
Charts / graphs potentially disconnected
from data
Distribution leading to data duplication
& amendment
What’s the lineage & provenance of the data now? Intelligent Business
48
49. A Happy Path?
Go back to the source
Avoid “Cottage Industry” reporting
Record metadata regarding the extract and
don’t change its values
If you must correct data, correct at source
Ensure calculations make sense and are
properly annotated and tested
Clearly label distributed versions vs originals.
Identify versions
Don’t re-issue your local copy of the source data - redirect any data
requests to the source Intelligent Business
49
51. What is CDMP?
CDMP stands for “Certified Data Management Professional”
It is the only non-proprietary, widely recognized data
management certification.
The certification program was jointly constructed by DAMA
International (DAMA) and the Institute for Certification of
Computer Professionals (ICCP).
DAMA owns the CDMP certification, and ICCP administers
and delivers exams, provides all record keeping.
Intelligent Business
51
52. Why do I need it?
“Certification, in itself, is not a goal, but Professionalism is.”
Dr. Paul M. Pair, ICCP Fellow
Credential
Increase in Salary
Company Requirement Credibility within Organisation
Professional Growth Credibility with Customers
Self Evaluation Greater Self Esteem
Financial Reward Solve Problems Quicker
Other
Why People Certify Primary Achievement Resulting
from Certification
Intelligent Business
Source: ICCP Research Study (Athabasca University))
52
54. IPL’s Information Management Framework
Goals
Principles
1
Governance Planning
2 3
Lifecycle Infrastructure
Quality
Management Services
4 5 6
Models / Taxonomy Catalog / Meta data
7 8
Structured
Transaction Unstructured
Master Data MI/BI Data Technical
Data Data
Data
9 10
Intelligent Business
54
55. Maturity Model – Information Governance 2
Level 1 - Initial Level 2 - Repeatable Level 3 - Defined Level 4 - Managed Level 5 - Optimised
No clear data Data Ownership Defined Data Data Ownership Data Ownership
ownership assigned. Model does not exist. Ownership Model Model is Model has been
Data Owners, if any, Owners exists. Ownership implemented for the extended such that
evolve on their own commissioned in the Model is loosely key data entities. the majority of data
during project short-term for applied to key data Collaboration assets are under
rollouts (i.e. self specific projects & entities. Limited between active stewardship.
appointed data initiatives. Often collaboration. Not stakeholders in place. Effective governance
owners). No standard department or silo fully 'bought in' to Governance process process employed by
tools or focused leading to data ownership at an regularly reviews this stakeholders &
documentation ownership by “Data enterprise level. model and its stewards. Well
available for use Teams” or “Super application, updating defined standards
across the whole Users” that manage and improving as adopted.
enterprise. “all” data. needed. Benefits
begin to be realised.
Intelligent Business
55
56. Maturity Model – Quality 4
Level 1 - Initial Level 2 - Repeatable Level 3 - Defined Level 4 - Managed Level 5 - Optimised
Limited awareness The quality of few Quality measures Data quality is The measurement of
within the enterprise data sources is have been defined measured for all key data quality is
of the importance of measured in an ad for some key data data sources on a embedded in many
information quality. hoc manner. A sources. Specific regular basis. Quality business processes
Very few, if any, number of different tools adopted to metrics information across the enterprise.
processes in place to tools used to measure quality with is published via Data quality issues
measure quality of measure quality. The some standards in dashboards etc. addressed through
information. Data is activity is driven by a place. The processes Active management the data ownership
often not trusted by projects or for measuring quality of data issues model. Data quality
business users. departments. are applied at through the data issues fed back to be
Limited consistent intervals. ownership model fixed at source.
understanding of Data issues are ensures issues are
good versus bad addressed where often resolved.
quality. Identified critical. Quality
issues are not considerations baked
consistently into the SDLC.
managed. Intelligent Business
56
57. Maturity Model – Master Data 9
Level 1 - Initial Level 2 - Repeatable Level 3 - Defined Level 4 - Managed Level 5 - Optimised
Limited awareness of The impact of master Definition of an A complete MDM A full integrated
MDM. Master Data data issues gain MDM strategy is in strategy has been MDM hub exists and
domains have not recognition within progress. Master defined and adopted. has been adopted
been defined across the enterprise. data domains have MDM joined up with across the enterprise
the enterprise. Silo Limited scope for been identified. data governance and for all key master
based approach to managing master Several domains are data quality data domains. The
data models means data due to lack of targeted for initiatives. Robust hub controls access
multiple definitions Data Ownership delivering master business rules to master data
of potential master Model. Project or data to specific defined for master entities. Many
data entities, such as department based applications or data domains. Data applications access
customer, exist. initiatives attempt to projects. Differing cleansing and the MDM Hub
understand the products may be standardisation through a service
enterprise's master adopted in these performed in the layer. Business users
data. No MDM silos for MDM. Senior MDM hub. Specific are fully responsible
strategy defined. management support products adopted for for master data.
for MDM grows. MDM. Master data
models defined. Intelligent Business
57
58. As-Is
IM Principles
5
Business
4 Data Governance
Intelligence
3
Master Data 2 IM Planning
Management 1
0 As-Is
Catalog &
Data Quality
Metadata
Models & IM Lifecycle
Taxonomy Management
Integration &
Intelligent Business
58 Access
59. Summary
ben.braine@ipl.com
Use this layout for a title
with a horizontally
striped picture.
59
I Intelligent Business
60. Summary
Data Virtualisation opens
up a brave new world
For data migration,
ETL isn’t “the only way”
Effective Information
Management is crucial
Intelligent Business
60
61. Contact details
Chris Bradley
Business Consulting Director
Chris.Bradley@ipl.com
+44 1225 475000
@InfoRacer
My blog: Information Management, Life & Petrol
http://infomanagementlifeandpetrol.blogspot.com Intelligent Business
61
62. Further information:
Articles including:
• Seven deadly sins of data modelling
• The IT Credibility Crunch
• Information Management Deficiency Syndrome
• Modelling is not just for DBMS’s
• Data mining - where’s my hard hat?
• Master data mix-ups
• Drowning in spreadsheets
• Why bother with a semantic layer?
• Business Intelligence in a cold climate
• Data Management is everybody's business
• Information superstition
Download from:
http://bc.ipl.com/ Intelligent Business
62