A non technical look at the platform that will change the way you think about data quality.
Quality data is now as simple as a single call to the Kleber Cloud platform where information can be:
- Parsed into its discrete elements,
- Verified in real-time against an extensive catalogue of data-sets and third party data services,
- Repaired,
- Formatted in multiple structures ready for use in downstream processes
- Enhanced to provide data insight or improve overall data value
- Match Keys assigned to assist in the identification of duplicate or similar records.
All of this from one simple process!
Traditional validation of information alone is not enough as it creates complex and costly exception handling processes when data cannot be verified. What do you do when a piece of information cannot be verified? What process does this trigger and what impact does this have on your business?
DataTools Kleber handles these exceptions before they occur. Inbuilt advanced parsing and repair technologies allow even unverified data to be standardized and formatted in a way that downstream processes can use, removing the need for costly manual intervention.
Available for use during both real-time data capture and batch processing of existing data, DataTools have taken all the “heavy lifting” required to deliver true data quality and made it available in a single process that can be easily implemented into almost any environment.
The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...
DataTools Kleber. Powerful data quality is a single, simple to implement process.
1. What is Kleber?
A non technical look into the
platform that will change the
way you think about
data quality
Copyright DataTools 2014
www.datatools.com.au
Addressing Data Quality
2. What is DataTools Kleber?
Let’s begin.
Capture
First things first. If you are a technical person please contact us and we will send
you all the architecture, methods calls, functions, services information and
otherwise foreign to non technical people jargon you can handle. If you are a
business owner, operations manager, marketing guru, non technical PM or
otherwise just wanting to get things fixed, read on!
Kleber is the culmination of almost 20 years providing data quality software. It
binds together all the discrete elements required to deliver true data quality into a
single platform. Further still it harmoniously unifies each of the commonly required
individual processes into a single process that can be easily implemented from
within a website or application.
To delver true data quality, the process outlined in this document can be
implemented in real-time for on boarding of information and in a batch method to
process all the data already residing within your organisation.
Create
Parse
Enhance
Verify
Match
Repair
Format
Read on to learn about each step in the Kleber process and how these come together as a complete data processing solution.
Copyright DataTools 2014
www.datatools.com.au
Addressing Data Quality
3. “I want quality data. I’ve been told I need a Data Capture solution.”
It’s a great place to start.
“To assist in fast and accurate capture of contact information”
Data Capture solutions make the capture of contact information within your websites and
applications quick and easy. It’s also a great place to get into DataTools Kleber.
Predictive Search technologies (otherwise known as Type Assist) leverage advanced keystroke
reduction capabilities to predict the information being entered and minimise the number of
keystrokes required.
This makes entering of information not only faster and more accurate but also more
pleasurable for the user, reducing abandonment and enhancing the user experience.
Because Predictive Search is highly intuitive it requires no user training and can be used by new
users with ease.
Kleber offers predictive capture for physical addresses, first names, surnames, business
names and email addresses.
Copyright DataTools 2014
www.datatools.com.au
Addressing Data Quality
4. “I already use data capture. How is Kleber of any benefit to me?”
Handling Exceptions.
“Every data capture process creates exceptions that must be handled. This is
typically the most complex and costly aspect of any data capture project”
Despite advanced logic being used in Predictive Capture technologies
not all data will be captured correctly.
Even the most sophisticated data capture solutions will have some
percentage of error or information that could not be captured.
Data Capture solutions have been sold as the
golden panacea for data quality issues. The
reality is these technologies only handle the
“simple fixes”.
True data quality requires a mechanism for
handling the few percent of truly complicated
expectations these technologies create.
It is this data that pollutes databases and causes complex and costly
exception handling process throughout the information workflow.
DataTools Kleber makes handing of these exceptions easy by ensuring
even poorly captured data is repaired and cleansed before being stored.
Copyright DataTools 2014
www.datatools.com.au
Addressing Data Quality
5. “OK. So I can fix the exceptions, but that’s still a lot of work, right?”
Wrong! One Call does it all.
Automated exception handling via
“Handling exceptions is as simple as making a single call to Kleber .
Kleber will then repair and clean the data turning it into useful information
and return it to you”
• Parsing to split the captured data into
discrete fields
All of the individual and complex data processing methods have been
combined together in a single function within Kleber.
By calling this function the data will be parsed, verified, repaired and
formatted. Match keys will also be appended as will various other types
of information to provide insight into the data and deliver operational
efficiencies.
You can call each of these function individually if you want to, but all the
hard work has already been done, so why not leave it to us.
• Verification of data to ensure its accuracy
• Repair of data through results of verification
• Formatting of the data to ensure it is
suitable for use
• Matching to compare data and find
duplicates
Using Kleber at this point will eliminate the need for complex downstream exception handling.
Copyright DataTools 2014
www.datatools.com.au
Addressing Data Quality
6. “I don’t have Data Capture technologies. Do I need to implement these first?”
No. It will provider a better user experience, but you don’t have to use it.
Simply send the data as you store it now.
“Use your standard data entry forms to capture the data and send it as is or process all the data in your database as a batch.”
If you don’t have a data capture solution that’s OK. You can use your
existing fields for users to enter data. Once all the data has been entered
simply send it to Kleber and get the clean, verified and nicely formatted
result.
Data Capture technologies simplify the entry of data and make the
process faster and more pleasurable for the user.
We can provide these too!
Kleber can also be used in a Batch fashion whereby an entire list or
database can be processed.
Copyright DataTools 2014
www.datatools.com.au
Addressing Data Quality
7. “So how does it work?”
It’s complicated – Well it is: but we’ve made it simple.
“There is a logical process through which all data can be taken, each step of the
process building on the previous on to ultimately deliver true data quality and insight .”
Capture
Data is captured,
whether it be
though Predictive
technologies or
traditional data
entry.
Parse
Parsing breaks
the data down into
its smallest parts
ready for other
downstream
processes.
Copyright DataTools 2014
Verify
Parsed data is
compared to
“sources of truth”
to verify the
accuracy of the
information.
Repair
Results from the
parsing and
verification are
used to repair the
data, making it
clean and useful.
www.datatools.com.au
Format
The cleansed data
elements can then
be put together in
a variety of ways
to ensure the data
meets specific
downstream
requirements.
Match
Match Keys are
appended to the
data in order to
compare records
against each other
and find
duplicates.
Enhance
Other information,
is appended to the
data in order to
deliver true insight.
E.g.
Geospatial data.
Addressing Data Quality
8. “Why do I need to parse my data?”
It all starts with parsing.
“Parsing is the foundation of quality data. Without quality parsing all other
data processes will fail.”
Parsing is the process of splitting data apart into its smallest parts in a way
that makes sense. It is typically done behind the scenes and, as such, its
importance is often underestimated.
Proper parsing is often “faked” by many vendors who simply take
variations of a record and reference this against a known source of truth
until they get a match. This is inefficient and highly dependant on the
quality of the source of truth.
Kleber provides true parsing. After all, DataTools has been delivering
advanced data parsing technologies for almost 20 years!
Ex
Capture
Copyright DataTools 2014
Parse
Verify
Repair
www.datatools.com.au
Format
Match
Enhance
Addressing Data Quality
9. “What does verification involve?”
Referencing data against a “Source of Truth”.
“Compare your data against a catalogue the most accurate and
comprehensive datasets available.”
Verification, or Validation as it is otherwise referred to, is the process of
comparing the data you have against the an authoritative third party
dataset.
Kleber makes verification of data against third party datasets easy as
many of the common data sets are embedded within the Kleber
platform.
Using the results from a validation process it can be determined what
information is correct and what needs to be repaired.
Capture
Copyright DataTools 2014
Parse
Verify
Repair
www.datatools.com.au
Format
Match
Enhance
Addressing Data Quality
10. “But doesn’t verification give the same result as parsing?”
Yes & No: For accurate and clean information the results will be similar.
For problem data no amount of verification will deliver a clean result.
This is where Kleber really shines. Validation is only as good as the parsing
technology that underpins it. Remember, parsing is the foundation on
which all other data process are built.
Properly parsed data can still be repaired and cleaned without being
validated, so good parsing is vital. As with the exception handling that
Data Capture may create, so too can replying solely on validation
technologies. What to you do with data that doesn’t validate?
Beware of vendors who promote validation as a means of cleaning data. It
is easy to compare data against a structured source of truth. It is entirely
more complex to clean and make sense of poorly structured data. This is
the domain of a good parser such as that provided within Kleber.
Copyright DataTools 2014
www.datatools.com.au
Addressing Data Quality
11. “OK. So I can use the verified information to repair my data?”
Yes. But there is more…
“Both verified and un-verified information can be repaired.”
After verifying a record, or part thereof, against a known source of truth, any
discrepancies between the original record and the reference source can be
intelligently examined and the data repaired.
E.g. updating a street type from St to Dr.
This is particularly useful for address information where reference datasets
such as the Australia Post Postal Address File (PAF) can be used to verify and
repair data.
Remember these processes can happen for a single record during data entry or
across an entire database. Imagine repairing all your data in a single process?
Capture
Copyright DataTools 2014
Parse
Verify
Repair
www.datatools.com.au
Format
Match
Enhance
Addressing Data Quality
12. “But what about records that can’t be verified? How do I repair them?”
That’s where it comes back to smart parsing.
“Remember parsing is the foundation to true data quality.”
Traditionally, to repair (i.e. clean) data records that could not be verified a separate manual exception handling process would
need to be undertaken. This is a time consuming and costly process and in most cases the result would be inconclusive.
The smart parsing capabilities of Kleber enable even unverified data to be cleansed and repaired sufficiently to facilitate
further downstream processes such as matching and data appending. This significantly reduces the number of exceptions and
time associated in dealing with these.
Quite often when an a business is looking for “Data Quality” what they typically mean is “Data Repair”. It is only due to the prolific
and highly successful promotion of data validation technologies that businesses look to validation as a source of data quality.
Capture
Copyright DataTools 2014
Parse
Verify
Repair
www.datatools.com.au
Format
Match
Enhance
Addressing Data Quality
13. “So now I have clean data?”
Yes. However it may not be in the format required.
“Format of data is just as important as its cleanliness and validity.”
Formatting of data is vital to it being usable by a business. Different systems, process and
communications channels require data to be presented in different formats. Data must be
prepared in a way that each system will accept it. After all, you can’t push a round peg into a
square hole!
Presentation of an address on an invoice is very different to that required for exchange of
address data with 3PL providers or with other areas within the same business.
Because of Kleber’s advanced parsing capabilities data can be put together in almost any
format or structure. Imagine always having a perfectly formatted output for each specific
system or communications channel!
Capture
Copyright DataTools 2014
Parse
Verify
Repair
www.datatools.com.au
Format
Match
Enhance
Addressing Data Quality
14. “It’s been parsed, verified, repaired and formatted – Is it clean data now?”
Yes, it’s clean. But its not true data quality.
“Data Quality is more than just the cleanliness, validity and format of an
individual record. Its these things across all the data in its entirety.”
Congratulations. You now have a beautifully formatted (and possibly verified) data
record – or you might have several of them, or millions, or more! Now how many of
these are unique?
Kleber includes industry leading data matching technologies that allow you to not
only find duplicates in your own database, but also allows you to compare records
between databases.
Kleber matching technologies can be used on full records to identify unique individuals
or on partial records to identify matches at specific levels. Kleber can append a “match
key” automatically during data processing if required.
Capture
Copyright DataTools 2014
Parse
Verify
Repair
www.datatools.com.au
Format
Match
Enhance
Addressing Data Quality
15. “Are we there yet?”
That depends on how much you want to know!
“Appending certain data can provide insights that would otherwise be unknown.
Uplifting data will improve the completeness of the data, making it more valuable.”
There is a potentially limitless amount of data out there that can be appended to
your existing data to deliver insights and value. Geospatial data to plot a point on
a map, affluence and behavioural data for marketing and even psychographic and
cultural profiling data to predict trends.
Missing data, such as an phone number or current address can be added using
any one of the various data services providers Kleber makes accessible.
The ability to append and uplift data with this other useful information is
dependant on having clean, accurate and properly structured data. The Kleber
process makes all this possible.
Capture
Copyright DataTools 2014
Parse
Verify
Repair
www.datatools.com.au
Format
Match
Enhance
Addressing Data Quality
16. I think I get it now!
Great, but let’s recap.
Capture
Kleber provides all the necessary (if not vital) components to achieve data quality!
Create
Parse
It provides all of this in ONE simple to implement process (one call)
It can be used in both real-time (data entry) and batch (existing data)
Quality parsing is vital to all other data processes
Enhance
Verify
Validation is not parsing, but good parsing means better validation
Data Quality needn't be difficult, especially if you use Kleber.
For further information regarding Kleber and how it can benefit your business
please contact one of the friendly staff at DataTools on +61 2 9687 4666.
Match
Repair
Format
P.S. We’ve only scratched the surface!
Copyright DataTools 2014
www.datatools.com.au
Addressing Data Quality