SlideShare a Scribd company logo
1 of 28
Download to read offline
#AIIM14	
  #AIIM14	
  
#AIIM14	
  
The	
  Good,	
  the	
  Bad,	
  and	
  the	
  Ugly	
  of	
  
Defensible	
  Disposi7on	
  
Richard	
  Medina	
  
Co-­‐founder	
  and	
  Principal	
  Consultant,	
  Doculabs	
  |	
  doculabs.com	
  
rmedina@doculabs.com	
  |	
  richardmedinadoculabs.com	
  
@richarddoculabs	
  
#AIIM14	
  
Issues	
  
1.  The	
  problem	
  
§  The	
  sky	
  is	
  falling	
  again	
  
2.  Break	
  it	
  into	
  two	
  problems	
  
§  Day-­‐forward	
  versus	
  historical	
  content	
  
3.  How	
  to	
  address	
  historical	
  content	
  
§  A	
  defensible	
  disposi2on	
  methodology	
  
4.  Analysis	
  and	
  classificaLon	
  technology	
  
§  Should	
  you	
  use	
  it?	
  Does	
  it	
  work?	
  
5.  Doing	
  the	
  Assessment	
  
§  Approaches	
  and	
  results	
  
#AIIM14	
  
Issues	
  
1.  The	
  problem	
  
§  The	
  sky	
  is	
  falling	
  again	
  
2.  Break	
  it	
  into	
  two	
  problems	
  
§  Day-­‐forward	
  versus	
  historical	
  content	
  
3.  How	
  to	
  address	
  historical	
  content	
  
§  A	
  defensible	
  disposi2on	
  methodology	
  
4.  Analysis	
  and	
  classificaLon	
  technology	
  
§  Should	
  you	
  use	
  it?	
  Does	
  it	
  work?	
  
5.  Doing	
  the	
  Assessment	
  
§  Approaches	
  and	
  results	
  
#AIIM14	
  
The	
  Problem	
  is	
  Over-­‐Reten7on	
  
OrganizaLons	
  have	
  been	
  over-­‐retaining	
  electronic	
  informaLon	
  and	
  failing	
  to	
  
dispose	
  of	
  it	
  in	
  a	
  legally	
  defensible	
  manner	
  when	
  business	
  and	
  law	
  will	
  allow	
  
Retaining	
  everything	
  forever	
  
Disposing	
  of	
  everything	
  immediately	
  
Having	
  employees	
  make	
  classificaLon	
  decisions	
  
Having	
  technology	
  make	
  classificaLon	
  decisions	
  
Hybrid	
  with	
  technology	
  and	
  people	
  
#AIIM14	
  
Why	
  Over-­‐Reten7on	
  is	
  the	
  Problem	
  
§  Organiza2ons	
  keep	
  non-­‐required	
  electronic	
  content	
  forever	
  
because:	
  
1.  Classifying	
  content	
  (to	
  determine	
  what	
  to	
  keep	
  and	
  what	
  to	
  purge)	
  is	
  
manual	
  and	
  expensive	
  
2.  Content	
  worth	
  preserving	
  is	
  mixed	
  with	
  content	
  that	
  should	
  be	
  purged	
  
3.  Legal	
  -­‐-­‐	
  and	
  others	
  -­‐-­‐	
  are	
  afraid	
  of	
  wrongfully	
  deleLng	
  materials	
  
(spoliaLon)	
  
4.  AddiLonal	
  storage	
  is	
  inexpensive,	
  which	
  makes	
  it	
  easy	
  for	
  corporaLons	
  
to	
  buy	
  more	
  storage	
  and	
  defer	
  addressing	
  the	
  problem	
  
#AIIM14	
  
Issues	
  
1.  The	
  problem	
  
§  The	
  sky	
  is	
  falling	
  again	
  
2.  Break	
  it	
  into	
  two	
  problems	
  
§  Day-­‐forward	
  versus	
  historical	
  content	
  
3.  How	
  to	
  address	
  historical	
  content	
  
§  A	
  defensible	
  disposi2on	
  methodology	
  
4.  Analysis	
  and	
  classificaLon	
  technology	
  
§  Should	
  you	
  use	
  it?	
  Does	
  it	
  work?	
  
5.  Doing	
  the	
  Assessment	
  
§  Approaches	
  and	
  results	
  
#AIIM14	
  
Recommenda7ons	
  for	
  Day-­‐forward	
  
§  Addressing	
  day-­‐forward	
  informa7on	
  lifecycle	
  management	
  (ILM)	
  is	
  much	
  easier	
  to	
  address	
  than	
  
historical	
  content	
  
§  Even	
  though	
  addressing	
  it	
  messes	
  with	
  employees’	
  day-­‐to-­‐day	
  business	
  acLviLes	
  
§  Day-­‐forward:	
  Ini2ate	
  ILM	
  prac2ces	
  on	
  a	
  “day-­‐forward”	
  basis	
  first,	
  so	
  any	
  new	
  content	
  created	
  or	
  
saved	
  is	
  assigned	
  a	
  disposi2on	
  period	
  
§  DisposiLon	
  horizons	
  should	
  begin	
  to	
  influence	
  behavior	
  on	
  where	
  content	
  begins	
  to	
  be	
  stored	
  (as	
  
users	
  discover	
  that	
  those	
  materials	
  saved	
  in	
  the	
  “wrong”	
  system	
  will	
  be	
  purged)	
  
§  Guidance:	
  Provide	
  employees	
  with	
  explicit	
  guidance	
  for	
  the	
  acceptable	
  use	
  of	
  available	
  tools	
  for	
  
dynamic	
  content	
  and	
  their	
  associated	
  reten2on	
  periods	
  	
  
§  For	
  example,	
  retain	
  non-­‐records	
  for	
  3	
  years,	
  retain	
  official	
  records	
  per	
  the	
  retenLon	
  schedule	
  
§  Historical:	
  For	
  historical	
  content,	
  analyze	
  the	
  feasibility	
  of	
  content	
  analy2cs	
  and	
  autoclassifica2on	
  
§  Recognize	
  that	
  cleaning	
  up	
  TBs	
  of	
  content	
  can	
  take	
  years.	
  So	
  conduct	
  the	
  analysis	
  in	
  2014,	
  begin	
  
the	
  cleanup	
  effort	
  in	
  earnest	
  by	
  2015,	
  and	
  eliminate	
  a	
  large	
  porLon	
  of	
  dated	
  content	
  by	
  2016	
  
	
  
#AIIM14	
  
Guidance	
  Example	
  for	
  Day-­‐
forward	
  
System/Repository	
   Recommended	
  Reten7on	
  Period	
  
Personal	
  Network	
  
Drives	
  (“P”	
  drives)	
  
•  Provide	
  each	
  user	
  with	
  personal	
  drive	
  space	
  of	
  a	
  limited	
  size	
  for	
  their	
  storage,	
  for	
  as	
  long	
  
as	
  the	
  user	
  is	
  employed	
  
Shared	
  Network	
  
Drives	
  
(“G”	
  drives)	
  
•  Make	
  them	
  read	
  only	
  (which	
  means	
  no	
  network	
  storage	
  for	
  collabora7on;	
  content	
  will	
  
have	
  to	
  go	
  into	
  an	
  ECM	
  system)	
  
•  Excep7ons	
  include	
  applica7on	
  or	
  systems	
  that	
  need	
  to	
  use	
  network	
  storage	
  
ECM	
  System	
   1.  Default	
  for	
  non	
  records:	
  retained	
  for	
  3	
  years	
  	
  
2.  Default	
  for	
  non	
  records	
  that	
  have	
  	
  long-­‐term	
  value:	
  retained	
  for	
  7	
  years	
  
3.  Official	
  records:	
  retained	
  per	
  the	
  reten7on	
  schedule	
  
Social	
  Community	
  
Sites	
  
•  No	
  documents	
  stored	
  in	
  communi7es	
  (only	
  links	
  to	
  documents	
  in	
  the	
  ECM	
  system)	
  
•  Consider	
  reten7on	
  periods	
  for	
  non-­‐document	
  content	
  (e.g.	
  3	
  years)	
  
#AIIM14	
  
Issues	
  
1.  The	
  problem	
  
§  The	
  sky	
  is	
  falling	
  again	
  
2.  Break	
  it	
  into	
  two	
  problems	
  
§  Day-­‐forward	
  versus	
  historical	
  content	
  
3.  How	
  to	
  address	
  historical	
  content	
  
§  A	
  defensible	
  disposi;on	
  methodology	
  
4.  Analysis	
  and	
  classificaLon	
  technology	
  
§  Should	
  you	
  use	
  it?	
  Does	
  it	
  work?	
  
5.  Doing	
  the	
  Assessment	
  
§  Approaches	
  and	
  results	
  
#AIIM14	
  
What’s	
  the	
  Purpose	
  of	
  Your	
  DD	
  Methodology?	
  
§  You	
  must	
  sa7sfy	
  4	
  demands:	
  
1.  Regulatory	
  retenLon	
  requirements	
  
2.  Hold	
  retenLon	
  requirements	
  
3.  Business	
  retenLon	
  requirements	
  
4.  Cost	
  impact	
  of	
  anything	
  you	
  do	
  
§  What	
  you	
  do	
  has	
  impact:	
  
1.  What	
  you	
  do	
  
2.  Effects	
  of	
  what	
  you	
  do	
  
§  You	
  can	
  do	
  2	
  things:	
  
1.  Sort	
  
2.  Dispose	
  
§  Your	
  mission	
  stated	
  two	
  ways:	
  
§  Your	
  mission	
  is	
  to	
  saLsfy	
  your	
  retenLon	
  demands	
  (1-­‐3)	
  while	
  minimizing	
  bad	
  cost	
  impact	
  to	
  
yourself	
  (4)	
  
§  Your	
  mission	
  is	
  to	
  maximize	
  good	
  cost	
  impact	
  (4)	
  while	
  saLsfying	
  your	
  retenLon	
  requirements	
  
(1-­‐3)	
  
	
  
#AIIM14	
  
It’s	
  Based	
  on	
  Reasonableness	
  
§  To	
  determine	
  what	
  “sa2sfy	
  your	
  reten2on	
  
demands”	
  really	
  means	
  for	
  you,	
  use	
  the	
  Principle	
  
of	
  Reasonableness	
  and	
  act	
  In	
  Good	
  Faith	
  
§  Courts	
  do	
  not	
  ask,	
  expect	
  or	
  necessarily	
  reward	
  organizaLons	
  for	
  
perfecLon.	
  Courts	
  do	
  expect,	
  however,	
  that	
  whatever	
  
informaLon	
  management	
  tacLcs	
  an	
  organizaLon	
  undertakes	
  are	
  
appropriate	
  to	
  how	
  that	
  parLcular	
  enLty	
  is	
  situated	
  (size,	
  
financial	
  resources,	
  regulatory	
  and	
  liLgaLon	
  profile,	
  etc.).	
  (Jim	
  
McGann	
  and	
  Julie	
  Colgan,	
  “Implement	
  a	
  defensible	
  dele2on	
  
strategy	
  to	
  manage	
  risk	
  and	
  control	
  costs”,	
  Inside	
  Counsel)	
  
#AIIM14	
  
Your	
  DD	
  Methodology	
  Has	
  4	
  Parts	
  
1.  Defensible	
  Disposi7on	
  Policy	
  
§  It’s	
  your	
  design	
  specificaLon,	
  your	
  business	
  rules	
  for	
  DD,	
  your	
  
decision	
  tree	
  
§  Specifies	
  very	
  clearly	
  the	
  objecLves	
  that	
  your	
  methodology	
  
will	
  fulfill.	
  It	
  states	
  clearly	
  what	
  you	
  mean	
  by	
  your	
  retenLon	
  
requirements	
  and	
  what	
  you	
  mean	
  by	
  reasonable	
  costs	
  when	
  
you	
  are	
  trying	
  to	
  fulfill	
  your	
  retenLon	
  requirements.	
  
2.  Technology	
  Approach	
  
§  For	
  SorLng	
  and	
  Disposing	
  
§  You	
  must	
  use	
  technology	
  –	
  it’s	
  not	
  an	
  opLon	
  
	
  
#AIIM14	
  
Your	
  DD	
  Methodology	
  Has	
  4	
  Parts	
  
3.  Assessment	
  	
  (Sor7ng)	
  Plan	
  
§  Do	
  the	
  legwork	
  and	
  look	
  at	
  what’s	
  there	
  
§  What	
  informaLon	
  and	
  systems	
  you’re	
  assessing	
  
§  Your	
  processing	
  rules	
  	
  (decision	
  plan)	
  
§  It	
  will	
  be	
  flexible	
  
4.  Disposi7on	
  Plan	
  
§  Evaluate	
  your	
  assessment	
  results	
  using	
  your	
  DD	
  Policy	
  
§  Dispose	
  (which	
  ranges	
  from	
  keeping	
  forever	
  to	
  deleLng	
  right	
  now	
  
with	
  many	
  opLons	
  in	
  between)	
  
§  Refine	
  your	
  DD	
  Policy	
  (1)	
  and	
  conLnue	
  as	
  needed	
  
	
  
#AIIM14	
  
Issues	
  
1.  The	
  problem	
  
§  The	
  sky	
  is	
  falling	
  again	
  
2.  Break	
  it	
  into	
  two	
  problems	
  
§  Day-­‐forward	
  versus	
  historical	
  content	
  
3.  How	
  to	
  address	
  historical	
  content	
  
§  A	
  defensible	
  disposi2on	
  methodology	
  
4.  Analysis	
  and	
  classifica7on	
  technology	
  
§  Should	
  you	
  use	
  it?	
  Does	
  it	
  work?	
  
5.  Doing	
  the	
  Assessment	
  
§  Approaches	
  and	
  results	
  
#AIIM14	
  
There’s	
  an	
  Awesome	
  Business	
  Case	
  
Classifica7on	
  Technique	
  
Classifica7on	
  
Rate	
  
Pricing	
  
Total	
  Cost	
  
to	
  Classify	
  
Manual	
  ClassificaLon	
   10	
  seconds	
  per	
  
document	
  
$35	
  /	
  hr.	
   $20	
  million	
  
Auto	
  ClassificaLon	
  
	
  
(with	
  95%	
  machine	
  and	
  5%	
  
human	
  classified,	
  via	
  
offshore	
  labor)	
  
Less	
  than	
  1	
  
second	
  per	
  
document	
  
$.005	
  per	
  document	
  for	
  
machine	
  processing	
  and	
  	
  
$5	
  /	
  hr.	
  for	
  those	
  that	
  
require	
  manual	
  
classificaLon	
  	
  
$2	
  million	
  
§  …	
  if	
  the	
  technology	
  works	
  
§  50	
  TB	
  =	
  	
  ~200	
  million	
  documents	
  (average	
  of	
  250KB	
  per	
  document)	
  
§  The	
  following	
  table	
  illustrates	
  the	
  Lme	
  and	
  effort	
  required	
  to	
  classify	
  200	
  million	
  documents	
  
#AIIM14	
  
Analysis	
  and	
  Classifica7on	
  Technologies	
  
§  Many	
  different	
  kinds	
  of	
  technology	
  vendors	
  are	
  addressing	
  analysis,	
  
classificaLon,	
  and	
  disposiLon	
  
§  File	
  AnalyLcs,	
  Content	
  AnalyLcs,	
  Content	
  ClassificaLon,	
  ECM,	
  E-­‐discovery,	
  
Search,	
  Capture,	
  DLP,	
  Storage	
  Management	
  
§  Products,	
  hosted	
  soluLons,	
  service	
  providers	
  	
  
§  IBM/Stored	
  IQ,	
  HP/Autonomy,	
  EMC	
  Kazeon,	
  SAS,	
  Kofax,	
  Equivio,	
  RaLonal	
  
RetenLon,	
  Recommind,	
  Index	
  Engines,	
  and	
  others	
  
§  Most	
  have	
  a	
  sweet	
  spot	
  where	
  they	
  will	
  succeed	
  
§  But	
  it’s	
  highly	
  dependent….	
  on	
  just	
  about	
  every	
  factor	
  you	
  can	
  think	
  of	
  
§  E.g.,	
  your	
  business	
  purposes,	
  your	
  ECM	
  environment,	
  your	
  “informaLon	
  
architecture”,	
  your	
  document	
  types	
  and	
  their	
  complexity	
  and	
  volume,	
  the	
  value	
  
and	
  risk	
  of	
  the	
  documents,	
  your	
  success	
  criteria,	
  etc.,	
  etc.,	
  etc.	
  
#AIIM14	
  
Sidebar:	
  How	
  Many	
  of	
  them	
  Work	
  
Before	
   Acer	
  
<server	
  XXX,	
  drive	
  G:>	
  
Forecast	
  
summary_121008.doc	
  
Record	
  =	
  no	
  
Age	
  =	
  2.5	
  years	
  
Document	
  type=	
  departmental	
  forecast	
  
Keywords	
  =	
  forecast,	
  2008,	
  drav	
  
Status	
  =	
  delete	
  
Confidence	
  =	
  9.2	
  (out	
  of	
  10)	
  
1.  Analyze	
  the	
  content	
  and	
  review	
  the	
  retenLon	
  schedule	
  
2.  Establish	
  classificaLon	
  rules	
  and	
  train	
  the	
  systems	
  with	
  examples	
  
3.  Crawlers	
  and	
  recogniLon	
  engines	
  evaluate	
  the	
  content	
  and	
  generate	
  a	
  classificaLon	
  
4.  For	
  content	
  where	
  a	
  high	
  machine	
  confidence	
  factor	
  exists,	
  content	
  is	
  automaLcally	
  tagged	
  
and	
  then	
  staged	
  for	
  migraLon	
  to	
  the	
  appropriate	
  system	
  or	
  disposiLon	
  
5.  For	
  content	
  with	
  low	
  confidence	
  factors,	
  documents	
  are	
  routed	
  to	
  clerical	
  staff	
  (onshore	
  or	
  
offshore)	
  for	
  manual	
  classificaLon	
  
6.  The	
  results	
  of	
  the	
  manual	
  idenLficaLon	
  are	
  fed	
  back	
  into	
  the	
  automated	
  algorithms	
  to	
  
“teach”	
  the	
  systems	
  bewer	
  classificaLon	
  
Throughout	
  the	
  process,	
  results	
  and	
  samples	
  are	
  routed	
  to	
  
records	
  management	
  and	
  legal	
  professionals	
  within	
  the	
  firm	
  for	
  
validaLon	
  and	
  confirmaLon	
  
1	
  
2	
  
3	
  
4	
  
5	
  
6	
   	
  
Client	
  
Valida7on	
  
	
  
#AIIM14	
  
Issues	
  
1.  The	
  problem	
  
§  The	
  sky	
  is	
  falling	
  again	
  
2.  Break	
  it	
  into	
  two	
  problems	
  
§  Day-­‐forward	
  versus	
  historical	
  content	
  
3.  How	
  to	
  address	
  historical	
  content	
  
§  A	
  defensible	
  disposi2on	
  methodology	
  
4.  Analysis	
  and	
  classificaLon	
  technology	
  
§  Should	
  you	
  use	
  it?	
  Does	
  it	
  work?	
  
5.  Doing	
  the	
  Assessment	
  
§  Approaches	
  and	
  results	
  
#AIIM14	
  
Assessment	
  Approaches	
  
§  There	
  are	
  three	
  categories	
  of	
  awributes	
  that	
  can	
  be	
  used	
  to	
  
determine	
  what	
  a	
  file	
  is:	
  	
  
1.  Environmental	
  awributes	
  around	
  the	
  file	
  (e.g.,	
  file	
  locaLon,	
  ownership)	
  
2.  File	
  awributes	
  about	
  the	
  file	
  (e.g.,	
  file	
  type,	
  age,	
  author)	
  
3.  Content	
  awributes	
  within	
  the	
  file	
  (e.g.,	
  keywords,	
  character	
  strings,	
  word	
  
proximity,	
  word	
  density)	
  
§  Various	
  techniques	
  	
  and	
  technologies,	
  along	
  with	
  business	
  rules,	
  
can	
  be	
  used	
  to	
  determine	
  what	
  a	
  file	
  is,	
  and	
  whether	
  it	
  is	
  eligible	
  for	
  
disposiLon	
  
§  E.g.,	
  a	
  DOC	
  file	
  created	
  over	
  5	
  years	
  ago	
  and	
  not	
  accessed	
  for	
  a	
  year	
  may	
  be	
  
purged	
  
§  This	
  type	
  of	
  purging	
  could	
  be	
  done	
  aver	
  giving	
  users	
  adequate	
  noLce	
  (“move	
  it	
  
or	
  lose	
  it”	
  or	
  “hold”	
  for	
  90	
  days,	
  then	
  delete)	
  
#AIIM14	
  
#1:	
  Environmental	
  Ahributes	
  
Ahribute	
   Evalua7on	
  Technique	
   Tool(s)	
  Used	
   Examples	
   How	
  Used	
  
Ownership	
   Access	
  Controls	
  
Content	
  Analy7cs,	
  Data	
  
Loss	
  Preven7on,	
  Storage	
  
Management	
  
Permissions	
  within	
  LDAP	
  list	
  
people	
  and	
  infer	
  
department	
  or	
  func7on	
  
Large	
  collec7ons	
  of	
  files	
  can	
  be	
  
assessed	
  en	
  masse	
  based	
  on	
  
access	
  controls	
  
1	
  
Loca7on	
   File	
  Path	
  
Content	
  Analy7cs,	
  Data	
  
Loss	
  Preven7on,	
  Storage	
  
Management	
  
G:/accoun7ng/july2004/temp	
   Stranded	
  and	
  orphaned	
  
loca7ons	
  are	
  ocen	
  easily	
  
eliminated	
  
2	
  
Environmental	
  Ahributes	
  (around	
  a	
  file)	
  
#AIIM14	
  
#2:	
  File	
  Ahributes	
  
Duplicate	
  
Hash	
  Algorithm	
  
Content	
  AnalyLcs	
   Exact	
  duplicates	
   Exact	
  duplicates	
  can	
  be	
  easily	
  eliminated	
  
3	
  
File	
  Type	
   Extension	
  or	
  MIME	
  type	
   Content	
  AnalyLcs	
   .TMP,	
  .MP3	
   To	
  idenLfy	
  file	
  types	
  that	
  should	
  not	
  exist	
  
in	
  a	
  corporate	
  seyng	
  
4	
  
Block	
  Read	
  
Content	
  AnalyLcs	
   Near	
  duplicates	
   Near	
  duplicates	
  must	
  be	
  assessed	
  in	
  the	
  
context	
  of	
  other	
  awributes	
  
Metadata	
   ProperLes	
  
Content	
  AnalyLcs	
   Age	
   To	
  determine	
  old	
  materials,	
  materials	
  
authored	
  by	
  individuals	
  that	
  have	
  lev	
  the	
  
organizaLon	
  
5	
   Content	
  AnalyLcs	
   Author	
   Typically,	
  these	
  awributed	
  must	
  be	
  
conLnued	
  with	
  other	
  awributed	
  via	
  a	
  rule	
  
to	
  take	
  acLon	
  
Content	
  AnalyLcs	
   Security	
  Profile	
  (ConfidenLal)	
   User	
  filename	
  properLes	
  to	
  determine	
  
type	
  
File	
  Name	
   Character	
  Strings	
  
Content	
  AnalyLcs	
   GL-­‐USDIST31_093098.xls	
   Determine	
  whether	
  a	
  file	
  was	
  system	
  
generated	
  vs.	
  human	
  generated	
  
6	
   Content	
  AnalyLcs	
   FORMUB92_SMITH	
   Documents	
  that	
  are	
  based	
  on	
  a	
  specific	
  
form	
  number	
  can	
  easily	
  be	
  idenLfied	
  
Ahribute	
   Evalua7on	
  Technique	
   Tool(s)	
  Used	
   Examples	
   How	
  Used	
  
File	
  Ahributes	
  (about	
  a	
  file)	
  
#AIIM14	
  
#3:	
  Content	
  Ahributes	
  
Key	
  Word	
   Character	
  Strings	
  
Content	
  AnalyLcs;	
  
ClassificaLon	
  Module	
  
“Enron”,	
  “Guarantee”	
   To	
  determine	
  if	
  a	
  document	
  is	
  
on	
  Hold	
  via	
  a	
  word	
  list	
  per	
  the	
  
hold	
  request	
  
7	
  
Character	
  
or	
  Word	
  
Paherns	
  
“ClassificaLon”	
  
<pawern	
  matching>	
  
ClassificaLon	
  Module	
   Word	
  proximity	
   To	
  determine	
  the	
  category	
  in	
  
which	
  a	
  document	
  may	
  fit	
  8	
  
ClassificaLon	
  Module	
   Word	
  frequency	
  
Content	
  AnalyLcs;	
  
ClassificaLon	
  Module	
  
“Privileged”	
   IdenLficaLon	
  of	
  PII	
  
Content	
  AnalyLcs;	
  DLP	
   SS#,	
  Credit	
  card	
  #	
   Regular	
  Expression(RegEX)	
  
lists;	
  determined	
  enLLes	
  for	
  
hold,	
  security,	
  IP,	
  PHI,	
  PII,	
  DLP	
  
Ahribute	
   Evalua7on	
  Technique	
   Tool(s)	
  Used	
   Examples	
   How	
  Used	
  
Content	
  Ahributes	
  (within	
  a	
  file)	
  
#AIIM14	
  
Assessment	
  Results	
  
Preserva7on	
   Findings	
  
Unnecessary	
  File	
  Types	
  
(Executables,	
  non-­‐business	
  pictures,	
  movies,	
  etc.)	
   13	
  to	
  15%	
  
Duplicates	
   15	
  to	
  20%	
  
Near	
  Duplicates	
   9	
  to	
  30%	
  
Risk	
   Findings	
  
Files	
  with	
  PII	
   10	
  to	
  16%	
  
Files	
  with	
  Sample	
  Keywords	
   3	
  to	
  5%	
  
Opera7onal	
   Findings	
  
Files	
  10	
  years	
  or	
  older	
   7	
  to	
  11%	
  
Files	
  accessed	
  within	
  the	
  last	
  18	
  months	
   25	
  to	
  35%	
  
Findings	
  not	
  mutually	
  
exclusive	
  (	
  i.e.,	
  a	
  duplicate	
  
file	
  could	
  also	
  be	
  aged)	
  
#AIIM14	
  
Assessment	
  Summary	
  
Findings	
   Enterprise	
  Impact	
  
Total	
  that	
  could	
  be	
  disposed	
   20%	
  of	
  2.5	
  PB	
  
Enterprise	
  ImplicaLons	
   .5	
  PB	
  removed	
  @	
  $5,000,000	
  per	
  PB	
  
Savings	
   $2,500,000	
  per	
  year	
  in	
  storage	
  expense	
  
Technique	
   Status	
   %	
  of	
  Total	
   Total	
  
AnalyLcs	
   Unnecessary	
  	
   20%	
   500	
  TB	
  (.5	
  PB)	
  
ClassificaLon	
   Record	
   8%	
   200	
  TB	
  (.2	
  PB)	
  
Non-­‐Record,	
  Business	
  
Reference	
  
28%	
   700	
  TB	
  (.7	
  PB)	
  
Evaluated,	
  Staged	
  for	
  
DisposiLon	
  (2016)	
  	
  
44%	
   1,100	
  TB	
  (1.1PB)	
  
Total	
   100%	
   2,500	
  TB	
  (2.5	
  PB)	
  
#AIIM14	
  
Assessment	
  Implica7ons	
  
§  Given	
  the	
  results,	
  $2.5	
  million	
  in	
  storage	
  expense	
  could	
  be	
  saved	
  annually	
  on	
  the	
  disposiLon	
  of	
  
historic	
  content,	
  resulLng	
  in	
  $12.5	
  million	
  over	
  5	
  years	
  
§  Going	
  forward	
  with	
  newly	
  created	
  content,	
  if	
  similar	
  techniques	
  are	
  applied,	
  the	
  saving	
  grows	
  to	
  
$34.8	
  million	
  over	
  5	
  years	
  
§  The	
  current	
  cost	
  projecLons	
  are	
  based	
  on	
  the	
  historical	
  content	
  growth	
  rate	
  of	
  30%	
  per	
  year	
  
§  The	
  expected	
  cost	
  projecLons	
  are	
  based	
  on	
  a	
  content	
  growth	
  rate	
  of	
  26%	
  per	
  year	
  
@$5,000,000	
  per	
  PB	
   2012	
   2013	
   2014	
   2015	
   2016*	
  	
   Total	
  
Current	
  Storage	
  (PB)	
   2.5	
   3.25	
   4.23	
   5.49	
   7.14	
  
Current	
  Cost	
  (Mill)	
   $12.5	
   $16.3	
   $21.1	
   $27.5	
   $35.7	
   $113.0	
  
Expected	
  Storage	
  (PB)	
   2	
   2.52	
   3.18	
   4.00	
   3.94	
  
Expected	
  Cost	
  (Mill)	
   $10	
   $12.6	
   $15.9	
   $20.0	
   $19.7	
   $78.2	
  
Total	
  Savings	
  (Mill)	
   $2.5	
   $3.65	
   $5.25	
   $7.46	
   $16.00	
   $34.8	
  
*In	
  2016,	
  the	
  1.1	
  PB	
  or	
  44%	
  of	
  content	
  from	
  the	
  2012	
  historical	
  content	
  assessment	
  can	
  be	
  disposed	
  
#AIIM14	
  
Conclusions	
  
1.  The	
  business	
  case	
  for	
  disposiLon	
  is	
  strong	
  
§  Costs,	
  risks,	
  and	
  benefits	
  
2.  InformaLon	
  governance	
  must	
  be	
  addressed	
  in	
  phases	
  
§  StarLng	
  today,	
  the	
  program	
  will	
  take	
  years	
  to	
  mature	
  
§  Set	
  expectaLons	
  according	
  
3.  You	
  should	
  probably	
  address	
  day-­‐forward	
  ILM	
  before	
  tackling	
  historical	
  
content	
  
4.  Recognize	
  that	
  manual	
  classificaLon	
  is	
  not	
  an	
  opLon	
  
5.  The	
  technologies	
  are	
  immature	
  and	
  varied,	
  but	
  you	
  can	
  be	
  successful	
  by	
  
matching	
  the	
  techniques	
  and	
  technologies	
  to	
  the	
  kinds	
  of	
  files	
  you	
  want	
  to	
  
target	
  
6.  Your	
  DD	
  methodology	
  has	
  4	
  main	
  parts:	
  	
  DD	
  Policy,	
  Technology	
  Approach,	
  
Assessment	
  Plan,	
  Disposi2on	
  Plan	
  
#AIIM14	
  #AIIM14	
  
#AIIM14	
  
Thank	
  You	
  
Richard	
  Medina	
  
Co-­‐founder	
  and	
  Principal	
  Consultant,	
  Doculabs	
  |	
  doculabs.com	
  
rmedina@doculabs.com	
  |	
  richardmedinadoculabs.com	
  
@richarddoculabs	
  
www.aiim.org/infochaos	
  
Do	
  YOU	
  understand	
  the	
  business	
  	
  
challenge	
  of	
  the	
  next	
  10	
  years?	
  
This	
  ebook	
  from	
  AIIM	
  President	
  
John	
  Mancini	
  explains.	
  

More Related Content

Similar to The Good, The Bad, and The Ugly of Defensible Disposition

Lessons learned comm_industry
Lessons learned comm_industryLessons learned comm_industry
Lessons learned comm_industryfrmichler
 
Chapter 5 successful problem solving & task mgt
Chapter 5   successful problem solving & task mgtChapter 5   successful problem solving & task mgt
Chapter 5 successful problem solving & task mgtNasz Zainuddin
 
Future Proof Your DAM
Future Proof Your DAMFuture Proof Your DAM
Future Proof Your DAMjflorance
 
How to Realize Benefits from Data Management Maturity Models
How to Realize Benefits from Data Management Maturity ModelsHow to Realize Benefits from Data Management Maturity Models
How to Realize Benefits from Data Management Maturity ModelsKingland
 
Backups and Disaster Recovery for Nonprofits
Backups and Disaster Recovery for NonprofitsBackups and Disaster Recovery for Nonprofits
Backups and Disaster Recovery for NonprofitsCommunity IT Innovators
 
PMO and PPM Best Practices
PMO and PPM Best PracticesPMO and PPM Best Practices
PMO and PPM Best PracticesJeff McClay
 
OAUG 05-2009-MDM-1683-A Fiteni CPA, CMA
OAUG 05-2009-MDM-1683-A Fiteni CPA, CMAOAUG 05-2009-MDM-1683-A Fiteni CPA, CMA
OAUG 05-2009-MDM-1683-A Fiteni CPA, CMAAlex Fiteni
 
Planning Your Data Science Projects
Planning Your Data Science ProjectsPlanning Your Data Science Projects
Planning Your Data Science ProjectsSpotle.ai
 
DPBoK Foundation Certification Introduction
DPBoK Foundation Certification IntroductionDPBoK Foundation Certification Introduction
DPBoK Foundation Certification IntroductionAshraf Fouad
 
Designing Effective Storage Strategies to Meet Business Needs
Designing Effective Storage Strategies to Meet Business NeedsDesigning Effective Storage Strategies to Meet Business Needs
Designing Effective Storage Strategies to Meet Business NeedsBrian Anderson
 
Designing Effective Storage Strategies to Meet Business Needs
Designing Effective Storage Strategies to Meet Business NeedsDesigning Effective Storage Strategies to Meet Business Needs
Designing Effective Storage Strategies to Meet Business NeedsEagle Technologies
 
Planning Information Governance and Litigation Readiness
Planning Information Governance and Litigation ReadinessPlanning Information Governance and Litigation Readiness
Planning Information Governance and Litigation ReadinessRich Medina
 
Understanding-Agile &Scrum.pdf
Understanding-Agile &Scrum.pdfUnderstanding-Agile &Scrum.pdf
Understanding-Agile &Scrum.pdfSwapnikaReddy6
 
When Management Asks You: “Do You Accept Agile as Your Lord and Savior?"
When Management Asks You: “Do You Accept Agile as Your Lord and Savior?"When Management Asks You: “Do You Accept Agile as Your Lord and Savior?"
When Management Asks You: “Do You Accept Agile as Your Lord and Savior?"admford
 
ch03-Design Project.ppt
ch03-Design Project.pptch03-Design Project.ppt
ch03-Design Project.pptLuckySaigon1
 
Cicerone India
Cicerone IndiaCicerone India
Cicerone IndiaAdarsh K
 
Sdec10 lean package implementation
Sdec10 lean package implementationSdec10 lean package implementation
Sdec10 lean package implementationTerry Bunio
 
Digital Preservation - Manage and Provide Access
Digital Preservation - Manage and Provide AccessDigital Preservation - Manage and Provide Access
Digital Preservation - Manage and Provide AccessMichaelPaulmeno
 
Webinar: 2018 Disaster Recovery Checklist - 5 Key Areas to Improve
Webinar: 2018 Disaster Recovery Checklist - 5 Key Areas to ImproveWebinar: 2018 Disaster Recovery Checklist - 5 Key Areas to Improve
Webinar: 2018 Disaster Recovery Checklist - 5 Key Areas to ImproveStorage Switzerland
 

Similar to The Good, The Bad, and The Ugly of Defensible Disposition (20)

Lessons learned comm_industry
Lessons learned comm_industryLessons learned comm_industry
Lessons learned comm_industry
 
Chapter 5 successful problem solving & task mgt
Chapter 5   successful problem solving & task mgtChapter 5   successful problem solving & task mgt
Chapter 5 successful problem solving & task mgt
 
Future Proof Your DAM
Future Proof Your DAMFuture Proof Your DAM
Future Proof Your DAM
 
How to Realize Benefits from Data Management Maturity Models
How to Realize Benefits from Data Management Maturity ModelsHow to Realize Benefits from Data Management Maturity Models
How to Realize Benefits from Data Management Maturity Models
 
Backups and Disaster Recovery for Nonprofits
Backups and Disaster Recovery for NonprofitsBackups and Disaster Recovery for Nonprofits
Backups and Disaster Recovery for Nonprofits
 
PMO and PPM Best Practices
PMO and PPM Best PracticesPMO and PPM Best Practices
PMO and PPM Best Practices
 
OAUG 05-2009-MDM-1683-A Fiteni CPA, CMA
OAUG 05-2009-MDM-1683-A Fiteni CPA, CMAOAUG 05-2009-MDM-1683-A Fiteni CPA, CMA
OAUG 05-2009-MDM-1683-A Fiteni CPA, CMA
 
Planning Your Data Science Projects
Planning Your Data Science ProjectsPlanning Your Data Science Projects
Planning Your Data Science Projects
 
DPBoK Foundation Certification Introduction
DPBoK Foundation Certification IntroductionDPBoK Foundation Certification Introduction
DPBoK Foundation Certification Introduction
 
Designing Effective Storage Strategies to Meet Business Needs
Designing Effective Storage Strategies to Meet Business NeedsDesigning Effective Storage Strategies to Meet Business Needs
Designing Effective Storage Strategies to Meet Business Needs
 
Designing Effective Storage Strategies to Meet Business Needs
Designing Effective Storage Strategies to Meet Business NeedsDesigning Effective Storage Strategies to Meet Business Needs
Designing Effective Storage Strategies to Meet Business Needs
 
Planning Information Governance and Litigation Readiness
Planning Information Governance and Litigation ReadinessPlanning Information Governance and Litigation Readiness
Planning Information Governance and Litigation Readiness
 
Agile Fundamentals for Project Managers.pdf
Agile Fundamentals for Project Managers.pdfAgile Fundamentals for Project Managers.pdf
Agile Fundamentals for Project Managers.pdf
 
Understanding-Agile &Scrum.pdf
Understanding-Agile &Scrum.pdfUnderstanding-Agile &Scrum.pdf
Understanding-Agile &Scrum.pdf
 
When Management Asks You: “Do You Accept Agile as Your Lord and Savior?"
When Management Asks You: “Do You Accept Agile as Your Lord and Savior?"When Management Asks You: “Do You Accept Agile as Your Lord and Savior?"
When Management Asks You: “Do You Accept Agile as Your Lord and Savior?"
 
ch03-Design Project.ppt
ch03-Design Project.pptch03-Design Project.ppt
ch03-Design Project.ppt
 
Cicerone India
Cicerone IndiaCicerone India
Cicerone India
 
Sdec10 lean package implementation
Sdec10 lean package implementationSdec10 lean package implementation
Sdec10 lean package implementation
 
Digital Preservation - Manage and Provide Access
Digital Preservation - Manage and Provide AccessDigital Preservation - Manage and Provide Access
Digital Preservation - Manage and Provide Access
 
Webinar: 2018 Disaster Recovery Checklist - 5 Key Areas to Improve
Webinar: 2018 Disaster Recovery Checklist - 5 Key Areas to ImproveWebinar: 2018 Disaster Recovery Checklist - 5 Key Areas to Improve
Webinar: 2018 Disaster Recovery Checklist - 5 Key Areas to Improve
 

More from AIIM International

Create, Capture, Collaborate - Your Content Drives Organizational Value
Create, Capture, Collaborate - Your Content Drives Organizational ValueCreate, Capture, Collaborate - Your Content Drives Organizational Value
Create, Capture, Collaborate - Your Content Drives Organizational ValueAIIM International
 
Meet the Expert Panel - 2021 State of the Intelligent Information Management ...
Meet the Expert Panel - 2021 State of the Intelligent Information Management ...Meet the Expert Panel - 2021 State of the Intelligent Information Management ...
Meet the Expert Panel - 2021 State of the Intelligent Information Management ...AIIM International
 
[Webinar Slides] Maximizing Workforce Capacity - Proven Practices for Saving ...
[Webinar Slides] Maximizing Workforce Capacity - Proven Practices for Saving ...[Webinar Slides] Maximizing Workforce Capacity - Proven Practices for Saving ...
[Webinar Slides] Maximizing Workforce Capacity - Proven Practices for Saving ...AIIM International
 
[Webinar Slides] When Your Current Systems No Longer Help You Do Your Job, It...
[Webinar Slides] When Your Current Systems No Longer Help You Do Your Job, It...[Webinar Slides] When Your Current Systems No Longer Help You Do Your Job, It...
[Webinar Slides] When Your Current Systems No Longer Help You Do Your Job, It...AIIM International
 
[Webinar Slides] Information Access and Information Control: Two Cloud Conten...
[Webinar Slides] Information Access and Information Control: Two Cloud Conten...[Webinar Slides] Information Access and Information Control: Two Cloud Conten...
[Webinar Slides] Information Access and Information Control: Two Cloud Conten...AIIM International
 
[Webinar Slides] Data Privacy for the IM Practitioner - Practical Advice for ...
[Webinar Slides] Data Privacy for the IM Practitioner - Practical Advice for ...[Webinar Slides] Data Privacy for the IM Practitioner - Practical Advice for ...
[Webinar Slides] Data Privacy for the IM Practitioner - Practical Advice for ...AIIM International
 
[Webinar Slides] New Approaches to Classification and Retention for Organizat...
[Webinar Slides] New Approaches to Classification and Retention for Organizat...[Webinar Slides] New Approaches to Classification and Retention for Organizat...
[Webinar Slides] New Approaches to Classification and Retention for Organizat...AIIM International
 
[Webinar Slides] Driving Digital Change With O365 & Intelligent Information M...
[Webinar Slides] Driving Digital Change With O365 & Intelligent Information M...[Webinar Slides] Driving Digital Change With O365 & Intelligent Information M...
[Webinar Slides] Driving Digital Change With O365 & Intelligent Information M...AIIM International
 
[Webinar Slides] Working Faster and Smarter in a Digital Transforming World W...
[Webinar Slides] Working Faster and Smarter in a Digital Transforming World W...[Webinar Slides] Working Faster and Smarter in a Digital Transforming World W...
[Webinar Slides] Working Faster and Smarter in a Digital Transforming World W...AIIM International
 
[Webinar Slides] Using AI to Easily Automate All of Your Correspondence Channels
[Webinar Slides] Using AI to Easily Automate All of Your Correspondence Channels[Webinar Slides] Using AI to Easily Automate All of Your Correspondence Channels
[Webinar Slides] Using AI to Easily Automate All of Your Correspondence ChannelsAIIM International
 
[Webinar Slides] Capture Leaders & Their Projects: We Asked, They Answered. D...
[Webinar Slides] Capture Leaders & Their Projects: We Asked, They Answered. D...[Webinar Slides] Capture Leaders & Their Projects: We Asked, They Answered. D...
[Webinar Slides] Capture Leaders & Their Projects: We Asked, They Answered. D...AIIM International
 
[Webinar Slides] eSignatures: Learn How This Technology Can Revolutionize You...
[Webinar Slides] eSignatures: Learn How This Technology Can Revolutionize You...[Webinar Slides] eSignatures: Learn How This Technology Can Revolutionize You...
[Webinar Slides] eSignatures: Learn How This Technology Can Revolutionize You...AIIM International
 
[Webinar Slides] Your 2019 Information Management Resolution: Part Two
[Webinar Slides] Your 2019 Information Management Resolution: Part Two[Webinar Slides] Your 2019 Information Management Resolution: Part Two
[Webinar Slides] Your 2019 Information Management Resolution: Part TwoAIIM International
 
[Webinar Slides] Data Explosion in Your Organization? Harness It with a Compr...
[Webinar Slides] Data Explosion in Your Organization? Harness It with a Compr...[Webinar Slides] Data Explosion in Your Organization? Harness It with a Compr...
[Webinar Slides] Data Explosion in Your Organization? Harness It with a Compr...AIIM International
 
[Webinar Slides] It All Starts Here— Effectively Capturing Paper and Digital ...
[Webinar Slides] It All Starts Here— Effectively Capturing Paper and Digital ...[Webinar Slides] It All Starts Here— Effectively Capturing Paper and Digital ...
[Webinar Slides] It All Starts Here— Effectively Capturing Paper and Digital ...AIIM International
 
[Webinar Slides] Improving your Organization’s Collaborative and Case-Centric...
[Webinar Slides] Improving your Organization’s Collaborative and Case-Centric...[Webinar Slides] Improving your Organization’s Collaborative and Case-Centric...
[Webinar Slides] Improving your Organization’s Collaborative and Case-Centric...AIIM International
 
[Webinar Slides] Modern Problems Require Modern Solutions
[Webinar Slides] Modern Problems Require Modern Solutions[Webinar Slides] Modern Problems Require Modern Solutions
[Webinar Slides] Modern Problems Require Modern SolutionsAIIM International
 
[Webinar Slides] Dreading Your Data Migration Project? 3 Ways Robotic Process...
[Webinar Slides] Dreading Your Data Migration Project? 3 Ways Robotic Process...[Webinar Slides] Dreading Your Data Migration Project? 3 Ways Robotic Process...
[Webinar Slides] Dreading Your Data Migration Project? 3 Ways Robotic Process...AIIM International
 
[AIIM18] Beyond Human Capacity: Using analytics to scale your everyday inform...
[AIIM18] Beyond Human Capacity: Using analytics to scale your everyday inform...[AIIM18] Beyond Human Capacity: Using analytics to scale your everyday inform...
[AIIM18] Beyond Human Capacity: Using analytics to scale your everyday inform...AIIM International
 

More from AIIM International (20)

2022 IIM Infographic.pptx
2022 IIM Infographic.pptx2022 IIM Infographic.pptx
2022 IIM Infographic.pptx
 
Create, Capture, Collaborate - Your Content Drives Organizational Value
Create, Capture, Collaborate - Your Content Drives Organizational ValueCreate, Capture, Collaborate - Your Content Drives Organizational Value
Create, Capture, Collaborate - Your Content Drives Organizational Value
 
Meet the Expert Panel - 2021 State of the Intelligent Information Management ...
Meet the Expert Panel - 2021 State of the Intelligent Information Management ...Meet the Expert Panel - 2021 State of the Intelligent Information Management ...
Meet the Expert Panel - 2021 State of the Intelligent Information Management ...
 
[Webinar Slides] Maximizing Workforce Capacity - Proven Practices for Saving ...
[Webinar Slides] Maximizing Workforce Capacity - Proven Practices for Saving ...[Webinar Slides] Maximizing Workforce Capacity - Proven Practices for Saving ...
[Webinar Slides] Maximizing Workforce Capacity - Proven Practices for Saving ...
 
[Webinar Slides] When Your Current Systems No Longer Help You Do Your Job, It...
[Webinar Slides] When Your Current Systems No Longer Help You Do Your Job, It...[Webinar Slides] When Your Current Systems No Longer Help You Do Your Job, It...
[Webinar Slides] When Your Current Systems No Longer Help You Do Your Job, It...
 
[Webinar Slides] Information Access and Information Control: Two Cloud Conten...
[Webinar Slides] Information Access and Information Control: Two Cloud Conten...[Webinar Slides] Information Access and Information Control: Two Cloud Conten...
[Webinar Slides] Information Access and Information Control: Two Cloud Conten...
 
[Webinar Slides] Data Privacy for the IM Practitioner - Practical Advice for ...
[Webinar Slides] Data Privacy for the IM Practitioner - Practical Advice for ...[Webinar Slides] Data Privacy for the IM Practitioner - Practical Advice for ...
[Webinar Slides] Data Privacy for the IM Practitioner - Practical Advice for ...
 
[Webinar Slides] New Approaches to Classification and Retention for Organizat...
[Webinar Slides] New Approaches to Classification and Retention for Organizat...[Webinar Slides] New Approaches to Classification and Retention for Organizat...
[Webinar Slides] New Approaches to Classification and Retention for Organizat...
 
[Webinar Slides] Driving Digital Change With O365 & Intelligent Information M...
[Webinar Slides] Driving Digital Change With O365 & Intelligent Information M...[Webinar Slides] Driving Digital Change With O365 & Intelligent Information M...
[Webinar Slides] Driving Digital Change With O365 & Intelligent Information M...
 
[Webinar Slides] Working Faster and Smarter in a Digital Transforming World W...
[Webinar Slides] Working Faster and Smarter in a Digital Transforming World W...[Webinar Slides] Working Faster and Smarter in a Digital Transforming World W...
[Webinar Slides] Working Faster and Smarter in a Digital Transforming World W...
 
[Webinar Slides] Using AI to Easily Automate All of Your Correspondence Channels
[Webinar Slides] Using AI to Easily Automate All of Your Correspondence Channels[Webinar Slides] Using AI to Easily Automate All of Your Correspondence Channels
[Webinar Slides] Using AI to Easily Automate All of Your Correspondence Channels
 
[Webinar Slides] Capture Leaders & Their Projects: We Asked, They Answered. D...
[Webinar Slides] Capture Leaders & Their Projects: We Asked, They Answered. D...[Webinar Slides] Capture Leaders & Their Projects: We Asked, They Answered. D...
[Webinar Slides] Capture Leaders & Their Projects: We Asked, They Answered. D...
 
[Webinar Slides] eSignatures: Learn How This Technology Can Revolutionize You...
[Webinar Slides] eSignatures: Learn How This Technology Can Revolutionize You...[Webinar Slides] eSignatures: Learn How This Technology Can Revolutionize You...
[Webinar Slides] eSignatures: Learn How This Technology Can Revolutionize You...
 
[Webinar Slides] Your 2019 Information Management Resolution: Part Two
[Webinar Slides] Your 2019 Information Management Resolution: Part Two[Webinar Slides] Your 2019 Information Management Resolution: Part Two
[Webinar Slides] Your 2019 Information Management Resolution: Part Two
 
[Webinar Slides] Data Explosion in Your Organization? Harness It with a Compr...
[Webinar Slides] Data Explosion in Your Organization? Harness It with a Compr...[Webinar Slides] Data Explosion in Your Organization? Harness It with a Compr...
[Webinar Slides] Data Explosion in Your Organization? Harness It with a Compr...
 
[Webinar Slides] It All Starts Here— Effectively Capturing Paper and Digital ...
[Webinar Slides] It All Starts Here— Effectively Capturing Paper and Digital ...[Webinar Slides] It All Starts Here— Effectively Capturing Paper and Digital ...
[Webinar Slides] It All Starts Here— Effectively Capturing Paper and Digital ...
 
[Webinar Slides] Improving your Organization’s Collaborative and Case-Centric...
[Webinar Slides] Improving your Organization’s Collaborative and Case-Centric...[Webinar Slides] Improving your Organization’s Collaborative and Case-Centric...
[Webinar Slides] Improving your Organization’s Collaborative and Case-Centric...
 
[Webinar Slides] Modern Problems Require Modern Solutions
[Webinar Slides] Modern Problems Require Modern Solutions[Webinar Slides] Modern Problems Require Modern Solutions
[Webinar Slides] Modern Problems Require Modern Solutions
 
[Webinar Slides] Dreading Your Data Migration Project? 3 Ways Robotic Process...
[Webinar Slides] Dreading Your Data Migration Project? 3 Ways Robotic Process...[Webinar Slides] Dreading Your Data Migration Project? 3 Ways Robotic Process...
[Webinar Slides] Dreading Your Data Migration Project? 3 Ways Robotic Process...
 
[AIIM18] Beyond Human Capacity: Using analytics to scale your everyday inform...
[AIIM18] Beyond Human Capacity: Using analytics to scale your everyday inform...[AIIM18] Beyond Human Capacity: Using analytics to scale your everyday inform...
[AIIM18] Beyond Human Capacity: Using analytics to scale your everyday inform...
 

Recently uploaded

Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Mark Simos
 
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxPasskey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxLoriGlavin3
 
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brandgvaughan
 
TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024Lonnie McRorey
 
Advanced Computer Architecture – An Introduction
Advanced Computer Architecture – An IntroductionAdvanced Computer Architecture – An Introduction
Advanced Computer Architecture – An IntroductionDilum Bandara
 
What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024Stephanie Beckett
 
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024BookNet Canada
 
Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationConnect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationSlibray Presentation
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsRizwan Syed
 
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdfHyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdfPrecisely
 
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxA Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxLoriGlavin3
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfAddepto
 
What is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdfWhat is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdfMounikaPolabathina
 
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebUiPathCommunity
 
SAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxSAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxNavinnSomaal
 
Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubKalema Edgar
 
Take control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteTake control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteDianaGray10
 
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek SchlawackFwdays
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr BaganFwdays
 

Recently uploaded (20)

Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
 
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxPasskey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
 
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brand
 
TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024
 
Advanced Computer Architecture – An Introduction
Advanced Computer Architecture – An IntroductionAdvanced Computer Architecture – An Introduction
Advanced Computer Architecture – An Introduction
 
What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024
 
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
 
Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationConnect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck Presentation
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL Certs
 
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdfHyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
 
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxA Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdf
 
What is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdfWhat is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdf
 
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio Web
 
SAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxSAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptx
 
Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding Club
 
Take control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteTake control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test Suite
 
DMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special EditionDMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special Edition
 
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan
 

The Good, The Bad, and The Ugly of Defensible Disposition

  • 1. #AIIM14  #AIIM14   #AIIM14   The  Good,  the  Bad,  and  the  Ugly  of   Defensible  Disposi7on   Richard  Medina   Co-­‐founder  and  Principal  Consultant,  Doculabs  |  doculabs.com   rmedina@doculabs.com  |  richardmedinadoculabs.com   @richarddoculabs  
  • 2. #AIIM14   Issues   1.  The  problem   §  The  sky  is  falling  again   2.  Break  it  into  two  problems   §  Day-­‐forward  versus  historical  content   3.  How  to  address  historical  content   §  A  defensible  disposi2on  methodology   4.  Analysis  and  classificaLon  technology   §  Should  you  use  it?  Does  it  work?   5.  Doing  the  Assessment   §  Approaches  and  results  
  • 3. #AIIM14   Issues   1.  The  problem   §  The  sky  is  falling  again   2.  Break  it  into  two  problems   §  Day-­‐forward  versus  historical  content   3.  How  to  address  historical  content   §  A  defensible  disposi2on  methodology   4.  Analysis  and  classificaLon  technology   §  Should  you  use  it?  Does  it  work?   5.  Doing  the  Assessment   §  Approaches  and  results  
  • 4. #AIIM14   The  Problem  is  Over-­‐Reten7on   OrganizaLons  have  been  over-­‐retaining  electronic  informaLon  and  failing  to   dispose  of  it  in  a  legally  defensible  manner  when  business  and  law  will  allow   Retaining  everything  forever   Disposing  of  everything  immediately   Having  employees  make  classificaLon  decisions   Having  technology  make  classificaLon  decisions   Hybrid  with  technology  and  people  
  • 5. #AIIM14   Why  Over-­‐Reten7on  is  the  Problem   §  Organiza2ons  keep  non-­‐required  electronic  content  forever   because:   1.  Classifying  content  (to  determine  what  to  keep  and  what  to  purge)  is   manual  and  expensive   2.  Content  worth  preserving  is  mixed  with  content  that  should  be  purged   3.  Legal  -­‐-­‐  and  others  -­‐-­‐  are  afraid  of  wrongfully  deleLng  materials   (spoliaLon)   4.  AddiLonal  storage  is  inexpensive,  which  makes  it  easy  for  corporaLons   to  buy  more  storage  and  defer  addressing  the  problem  
  • 6. #AIIM14   Issues   1.  The  problem   §  The  sky  is  falling  again   2.  Break  it  into  two  problems   §  Day-­‐forward  versus  historical  content   3.  How  to  address  historical  content   §  A  defensible  disposi2on  methodology   4.  Analysis  and  classificaLon  technology   §  Should  you  use  it?  Does  it  work?   5.  Doing  the  Assessment   §  Approaches  and  results  
  • 7. #AIIM14   Recommenda7ons  for  Day-­‐forward   §  Addressing  day-­‐forward  informa7on  lifecycle  management  (ILM)  is  much  easier  to  address  than   historical  content   §  Even  though  addressing  it  messes  with  employees’  day-­‐to-­‐day  business  acLviLes   §  Day-­‐forward:  Ini2ate  ILM  prac2ces  on  a  “day-­‐forward”  basis  first,  so  any  new  content  created  or   saved  is  assigned  a  disposi2on  period   §  DisposiLon  horizons  should  begin  to  influence  behavior  on  where  content  begins  to  be  stored  (as   users  discover  that  those  materials  saved  in  the  “wrong”  system  will  be  purged)   §  Guidance:  Provide  employees  with  explicit  guidance  for  the  acceptable  use  of  available  tools  for   dynamic  content  and  their  associated  reten2on  periods     §  For  example,  retain  non-­‐records  for  3  years,  retain  official  records  per  the  retenLon  schedule   §  Historical:  For  historical  content,  analyze  the  feasibility  of  content  analy2cs  and  autoclassifica2on   §  Recognize  that  cleaning  up  TBs  of  content  can  take  years.  So  conduct  the  analysis  in  2014,  begin   the  cleanup  effort  in  earnest  by  2015,  and  eliminate  a  large  porLon  of  dated  content  by  2016    
  • 8. #AIIM14   Guidance  Example  for  Day-­‐ forward   System/Repository   Recommended  Reten7on  Period   Personal  Network   Drives  (“P”  drives)   •  Provide  each  user  with  personal  drive  space  of  a  limited  size  for  their  storage,  for  as  long   as  the  user  is  employed   Shared  Network   Drives   (“G”  drives)   •  Make  them  read  only  (which  means  no  network  storage  for  collabora7on;  content  will   have  to  go  into  an  ECM  system)   •  Excep7ons  include  applica7on  or  systems  that  need  to  use  network  storage   ECM  System   1.  Default  for  non  records:  retained  for  3  years     2.  Default  for  non  records  that  have    long-­‐term  value:  retained  for  7  years   3.  Official  records:  retained  per  the  reten7on  schedule   Social  Community   Sites   •  No  documents  stored  in  communi7es  (only  links  to  documents  in  the  ECM  system)   •  Consider  reten7on  periods  for  non-­‐document  content  (e.g.  3  years)  
  • 9. #AIIM14   Issues   1.  The  problem   §  The  sky  is  falling  again   2.  Break  it  into  two  problems   §  Day-­‐forward  versus  historical  content   3.  How  to  address  historical  content   §  A  defensible  disposi;on  methodology   4.  Analysis  and  classificaLon  technology   §  Should  you  use  it?  Does  it  work?   5.  Doing  the  Assessment   §  Approaches  and  results  
  • 10. #AIIM14   What’s  the  Purpose  of  Your  DD  Methodology?   §  You  must  sa7sfy  4  demands:   1.  Regulatory  retenLon  requirements   2.  Hold  retenLon  requirements   3.  Business  retenLon  requirements   4.  Cost  impact  of  anything  you  do   §  What  you  do  has  impact:   1.  What  you  do   2.  Effects  of  what  you  do   §  You  can  do  2  things:   1.  Sort   2.  Dispose   §  Your  mission  stated  two  ways:   §  Your  mission  is  to  saLsfy  your  retenLon  demands  (1-­‐3)  while  minimizing  bad  cost  impact  to   yourself  (4)   §  Your  mission  is  to  maximize  good  cost  impact  (4)  while  saLsfying  your  retenLon  requirements   (1-­‐3)    
  • 11. #AIIM14   It’s  Based  on  Reasonableness   §  To  determine  what  “sa2sfy  your  reten2on   demands”  really  means  for  you,  use  the  Principle   of  Reasonableness  and  act  In  Good  Faith   §  Courts  do  not  ask,  expect  or  necessarily  reward  organizaLons  for   perfecLon.  Courts  do  expect,  however,  that  whatever   informaLon  management  tacLcs  an  organizaLon  undertakes  are   appropriate  to  how  that  parLcular  enLty  is  situated  (size,   financial  resources,  regulatory  and  liLgaLon  profile,  etc.).  (Jim   McGann  and  Julie  Colgan,  “Implement  a  defensible  dele2on   strategy  to  manage  risk  and  control  costs”,  Inside  Counsel)  
  • 12. #AIIM14   Your  DD  Methodology  Has  4  Parts   1.  Defensible  Disposi7on  Policy   §  It’s  your  design  specificaLon,  your  business  rules  for  DD,  your   decision  tree   §  Specifies  very  clearly  the  objecLves  that  your  methodology   will  fulfill.  It  states  clearly  what  you  mean  by  your  retenLon   requirements  and  what  you  mean  by  reasonable  costs  when   you  are  trying  to  fulfill  your  retenLon  requirements.   2.  Technology  Approach   §  For  SorLng  and  Disposing   §  You  must  use  technology  –  it’s  not  an  opLon    
  • 13. #AIIM14   Your  DD  Methodology  Has  4  Parts   3.  Assessment    (Sor7ng)  Plan   §  Do  the  legwork  and  look  at  what’s  there   §  What  informaLon  and  systems  you’re  assessing   §  Your  processing  rules    (decision  plan)   §  It  will  be  flexible   4.  Disposi7on  Plan   §  Evaluate  your  assessment  results  using  your  DD  Policy   §  Dispose  (which  ranges  from  keeping  forever  to  deleLng  right  now   with  many  opLons  in  between)   §  Refine  your  DD  Policy  (1)  and  conLnue  as  needed    
  • 14. #AIIM14   Issues   1.  The  problem   §  The  sky  is  falling  again   2.  Break  it  into  two  problems   §  Day-­‐forward  versus  historical  content   3.  How  to  address  historical  content   §  A  defensible  disposi2on  methodology   4.  Analysis  and  classifica7on  technology   §  Should  you  use  it?  Does  it  work?   5.  Doing  the  Assessment   §  Approaches  and  results  
  • 15. #AIIM14   There’s  an  Awesome  Business  Case   Classifica7on  Technique   Classifica7on   Rate   Pricing   Total  Cost   to  Classify   Manual  ClassificaLon   10  seconds  per   document   $35  /  hr.   $20  million   Auto  ClassificaLon     (with  95%  machine  and  5%   human  classified,  via   offshore  labor)   Less  than  1   second  per   document   $.005  per  document  for   machine  processing  and     $5  /  hr.  for  those  that   require  manual   classificaLon     $2  million   §  …  if  the  technology  works   §  50  TB  =    ~200  million  documents  (average  of  250KB  per  document)   §  The  following  table  illustrates  the  Lme  and  effort  required  to  classify  200  million  documents  
  • 16. #AIIM14   Analysis  and  Classifica7on  Technologies   §  Many  different  kinds  of  technology  vendors  are  addressing  analysis,   classificaLon,  and  disposiLon   §  File  AnalyLcs,  Content  AnalyLcs,  Content  ClassificaLon,  ECM,  E-­‐discovery,   Search,  Capture,  DLP,  Storage  Management   §  Products,  hosted  soluLons,  service  providers     §  IBM/Stored  IQ,  HP/Autonomy,  EMC  Kazeon,  SAS,  Kofax,  Equivio,  RaLonal   RetenLon,  Recommind,  Index  Engines,  and  others   §  Most  have  a  sweet  spot  where  they  will  succeed   §  But  it’s  highly  dependent….  on  just  about  every  factor  you  can  think  of   §  E.g.,  your  business  purposes,  your  ECM  environment,  your  “informaLon   architecture”,  your  document  types  and  their  complexity  and  volume,  the  value   and  risk  of  the  documents,  your  success  criteria,  etc.,  etc.,  etc.  
  • 17. #AIIM14   Sidebar:  How  Many  of  them  Work   Before   Acer   <server  XXX,  drive  G:>   Forecast   summary_121008.doc   Record  =  no   Age  =  2.5  years   Document  type=  departmental  forecast   Keywords  =  forecast,  2008,  drav   Status  =  delete   Confidence  =  9.2  (out  of  10)   1.  Analyze  the  content  and  review  the  retenLon  schedule   2.  Establish  classificaLon  rules  and  train  the  systems  with  examples   3.  Crawlers  and  recogniLon  engines  evaluate  the  content  and  generate  a  classificaLon   4.  For  content  where  a  high  machine  confidence  factor  exists,  content  is  automaLcally  tagged   and  then  staged  for  migraLon  to  the  appropriate  system  or  disposiLon   5.  For  content  with  low  confidence  factors,  documents  are  routed  to  clerical  staff  (onshore  or   offshore)  for  manual  classificaLon   6.  The  results  of  the  manual  idenLficaLon  are  fed  back  into  the  automated  algorithms  to   “teach”  the  systems  bewer  classificaLon   Throughout  the  process,  results  and  samples  are  routed  to   records  management  and  legal  professionals  within  the  firm  for   validaLon  and  confirmaLon   1   2   3   4   5   6     Client   Valida7on    
  • 18. #AIIM14   Issues   1.  The  problem   §  The  sky  is  falling  again   2.  Break  it  into  two  problems   §  Day-­‐forward  versus  historical  content   3.  How  to  address  historical  content   §  A  defensible  disposi2on  methodology   4.  Analysis  and  classificaLon  technology   §  Should  you  use  it?  Does  it  work?   5.  Doing  the  Assessment   §  Approaches  and  results  
  • 19. #AIIM14   Assessment  Approaches   §  There  are  three  categories  of  awributes  that  can  be  used  to   determine  what  a  file  is:     1.  Environmental  awributes  around  the  file  (e.g.,  file  locaLon,  ownership)   2.  File  awributes  about  the  file  (e.g.,  file  type,  age,  author)   3.  Content  awributes  within  the  file  (e.g.,  keywords,  character  strings,  word   proximity,  word  density)   §  Various  techniques    and  technologies,  along  with  business  rules,   can  be  used  to  determine  what  a  file  is,  and  whether  it  is  eligible  for   disposiLon   §  E.g.,  a  DOC  file  created  over  5  years  ago  and  not  accessed  for  a  year  may  be   purged   §  This  type  of  purging  could  be  done  aver  giving  users  adequate  noLce  (“move  it   or  lose  it”  or  “hold”  for  90  days,  then  delete)  
  • 20. #AIIM14   #1:  Environmental  Ahributes   Ahribute   Evalua7on  Technique   Tool(s)  Used   Examples   How  Used   Ownership   Access  Controls   Content  Analy7cs,  Data   Loss  Preven7on,  Storage   Management   Permissions  within  LDAP  list   people  and  infer   department  or  func7on   Large  collec7ons  of  files  can  be   assessed  en  masse  based  on   access  controls   1   Loca7on   File  Path   Content  Analy7cs,  Data   Loss  Preven7on,  Storage   Management   G:/accoun7ng/july2004/temp   Stranded  and  orphaned   loca7ons  are  ocen  easily   eliminated   2   Environmental  Ahributes  (around  a  file)  
  • 21. #AIIM14   #2:  File  Ahributes   Duplicate   Hash  Algorithm   Content  AnalyLcs   Exact  duplicates   Exact  duplicates  can  be  easily  eliminated   3   File  Type   Extension  or  MIME  type   Content  AnalyLcs   .TMP,  .MP3   To  idenLfy  file  types  that  should  not  exist   in  a  corporate  seyng   4   Block  Read   Content  AnalyLcs   Near  duplicates   Near  duplicates  must  be  assessed  in  the   context  of  other  awributes   Metadata   ProperLes   Content  AnalyLcs   Age   To  determine  old  materials,  materials   authored  by  individuals  that  have  lev  the   organizaLon   5   Content  AnalyLcs   Author   Typically,  these  awributed  must  be   conLnued  with  other  awributed  via  a  rule   to  take  acLon   Content  AnalyLcs   Security  Profile  (ConfidenLal)   User  filename  properLes  to  determine   type   File  Name   Character  Strings   Content  AnalyLcs   GL-­‐USDIST31_093098.xls   Determine  whether  a  file  was  system   generated  vs.  human  generated   6   Content  AnalyLcs   FORMUB92_SMITH   Documents  that  are  based  on  a  specific   form  number  can  easily  be  idenLfied   Ahribute   Evalua7on  Technique   Tool(s)  Used   Examples   How  Used   File  Ahributes  (about  a  file)  
  • 22. #AIIM14   #3:  Content  Ahributes   Key  Word   Character  Strings   Content  AnalyLcs;   ClassificaLon  Module   “Enron”,  “Guarantee”   To  determine  if  a  document  is   on  Hold  via  a  word  list  per  the   hold  request   7   Character   or  Word   Paherns   “ClassificaLon”   <pawern  matching>   ClassificaLon  Module   Word  proximity   To  determine  the  category  in   which  a  document  may  fit  8   ClassificaLon  Module   Word  frequency   Content  AnalyLcs;   ClassificaLon  Module   “Privileged”   IdenLficaLon  of  PII   Content  AnalyLcs;  DLP   SS#,  Credit  card  #   Regular  Expression(RegEX)   lists;  determined  enLLes  for   hold,  security,  IP,  PHI,  PII,  DLP   Ahribute   Evalua7on  Technique   Tool(s)  Used   Examples   How  Used   Content  Ahributes  (within  a  file)  
  • 23. #AIIM14   Assessment  Results   Preserva7on   Findings   Unnecessary  File  Types   (Executables,  non-­‐business  pictures,  movies,  etc.)   13  to  15%   Duplicates   15  to  20%   Near  Duplicates   9  to  30%   Risk   Findings   Files  with  PII   10  to  16%   Files  with  Sample  Keywords   3  to  5%   Opera7onal   Findings   Files  10  years  or  older   7  to  11%   Files  accessed  within  the  last  18  months   25  to  35%   Findings  not  mutually   exclusive  (  i.e.,  a  duplicate   file  could  also  be  aged)  
  • 24. #AIIM14   Assessment  Summary   Findings   Enterprise  Impact   Total  that  could  be  disposed   20%  of  2.5  PB   Enterprise  ImplicaLons   .5  PB  removed  @  $5,000,000  per  PB   Savings   $2,500,000  per  year  in  storage  expense   Technique   Status   %  of  Total   Total   AnalyLcs   Unnecessary     20%   500  TB  (.5  PB)   ClassificaLon   Record   8%   200  TB  (.2  PB)   Non-­‐Record,  Business   Reference   28%   700  TB  (.7  PB)   Evaluated,  Staged  for   DisposiLon  (2016)     44%   1,100  TB  (1.1PB)   Total   100%   2,500  TB  (2.5  PB)  
  • 25. #AIIM14   Assessment  Implica7ons   §  Given  the  results,  $2.5  million  in  storage  expense  could  be  saved  annually  on  the  disposiLon  of   historic  content,  resulLng  in  $12.5  million  over  5  years   §  Going  forward  with  newly  created  content,  if  similar  techniques  are  applied,  the  saving  grows  to   $34.8  million  over  5  years   §  The  current  cost  projecLons  are  based  on  the  historical  content  growth  rate  of  30%  per  year   §  The  expected  cost  projecLons  are  based  on  a  content  growth  rate  of  26%  per  year   @$5,000,000  per  PB   2012   2013   2014   2015   2016*     Total   Current  Storage  (PB)   2.5   3.25   4.23   5.49   7.14   Current  Cost  (Mill)   $12.5   $16.3   $21.1   $27.5   $35.7   $113.0   Expected  Storage  (PB)   2   2.52   3.18   4.00   3.94   Expected  Cost  (Mill)   $10   $12.6   $15.9   $20.0   $19.7   $78.2   Total  Savings  (Mill)   $2.5   $3.65   $5.25   $7.46   $16.00   $34.8   *In  2016,  the  1.1  PB  or  44%  of  content  from  the  2012  historical  content  assessment  can  be  disposed  
  • 26. #AIIM14   Conclusions   1.  The  business  case  for  disposiLon  is  strong   §  Costs,  risks,  and  benefits   2.  InformaLon  governance  must  be  addressed  in  phases   §  StarLng  today,  the  program  will  take  years  to  mature   §  Set  expectaLons  according   3.  You  should  probably  address  day-­‐forward  ILM  before  tackling  historical   content   4.  Recognize  that  manual  classificaLon  is  not  an  opLon   5.  The  technologies  are  immature  and  varied,  but  you  can  be  successful  by   matching  the  techniques  and  technologies  to  the  kinds  of  files  you  want  to   target   6.  Your  DD  methodology  has  4  main  parts:    DD  Policy,  Technology  Approach,   Assessment  Plan,  Disposi2on  Plan  
  • 27. #AIIM14  #AIIM14   #AIIM14   Thank  You   Richard  Medina   Co-­‐founder  and  Principal  Consultant,  Doculabs  |  doculabs.com   rmedina@doculabs.com  |  richardmedinadoculabs.com   @richarddoculabs  
  • 28. www.aiim.org/infochaos   Do  YOU  understand  the  business     challenge  of  the  next  10  years?   This  ebook  from  AIIM  President   John  Mancini  explains.