StatsCraft 2015: The problem (Keynote) - Nir Cohen

•

0 likes•222 views

StatsCraft

Slides of Nir Cohen's talk at StatsCraft 2015

Technology

StatsCraftStatsCraft
Monitoring ConferenceMonitoring Conference
website and agenda:
twitter: (#statscraft)
facebook:
email:
http://statscraft.org.il
@statscraft
https://www.facebook.com/statscraft.il
statscraftcon@gmail.com

AgendaAgenda
1. Understand the problem.
2. Understand what monitoring is.
3. Example use-case(s)
4. A diﬀerent approach
5. Learn methodologies and tools

The ProblemThe Problem
Nir Cohen @ Gigaspaces
@thinkops
http://github.com/nir0s

WeWe
monitor because...monitor because...
We want to satify theWe want to satify the
customer.customer.
(make money?)

Automated Resource Provisioning
Conﬁguration Management
Automated Code Deployment
Continuous Whatever
Monitoring
Still underrated...Still underrated...
Automated Resource Provisioning
Conﬁguration Management
Automated Code Deployment
Continuous Whatever
Monitoring
PROBLEM!PROBLEM!

Problem originProblem origin
DISCLAIMERDISCLAIMER

We're monitoringWe're monitoring
the wrong things.the wrong things.
_rootCauseAnalysis:
the alternative is harder.

We're consideringWe're considering
logs a second classlogs a second class
citizen.citizen.
_rootCauseAnalysis:
the alternative is harder.

Our data is lacking.Our data is lacking.
_rootCauseAnalysis:
inertia. that's how it was, that's ho
w it is.

We separateWe separate
monitoring frommonitoring from
applicationapplication
_rootCauseAnalysis:
we're not used to this. (Ops problem)

We monitorWe monitor
reactively, notreactively, not
proactivelyproactively
_rootCauseAnalysis:
reaction requires less initial energy
than anticipation.

We put uptimeWe put uptime
above system andabove system and
product qualityproduct quality
_rootCauseAnalysis:
it's much easier.

We deal with hardWe deal with hard
limits.limits.
_rootCauseAnalysis:
arbitrary numbers are easier to set.

Monitoring is non-Monitoring is non-
functional butfunctional but
resource hungryresource hungry
_rootCauseAnalysis:
we just don't accept it.

Good monitoringGood monitoring
requires the rightrequires the right
people, not just Ops!people, not just Ops!
_rootCauseAnalysis:
delegation is natural. other have mor
e important things to do.

Alert fatigue isAlert fatigue is
common.common.
_rootCauseAnalysis:
solving issues is much easier than so
lving problems, and apparently, we ar
e additted to non-actionable alerts.

We're auto-scalingWe're auto-scaling
prematurelyprematurely
_rootCauseAnalysis:
brute force is natural

We're choosing theWe're choosing the
wrong tools.wrong tools.
_rootCauseAnalysis:
it's easier to choose the tool than to choos
e what to monitor.

Good monitoringGood monitoring
is hardis hard
_rootCauseAnalysis:
systems become complex, so they're ha
rder to monitor.

So, after all, why do weSo, after all, why do we
not monitor properly?not monitor properly?
1. SimplificationSimplification
2. DelegationDelegation
3. RationalizationRationalization
_rootCauseAnalysis:

No fear,No fear,
Let's see how we can makeLet's see how we can make
this all betterthis all better
is here!is here!

“ If a service crashes and no one is
around to monitor it, does it raise an
alert?

Similar to StatsCraft 2015: The problem (Keynote) - Nir Cohen

Monitoring Complex Systems - Chicago Erlang, 2014Brian Troutwine

[cb22] Keynote: Underwhelmed: Making Sense of the Overwhelming Challenge of C...CODE BLUE

Building a Modern Security Engineering Organization. Zane LackeyYandex

Learn Your Way to AWESOME.Arty Starr

Webinar: Will the Real AI Please Stand Up?Interset

Purple Teaming - The Collaborative Future of Penetration TestingFRSecure

Building a Modern Security Engineering OrganizationZane Lackey

Automatic Assessment of Failure Recovery in Erlang ApplicationsJan Henry Nystrom

Are We Secure? Answering the UnanswerableJustin Berman

Is data visualisation bullshit?Alban Gérôme

Digital Transformation, Testing and AutomationTEST Huddle

Business analytics Project.docxkushi62

Creating a Culture of Ownership and Trust with Visibility and Transparency by...AgileSparks

Evil Tester's Guide to Agile TestingAlan Richardson

Normal accidents and outpatient surgeriesJonathan Creasy

The Ultimate MetricArty Starr

Are Automated Debugging Techniques Actually Helping ProgrammersChris Parnin

Гірка правда про безпеку програмного забезпечення, Володимир СтиранSigma Software

Sigma Open Tech Week: Bitter Truth About Software SecurityVlad Styran

Ai lecture1 finalShivam Agrawal

Similar to StatsCraft 2015: The problem (Keynote) - Nir Cohen (20)

Monitoring Complex Systems - Chicago Erlang, 2014

[cb22] Keynote: Underwhelmed: Making Sense of the Overwhelming Challenge of C...

Building a Modern Security Engineering Organization. Zane Lackey

Learn Your Way to AWESOME.

Webinar: Will the Real AI Please Stand Up?

Purple Teaming - The Collaborative Future of Penetration Testing

Building a Modern Security Engineering Organization

Automatic Assessment of Failure Recovery in Erlang Applications

Are We Secure? Answering the Unanswerable

Is data visualisation bullshit?

Digital Transformation, Testing and Automation

Business analytics Project.docx

Creating a Culture of Ownership and Trust with Visibility and Transparency by...

Evil Tester's Guide to Agile Testing

Normal accidents and outpatient surgeries

The Ultimate Metric

Are Automated Debugging Techniques Actually Helping Programmers

Гірка правда про безпеку програмного забезпечення, Володимир Стиран

Sigma Open Tech Week: Bitter Truth About Software Security

Ai lecture1 final

Recently uploaded

Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Patryk Bandurski

From Event to Action: Accelerate Your Decision Making with Real-Time AutomationSafe Software

SIEMENS: RAPUNZEL – A Tale About Knowledge GraphNeo4j

Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j

Swan(sea) Song – personal research during my six years at Swansea ... and bey...Alan Dix

Key Features Of Token Development (1).pptxLBM Solutions

Benefits Of Flutter Compared To Other FrameworksSoftradix Technologies

A Domino Admins Adventures (Engage 2024)Gabriella Davis

GenCyber Cyber Security Day PresentationMichael W. Hawkins

[2024]Digital Global Overview Report 2024 Meltwater.pdfhans926745

Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...shyamraj55

Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersThousandEyes

08448380779 Call Girls In Friends Colony Women Seeking MenDelhi Call girls

Pigging Solutions in Pet Food ManufacturingPigging Solutions

How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes

The 7 Things I Know About Cyber Security After 25 Years | April 2024Rafal Los

04-2024-HHUG-Sales-and-Marketing-Alignment.pptxHampshireHUG

Maximizing Board Effectiveness 2024 Webinar.pptxOnBoard

FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhisoniya singh

Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 3652toLead Limited

Recently uploaded (20)

Integration and Automation in Practice: CI/CD in Mule Integration and Automat...

From Event to Action: Accelerate Your Decision Making with Real-Time Automation

SIEMENS: RAPUNZEL – A Tale About Knowledge Graph

Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...

Swan(sea) Song – personal research during my six years at Swansea ... and bey...

Key Features Of Token Development (1).pptx

Benefits Of Flutter Compared To Other Frameworks

A Domino Admins Adventures (Engage 2024)

GenCyber Cyber Security Day Presentation

[2024]Digital Global Overview Report 2024 Meltwater.pdf

Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...

Enhancing Worker Digital Experience: A Hands-on Workshop for Partners

08448380779 Call Girls In Friends Colony Women Seeking Men

Pigging Solutions in Pet Food Manufacturing

How to Troubleshoot Apps for the Modern Connected Worker

The 7 Things I Know About Cyber Security After 25 Years | April 2024

04-2024-HHUG-Sales-and-Marketing-Alignment.pptx

Maximizing Board Effectiveness 2024 Webinar.pptx

FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi

Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365

StatsCraft 2015: The problem (Keynote) - Nir Cohen

1. StatsCraftStatsCraft Monitoring ConferenceMonitoring Conference website and agenda: twitter: (#statscraft) facebook: email: http://statscraft.org.il @statscraft https://www.facebook.com/statscraft.il statscraftcon@gmail.com

2. AgendaAgenda 1. Understand the problem. 2. Understand what monitoring is. 3. Example use-case(s) 4. A diﬀerent approach 5. Learn methodologies and tools

3. The ProblemThe Problem Nir Cohen @ Gigaspaces @thinkops http://github.com/nir0s

4. WeWe monitor because...monitor because... We want to satify theWe want to satify the customer.customer. (make money?)

5. Automated Resource Provisioning Conﬁguration Management Automated Code Deployment Continuous Whatever Monitoring Still underrated...Still underrated... Automated Resource Provisioning Conﬁguration Management Automated Code Deployment Continuous Whatever Monitoring PROBLEM!PROBLEM!

6. Blame the tools?Blame the tools?

7. Problem originProblem origin DISCLAIMERDISCLAIMER

8. We're monitoringWe're monitoring the wrong things.the wrong things. _rootCauseAnalysis: the alternative is harder.

9. We're consideringWe're considering logs a second classlogs a second class citizen.citizen. _rootCauseAnalysis: the alternative is harder.

10. Our data is lacking.Our data is lacking. _rootCauseAnalysis: inertia. that's how it was, that's ho w it is.

11. We separateWe separate monitoring frommonitoring from applicationapplication _rootCauseAnalysis: we're not used to this. (Ops problem)

12. We monitorWe monitor reactively, notreactively, not proactivelyproactively _rootCauseAnalysis: reaction requires less initial energy than anticipation.

13. We put uptimeWe put uptime above system andabove system and product qualityproduct quality _rootCauseAnalysis: it's much easier.

14. We deal with hardWe deal with hard limits.limits. _rootCauseAnalysis: arbitrary numbers are easier to set.

15. Monitoring is non-Monitoring is non- functional butfunctional but resource hungryresource hungry _rootCauseAnalysis: we just don't accept it.

16. Good monitoringGood monitoring requires the rightrequires the right people, not just Ops!people, not just Ops! _rootCauseAnalysis: delegation is natural. other have mor e important things to do.

17. Alert fatigue isAlert fatigue is common.common. _rootCauseAnalysis: solving issues is much easier than so lving problems, and apparently, we ar e additted to non-actionable alerts.

18. We're auto-scalingWe're auto-scaling prematurelyprematurely _rootCauseAnalysis: brute force is natural

19. We're choosing theWe're choosing the wrong tools.wrong tools. _rootCauseAnalysis: it's easier to choose the tool than to choos e what to monitor.

20. Good monitoringGood monitoring is hardis hard _rootCauseAnalysis: systems become complex, so they're ha rder to monitor.

21. So, after all, why do weSo, after all, why do we not monitor properly?not monitor properly? 1. SimplificationSimplification 2. DelegationDelegation 3. RationalizationRationalization _rootCauseAnalysis:

22. No fear,No fear, Let's see how we can makeLet's see how we can make this all betterthis all better is here!is here!

23. “ If a service crashes and no one is around to monitor it, does it raise an alert?

StatsCraft 2015: The problem (Keynote) - Nir Cohen

Recommended

Recommended

More Related Content

Similar to StatsCraft 2015: The problem (Keynote) - Nir Cohen

Similar to StatsCraft 2015: The problem (Keynote) - Nir Cohen (20)

Recently uploaded

Recently uploaded (20)

StatsCraft 2015: The problem (Keynote) - Nir Cohen