The Future of Log Centralization for SIEMs and DFIR: Is the End Nigh
1. The Future of Log Centralization fo
SIEMs and DFIR
Is the End Nigh?
Dr. Anton Chuvakin
https://medium.com/anton-on-security
https://cloud.withgoogle.com/cloudsecurity/podcast
Office of the CISO, Google Cloud
August 2023
2. Outline
● Logs … still centralized?
○ What worked well?
○ What was always a challenge?
● What changed?
○ So, should we still centralize?
● What does the possible future look like?
5. Time Machine to 2003!
● Log centralization
● Syslog dominates
● Syslog UDP is still cool (in a late
1980s kinda way)
● SIEM does not exist, yet SIM and
SEM do
● Log management is a generic term,
not a market name…
9. Scenario 1 Multi-cloud at Scale
● Big presence in Google Cloud
● Also, big presence in another
cloud
● AND finally, still sizable
present on premise
● Where do the logs go?
10. Scenario 2 Useful Logs, “Useless” Logs
● Megabytes of alerts
● Gigabytes of priority logs
● AND petabytes of information logs
● Now, add observability traces
● Do we centralize … at per GB
price?
11. Scenario 3 Very SaaSy (But not SASE!)
● Lots of SaaS use - CRM, HR,
marketing, etc
● CASB in use
● No data centers
● Do we centralize log at …
eh…well…eh… WHERE?
13. “Will the future be more secure? It'll be just as
insecure as it possibly can, while still
continuing to function. Just like it is today.”
-- Marcus Ranum (in ~early 2000)
14.
15. So You Want to Decentralize?
● How to assure retention?
○ … and impress our “friends”, the auditors!?
○ … and assure evidence availability for IR
● How to normalize?
● How to correlate?
● How to ML?
16. Decisions, Decisions, etc
“Damn the torpedoes, we are centralizing
anyway”
● Compliance mandates (PCI DSS, etc)
● Need guaranteed data retention
● Have a scope of data to normalize
“Hold your horses, we need to think about it”
● Still need to centralize …
● … but not everything
● Centralized/distributed for low stakes data
“Decentralized all the way!”
● Heavy cloud, and especially SaaS use
● No center to centralize into
● Focus on best-effort search
● “Magical” normalization (OCSF)
17. Why Bite the Bullet and CENTRALIZE ANYWAY!?!
● Specific mandate that says “centralize logs”
○ Centralize does not mean ONE place.
● Contractual pressure to have logs available in 100%
cases
○ “If you need it done, you do it yourself!”
● Cost effective (=cloud-native) tool is available to store
logs … and not pay “per GB”...
● Don’t pay for 4 copies of the same data…
21. Recommendations
● Stick to centralized approach to logs/data that you alert on or
analyze directly
○ Use cloud-native, SaaS SIEM platform for this
● Be ready for the world where you cannot centralize all logs in one
place
○ Start reviewing the tools that support distributed queries over
decentralized stores
○ Beware of their inherent limitations, however
● Long term, assume centralized/decentralized model for log
analysis
22. Resources
● “Log Centralization: The End Is Nigh?”
● “Anton Chuvakin Discusses “20 Years of SIEM – What’s Next?”” SANS
webinar
● “20 Years of SIEM: Celebrating My Dubious Anniversary” blog
● “On “Output-driven” SIEM” blog (2012)
● “Anton and The Great XDR Debate, Part 1”
● … and of course https://medium.com/anton-on-security
● and https://cloud.withgoogle.com/cloudsecurity/podcast/
Namely, this one: https://gartner.com/document/4017131… that says "Federated security log management (SLM) is emerging as an alternative to centrally collecting logs."
https://medium.com/anton-on-security/log-centralization-the-end-is-nigh-b28efaa98379
Let’s go through a few basic examples. The very example that inspired that line of thinking involved multi-cloud. If you are present in multiple public cloud providers, and present there at scale, it is very likely that you are NOT collecting logs into one place in one cloud. Various complexities, egress costs, storage costs all play into this becoming a questionable decision for most organizations. So you perhaps centralize per cloud, but what if we include SaaS services into this? Then it becomes an even bigger mess, as most large organizations use 100s of those.
https://medium.com/anton-on-security/log-centralization-the-end-is-nigh-b28efaa98379
Another trivial example refers to the log types that are useful for investigations or in bulk, but where each individual record is unlikely to be used for detection. For example, I’ve noticed that many organizations don’t collect and retain DHCP logs (of course, Chronicle customers do!). They fail to do it not because these logs are not useful (they are very useful as context), but because they don’t use them for any direct detections, and thus see them as “too costly to centralize” (especially if their SIEM vendor charges per EPS…).
https://medium.com/anton-on-security/log-centralization-the-end-is-nigh-b28efaa98379
Another trivial example refers to the log types that are useful for investigations or in bulk, but where each individual record is unlikely to be used for detection. For example, I’ve noticed that many organizations don’t collect and retain DHCP logs (of course, Chronicle customers do!). They fail to do it not because these logs are not useful (they are very useful as context), but because they don’t use them for any direct detections, and thus see them as “too costly to centralize” (especially if their SIEM vendor charges per EPS…).
https://www.query.ai/federated-search/
“Open federated search retrieves information from across vendor solutions and environments. It uses API integrations with third parties to perform a unified search across the data sources that are participating in the federation, and it does this without requiring data transfer or centralization. This approach also provides the flexibility to choose and integrate the best-of-breed security solutions vs having a single-vendor lock-in.”
https://www.query.ai/wp-content/uploads/2023/05/QWP-002_Evaluating-Federated-Search-for-Security.pdf