SlideShare a Scribd company logo
1 of 25
msc_logparser
Ervin Hegedüs (airween@digitalwave.hu)
A ModSecurity log parser
Content
● idea
● goals and motivations
● internal operation
● examples
● future plans
// 2
Idea
● we use ModSecurity on more and more servers, both Apache and Nginx
○ more very unpleasant FP's
● demand: process logs for a custom dashboard
○ both developers and admins can check the WAF results
● collect virtual hosts logs from the servers
○ side note: discover the servers config with httpd_pyparser
○ collect information about server configuration, incl. log paths, CRS setup (PL), ModSecurity customization
● which log?
○ audit.log?
○ error.log? - was chosen this one: we need "only 'Warning'" lines too
● how to transport the log(s) to the dashboard app?
○ on the fly - through pipe (Apache) or through syslog (Nginx)
○ copy the logs and load them in place
// 3
First steps
● facing the problems:
○ log parts and those limitations - see later
○ truncated fields
○ falsified data - see later
○ differences between the engines
○ portability - what if we want several different application?
○ performance - only may occur in extreme cases
● it's not as trivial as it seems
// 4
Falsify data
Example: how to fill the log with false data
curl -v -X POST -d ") [file "/dev/null"] [line "-2"] [id "3.1415"] [msg "Empty"] [data
"[file "/dev/random"] [line "inf"]"] [severity "NORMAL"] =@attack"...
will produce this funny line - a game: find the correct "file" field!
ModSecurity: Warning. Matched phrase "pattern_from_attack" at ARGS_NAMES:) [file "/dev/null"]
[line "-2"] [id "3.1415"] [msg "Empty"] [data "[file "/dev/random"] [line "inf"]"] [severity
"NORMAL"] . [file "/usr..
Fortunately, later this won't appear (eg. %{MATCHED_VAR} in [data]), because that fields are encoded.
// 5
Goals and motivations
● parse the lines of the logfiles into a structure
○ structure contains as much details as possible
○ help with row filtering or making exclusions
■ eg. add access to site developer who can see the affected logs and mark the FP's
○ indicate if the line is wrong
● doing it as fast as possible
● can be moved to other platforms (eg. from Python to PHP)
● with minimal dependency (decrease the number of 3rd party libraries)
// 6
Internal operation
● libmsclogparser is written in C
○ therefore it can be used for other applications what also was written in C, C++, Go or Rust
○ by default it has these bindings: Lua, PHP, Python, Ruby
○ available on Github: https://github.com/digitalwave/libmsclogparser, with license AGPL
● no other libraries needed (eg. regex, glib)
● used standard string functions
○ compiler can optimize these functions
● main function is the "parse()"
○ expects a line as string, length of the string, type of line (Apache or Nginx) and an empty structure
○ in bindings these are easier: needs only the line and the type of the line
○ the function fills the empty structure, bindings returns the structure
● a helper function: read_msclog_err() - gives a list about error messages and positions
// 7
Internal operation
It was important to understand how a log entry is created in case of both engines.
Let's see how do they work!
● logs are written through the web server
● the length of the log (and thus its content) is limited
● the code writer decided which part could be truncated
// 8
Internal operation - parts of a log line
Core parts (with bold)
[Tue Feb 14 09:00:00.123456 2023] [security2:error] [pid 364323:tid 139847182132992] [client
216.244.66.246:57996] [client 216.244.66.246] ModSecurity:...
2023/02/14 09:00:00 [info] 1419350#1419350: *12986 ModSecurity:... , client: 162.214.112.108,
server: my.virtualserver.com, request: "GET /.env HTTP/1.1", host: "my.virtualserver.com"
● these parts are always present, each line contains them
○ therefore these are not truncated, always present with fix length
● generated by the core logger - not the module!
● the rest generated by the modul
// 9
Internal operation - parts of a log line
Module parts (with bold) - Apache
[client 216.244.66.246:57996] [client 216.244.66.246] ModSecurity: Warning. Operator GE matched 5
at TX:inbound_anomaly_score. [file "/…/RESPONSE-980-CORRELATION.conf"] [line "92"] [id "980130"]
[msg "Inbound Anomaly Score Exceeded (Total Inbound Score: 5 - SQLI=0,...,SESS=0): individual
paranoia level scores: 5, 0, 0, 0"] [ver "OWASP_CRS/3.3.4"] [tag "event-correlation"] [hostname
"my.virtualserver.com"] [uri "/robots.txt"] [unique_id "Y-PIh32oaSsbx0Ag_pH6agAAAEE"]
● second (bold) [client] is duplicated (without port)
● part with red is the "message". Without string "Warning. ", the max length can be 252 + " …"
● part with black is the metadata - see later
● part with brown is appended at the end of process, those can't be truncated
// 10
Internal operation - parts of a log line
Metadata in module parts - Apache
[file "/…/RESPONSE-980-CORRELATION.conf"] [line "92"] [id "980130"] [msg "Inbound Anomaly Score
Exceeded (Total Inbound Score: 5 - SQLI=0,...,SESS=0): individual paranoia level scores: 5, 0, 0,
0"] [ver "OWASP_CRS/3.3.4"] [tag "event-correlation"]
● strict order: file, line, id, rev, msg, data, severity, ver, maturity, accuracy, tag
● fields are optional - if rule does not have value with that field, it won't appear
● [data] value can be max length of 512; if it longer, then truncated and appended the '…"]' tail
● [tag] can be there many times
// 11
Internal operation - parts of a log line
Tail in module parts - Apache
[hostname "my.virtualserver.com"] [uri "/robots.txt"] [unique_id "Y-PIh32oaSsbx0Ag_pH6agAAAEE"]
These fields are always presented in the log file as they are, their length does not matter.
// 12
Internal operation - parts of a log line
Concatenate module parts - Apache
● core parts come from the server
● leading text: "[client a.b.c.d] ModSecurity: Warning." always presents
● added the message: "Operator GE matched…"
○ there can be many kind of message!
● added the metadata
● maximum length of leading text, message and metadata can be any lenght up to 1024 bytes
● after that, the tail part is added
● effect: there can be a truncated field, usually after the data
○ eg:
[data "some long text …"] [severity "CRITICAL"] [v [hostname "my.virtualhost.com]
[ver "OWASP_CRS/3.3.2"] [tag "application-multi"] [tag "lang [hostname
"my.virtualhost.com"]
// 13
Internal operation - parts of a log line
Module parts (with bold) - Nginx
12986 ModSecurity: Warning. Matched "Operator `PmFromFile' with parameter `restricted-files.data'
against variable `REQUEST_FILENAME' (Value: `/.env' ) [file "/../REQUEST-930-APPLICATION-ATTACK-
LFI.conf"] [line "106"] [id "930130"] [rev ""] [msg "Restricted File Access Attempt"] [data
"Matched Data: /.env found within REQUEST_FILENAME: /.env"] [severity "2"] [ver "OWASP_CRS/3.3.4"]
[maturity "0"] [accuracy "0"] [tag "application-multi"] [tag "language-multi"] [tag "platform-
multi"] [tag "attack-lfi"] [tag "paranoia-level/1"] [tag "OWASP_CRS"] [tag
"capec/1000/255/153/126"] [tag "PCI/6.5.4"] [hostname "my.virtualserver.com"] [uri "/.env"]
[unique_id "167466178181.669649"] [ref "..."], client: …
● part with red is the "message" with a very strict content
● part with black is the metadata - see later
● part with brown is appended at the end of process, those can't be truncated
// 14
Internal operation - parts of a log line
Metadata in module parts - Nginx
[file "/../REQUEST-930-APPLICATION-ATTACK-LFI.conf"] [line "106"] [id "930130"] [rev ""] [msg
"Restricted File Access Attempt"] [data "Matched Data: /.env found within REQUEST_FILENAME:
/.env"] [severity "2"] [ver "OWASP_CRS/3.3.4"] [maturity "0"] [accuracy "0"] [tag "application-
multi"] [tag "language-multi"] [tag "platform-multi"] [tag "attack-lfi"] [tag "paranoia-level/1"]
[tag "OWASP_CRS"] [tag "capec/1000/255/153/126"] [tag "PCI/6.5.4"]
● strict order: file, line, id, rev, msg, data, severity, ver, maturity, accuracy, tag
● fields are not optional (except [tag]) - if data does not exist, it will be there with empty or 0
● [data] value can be max length of 200; if it longer, then truncated and appended the 'N characters
omitted)' tail
● [tag] can be there many times
// 15
Internal operation - parts of a log line
Tail in module parts - Nginx
[hostname "my.virtualserver.com"] [uri "/.env"] [unique_id "167466178181.669649"] [ref "..."]
These fields are always presented in the log file as is; [uri] has a limit with 200 characters, others do not
matter how long they are.
// 16
Internal operation - parts of a log line
Concatenate module parts - Nginx
● core parts comes from the server
● leading text: "ModSecurity: Warning." always presents
○ only two types of messages, the other is the "Access denied"
● added the message: "Operator `OP' with parameter `PARAM' against variable `KEY' (Value:
`VALUE' )"
● PARAM limited in 200, VALUE limited in 100 characters
● added the metadata
● maximum length of leading text, message and metadata can be any length up to 2048 bytes
● after that, the tail part added
● no random truncated field, all truncated parts are marked explicitly
// 17
Internal operation - splitting the line
● challenge: split the line into the parts above
● find the left border of message, it is easy: first "ModSecurity:" occurrence it is
● find the right border: remember the fails data - look for the last ' [file "' pattern in the line
○ a space, an opening brace, word 'file', space, quotation mark
○ later (eg. in [data]) this pattern is in hexa-encoded form: ' [file 22'
● tail starts with fix ' [hostname "'...
● parsing metadata part: find the next possible field (strict order!)
● trick: the right border of next field is the left border of current (step back 2 chars)
● but allow the search method to find the shortest possible pattern, eg: ' [v' ( [ver), ' [t' (tag) and so on
● due to the strict order, they exclude each other from occurring, ' [v' is the shortened form of ' [ver "'
○ only one conflict can be there with ' [m': [msg] and [maturity] - but [msg] is before [data], so it won't be
truncated ever
// 18
Internal operation - getting more information
Find the type of message!
● "ModSecurity: Warning. ", "ModSecurity: Access Denied. ", "ModSecurity: Rule error", …
○ these are important information
○ some of them modifies the line structure:
■ …message… [id "-"][file "/…"][line "345"] Execution error - PCRE limits exceeded (-8):
(null)
■ parser is able to recognize these lines too!
● Get the reason ("Pattern match"), operand and target - making exclusions easier
○ in case of Nginx, more information are available: operator, operand, target name, target value
● Show the truncated field - if any
● Mark all errors: truncated field, missing fields (eg. disk was full and line truncated)
// 19
Example
Using parser in a Python script:
● open the file given as argument
● set the type (can be an argument too)
● parse the lines
// 20
lt = macpylogparse.LOG_TYPE_APACHE
with open(sys.argv[1], "r") as fp:
lines = fp.readlines()
for l in lines:
r = mscpylogparser.parse(l, len(l), lt)
print(json.dumps(r))
Example - output
{
"linelen": 614,
"is_modsecline": 1,
"is_broken": 0,
"date_iso": "2023-02-08 17:06:31",
"date_epoch": 1675872391.376939,
"client": "216.244.66.246:57996",
"modseclinetype": 1,
"modsecmsg": "Warning. Operator GE matched 5 at TX:inbound_anomaly_score.",
"modsecmsglen": 59,
"modsecdenymsg": "",
"modsecmsgreason": "Operator GE matched 5",
"modsecmsgop": "",
"modsecmsgoperand": "",
// 21
Example - output
"modsecmsgtrgname": "TX:inbound_anomaly_score",
"modsecmsgtrgvalue": "",
"ruleerror": "",
"file": "/usr/share/modsecurity-crs/rules/RESPONSE-980-CORRELATION.conf",
"line": "92",
"id": "980130",
"rev": "",
"msg": "Inbound Anomaly Score Exceeded (Total Inbound Score: 5 - SQLI=0,...",
"data": "",
"severity": "",
"version": "OWASP_CRS/3.3.4",
"maturity": "",
"accuracy": "",
// 22
Example - output
"tagcnt": 1,
"tags": [
"event-correlation"
],
"hostname": "my.virtualserver.com",
"uri": "/robots.tx",
"unique_id": "Y-PIh32oaSsbx0Ag_pH6agAAAEE",
"lineerrorcnt": 0,
"lineerrors": [],
"lineerrorspos": []
}
// 23
Future plans
● adding more pattern recognition from message
● getting more details from existing types
○ eg. 'Operator GE matched 5 at TX:inbound_anomaly_score.'
■ 'GE' is the operator
■ 5 is the operand
● distinguish between unset and empty values
○ [ver ""] produces "" value instead of None/NULL
● more bindings (if necessary)
○ Perl
// 24
Thank you
www.digitalwave.hu
Ervin Hegedüs (airween@digitalwave.hu)

More Related Content

What's hot

KGC 2016 오픈소스 네트워크 엔진 Super socket 사용하기
KGC 2016 오픈소스 네트워크 엔진 Super socket 사용하기KGC 2016 오픈소스 네트워크 엔진 Super socket 사용하기
KGC 2016 오픈소스 네트워크 엔진 Super socket 사용하기흥배 최
 
Secure your Web Application With The New Python Audit Hooks
Secure your Web Application With The New Python Audit HooksSecure your Web Application With The New Python Audit Hooks
Secure your Web Application With The New Python Audit HooksNicolas Vivet
 
CyberAgentのプライベートクラウド Cycloudの運用及びモニタリングについて #CODT2020 / Administration and M...
CyberAgentのプライベートクラウド Cycloudの運用及びモニタリングについて #CODT2020 / Administration and M...CyberAgentのプライベートクラウド Cycloudの運用及びモニタリングについて #CODT2020 / Administration and M...
CyberAgentのプライベートクラウド Cycloudの運用及びモニタリングについて #CODT2020 / Administration and M...whywaita
 
elixirを使ったゲームサーバ
elixirを使ったゲームサーバelixirを使ったゲームサーバ
elixirを使ったゲームサーバHidetaka Kojo
 
시스템 보안에 대해 최종본
시스템 보안에 대해   최종본시스템 보안에 대해   최종본
시스템 보안에 대해 최종본승표 홍
 
初心者向けCTFのWeb分野の強化法
初心者向けCTFのWeb分野の強化法初心者向けCTFのWeb分野の強化法
初心者向けCTFのWeb分野の強化法kazkiti
 
続・モジュール / Introduction to C++ modules (part 2)
続・モジュール / Introduction to C++ modules (part 2)続・モジュール / Introduction to C++ modules (part 2)
続・モジュール / Introduction to C++ modules (part 2)TetsuroMatsumura
 
ContainerとName Space Isolation
ContainerとName Space IsolationContainerとName Space Isolation
ContainerとName Space Isolationmaruyama097
 
TIME_WAITに関する話
TIME_WAITに関する話TIME_WAITに関する話
TIME_WAITに関する話Takanori Sejima
 
서버 성능에 대한 정의와 이해
서버 성능에 대한 정의와 이해서버 성능에 대한 정의와 이해
서버 성능에 대한 정의와 이해중선 곽
 
ネットワークエンジニア的Ansibleの始め方
ネットワークエンジニア的Ansibleの始め方ネットワークエンジニア的Ansibleの始め方
ネットワークエンジニア的Ansibleの始め方akira6592
 
Open vSwitchソースコードの全体像
Open vSwitchソースコードの全体像 Open vSwitchソースコードの全体像
Open vSwitchソースコードの全体像 Sho Shimizu
 
Docker管理もHinemosで! ~監視・ジョブ機能を併せ持つ唯一のOSS「Hinemos」のご紹介~
Docker管理もHinemosで! ~監視・ジョブ機能を併せ持つ唯一のOSS「Hinemos」のご紹介~Docker管理もHinemosで! ~監視・ジョブ機能を併せ持つ唯一のOSS「Hinemos」のご紹介~
Docker管理もHinemosで! ~監視・ジョブ機能を併せ持つ唯一のOSS「Hinemos」のご紹介~Hinemos
 
C++20 モジュールの概要 / Introduction to C++ modules (part 1)
C++20 モジュールの概要 / Introduction to C++ modules (part 1)C++20 モジュールの概要 / Introduction to C++ modules (part 1)
C++20 モジュールの概要 / Introduction to C++ modules (part 1)TetsuroMatsumura
 
MaxScaleを触ってみた
MaxScaleを触ってみたMaxScaleを触ってみた
MaxScaleを触ってみたFujishiro Takuya
 
はじめてのCF buildpack
はじめてのCF buildpackはじめてのCF buildpack
はじめてのCF buildpackKazuto Kusama
 
[Play.node] node.js 를 사용한 대규모 글로벌(+중국) 서비스
[Play.node] node.js 를 사용한 대규모 글로벌(+중국) 서비스[Play.node] node.js 를 사용한 대규모 글로벌(+중국) 서비스
[Play.node] node.js 를 사용한 대규모 글로벌(+중국) 서비스Dan Kang (강동한)
 
パケットキャプチャの勘どころ Ssmjp 201501
パケットキャプチャの勘どころ Ssmjp 201501パケットキャプチャの勘どころ Ssmjp 201501
パケットキャプチャの勘どころ Ssmjp 201501稔 小林
 
Karpenterで君だけの最強のオートスケーリングを実装しよう
Karpenterで君だけの最強のオートスケーリングを実装しようKarpenterで君だけの最強のオートスケーリングを実装しよう
Karpenterで君だけの最強のオートスケーリングを実装しようKohei Nagase
 

What's hot (20)

KGC 2016 오픈소스 네트워크 엔진 Super socket 사용하기
KGC 2016 오픈소스 네트워크 엔진 Super socket 사용하기KGC 2016 오픈소스 네트워크 엔진 Super socket 사용하기
KGC 2016 오픈소스 네트워크 엔진 Super socket 사용하기
 
Secure your Web Application With The New Python Audit Hooks
Secure your Web Application With The New Python Audit HooksSecure your Web Application With The New Python Audit Hooks
Secure your Web Application With The New Python Audit Hooks
 
CyberAgentのプライベートクラウド Cycloudの運用及びモニタリングについて #CODT2020 / Administration and M...
CyberAgentのプライベートクラウド Cycloudの運用及びモニタリングについて #CODT2020 / Administration and M...CyberAgentのプライベートクラウド Cycloudの運用及びモニタリングについて #CODT2020 / Administration and M...
CyberAgentのプライベートクラウド Cycloudの運用及びモニタリングについて #CODT2020 / Administration and M...
 
elixirを使ったゲームサーバ
elixirを使ったゲームサーバelixirを使ったゲームサーバ
elixirを使ったゲームサーバ
 
시스템 보안에 대해 최종본
시스템 보안에 대해   최종본시스템 보안에 대해   최종본
시스템 보안에 대해 최종본
 
初心者向けCTFのWeb分野の強化法
初心者向けCTFのWeb分野の強化法初心者向けCTFのWeb分野の強化法
初心者向けCTFのWeb分野の強化法
 
続・モジュール / Introduction to C++ modules (part 2)
続・モジュール / Introduction to C++ modules (part 2)続・モジュール / Introduction to C++ modules (part 2)
続・モジュール / Introduction to C++ modules (part 2)
 
ContainerとName Space Isolation
ContainerとName Space IsolationContainerとName Space Isolation
ContainerとName Space Isolation
 
TIME_WAITに関する話
TIME_WAITに関する話TIME_WAITに関する話
TIME_WAITに関する話
 
서버 성능에 대한 정의와 이해
서버 성능에 대한 정의와 이해서버 성능에 대한 정의와 이해
서버 성능에 대한 정의와 이해
 
ネットワークエンジニア的Ansibleの始め方
ネットワークエンジニア的Ansibleの始め方ネットワークエンジニア的Ansibleの始め方
ネットワークエンジニア的Ansibleの始め方
 
C言語よくある誤り
C言語よくある誤りC言語よくある誤り
C言語よくある誤り
 
Open vSwitchソースコードの全体像
Open vSwitchソースコードの全体像 Open vSwitchソースコードの全体像
Open vSwitchソースコードの全体像
 
Docker管理もHinemosで! ~監視・ジョブ機能を併せ持つ唯一のOSS「Hinemos」のご紹介~
Docker管理もHinemosで! ~監視・ジョブ機能を併せ持つ唯一のOSS「Hinemos」のご紹介~Docker管理もHinemosで! ~監視・ジョブ機能を併せ持つ唯一のOSS「Hinemos」のご紹介~
Docker管理もHinemosで! ~監視・ジョブ機能を併せ持つ唯一のOSS「Hinemos」のご紹介~
 
C++20 モジュールの概要 / Introduction to C++ modules (part 1)
C++20 モジュールの概要 / Introduction to C++ modules (part 1)C++20 モジュールの概要 / Introduction to C++ modules (part 1)
C++20 モジュールの概要 / Introduction to C++ modules (part 1)
 
MaxScaleを触ってみた
MaxScaleを触ってみたMaxScaleを触ってみた
MaxScaleを触ってみた
 
はじめてのCF buildpack
はじめてのCF buildpackはじめてのCF buildpack
はじめてのCF buildpack
 
[Play.node] node.js 를 사용한 대규모 글로벌(+중국) 서비스
[Play.node] node.js 를 사용한 대규모 글로벌(+중국) 서비스[Play.node] node.js 를 사용한 대규모 글로벌(+중국) 서비스
[Play.node] node.js 를 사용한 대규모 글로벌(+중국) 서비스
 
パケットキャプチャの勘どころ Ssmjp 201501
パケットキャプチャの勘どころ Ssmjp 201501パケットキャプチャの勘どころ Ssmjp 201501
パケットキャプチャの勘どころ Ssmjp 201501
 
Karpenterで君だけの最強のオートスケーリングを実装しよう
Karpenterで君だけの最強のオートスケーリングを実装しようKarpenterで君だけの最強のオートスケーリングを実装しよう
Karpenterで君だけの最強のオートスケーリングを実装しよう
 

Similar to Parse ModSecurity Logs into Structured Data

Odoo command line interface
Odoo command line interfaceOdoo command line interface
Odoo command line interfaceJalal Zahid
 
E bpf and dynamic tracing for mariadb db as (mariadb day during fosdem 2020)
E bpf and dynamic tracing for mariadb db as (mariadb day during fosdem 2020)E bpf and dynamic tracing for mariadb db as (mariadb day during fosdem 2020)
E bpf and dynamic tracing for mariadb db as (mariadb day during fosdem 2020)Valeriy Kravchuk
 
PG Day'14 Russia, PostgreSQL System Architecture, Heikki Linnakangas
PG Day'14 Russia, PostgreSQL System Architecture, Heikki LinnakangasPG Day'14 Russia, PostgreSQL System Architecture, Heikki Linnakangas
PG Day'14 Russia, PostgreSQL System Architecture, Heikki Linnakangaspgdayrussia
 
Tracing and profiling my sql (percona live europe 2019) draft_1
Tracing and profiling my sql (percona live europe 2019) draft_1Tracing and profiling my sql (percona live europe 2019) draft_1
Tracing and profiling my sql (percona live europe 2019) draft_1Valerii Kravchuk
 
Scaling Your Logging Infrastructure With Syslog-NG
Scaling Your Logging Infrastructure With Syslog-NGScaling Your Logging Infrastructure With Syslog-NG
Scaling Your Logging Infrastructure With Syslog-NGAll Things Open
 
Scaling your logging infrastructure using syslog-ng
Scaling your logging infrastructure using syslog-ngScaling your logging infrastructure using syslog-ng
Scaling your logging infrastructure using syslog-ngPeter Czanik
 
Socket programming, and openresty
Socket programming, and openrestySocket programming, and openresty
Socket programming, and openrestyTavish Naruka
 
Shall we play a game
Shall we play a gameShall we play a game
Shall we play a gamejackpot201
 
IT Operations for Web Developers
IT Operations for Web DevelopersIT Operations for Web Developers
IT Operations for Web DevelopersMahmoud Said
 
Node.js Web Apps @ ebay scale
Node.js Web Apps @ ebay scaleNode.js Web Apps @ ebay scale
Node.js Web Apps @ ebay scaleDmytro Semenov
 
IAS for IBM WebSphere MQ Users
IAS for IBM WebSphere MQ UsersIAS for IBM WebSphere MQ Users
IAS for IBM WebSphere MQ UsersInvenire Aude
 
We shall play a game....
We shall play a game....We shall play a game....
We shall play a game....Sadia Textile
 
Profiling PHP with Xdebug / Webgrind
Profiling PHP with Xdebug / WebgrindProfiling PHP with Xdebug / Webgrind
Profiling PHP with Xdebug / WebgrindSam Keen
 
Extending OpenShift Origin: Build Your Own Cartridge with Bill DeCoste of Red...
Extending OpenShift Origin: Build Your Own Cartridge with Bill DeCoste of Red...Extending OpenShift Origin: Build Your Own Cartridge with Bill DeCoste of Red...
Extending OpenShift Origin: Build Your Own Cartridge with Bill DeCoste of Red...OpenShift Origin
 
Prometheus and Docker (Docker Galway, November 2015)
Prometheus and Docker (Docker Galway, November 2015)Prometheus and Docker (Docker Galway, November 2015)
Prometheus and Docker (Docker Galway, November 2015)Brian Brazil
 
Php memory-redux
Php memory-reduxPhp memory-redux
Php memory-reduxnanderoo
 

Similar to Parse ModSecurity Logs into Structured Data (20)

Odoo command line interface
Odoo command line interfaceOdoo command line interface
Odoo command line interface
 
E bpf and dynamic tracing for mariadb db as (mariadb day during fosdem 2020)
E bpf and dynamic tracing for mariadb db as (mariadb day during fosdem 2020)E bpf and dynamic tracing for mariadb db as (mariadb day during fosdem 2020)
E bpf and dynamic tracing for mariadb db as (mariadb day during fosdem 2020)
 
NodeJS
NodeJSNodeJS
NodeJS
 
PG Day'14 Russia, PostgreSQL System Architecture, Heikki Linnakangas
PG Day'14 Russia, PostgreSQL System Architecture, Heikki LinnakangasPG Day'14 Russia, PostgreSQL System Architecture, Heikki Linnakangas
PG Day'14 Russia, PostgreSQL System Architecture, Heikki Linnakangas
 
Tracing and profiling my sql (percona live europe 2019) draft_1
Tracing and profiling my sql (percona live europe 2019) draft_1Tracing and profiling my sql (percona live europe 2019) draft_1
Tracing and profiling my sql (percona live europe 2019) draft_1
 
Scaling Your Logging Infrastructure With Syslog-NG
Scaling Your Logging Infrastructure With Syslog-NGScaling Your Logging Infrastructure With Syslog-NG
Scaling Your Logging Infrastructure With Syslog-NG
 
Scaling your logging infrastructure using syslog-ng
Scaling your logging infrastructure using syslog-ngScaling your logging infrastructure using syslog-ng
Scaling your logging infrastructure using syslog-ng
 
Socket programming, and openresty
Socket programming, and openrestySocket programming, and openresty
Socket programming, and openresty
 
0507 057 01 98 * Adana Klima Servisleri
0507 057 01 98 * Adana Klima Servisleri0507 057 01 98 * Adana Klima Servisleri
0507 057 01 98 * Adana Klima Servisleri
 
Shall we play a game
Shall we play a gameShall we play a game
Shall we play a game
 
Shall we play a game?
Shall we play a game?Shall we play a game?
Shall we play a game?
 
IT Operations for Web Developers
IT Operations for Web DevelopersIT Operations for Web Developers
IT Operations for Web Developers
 
Node.js Web Apps @ ebay scale
Node.js Web Apps @ ebay scaleNode.js Web Apps @ ebay scale
Node.js Web Apps @ ebay scale
 
IAS for IBM WebSphere MQ Users
IAS for IBM WebSphere MQ UsersIAS for IBM WebSphere MQ Users
IAS for IBM WebSphere MQ Users
 
PHP Development Tools
PHP  Development ToolsPHP  Development Tools
PHP Development Tools
 
We shall play a game....
We shall play a game....We shall play a game....
We shall play a game....
 
Profiling PHP with Xdebug / Webgrind
Profiling PHP with Xdebug / WebgrindProfiling PHP with Xdebug / Webgrind
Profiling PHP with Xdebug / Webgrind
 
Extending OpenShift Origin: Build Your Own Cartridge with Bill DeCoste of Red...
Extending OpenShift Origin: Build Your Own Cartridge with Bill DeCoste of Red...Extending OpenShift Origin: Build Your Own Cartridge with Bill DeCoste of Red...
Extending OpenShift Origin: Build Your Own Cartridge with Bill DeCoste of Red...
 
Prometheus and Docker (Docker Galway, November 2015)
Prometheus and Docker (Docker Galway, November 2015)Prometheus and Docker (Docker Galway, November 2015)
Prometheus and Docker (Docker Galway, November 2015)
 
Php memory-redux
Php memory-reduxPhp memory-redux
Php memory-redux
 

Recently uploaded

Building Real-Time Data Pipelines: Stream & Batch Processing workshop Slide
Building Real-Time Data Pipelines: Stream & Batch Processing workshop SlideBuilding Real-Time Data Pipelines: Stream & Batch Processing workshop Slide
Building Real-Time Data Pipelines: Stream & Batch Processing workshop SlideChristina Lin
 
Cloud Management Software Platforms: OpenStack
Cloud Management Software Platforms: OpenStackCloud Management Software Platforms: OpenStack
Cloud Management Software Platforms: OpenStackVICTOR MAESTRE RAMIREZ
 
What is Binary Language? Computer Number Systems
What is Binary Language?  Computer Number SystemsWhat is Binary Language?  Computer Number Systems
What is Binary Language? Computer Number SystemsJheuzeDellosa
 
ODSC - Batch to Stream workshop - integration of Apache Spark, Cassandra, Pos...
ODSC - Batch to Stream workshop - integration of Apache Spark, Cassandra, Pos...ODSC - Batch to Stream workshop - integration of Apache Spark, Cassandra, Pos...
ODSC - Batch to Stream workshop - integration of Apache Spark, Cassandra, Pos...Christina Lin
 
Hand gesture recognition PROJECT PPT.pptx
Hand gesture recognition PROJECT PPT.pptxHand gesture recognition PROJECT PPT.pptx
Hand gesture recognition PROJECT PPT.pptxbodapatigopi8531
 
Alluxio Monthly Webinar | Cloud-Native Model Training on Distributed Data
Alluxio Monthly Webinar | Cloud-Native Model Training on Distributed DataAlluxio Monthly Webinar | Cloud-Native Model Training on Distributed Data
Alluxio Monthly Webinar | Cloud-Native Model Training on Distributed DataAlluxio, Inc.
 
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdf
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdfLearn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdf
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdfkalichargn70th171
 
Project Based Learning (A.I).pptx detail explanation
Project Based Learning (A.I).pptx detail explanationProject Based Learning (A.I).pptx detail explanation
Project Based Learning (A.I).pptx detail explanationkaushalgiri8080
 
Building a General PDE Solving Framework with Symbolic-Numeric Scientific Mac...
Building a General PDE Solving Framework with Symbolic-Numeric Scientific Mac...Building a General PDE Solving Framework with Symbolic-Numeric Scientific Mac...
Building a General PDE Solving Framework with Symbolic-Numeric Scientific Mac...stazi3110
 
The Evolution of Karaoke From Analog to App.pdf
The Evolution of Karaoke From Analog to App.pdfThe Evolution of Karaoke From Analog to App.pdf
The Evolution of Karaoke From Analog to App.pdfPower Karaoke
 
Russian Call Girls in Karol Bagh Aasnvi ➡️ 8264348440 💋📞 Independent Escort S...
Russian Call Girls in Karol Bagh Aasnvi ➡️ 8264348440 💋📞 Independent Escort S...Russian Call Girls in Karol Bagh Aasnvi ➡️ 8264348440 💋📞 Independent Escort S...
Russian Call Girls in Karol Bagh Aasnvi ➡️ 8264348440 💋📞 Independent Escort S...soniya singh
 
HR Software Buyers Guide in 2024 - HRSoftware.com
HR Software Buyers Guide in 2024 - HRSoftware.comHR Software Buyers Guide in 2024 - HRSoftware.com
HR Software Buyers Guide in 2024 - HRSoftware.comFatema Valibhai
 
Der Spagat zwischen BIAS und FAIRNESS (2024)
Der Spagat zwischen BIAS und FAIRNESS (2024)Der Spagat zwischen BIAS und FAIRNESS (2024)
Der Spagat zwischen BIAS und FAIRNESS (2024)OPEN KNOWLEDGE GmbH
 
Optimizing AI for immediate response in Smart CCTV
Optimizing AI for immediate response in Smart CCTVOptimizing AI for immediate response in Smart CCTV
Optimizing AI for immediate response in Smart CCTVshikhaohhpro
 
KnowAPIs-UnknownPerf-jaxMainz-2024 (1).pptx
KnowAPIs-UnknownPerf-jaxMainz-2024 (1).pptxKnowAPIs-UnknownPerf-jaxMainz-2024 (1).pptx
KnowAPIs-UnknownPerf-jaxMainz-2024 (1).pptxTier1 app
 
XpertSolvers: Your Partner in Building Innovative Software Solutions
XpertSolvers: Your Partner in Building Innovative Software SolutionsXpertSolvers: Your Partner in Building Innovative Software Solutions
XpertSolvers: Your Partner in Building Innovative Software SolutionsMehedi Hasan Shohan
 
Salesforce Certified Field Service Consultant
Salesforce Certified Field Service ConsultantSalesforce Certified Field Service Consultant
Salesforce Certified Field Service ConsultantAxelRicardoTrocheRiq
 
BATTLEFIELD ORM: TIPS, TACTICS AND STRATEGIES FOR CONQUERING YOUR DATABASE
BATTLEFIELD ORM: TIPS, TACTICS AND STRATEGIES FOR CONQUERING YOUR DATABASEBATTLEFIELD ORM: TIPS, TACTICS AND STRATEGIES FOR CONQUERING YOUR DATABASE
BATTLEFIELD ORM: TIPS, TACTICS AND STRATEGIES FOR CONQUERING YOUR DATABASEOrtus Solutions, Corp
 
Unveiling the Tech Salsa of LAMs with Janus in Real-Time Applications
Unveiling the Tech Salsa of LAMs with Janus in Real-Time ApplicationsUnveiling the Tech Salsa of LAMs with Janus in Real-Time Applications
Unveiling the Tech Salsa of LAMs with Janus in Real-Time ApplicationsAlberto González Trastoy
 

Recently uploaded (20)

Building Real-Time Data Pipelines: Stream & Batch Processing workshop Slide
Building Real-Time Data Pipelines: Stream & Batch Processing workshop SlideBuilding Real-Time Data Pipelines: Stream & Batch Processing workshop Slide
Building Real-Time Data Pipelines: Stream & Batch Processing workshop Slide
 
Cloud Management Software Platforms: OpenStack
Cloud Management Software Platforms: OpenStackCloud Management Software Platforms: OpenStack
Cloud Management Software Platforms: OpenStack
 
What is Binary Language? Computer Number Systems
What is Binary Language?  Computer Number SystemsWhat is Binary Language?  Computer Number Systems
What is Binary Language? Computer Number Systems
 
ODSC - Batch to Stream workshop - integration of Apache Spark, Cassandra, Pos...
ODSC - Batch to Stream workshop - integration of Apache Spark, Cassandra, Pos...ODSC - Batch to Stream workshop - integration of Apache Spark, Cassandra, Pos...
ODSC - Batch to Stream workshop - integration of Apache Spark, Cassandra, Pos...
 
Hand gesture recognition PROJECT PPT.pptx
Hand gesture recognition PROJECT PPT.pptxHand gesture recognition PROJECT PPT.pptx
Hand gesture recognition PROJECT PPT.pptx
 
Alluxio Monthly Webinar | Cloud-Native Model Training on Distributed Data
Alluxio Monthly Webinar | Cloud-Native Model Training on Distributed DataAlluxio Monthly Webinar | Cloud-Native Model Training on Distributed Data
Alluxio Monthly Webinar | Cloud-Native Model Training on Distributed Data
 
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdf
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdfLearn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdf
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdf
 
Project Based Learning (A.I).pptx detail explanation
Project Based Learning (A.I).pptx detail explanationProject Based Learning (A.I).pptx detail explanation
Project Based Learning (A.I).pptx detail explanation
 
Building a General PDE Solving Framework with Symbolic-Numeric Scientific Mac...
Building a General PDE Solving Framework with Symbolic-Numeric Scientific Mac...Building a General PDE Solving Framework with Symbolic-Numeric Scientific Mac...
Building a General PDE Solving Framework with Symbolic-Numeric Scientific Mac...
 
The Evolution of Karaoke From Analog to App.pdf
The Evolution of Karaoke From Analog to App.pdfThe Evolution of Karaoke From Analog to App.pdf
The Evolution of Karaoke From Analog to App.pdf
 
Russian Call Girls in Karol Bagh Aasnvi ➡️ 8264348440 💋📞 Independent Escort S...
Russian Call Girls in Karol Bagh Aasnvi ➡️ 8264348440 💋📞 Independent Escort S...Russian Call Girls in Karol Bagh Aasnvi ➡️ 8264348440 💋📞 Independent Escort S...
Russian Call Girls in Karol Bagh Aasnvi ➡️ 8264348440 💋📞 Independent Escort S...
 
HR Software Buyers Guide in 2024 - HRSoftware.com
HR Software Buyers Guide in 2024 - HRSoftware.comHR Software Buyers Guide in 2024 - HRSoftware.com
HR Software Buyers Guide in 2024 - HRSoftware.com
 
Der Spagat zwischen BIAS und FAIRNESS (2024)
Der Spagat zwischen BIAS und FAIRNESS (2024)Der Spagat zwischen BIAS und FAIRNESS (2024)
Der Spagat zwischen BIAS und FAIRNESS (2024)
 
Optimizing AI for immediate response in Smart CCTV
Optimizing AI for immediate response in Smart CCTVOptimizing AI for immediate response in Smart CCTV
Optimizing AI for immediate response in Smart CCTV
 
KnowAPIs-UnknownPerf-jaxMainz-2024 (1).pptx
KnowAPIs-UnknownPerf-jaxMainz-2024 (1).pptxKnowAPIs-UnknownPerf-jaxMainz-2024 (1).pptx
KnowAPIs-UnknownPerf-jaxMainz-2024 (1).pptx
 
XpertSolvers: Your Partner in Building Innovative Software Solutions
XpertSolvers: Your Partner in Building Innovative Software SolutionsXpertSolvers: Your Partner in Building Innovative Software Solutions
XpertSolvers: Your Partner in Building Innovative Software Solutions
 
Salesforce Certified Field Service Consultant
Salesforce Certified Field Service ConsultantSalesforce Certified Field Service Consultant
Salesforce Certified Field Service Consultant
 
BATTLEFIELD ORM: TIPS, TACTICS AND STRATEGIES FOR CONQUERING YOUR DATABASE
BATTLEFIELD ORM: TIPS, TACTICS AND STRATEGIES FOR CONQUERING YOUR DATABASEBATTLEFIELD ORM: TIPS, TACTICS AND STRATEGIES FOR CONQUERING YOUR DATABASE
BATTLEFIELD ORM: TIPS, TACTICS AND STRATEGIES FOR CONQUERING YOUR DATABASE
 
Unveiling the Tech Salsa of LAMs with Janus in Real-Time Applications
Unveiling the Tech Salsa of LAMs with Janus in Real-Time ApplicationsUnveiling the Tech Salsa of LAMs with Janus in Real-Time Applications
Unveiling the Tech Salsa of LAMs with Janus in Real-Time Applications
 
Call Girls In Mukherjee Nagar 📱 9999965857 🤩 Delhi 🫦 HOT AND SEXY VVIP 🍎 SE...
Call Girls In Mukherjee Nagar 📱  9999965857  🤩 Delhi 🫦 HOT AND SEXY VVIP 🍎 SE...Call Girls In Mukherjee Nagar 📱  9999965857  🤩 Delhi 🫦 HOT AND SEXY VVIP 🍎 SE...
Call Girls In Mukherjee Nagar 📱 9999965857 🤩 Delhi 🫦 HOT AND SEXY VVIP 🍎 SE...
 

Parse ModSecurity Logs into Structured Data

  • 2. Content ● idea ● goals and motivations ● internal operation ● examples ● future plans // 2
  • 3. Idea ● we use ModSecurity on more and more servers, both Apache and Nginx ○ more very unpleasant FP's ● demand: process logs for a custom dashboard ○ both developers and admins can check the WAF results ● collect virtual hosts logs from the servers ○ side note: discover the servers config with httpd_pyparser ○ collect information about server configuration, incl. log paths, CRS setup (PL), ModSecurity customization ● which log? ○ audit.log? ○ error.log? - was chosen this one: we need "only 'Warning'" lines too ● how to transport the log(s) to the dashboard app? ○ on the fly - through pipe (Apache) or through syslog (Nginx) ○ copy the logs and load them in place // 3
  • 4. First steps ● facing the problems: ○ log parts and those limitations - see later ○ truncated fields ○ falsified data - see later ○ differences between the engines ○ portability - what if we want several different application? ○ performance - only may occur in extreme cases ● it's not as trivial as it seems // 4
  • 5. Falsify data Example: how to fill the log with false data curl -v -X POST -d ") [file "/dev/null"] [line "-2"] [id "3.1415"] [msg "Empty"] [data "[file "/dev/random"] [line "inf"]"] [severity "NORMAL"] =@attack"... will produce this funny line - a game: find the correct "file" field! ModSecurity: Warning. Matched phrase "pattern_from_attack" at ARGS_NAMES:) [file "/dev/null"] [line "-2"] [id "3.1415"] [msg "Empty"] [data "[file "/dev/random"] [line "inf"]"] [severity "NORMAL"] . [file "/usr.. Fortunately, later this won't appear (eg. %{MATCHED_VAR} in [data]), because that fields are encoded. // 5
  • 6. Goals and motivations ● parse the lines of the logfiles into a structure ○ structure contains as much details as possible ○ help with row filtering or making exclusions ■ eg. add access to site developer who can see the affected logs and mark the FP's ○ indicate if the line is wrong ● doing it as fast as possible ● can be moved to other platforms (eg. from Python to PHP) ● with minimal dependency (decrease the number of 3rd party libraries) // 6
  • 7. Internal operation ● libmsclogparser is written in C ○ therefore it can be used for other applications what also was written in C, C++, Go or Rust ○ by default it has these bindings: Lua, PHP, Python, Ruby ○ available on Github: https://github.com/digitalwave/libmsclogparser, with license AGPL ● no other libraries needed (eg. regex, glib) ● used standard string functions ○ compiler can optimize these functions ● main function is the "parse()" ○ expects a line as string, length of the string, type of line (Apache or Nginx) and an empty structure ○ in bindings these are easier: needs only the line and the type of the line ○ the function fills the empty structure, bindings returns the structure ● a helper function: read_msclog_err() - gives a list about error messages and positions // 7
  • 8. Internal operation It was important to understand how a log entry is created in case of both engines. Let's see how do they work! ● logs are written through the web server ● the length of the log (and thus its content) is limited ● the code writer decided which part could be truncated // 8
  • 9. Internal operation - parts of a log line Core parts (with bold) [Tue Feb 14 09:00:00.123456 2023] [security2:error] [pid 364323:tid 139847182132992] [client 216.244.66.246:57996] [client 216.244.66.246] ModSecurity:... 2023/02/14 09:00:00 [info] 1419350#1419350: *12986 ModSecurity:... , client: 162.214.112.108, server: my.virtualserver.com, request: "GET /.env HTTP/1.1", host: "my.virtualserver.com" ● these parts are always present, each line contains them ○ therefore these are not truncated, always present with fix length ● generated by the core logger - not the module! ● the rest generated by the modul // 9
  • 10. Internal operation - parts of a log line Module parts (with bold) - Apache [client 216.244.66.246:57996] [client 216.244.66.246] ModSecurity: Warning. Operator GE matched 5 at TX:inbound_anomaly_score. [file "/…/RESPONSE-980-CORRELATION.conf"] [line "92"] [id "980130"] [msg "Inbound Anomaly Score Exceeded (Total Inbound Score: 5 - SQLI=0,...,SESS=0): individual paranoia level scores: 5, 0, 0, 0"] [ver "OWASP_CRS/3.3.4"] [tag "event-correlation"] [hostname "my.virtualserver.com"] [uri "/robots.txt"] [unique_id "Y-PIh32oaSsbx0Ag_pH6agAAAEE"] ● second (bold) [client] is duplicated (without port) ● part with red is the "message". Without string "Warning. ", the max length can be 252 + " …" ● part with black is the metadata - see later ● part with brown is appended at the end of process, those can't be truncated // 10
  • 11. Internal operation - parts of a log line Metadata in module parts - Apache [file "/…/RESPONSE-980-CORRELATION.conf"] [line "92"] [id "980130"] [msg "Inbound Anomaly Score Exceeded (Total Inbound Score: 5 - SQLI=0,...,SESS=0): individual paranoia level scores: 5, 0, 0, 0"] [ver "OWASP_CRS/3.3.4"] [tag "event-correlation"] ● strict order: file, line, id, rev, msg, data, severity, ver, maturity, accuracy, tag ● fields are optional - if rule does not have value with that field, it won't appear ● [data] value can be max length of 512; if it longer, then truncated and appended the '…"]' tail ● [tag] can be there many times // 11
  • 12. Internal operation - parts of a log line Tail in module parts - Apache [hostname "my.virtualserver.com"] [uri "/robots.txt"] [unique_id "Y-PIh32oaSsbx0Ag_pH6agAAAEE"] These fields are always presented in the log file as they are, their length does not matter. // 12
  • 13. Internal operation - parts of a log line Concatenate module parts - Apache ● core parts come from the server ● leading text: "[client a.b.c.d] ModSecurity: Warning." always presents ● added the message: "Operator GE matched…" ○ there can be many kind of message! ● added the metadata ● maximum length of leading text, message and metadata can be any lenght up to 1024 bytes ● after that, the tail part is added ● effect: there can be a truncated field, usually after the data ○ eg: [data "some long text …"] [severity "CRITICAL"] [v [hostname "my.virtualhost.com] [ver "OWASP_CRS/3.3.2"] [tag "application-multi"] [tag "lang [hostname "my.virtualhost.com"] // 13
  • 14. Internal operation - parts of a log line Module parts (with bold) - Nginx 12986 ModSecurity: Warning. Matched "Operator `PmFromFile' with parameter `restricted-files.data' against variable `REQUEST_FILENAME' (Value: `/.env' ) [file "/../REQUEST-930-APPLICATION-ATTACK- LFI.conf"] [line "106"] [id "930130"] [rev ""] [msg "Restricted File Access Attempt"] [data "Matched Data: /.env found within REQUEST_FILENAME: /.env"] [severity "2"] [ver "OWASP_CRS/3.3.4"] [maturity "0"] [accuracy "0"] [tag "application-multi"] [tag "language-multi"] [tag "platform- multi"] [tag "attack-lfi"] [tag "paranoia-level/1"] [tag "OWASP_CRS"] [tag "capec/1000/255/153/126"] [tag "PCI/6.5.4"] [hostname "my.virtualserver.com"] [uri "/.env"] [unique_id "167466178181.669649"] [ref "..."], client: … ● part with red is the "message" with a very strict content ● part with black is the metadata - see later ● part with brown is appended at the end of process, those can't be truncated // 14
  • 15. Internal operation - parts of a log line Metadata in module parts - Nginx [file "/../REQUEST-930-APPLICATION-ATTACK-LFI.conf"] [line "106"] [id "930130"] [rev ""] [msg "Restricted File Access Attempt"] [data "Matched Data: /.env found within REQUEST_FILENAME: /.env"] [severity "2"] [ver "OWASP_CRS/3.3.4"] [maturity "0"] [accuracy "0"] [tag "application- multi"] [tag "language-multi"] [tag "platform-multi"] [tag "attack-lfi"] [tag "paranoia-level/1"] [tag "OWASP_CRS"] [tag "capec/1000/255/153/126"] [tag "PCI/6.5.4"] ● strict order: file, line, id, rev, msg, data, severity, ver, maturity, accuracy, tag ● fields are not optional (except [tag]) - if data does not exist, it will be there with empty or 0 ● [data] value can be max length of 200; if it longer, then truncated and appended the 'N characters omitted)' tail ● [tag] can be there many times // 15
  • 16. Internal operation - parts of a log line Tail in module parts - Nginx [hostname "my.virtualserver.com"] [uri "/.env"] [unique_id "167466178181.669649"] [ref "..."] These fields are always presented in the log file as is; [uri] has a limit with 200 characters, others do not matter how long they are. // 16
  • 17. Internal operation - parts of a log line Concatenate module parts - Nginx ● core parts comes from the server ● leading text: "ModSecurity: Warning." always presents ○ only two types of messages, the other is the "Access denied" ● added the message: "Operator `OP' with parameter `PARAM' against variable `KEY' (Value: `VALUE' )" ● PARAM limited in 200, VALUE limited in 100 characters ● added the metadata ● maximum length of leading text, message and metadata can be any length up to 2048 bytes ● after that, the tail part added ● no random truncated field, all truncated parts are marked explicitly // 17
  • 18. Internal operation - splitting the line ● challenge: split the line into the parts above ● find the left border of message, it is easy: first "ModSecurity:" occurrence it is ● find the right border: remember the fails data - look for the last ' [file "' pattern in the line ○ a space, an opening brace, word 'file', space, quotation mark ○ later (eg. in [data]) this pattern is in hexa-encoded form: ' [file 22' ● tail starts with fix ' [hostname "'... ● parsing metadata part: find the next possible field (strict order!) ● trick: the right border of next field is the left border of current (step back 2 chars) ● but allow the search method to find the shortest possible pattern, eg: ' [v' ( [ver), ' [t' (tag) and so on ● due to the strict order, they exclude each other from occurring, ' [v' is the shortened form of ' [ver "' ○ only one conflict can be there with ' [m': [msg] and [maturity] - but [msg] is before [data], so it won't be truncated ever // 18
  • 19. Internal operation - getting more information Find the type of message! ● "ModSecurity: Warning. ", "ModSecurity: Access Denied. ", "ModSecurity: Rule error", … ○ these are important information ○ some of them modifies the line structure: ■ …message… [id "-"][file "/…"][line "345"] Execution error - PCRE limits exceeded (-8): (null) ■ parser is able to recognize these lines too! ● Get the reason ("Pattern match"), operand and target - making exclusions easier ○ in case of Nginx, more information are available: operator, operand, target name, target value ● Show the truncated field - if any ● Mark all errors: truncated field, missing fields (eg. disk was full and line truncated) // 19
  • 20. Example Using parser in a Python script: ● open the file given as argument ● set the type (can be an argument too) ● parse the lines // 20 lt = macpylogparse.LOG_TYPE_APACHE with open(sys.argv[1], "r") as fp: lines = fp.readlines() for l in lines: r = mscpylogparser.parse(l, len(l), lt) print(json.dumps(r))
  • 21. Example - output { "linelen": 614, "is_modsecline": 1, "is_broken": 0, "date_iso": "2023-02-08 17:06:31", "date_epoch": 1675872391.376939, "client": "216.244.66.246:57996", "modseclinetype": 1, "modsecmsg": "Warning. Operator GE matched 5 at TX:inbound_anomaly_score.", "modsecmsglen": 59, "modsecdenymsg": "", "modsecmsgreason": "Operator GE matched 5", "modsecmsgop": "", "modsecmsgoperand": "", // 21
  • 22. Example - output "modsecmsgtrgname": "TX:inbound_anomaly_score", "modsecmsgtrgvalue": "", "ruleerror": "", "file": "/usr/share/modsecurity-crs/rules/RESPONSE-980-CORRELATION.conf", "line": "92", "id": "980130", "rev": "", "msg": "Inbound Anomaly Score Exceeded (Total Inbound Score: 5 - SQLI=0,...", "data": "", "severity": "", "version": "OWASP_CRS/3.3.4", "maturity": "", "accuracy": "", // 22
  • 23. Example - output "tagcnt": 1, "tags": [ "event-correlation" ], "hostname": "my.virtualserver.com", "uri": "/robots.tx", "unique_id": "Y-PIh32oaSsbx0Ag_pH6agAAAEE", "lineerrorcnt": 0, "lineerrors": [], "lineerrorspos": [] } // 23
  • 24. Future plans ● adding more pattern recognition from message ● getting more details from existing types ○ eg. 'Operator GE matched 5 at TX:inbound_anomaly_score.' ■ 'GE' is the operator ■ 5 is the operand ● distinguish between unset and empty values ○ [ver ""] produces "" value instead of None/NULL ● more bindings (if necessary) ○ Perl // 24