Download presentation
Presentation is loading. Please wait.
1
Log Analysis with GAWK Back to Basics
2
Who Am I? Brad Isbell 20 years in IT
Range Operations Lead / SimSpace Inc. Instructor / Sun Microsystems, DCITA, Stevenson University Contractor / DISA, DOL, NELO, UMUC, FEMA CyberPatriot Mentor MIS, OSCP, CISSP, CEH, Sec+, Linux+
3
Who are we?
4
Log Analysis with GAWK Why? I have Splunk/ELK
What happens when you don’t? Simple, Efficient, Common
5
What is GAWK? GNU AWK AWK: Aho, Wienberger, Kernighan
Data Driven Text Processing and Reporting Language Pattern Search + Action
6
GAWK Terminology Records and Fields Records: One Line
Fields: Records Contain Fields of Data
7
Well Formatted Data Logs are (usually) well formatted
How are records defined? Can you break the records into fields? What is/are the field separator(s)? How generically can you describe the data?
8
Well Formatted Data (Squid Proxy Logs)
|63731| |GET| /msdownload/update/v3/static/trustedr/en/disallowedcertstl.cab?8303c 1e235fc3944|HTTP/1.1|200|4622|237|4385|-|Microsoft-CryptoAPI/6.1
9
Well Formatted Data (Squid Proxy Logs)
|63731| |GET| /msdownload/update/v3/static/trustedr/en/disallowedcertstl.cab?8303c 1e235fc3944|HTTP/1.1|200|4622|237|4385|-|Microsoft-CryptoAPI/6.1 Client IP & Source Port
10
Well Formatted Data (Squid Proxy Logs)
|63731| |GET| /msdownload/update/v3/static/trustedr/en/disallowedcertstl.cab?8303c 1e235fc3944|HTTP/1.1|200|4622|237|4385|-|Microsoft-CryptoAPI/6.1 Timestamp
11
Well Formatted Data (Squid Proxy Logs)
|63731| |GET| /msdownload/update/v3/static/trustedr/en/disallowedcertstl.cab?8303c 1e235fc3944|HTTP/1.1|200|4622|237|4385|-|Microsoft-CryptoAPI/6.1 HTTP Verb & URL
12
Well Formatted Data (Squid Proxy Logs)
|63731| |GET| /msdownload/update/v3/static/trustedr/en/disallowedcertstl.cab?8303c 1e235fc3944|HTTP/1.1|200|4622|237|4385|-|Microsoft-CryptoAPI/6.1 Status Code & Bytes Transferred
13
Well Formatted Data (Squid Proxy Logs)
|63731| |GET| /msdownload/update/v3/static/trustedr/en/disallowedcertstl.cab?8303c 1e235fc3944|HTTP/1.1|200|4622|237|4385|-|Microsoft-CryptoAPI/6.1 Referrer & User Agent
14
Well Formatted Data? (tcpdump)
00:15: IP > : UDP, length 21 00:15: IP > : Flags [P.], seq 1:64, ack 63, win 11, options [nop,nop,TS val ecr ], length 63 00:15: IP > : Flags [.], ack 64, win , options [nop,nop,TS val ecr ], length 0 00:15: IP > : UDP, length 1350 00:50: ARP, Request who-has tell , length 28
15
Well Formatted Data? (tcpdump)
00:15: IP > : UDP, length 21 00:15: IP > : Flags [P.], seq 1:64, ack 63, win 11, options [nop,nop,TS val ecr ], length 63 00:15: IP > : Flags [.], ack 64, win , options [nop,nop,TS val ecr ], length 0 00:15: IP > : UDP, length 1350 00:50: ARP, Request who-has tell , length 28
16
Well Formatted Data? (tcpdump)
00:15: IP > : UDP, length 21 00:15: IP > : Flags [P.], seq 1:64, ack 63, win 11, options [nop,nop,TS val ecr ], length 63 00:15: IP > : Flags [.], ack 64, win , options [nop,nop,TS val ecr ], length 0 00:15: IP > : UDP, length 1350 00:50: ARP, Request who-has tell , length 28
17
Well Formatted Data? (tcpdump)
00:15: IP > : UDP, length 21 00:15: IP > : Flags [P.], seq 1:64, ack 63, win 11, options [nop,nop,TS val ecr ], length 63 00:15: IP > : Flags [.], ack 64, win , options [nop,nop,TS val ecr ], length 0 00:15: IP > : UDP, length 1350 00:50: ARP, Request who-has tell , length 28
18
Well Formatted Data? (tcpdump)
00:15: IP > : UDP, length 21 00:15: IP > : Flags [P.], seq 1:64, ack 63, win 11, options [nop,nop,TS val ecr ], length 63 00:15: IP > : Flags [.], ack 64, win , options [nop,nop,TS val ecr ], length 0 00:15: IP > : UDP, length 1350 00:50: ARP, Request who-has tell , length 28
19
Well Formatted Data? (tcpdump)
00:15: IP > : UDP, length 21 00:15: IP > : Flags [P.], seq 1:64, ack 63, win 11, options [nop,nop,TS val ecr ], length 63 00:15: IP > : Flags [.], ack 64, win , options [nop,nop,TS val ecr ], length 0 00:15: IP > : UDP, length 1350 00:50: ARP, Request who-has tell , length 28 Which fields are the source IP / destination IP
20
Well Formatted Data? (tcpdump)
00:15: IP > : UDP, length 21 00:15: IP > : Flags [P.], seq 1:64, ack 63, win 11, options [nop,nop,TS val ecr ], length 63 00:15: IP > : Flags [.], ack 64, win , options [nop,nop,TS val ecr ], length 0 00:15: IP > : UDP, length 1350 00:50: ARP, Request who-has tell , length 28
21
Well Formatted Data? (tcpdump)
00:15: IP > : UDP, Field Separator = [ .] Fields 1 & 2: (Timestamp): 00:15: Field 3: (Network Protocol): IP Fields 4, 5, 6, 7: (Source IP): Field 8: (Source Port): 443
22
Well Formatted Data? (tcpdump)
00:15: IP > : UDP, Field Separator = [ .] Fields 10, 11, 12, 13: (Destination IP): Field 14: (Destination Port): : Field 15: (Transport Protocol): UDP,
23
Well Formatted Data? (tcpdump)
00:15: IP > : UDP, Field Separator = [ .:,] Fields 1, 2, 3, 4: (Timestamp): Field 5: (Network Protocol): IP Fields 6, 7, 8, 9: (Source IP): Field 10: (Source Port): 443
24
Well Formatted Data (tcpdump)
00:15: IP > : UDP, Field Separator = [ .:,] Fields 12, 13, 14, 15: (Destination IP): Field 16: (Destination Port): Field 18: (Transport Protocol): UDP
25
Well Formatted Data (tcpdump)
00:15: IP > : UDP, Field Separator = [ .:,] Fields 12, 13, 14, 15: (Destination IP): Field 16: (Destination Port): Field 18: (Transport Protocol): UDP
26
Well Formatted Data (tcpdump)
00:15: IP > : UDP, Field Separator = [ .:,]+ Fields 12, 13, 14, 15: (Destination IP): Field 16: (Destination Port): Field 17: (Transport Protocol): UDP
27
Well Formatted Data? (tcpdump)
00:15: IP > : UDP, length 21 00:15: IP > : Flags [P.], seq 1:64, ack 63, win 11, options [nop,nop,TS val ecr ], length 63 00:50: ARP, Request who-has tell , length 28 IF Field 3 == “IP” AND Field 15 == “UDP”: UDP Packet IF Field 3 == “IP” AND Field 15 == “Flags”: TCP Packet IF Field 3 == ”ARP”: ARP Packet Notice: packet length is always the last field
28
Challenges in Describing Data
Web Server Access Logs [16/May/2019:03:16: ] "GET /favicon.ico HTTP/1.1" "-" "Mozilla/5.0 (Macintosh; Intel Mac OS X 10.14; rv:66.0) Gecko/ Firefox/66.0" [16/May/2019:03:16: ] "GET /favicon.ico HTTP/1.1" "-" " "Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/ (KHTML, like Gecko) Chrome/ Safari/537.36"
29
Challenges in Describing Data
Custom Logs 02/27/16 12:25:53 PM, ,root,chester
30
Challenges in Describing Data
Custom Logs 02/27/16 12:25:53 PM, ,root,chester 02/27/16 12:27:32 PM, ,root,mko0,lp-.;[=
31
Challenges in Describing Data
Custom Logs 02/27/16 12:25:53 PM, ,root,Y2hlc3Rlcg== 02/27/16 12:27:32 PM, ,root,bWtvMCxscC0uO1s9
32
GAWK Syntax $ gawk –F <field separator> ‘ SEARCH { ACTION }’ FILE Default field separator: [ \t]+ SEARCH: Which Records (PCRE Pattern) ACTION: What To Do
33
Fields and Records Fields are stored in variables: $1, $2, $3 …
$0 = The entire record NF = Total Number of Fields $NF = The Last Field NR = Record Number
34
Demonstration gawk ‘{print $0}’ contacts.txt
gawk ‘{print NF}’ contacts.txt gawk ‘{print $NF}’ contacts.txt gawk ‘{print NF}’ pcap.txt gawk ‘{print $NF}’ pcap.txt
35
Search Syntax /PCRE/ : Search for pattern in record
$1 == “PATTERN” : First field is an exact pattern match $1 != “PATTERN” : First field does not exactly match pattern $1 ~ /PCTE/ : First field matches a regex BEGIN : Perform Action Before Reading Records END : Perform Action After Reading Records
36
Demonstration gawk '/Amelia/{print $0 } ' contacts.txt
gawk ' /A/{print $0}' contacts.txt gawk ' $4 == "A" {print $0}' contacts.txt gawk ' $1 ~ /^A/ {print $3} ' contacts.txt gawk ' /gmail.com/{print $1} ' contacts.txt gawk -F'[ :,.]+' ' $17 == "UDP" {print $NF } ' pcap.txt
37
Variables Undeclared Letters, Digits, Underscore
Cannot Begin with Digit
38
Demonstration gawk -F'[ ,.:]+' '{print $17}' pcap.txt
$17 == "UDP" {UDP = UDP + $NF} END { print UDP} ' pcap.txt gawk -F'[ ,.:]+' ' $17 == "Flags" {TCP += $NF } END {print "UDP: " UDP "\nTCP: " TCP} gawk -f pcap-1.gawk pcap.txt
39
Arrays proto[“UDP”] = 1371 proto[“TCP”] = 63
for (p in proto) print p: proto[p]
40
Demonstration pcap-2.gawk BEGIN { FS = "[ ,.:]+" }
$17 == "UDP" { proto["UDP"] += $NF } $17 == "Flags" { proto["TCP"] += $NF } END { for (p in proto) print p": " proto[p] }
41
printf printf “%s\n”, $1 printf “%-15s%s\n”, $1, $3
%s : string %15s : string right justified on 15 characters %-15s : string left justified on 15 characters %d : decimal %f : float %10.2f : float right justified on 10 character, 2 places after decimal
42
Additional Functions Convert Epoch Timestamp to Human Readable
strftime(“%c”,TIMESTAMP) Regex Substitution gensub(SEARCH,REPLACE,INSTANCE,INPUT)
43
Demonstration head access.log | gawk -F'|' '{print $3}'
head access.log | gawk -F'|' '{print strftime("%c",$3)}' head access.log | gawk -F'|' '{print gensub("https?://([^/]+)/.*","\\1",1,$5)} ' QUESTIONS: Which client is generating the most traffic? gawk -F'|' ' { LENGTH[$1] += $9 } END { for (IP in LENGTH) printf "%-15s : %d\n", IP , LENGTH[IP] } ' access.log Which website is that IP going to? -> squid-1.gawk
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.