Presentation is loading. Please wait.

Presentation is loading. Please wait.

Log Analysis with GAWK Back to Basics.

Similar presentations


Presentation on theme: "Log Analysis with GAWK Back to Basics."— Presentation transcript:

1 Log Analysis with GAWK Back to Basics

2 Who Am I? Brad Isbell 20 years in IT
Range Operations Lead / SimSpace Inc. Instructor / Sun Microsystems, DCITA, Stevenson University Contractor / DISA, DOL, NELO, UMUC, FEMA CyberPatriot Mentor MIS, OSCP, CISSP, CEH, Sec+, Linux+

3 Who are we?

4 Log Analysis with GAWK Why? I have Splunk/ELK
What happens when you don’t? Simple, Efficient, Common

5 What is GAWK? GNU AWK AWK: Aho, Wienberger, Kernighan
Data Driven Text Processing and Reporting Language Pattern Search + Action

6 GAWK Terminology Records and Fields Records: One Line
Fields: Records Contain Fields of Data

7 Well Formatted Data Logs are (usually) well formatted
How are records defined? Can you break the records into fields? What is/are the field separator(s)? How generically can you describe the data?

8 Well Formatted Data (Squid Proxy Logs)
|63731| |GET| /msdownload/update/v3/static/trustedr/en/disallowedcertstl.cab?8303c 1e235fc3944|HTTP/1.1|200|4622|237|4385|-|Microsoft-CryptoAPI/6.1

9 Well Formatted Data (Squid Proxy Logs)
|63731| |GET| /msdownload/update/v3/static/trustedr/en/disallowedcertstl.cab?8303c 1e235fc3944|HTTP/1.1|200|4622|237|4385|-|Microsoft-CryptoAPI/6.1 Client IP & Source Port

10 Well Formatted Data (Squid Proxy Logs)
|63731| |GET| /msdownload/update/v3/static/trustedr/en/disallowedcertstl.cab?8303c 1e235fc3944|HTTP/1.1|200|4622|237|4385|-|Microsoft-CryptoAPI/6.1 Timestamp

11 Well Formatted Data (Squid Proxy Logs)
|63731| |GET| /msdownload/update/v3/static/trustedr/en/disallowedcertstl.cab?8303c 1e235fc3944|HTTP/1.1|200|4622|237|4385|-|Microsoft-CryptoAPI/6.1 HTTP Verb & URL

12 Well Formatted Data (Squid Proxy Logs)
|63731| |GET| /msdownload/update/v3/static/trustedr/en/disallowedcertstl.cab?8303c 1e235fc3944|HTTP/1.1|200|4622|237|4385|-|Microsoft-CryptoAPI/6.1 Status Code & Bytes Transferred

13 Well Formatted Data (Squid Proxy Logs)
|63731| |GET| /msdownload/update/v3/static/trustedr/en/disallowedcertstl.cab?8303c 1e235fc3944|HTTP/1.1|200|4622|237|4385|-|Microsoft-CryptoAPI/6.1 Referrer & User Agent

14 Well Formatted Data? (tcpdump)
00:15: IP > : UDP, length 21 00:15: IP > : Flags [P.], seq 1:64, ack 63, win 11, options [nop,nop,TS val ecr ], length 63 00:15: IP > : Flags [.], ack 64, win , options [nop,nop,TS val ecr ], length 0 00:15: IP > : UDP, length 1350 00:50: ARP, Request who-has tell , length 28

15 Well Formatted Data? (tcpdump)
00:15: IP > : UDP, length 21 00:15: IP > : Flags [P.], seq 1:64, ack 63, win 11, options [nop,nop,TS val ecr ], length 63 00:15: IP > : Flags [.], ack 64, win , options [nop,nop,TS val ecr ], length 0 00:15: IP > : UDP, length 1350 00:50: ARP, Request who-has tell , length 28

16 Well Formatted Data? (tcpdump)
00:15: IP > : UDP, length 21 00:15: IP > : Flags [P.], seq 1:64, ack 63, win 11, options [nop,nop,TS val ecr ], length 63 00:15: IP > : Flags [.], ack 64, win , options [nop,nop,TS val ecr ], length 0 00:15: IP > : UDP, length 1350 00:50: ARP, Request who-has tell , length 28

17 Well Formatted Data? (tcpdump)
00:15: IP > : UDP, length 21 00:15: IP > : Flags [P.], seq 1:64, ack 63, win 11, options [nop,nop,TS val ecr ], length 63 00:15: IP > : Flags [.], ack 64, win , options [nop,nop,TS val ecr ], length 0 00:15: IP > : UDP, length 1350 00:50: ARP, Request who-has tell , length 28

18 Well Formatted Data? (tcpdump)
00:15: IP > : UDP, length 21 00:15: IP > : Flags [P.], seq 1:64, ack 63, win 11, options [nop,nop,TS val ecr ], length 63 00:15: IP > : Flags [.], ack 64, win , options [nop,nop,TS val ecr ], length 0 00:15: IP > : UDP, length 1350 00:50: ARP, Request who-has tell , length 28

19 Well Formatted Data? (tcpdump)
00:15: IP > : UDP, length 21 00:15: IP > : Flags [P.], seq 1:64, ack 63, win 11, options [nop,nop,TS val ecr ], length 63 00:15: IP > : Flags [.], ack 64, win , options [nop,nop,TS val ecr ], length 0 00:15: IP > : UDP, length 1350 00:50: ARP, Request who-has tell , length 28 Which fields are the source IP / destination IP

20 Well Formatted Data? (tcpdump)
00:15: IP > : UDP, length 21 00:15: IP > : Flags [P.], seq 1:64, ack 63, win 11, options [nop,nop,TS val ecr ], length 63 00:15: IP > : Flags [.], ack 64, win , options [nop,nop,TS val ecr ], length 0 00:15: IP > : UDP, length 1350 00:50: ARP, Request who-has tell , length 28

21 Well Formatted Data? (tcpdump)
00:15: IP > : UDP, Field Separator = [ .] Fields 1 & 2: (Timestamp): 00:15: Field 3: (Network Protocol): IP Fields 4, 5, 6, 7: (Source IP): Field 8: (Source Port): 443

22 Well Formatted Data? (tcpdump)
00:15: IP > : UDP, Field Separator = [ .] Fields 10, 11, 12, 13: (Destination IP): Field 14: (Destination Port): : Field 15: (Transport Protocol): UDP,

23 Well Formatted Data? (tcpdump)
00:15: IP > : UDP, Field Separator = [ .:,] Fields 1, 2, 3, 4: (Timestamp): Field 5: (Network Protocol): IP Fields 6, 7, 8, 9: (Source IP): Field 10: (Source Port): 443

24 Well Formatted Data (tcpdump)
00:15: IP > : UDP, Field Separator = [ .:,] Fields 12, 13, 14, 15: (Destination IP): Field 16: (Destination Port): Field 18: (Transport Protocol): UDP

25 Well Formatted Data (tcpdump)
00:15: IP > : UDP, Field Separator = [ .:,] Fields 12, 13, 14, 15: (Destination IP): Field 16: (Destination Port): Field 18: (Transport Protocol): UDP

26 Well Formatted Data (tcpdump)
00:15: IP > : UDP, Field Separator = [ .:,]+ Fields 12, 13, 14, 15: (Destination IP): Field 16: (Destination Port): Field 17: (Transport Protocol): UDP

27 Well Formatted Data? (tcpdump)
00:15: IP > : UDP, length 21 00:15: IP > : Flags [P.], seq 1:64, ack 63, win 11, options [nop,nop,TS val ecr ], length 63 00:50: ARP, Request who-has tell , length 28 IF Field 3 == “IP” AND Field 15 == “UDP”: UDP Packet IF Field 3 == “IP” AND Field 15 == “Flags”: TCP Packet IF Field 3 == ”ARP”: ARP Packet Notice: packet length is always the last field

28 Challenges in Describing Data
Web Server Access Logs [16/May/2019:03:16: ] "GET /favicon.ico HTTP/1.1" "-" "Mozilla/5.0 (Macintosh; Intel Mac OS X 10.14; rv:66.0) Gecko/ Firefox/66.0" [16/May/2019:03:16: ] "GET /favicon.ico HTTP/1.1" "-" " "Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/ (KHTML, like Gecko) Chrome/ Safari/537.36"

29 Challenges in Describing Data
Custom Logs 02/27/16 12:25:53 PM, ,root,chester

30 Challenges in Describing Data
Custom Logs 02/27/16 12:25:53 PM, ,root,chester 02/27/16 12:27:32 PM, ,root,mko0,lp-.;[=

31 Challenges in Describing Data
Custom Logs 02/27/16 12:25:53 PM, ,root,Y2hlc3Rlcg== 02/27/16 12:27:32 PM, ,root,bWtvMCxscC0uO1s9

32 GAWK Syntax $ gawk –F <field separator> ‘ SEARCH { ACTION }’ FILE Default field separator: [ \t]+ SEARCH: Which Records (PCRE Pattern) ACTION: What To Do

33 Fields and Records Fields are stored in variables: $1, $2, $3 …
$0 = The entire record NF = Total Number of Fields $NF = The Last Field NR = Record Number

34 Demonstration gawk ‘{print $0}’ contacts.txt
gawk ‘{print NF}’ contacts.txt gawk ‘{print $NF}’ contacts.txt gawk ‘{print NF}’ pcap.txt gawk ‘{print $NF}’ pcap.txt

35 Search Syntax /PCRE/ : Search for pattern in record
$1 == “PATTERN” : First field is an exact pattern match $1 != “PATTERN” : First field does not exactly match pattern $1 ~ /PCTE/ : First field matches a regex BEGIN : Perform Action Before Reading Records END : Perform Action After Reading Records

36 Demonstration gawk '/Amelia/{print $0 } ' contacts.txt
gawk ' /A/{print $0}' contacts.txt gawk ' $4 == "A" {print $0}' contacts.txt gawk ' $1 ~ /^A/ {print $3} ' contacts.txt gawk ' /gmail.com/{print $1} ' contacts.txt gawk -F'[ :,.]+' ' $17 == "UDP" {print $NF } ' pcap.txt

37 Variables Undeclared Letters, Digits, Underscore
Cannot Begin with Digit

38 Demonstration gawk -F'[ ,.:]+' '{print $17}' pcap.txt
$17 == "UDP" {UDP = UDP + $NF} END { print UDP} ' pcap.txt gawk -F'[ ,.:]+' ' $17 == "Flags" {TCP += $NF } END {print "UDP: " UDP "\nTCP: " TCP} gawk -f pcap-1.gawk pcap.txt

39 Arrays proto[“UDP”] = 1371 proto[“TCP”] = 63
for (p in proto) print p: proto[p]

40 Demonstration pcap-2.gawk BEGIN { FS = "[ ,.:]+" }
$17 == "UDP" { proto["UDP"] += $NF } $17 == "Flags" { proto["TCP"] += $NF } END { for (p in proto) print p": " proto[p] }

41 printf printf “%s\n”, $1 printf “%-15s%s\n”, $1, $3
%s : string %15s : string right justified on 15 characters %-15s : string left justified on 15 characters %d : decimal %f : float %10.2f : float right justified on 10 character, 2 places after decimal

42 Additional Functions Convert Epoch Timestamp to Human Readable
strftime(“%c”,TIMESTAMP) Regex Substitution gensub(SEARCH,REPLACE,INSTANCE,INPUT)

43 Demonstration head access.log | gawk -F'|' '{print $3}'
head access.log | gawk -F'|' '{print strftime("%c",$3)}' head access.log | gawk -F'|' '{print gensub("https?://([^/]+)/.*","\\1",1,$5)} ' QUESTIONS: Which client is generating the most traffic? gawk -F'|' ' { LENGTH[$1] += $9 } END { for (IP in LENGTH) printf "%-15s : %d\n", IP , LENGTH[IP] } ' access.log Which website is that IP going to? -> squid-1.gawk


Download ppt "Log Analysis with GAWK Back to Basics."

Similar presentations


Ads by Google