DISA Cyclops Program
CENTAUR Background DISA has collected unsampled flow data from every Internet access point for over 15 years through a program called CENTAUR DISA advertises routes for roughly 8% of the Internet’s IPv4 address space, with CenturyLink as their ISP CENTAUR is SiLK-based and has thousands of users globally
2017 Goals Collect unsampled network flow data from the DoD’s ISP infrastructure via “Data as a Service” (DaaS) Focus on visibility into all inbound traffic to harvest threat intelligence Scanning Activity Backscatter Acquire transport for flow data through the ISP Reduce cost Improve scalability and adaptability of metadata collection Test the “Security as a Service” model
Solution: “Cyclops” 200 Mb/s dedicated backhaul network from each IAP Each site can handle maximum theoretical load at the IAP Rapidly expandable by at least 8x Flow data and additional metadata SiLK (CENTAUR) and Argus (NGS) compatible data HTTP Headers DNS Data Client & Server Banners SSL & TLS Certificate collection Indicators & Warnings PCAP if required (it is not at this time)
“Core Values” Tool, data format, and vendor agnostic Requirements are for data and capabilities Data additions, additional processes, and changes take less than 60 days worst case Provides the data feed and streaming analysis of data, not the forensic data store
Cyclops – Concept Leveraged DHS IPSS & Einstein experience Concept Provide managed security service within ISP domain Transfer acquisition burden from DoD to ISP Leverage commercial data centers – only use what’s needed Utilize commercial ISP backbone to deliver data Establish foundation to expand “security and data as a service” model Maintain existing Internet availability SLAs Provide indications & warnings to support threat intelligence
Cyclops Capabilities Dedicated DoD commercial hosting space for each Internet point Open Sensor Platform Infrastructure Exceed performance requirements for multiple 10Gbps connections Designed with fault tolerance Rapid expansion without re-engineering Vendor agnostic and requirements based Copy traffic to RedJack sensors Capabilities Dynamically load balance both active and passive capabilities Filter/drop traffic and still generate flows Tag/Label packets w/ context to sensors
“As-a-Service” Challenges Solutions Outcomes Scalability & Performance Transition the scalability and performance risk via a Single Provider SLA Technologies become scalable with bandwidth increases Funding Pay by the Glass Managed Service; utilize saved resources to advance state-of-the-art Predictable O&M based cost with overall cost reductions; more resources directly address advanced threats Keeping Pace with “State of the Art” Requirements Driven Commercially Agile Solution DISA can focus on how to use the data not how to get it
”General” Security Traffic Flow ISP ? Dropping traffic? FIREWALL Logs for SIEM “Next Gen” Flow Data Web Filter Enterprise
Consequences Flow data is “post-filtering” for a number of practical reasons Records of blocks and filtering typically land “somewhere else” We don’t have ISP insights at all Leveraging Netflow from routers means sampling What we really want is all our Internet communications to understand the threat
”General” Security Traffic Management ISP ISP Insights FIREWALL Logs for SIEM “Next Gen” Flow Data Web Filter Enterprise
Consequences We see flow data from the Internet, before we filter it, with any ISP “insights” We see all the data we send to the Internet after we filter it We have what we need on the inbound feed to understand the entire threat And we have 100X more data now…
What the flow data looks like Routed Scans “Legitimate” Traffic “Distributed Scanning” Scans: Vast, unanswered requests from individual hosts Distributed Scanning: Vast, unanswered requests from roughly as many hosts as there are requests
What are they looking for? *
Technical Challenge: Scaling and Adapting Scans must be summarized! You could see more data from a single scan than you see in a month! The effect of including all scan flows may be an overall reduction DNS context is absolutely essential “amazonaws.com” could be anyone! Support bandwidth expansion with minimal impact
Technical Challenge: Compatibility We support: SiLK Argus* Data serialization formats This is a lot of data to be transforming… Collect it in whatever format you’ll be storing it
Technical Challenge: Malicious Activity Scan records are SiLK flows with the destination addresses masked to a /16 Added bidirectional tagging to SiLK We have not yet determined the optimal way to deal with “distributed scans” and one of the reasons we are here is for ideas