NOC Tools Donal O’Cearbhaill HEAnet Ltd.
Ireland’s National Education and Research Network Provides Internet services to Irish Universities Broadband for Schools
Free ‘always on’ broadband connectivity to Schools 3 Year Agreement –Dept of Education/Dept of Communication/TIF 3,925+ Schools 7 Access Providers HEAnet backbone network Onward connectivity to Internet & Educational Networks HEAnet Managed Services: Network; Security; Broadband for Schools
Challenges 4,000 schools Highly contended links A lot of satellite connections SLA/Contract enforcement
Installation Rate
Monitoring/ISP Infrastructure 28 Debian/Ubuntu servers 4 Fibrenetix disk arrays –Disk based backup –rsync & application level dumps –Syslog nodes PostgreSQL database Aggregation Routers –7301 –PPPoE –GRE Border/Services Routers –6500, 3750
Tools SmokePing Nagios Rancid Cacti Netflow
SmokePing Latency measurement tool Runs probes in parallel >3,800 hosts RRD backend –Reporting Historical view Acceptance testing Tuning –FPing timeouts decreased –Total number of probes reduced –Satellite frequency reduced
Nagios 4,131 services on 3,905 hosts Top 5 number of hosts on nagios.org Populated by SmokePing and memcache –Nagios runs checks serially –>1 hour vs. 15 mins Nagios populates –sidebar alarms –Schools Up Graph
Rancid Really Awesome New Cisco confIg Differ 3,296 Router configs Maintains history of changes –Mails changes
Cacti 3,900 hosts Data gathering –SNMP –External Perl scripts Graph templating Database driven Cricket: 27 mins –Perl Cacti: <5 mins –Cactid –Custom multithreaded C application
Cacti Weathermap
Interconnects
Netflow NfSen is a graphical web based front end for the nfdump netflow tools Query abuse reports Usage reporting
Reporting Daily Reports DNS log reporting Report infected PCs –Top MX lookups –Misconfigurations –Active Directory Netflow –IPs –Schools usage Gigabytes downloaded by schools on 22/03/07: 332 Gigabytes uploaded by schools on 22/03/07 : 48 Total MegaBytes downloaded for Digiweb Satellite: Total MegaBytes uploaded for Digiweb Satellite: 1202 Total MegaBytes downloaded for Digiweb Wireless: Total MegaBytes uploaded for Digiweb Wireless: Total MegaBytes downloaded for ESATBT ADSL: Total MegaBytes uploaded for ESATBT ADSL: 6632 Total MegaBytes downloaded for HSData Wireless: 3047 Total MegaBytes uploaded for HSData Wireless: 575 …..
Logging Syslog server per PoP –Servers –Routers Logcheck –Logfile scanner IP to school identifier –Mapping IP to school
Server Monitoring SSH keys –Sharing keys/fingerprints –High overhead SNMP –Less configurable Memcache –Local Perl script –Easy to rollout –Load –Disk Space –Monitor Processes
Memcache Distributed memory caching system Low overhead Speed up dynamic database-driven websites by caching data and objects in memory Developed for LiveJournal –Slashdot –Wikipedia –SourceForge Schools –Nagios –Maps –Server status
Subversion Modern replacement for CVS Provisioning System –Configs ViewCVS Checkins get mailed Schools-noc –Scripts stored on every server –Automatically updated –cron.d
Sidebar Nagios polled every minute Populated into memcache Sidebar alarms Pubcookie single sign-on
Provisioning System Services provisioned –CPE router config –Nagios –RADIUS –Cacti –Cisco ACS (TACACS+) –SmokePing –Fortigate (Content filtering) –Maps –DNS –Webhosting
Provisioning System Text::Template templating system Data stored in authoritative database PostgreSQL’s INET type is brilliant! Perl scripts generate configlets Added to Subversion Perl/Shell provisioning agents handle service restarts etc. Ability to stop all provisioning
Provisioning System Structure
Google Maps
Random things we’ve encountered Predictable traffic levels Smokeping, Nagios and Cricket/Cacti take a lot of tuning to monitor our network Difficult to achieve high bandwidth and high level of reliability in transparent content filter