Vangel Ajanovski Institute of Faculty of Natural Sciences and Mathematics Saints Cyril and Methodius University Skopje, Macedonia
Organization of resources
Current situation Organizational structure Head of computing centre 1 system administrator Responsibilities Network in 2 buildings (1km apart) 120 computers in labs for students, many laptops 50 workstations for staff, many laptops 20 servers All services The sitiation is not going to change
What we use Self-built agent for monitoring labs Computer activity monitoring Screen viewing MRTG for monitoring network activity Links between buildings, labs and servers ICmyNet.Flow (with help from our colleages) Munin
What we need An integrated tool to monitor everything People, computing and network All at the same time Usual problems we have Frequent power outage Some link is down or is congested Clients do not receive DHCP response People always report this as we don't have internet or we can't send
Ideal ? tool Both administrative and general user interface End users should be able to see what is going on and if something is broken Administrator should not get into position to be surprised if something is not working Automated ticketing of incidents Instant sms reporting of critical incidents Recognition of repeating incidents and automatic offer of accepted solution Notification about fixes to critical incidents
Ideal ? tool... Again imagine the single system/network admin Rushing arround and between buildings helping users When he is away or he is back in office, what is critical? Intelligent incident reporting Overview level operational dashboard(s) Reports on which incidents were not solved Reporting and monitoring of consequences Not reasons!