Master’s Thesis, Mikko Nieminen Espoo, February 14th, 2006 TROUBLESHOOTING IN LIVE WCDMA NETWORKS Supervisor: Professor Heikki Hämmäinen.

Slides:



Advertisements
Similar presentations
Testing Relational Database
Advertisements

Mobile Switching Systems Unit, L M Ericsson in Finland
System Integration Verification and Validation
Layer 3 Messaging and Call Procedures
LTE-A Carrier Aggregation
S MARTPHONES IMPACT IN 3G AND F UTURE LTE N ETWORKS Student : Adnan Basir (84906S) Supervisor : Jyri Hämäläinen Instructor : Timo Halonen (Nokia Siemens.
Chapter 19: Network Management Business Data Communications, 5e.
Rev A Antti Miettinen H.248 Gateway Control Protocol Signaling Traffic Related Protocol Analysis Antti Miettinen S Thesis Seminar on.
Chapter 4 Quality Assurance in Context
NERC Lessons Learned Summary December NERC lessons learned published in December 2014 Three NERC lessons learned (LL) were published in December.
Signaling Measurements on the Packet Domain of 3G-UMTS Core Network G. Stephanopoulos (National Technical University of Athens, Greece) G. Tselikis (4Plus.
Chapter 19: Network Management Business Data Communications, 4e.
1 ITC242 – Introduction to Data Communications Week 12 Topic 18 Chapter 19 Network Management.
Overview.  UMTS (Universal Mobile Telecommunication System) the third generation mobile communication systems.
Best Practices – Overview
Testing - an Overview September 10, What is it, Why do it? Testing is a set of activities aimed at validating that an attribute or capability.
Capacity and Load Sharing in Dual-Mode Mobile Networks
Network security policy: best practices
Computer System Lifecycle Chapter 1. Introduction Computer System users, administrators, and designers are all interested in performance evaluation. Whether.
Annie Griffith December 2007 December 2007 Gemini OSU - UKLC Update.
Software Testing Verification and validation planning Software inspections Software Inspection vs. Testing Automated static analysis Cleanroom software.
1 © 2007 Nokia Optimization of EGPRS Link Adaptation Thesis work presentation Author: Jussi Nervola Supervisor:Professor.
Failure Spread in Redundant UMTS Core Network n Author: Tuomas Erke, Helsinki University of Technology n Supervisor: Timo Korhonen, Professor of Telecommunication.
1 BTEC HNC Systems Support Castle College 2007/8 Systems Analysis Lecture 9 Introduction to Design.
Chapter 8: Systems analysis and design
Soc Classification level 1© Nokia Siemens Networks Keyword-Driven Automated performance testing of User Interfaces: a Case Study for the Open Element Management.
Evaluation of Signal Processing Resource Management Algorithms in 3G Markku Piiroinen S tietoverkkotekniikan diplomityöseminaari
Slide title In CAPITALS 50 pt Slide subtitle 32 pt Cingular UMTS Phase II Markets Pre-Launch RF Activities.
Mikko Viitanen Measuring Media Gateway Software Efficiency Using Performance Monitor Counters Mikko Viitanen S Thesis seminar on networking.
1© Nokia Siemens NetworksSeminaariesitelmä / Jukka Valtanen / Transport Formats in UMTS Radio Network Controller’s Software Implementation Seminaariesitelmä.
06/09/2005Master's Thesis Seminar - Jesse Kruus 1 Analyzing and Developing Base Load for WCDMA Base Station Automated Testing System Thesis written at.
©Ian Sommerville 2004Software Engineering, 7th edition. Chapter 22 Slide 1 Software Verification, Validation and Testing.
Economics of automation in functional testing of network service platforms Supervisor: Prof. Heikki Hämmäinen Instructor: M.Sc. Reima Kaitajärvi Mikko.
1 © 2006 Nokia pullola_ ppt / Extending Base Station Active Radio Link Set for Improved Uplink Scheduling Esa-Pekka Pullola Supervisor:
Rev A Mikko Suominen Enhancing System Capacity and Robustness by Optimizing Software Architecture in a Real-time Multiprocessor Environment.
Slide title In CAPITALS 50 pt Slide subtitle 32 pt Dynamic and Persistent Scheduling for Voice over IP Traffic in the Long-Term Evolution Uplink Master’s.
Business Id © NetHawk All rights reserved. Confidential April 2005NetHawk NetHawk Quality of Service products Markus Ahokangas, MSc Product.
UMTS: Universal Mobile Telecommunications System
1 © 2007 Nokia Masters Thesis Seminar.ppt / / CK Automated testing of congestion within the UMTS cellular network Masters’ Thesis Seminar Claes.
1 LTE standards Status for this work in 3GPP and what next for the Future Francois COURAU 3GPP TSG RAN Chairman.
1 © NOKIA Functionality and Testing of Policy Control in IP Multimedia Subsystem Skander Chaichee HUT/Nokia Networks Supervisor: Professor Raimo.
A Study of Non Real Time Radio Bearer Packet Data Performance in UMTS Radio Access Network Seema Gyanwali Supervisor: Professor Sven Gustav Häggman Instructor:
IEEE Communications Magazine February 2006 Stefan Parkvall, Eva Englund, Magnus Lundevall, and Johan Torsner, Ericsson Research 2015/12/31.
Slide title In CAPITALS 50 pt Slide subtitle 32 pt Risk-based regression testing in a telecommunication system node Master’s thesis presentation
Slide title In CAPITALS 50 pt Slide subtitle 32 pt ANALYSING EFFECTS OF MALFUNCTIONS ON THE PERFORMANCE OF UMTS RADIO ACCESS NETWORKS Author: Antti Keintola.
Network design Topic 6 Testing and documentation.
WCDMA RAN Protocols and Procedures Chapter 8 Iu Interface - RANAP Protocol In this chapter we will look at the considerations that must be taken into.
Mobile Radio Network. Contents Introduction and background Literature review: Architecture of Mobile Radio Access Networks State of the art in management.
Rev PA1 WCDMA RAN PM - Supplemental Information 3/3/20051 WCDMA RAN PM Supplemental Information.
Equipements Réseaux Présentation. Architecture UMTS.
Software Quality Assurance and Testing Fazal Rehman Shamil.
ANALYSIS PHASE OF BUSINESS SYSTEM DEVELOPMENT METHODOLOGY.
Slide title In CAPITALS 50 pt Slide subtitle 32 pt Capacity Management in WCDMA.
MD8430A DEMO CONFIGURATION Demo Procedure 1 - Full stack data testing The demonstration scenario will show the following:  Full attach and default bearer.
Long Term Evolution Protocols
16th December 2008 New in NetHawk M5 – Air interface monitoring One tool for both air and network interface real time monitoring.
1 Wireless Networks Lecture 21 WCDMA (Part I) Dr. Ghalib A. Shah.
How to fix Missing Windows Sockets Registry Entries required for Network Connectivity in Windows 10 /pages/Reimage- Repair- Tool/ /u/6/b/
Slide title :40-47pt Slide subtitle :26-30pt Color::white Corporate Font : FrutigerNext LT Medium Font to be used by customers and partners : Arial HUAWEI.
Signaling Flow of WCDMA Advanced Radio Interfaces ZTE University.
NetHawk EYE Network Monitoring System Introduction
Chapter 19: Network Management
SOFTWARE TESTING Date: 29-Dec-2016 By: Ram Karthick.
3G architecture and protocols
Universal Mobile Telecommunication System (UMTS)
Network Management Functions
“Predictive Mobile Networks”
UMTS Terrestrial Radio Access Network Architecture
Optimisation of Softer Handover in UMTS Network
Author: Mikko Rönö Istructor: M.Sc. Jussi Setälä
Presentation transcript:

Master’s Thesis, Mikko Nieminen Espoo, February 14th, 2006 TROUBLESHOOTING IN LIVE WCDMA NETWORKS Supervisor: Professor Heikki Hämmäinen

Background to the Study The number of live WCDMA networks is growing quickly. The first commercial Third Generation Partnership Project (3GPP) compliant network, J-phone, was opened in December By October of 2005, there were 80 live commercial WCDMA networks and the amount of subscribers was nearly 40 million. By that time, around 140 licenses had been awarded for WCDMA, the current WCDMA license holders having more than 500 million subscribers in their Second Generation (2G) networks. Especially in Europe and Asia, WCDMA network deployment after successful field trials and service launches has entered a new critical stage: the phase of network optimisation and network troubleshooting.

Research Problem As the amount of WCDMA subscribers quickly increases, operators and equipment vendors are facing big challenges in maintaining and troubleshooting their networks. –We may raise the question of how one can efficiently narrow down the root causes of the problems when there is a huge amount of subscribers and traffic in a live WCDMA network. –What are the principles of examination of the fault scenarios and narrowing down the problem investigation into logical manageable pieces? –Which are the tools and methods that are in practice used in WCDMA network troubleshooting today? In order tackle these questions and challenges, this Thesis presents a Framework for KPI-triggered troubleshooting in live WCDMA networks. The applicability of the Framework is demonstrated by applying it to a selection of real troubleshooting cases that have occurred in commercial WCDMA networks.

Scope of the Study This study concentrates on the KPI-triggered problems in live WCDMA networks. In general, the faults can be classified into three categories –Critical, which are emergency problems that require immediate actions, –Major (which we refer in this study as KPI-triggered problems) –Minor which do not affect the services of the network. The viewpoint of is from the equipment vendor’s side, the main objective being to create guidelines for troubleshooting experts and technical support personnel of WCDMA network manufacturers in order to perform troubleshooting and narrow the problems down following a defined logic. This Thesis mainly concentrates on WCDMA network troubleshooting from a Radio Access Network perspective. The reasoning behind this approach is that the UTRAN covers most of the WCDMA specific functionality and intelligence, and therefore brings the majority of the troubleshooting challenges also.

Research Methods This Thesis is mainly based on the study of various technical specifications and interviews of WCDMA network troubleshooting experts. The main literature sources are the 3GPP specifications of release 99, since the majority of the live WCDMA networks were based on 3GPP release 99 during the writing of this Thesis. It can be noted that 3GPP release 4 networks are currently gaining foothold in the live WCDMA networks. However, there are only minor differences in the Radio Access functionality of the afore-mentioned two 3GPP specification releases.

Structure of the Thesis Introduction to WCDMA Networks UTRAN Protocols Call Trace Analysis Key Performance Indicators Framework for KPI-Triggered Troubleshooting Cases from Live WCDMA Networks

WCDMA network architecture UTRAN RNC CORE NETWORK Node B USIM ME UE MSC/VLR SGSN GGSNGMSC HLR AuC EIR PSTN RNC Node B cell INTERNET

UTRAN architecture UTRAN Iu-CS Uu User Equipment (UE) IurIub RNC Node B RNC Core Network (CN) SGSN 3G MSC Iu-PS

UMTS Bearer Services RRC : SAP Non-Access Stratum Access Stratum UE RANCN UuIu Radio Access Bearer Signalling connection RRC connectionIu connection Radio bearer service Iu bearer service

Summary of Protocols (CS user plane) WCDMA L1 RLC MAC PDH/SDH ATM AAL2 FP RNCNode BUEMSC I ub IuIu UuUu RLC MAC PDH/SDH ATM AAL2 FP WCDMA L1 CS application and coding PDH/SDH ATM AAL2 Iu-UP protocol PDH/SDH ATM AAL2 CS application and coding Iu-UP protocol

Summary of Protocols (UE control plane) PDH/SDH ATM AAL2 FP RNCNode BUECN WCDMA L1 I ub IuIu UuUu RRC RLC MAC PDH/SDH ATM AAL2 FP WCDMA L1 RRC RLC MAC PDH/SDH ATM AAL5 SSCOP RANAP MTP3b SCCP PDH/SDH ATM AAL5 SSCF-NNI RANAP MTP3b SCCP SSCOP SSCF-NNI NAS

MT Call MO Call RRC Connection Establishment Radio Access Bearer Establishment Paging User Plane Data Flow Overview of WCDMA Call Setup

RRC connection establishment (DCH) 1. RRC CONNECTION REQUEST UERNCNode B 2. Admission Control 4. Start RX 9. Start TX 3. RADIO LINK SETUP REQUEST 5. RADIO LINK SETUP ESPONSE 10. RRC CONNECTION SETUP 11. L1 SYNCH 13. RRC CONNECTION SETUP COMPLETE RRC C-NBAP ALCAP 6. ESTABLISH REQUEST ALCAP 7. ESTABLISH CONFIRM RRC 12. RL RESTORE INDICATION D-NBAP RRC 8. UPLINK & DOWNLINK SYNC FP

Protocol Analysers CompanyProductHome Country Nethawk [47]3G AnalyserFinland Agilent [48]Signaling AnalyzerUnited States Tektronix [49]K15United States Radcom [50]Performer AnalyserIsrael Acterna [51]Telecom Protocol AnalyzerUnited States

Active phase Access phase RRC Connection Events and KPIs RRC CONNECTION REQUEST RRC CONNECTION SETUP COMPLETE UERNCCN RRC CONNECTION SETUP Event 1RRC_CONN_ATT_EST incremented Event 3RRC_CONN_ACC_COMP incremented Event 2RRC_CONN_ATT_COMP incremented Event 1 Event 2 Event 3 Event 4IU RELEASE COMMAND Event 4RRC_CONN_ACT_COMP incremented Setup phase Sum of RRC_CONN_STP_COMP Sum of RRC_CONN_STP_ATT x 100 %RRC Setup Complete Rate = Sum of RRC_CONN_ACC_COMP Sum of RRC_CONN_STP_ATT x 100 %RRC Establishment Complete Rate = Sum of RRC_CONN_ACC_COMP Sum of RRC_CONN_ACT_COMP x 100 %RRC Retainability Rate =

RRC connection Phases Attempts Setup complete Access Complete Active Complete Active Release Active Failures ActiveAccessSetup Setup Failures, Blocking Access Failures Access RRC Drop Success Phase:

Sum of RAB_STP_COMP Sum of RAB_STP_ATT x 100 % RAB Setup Complete Rate = Sum of RAB_ACC_COMP Sum of RAB_STP_ATT x 100 % RAB Establishment Complete Rate = Sum of RAB_ACT_COMP Sum of RAB_ACC_COMP x 100 %RAB Retainability Rate = Other WCDMA network KPIs Sum of RAB_ACC_COMP Sum of RRC_CONN_STP_ATT x 100 %CSSR = Sum of RAB_ACT_COMP Sum of RRC_CONN_STP_ATT x 100 %CCSR =

Fault ClassDescriptionExamples A-CRITICAL Total or major outages that are not avoidable with a workaround solution. Critical (emergency duty contacted) problems severely affect service, capacity/traffic, billing, and maintenance capabilities and require immediate corrective action, regardless of time of day or day of the week as viewed by the operator. System restart, all links down Simultaneous restarts of active computer units More than 50 per cent of traffic handling capacity out of use Subscriber related network element functionality is not working B-MAJOR The problem leads to degradation of network performance or the fault affects traffic randomly. Major problems cause conditions that seriously affect system performance, operation, maintenance, and administration and require immediate attention as viewed by the operator. The urgency is less than in critical situations because of a lesser immediate or impending effect on system performance, customers, and the customers operation and revenue. Capacity/quality related functionality is not working as supposed to Problems seriously affecting end user service, but avoidable with a workaround solution Configuration changes (network, HW, and SW) are not working as supposed to Subscriber related functions are not working completely Performance measurement, alarm management or activation of a new feature fails Single restart of computer units C-MINOR Minor fault not affecting operation or service quality Other problems that the operator does does not view as critical or major are considered minor. Minor problems do not significantly impair the functioning of the system or affect the service to customers. These problems are tolerable during system use. Failures not seriously affecting traffic Errors in operating commands syntax Cosmetic errors in operational commands or statistics output Minor errors in documentation Fault Classification

Framework for KPI-Triggered Troubleshooting Framework is designed for investigating and soelving B-MAJOR level i.e. “KPI-triggered” faults Before applying the Framework –The general alarm status of the network has been checked. No clear network alarms pointing to the root cause of the fault can be detected. –Traces from external interfaces of RNC have been taken with a protocol analyser in order to record the fault scenario. Also RNC internal trace has been taken when the fault took place. –The basic fault scenario has been analysed and clarified.

Transmission specific Node B specific Service specific RNC specific CN specific Country specific UE specific Yes No Use RNC Performance Tester to generate load in test bed and perform analysis. In case of MVI environment, check IOT results and contact foreign vendor. Investigate own vendor’s default parameters and compare implementation againts 3GPP specifications. Compare own default parameters with other default parameters of other vendors. Execute air interface protocol analysis and drive tests. Analyse network element and interface specific alarms, parameters, capacity, logs and traces. Take specific actions depending on problem scope (refer to detailed Framework notes). Has average network load increased significantly and/or does the problem occur at a specific time of day? Is the problem new in the operator network? NoYes Analyse and investigate the differences between the working and faulty conditions. No Yes Perform simulation of the fault in test bed. Does the fault still occur? No Yes No New SW, HW, parameters, UE model or feature introduced? Is the fault operator specific? Analyse the traces. Investigate fault scope. Perform simulation of the fault with reference conditions. Does the fault still occur? A BC D F H G J I Q KLMNO R E P

Case: Increased AMR call drop rate A decrease in RAB Retainability Rate KPI for AMR telephony service was experienced during the last three months in an operator network. The decrease was around 2% on each RNC compared to the time when the network was performing well. Actions that had already been taken with no positive effect: –Soft reset for all Node Bs and for all RNCs –Hard reset and re-commissioning of Node Bs –Alarms checked and no major alarms found

Is the problem new in the operator network? Yes Analyse and investigate the differences between the working and faulty conditions. No New SW, HW, parameters, UE model or feature introduced? Perform simulation of the fault in reference conditions. Does the fault still occur? A C G E I. II. III. IV. Case: Increased AMR call drop rate

Solution –The short term solution was that the parameter for planned maximum downlink transmission power of all the Node Bs in the operator network was changed to the default value of 34 dBm. In this way, the problem disappeared in the operator network. –The long term solution was to implement a fix of the bug into the next software release of the Node B.

Results As a result of thorough research conducted for this Thesis, a Framework for KPI- triggered troubleshooting for live WCDMA networks was developed. The Framework is mainly targeted for WCDMA network equipment vendors, to help them in solving major service affecting faults occurring in the live WCDMA networks of today. Troubleshooting cases from live WCDMA networks were solved using the Framework developed, in order to verify the results and test the applicability and practicality of the Framework.

Assessment of the results The applicability and relevance of the troubleshooting Framework was tested against three different fault cases from live WCDMA networks. The results were fairly promising since all the cases were successfully solved by utilising the Framework. The Framework was found to be quite practical and suitable for solving KPI-triggered problems in live WCDMA networks. However, it must be taken into account that the Framework was tested with a limited number of cases, because of time and resource limitations. If more extensive testing and verification with a large number of cases would be applied, there is a possibility that optimisations and improvements to the Framework could be done. Still, the basic logic of the Framework was proven with reasonable relevance. The results presented in this study can be easily tested in the future against a number of cases in order to verify the results with more extensive statistical reliability.

Exploitation of the results The results of this study will be used as source material in the development of UTRAN troubleshooting competence development and advanced learning solution creation, targeted for troubleshooting experts and customer support engineers of one of the leading WCDMA network equipment vendors. Also, the results of the Thesis will be used as an input in creation of customer documentation for UTRAN troubleshooting. There is also an intention to further test the relevance and reliability of the results of this Thesis by applying it in the 24/7 RAN technical support operator service of the equipment vendor in question.

Future Research The significance of Performance Indicator based troubleshooting is increasing continuously in live WCDMA networks. Once the PI and KPI specifications become more mature, more extensive study of the most relevant Performance Indicators used in WCDMA network troubleshooting is essential. Also, there is a need to develop a Framework and logic for solving emergency problems in WCDMA networks. As the growth of complexity of telecommunication networks increases, effective and efficient troubleshooting procedures are essential in order to manage the diversity of network technologies and the increasing quality requirements of the operators.