DEPT. OF PHARMACEUTICS & PHARM. TECH

Slides:

Advertisements

Similar presentations

Request Dispatching for Cheap Energy Prices in Cloud Data Centers

Advertisements

SpringerLink Training Kit

Luminosity measurements at Hadron Colliders

From Word Embeddings To Document Distances

Choosing a Dental Plan Student Name

Virtual Environments and Computer Graphics

Chương 1: CÁC PHƯƠNG THỨC GIAO DỊCH TRÊN THỊ TRƯỜNG THẾ GIỚI

THỰC TIỄN KINH DOANH TRONG CỘNG ĐỒNG KINH TẾ ASEAN –

D. Phát triển thương hiệu

NHỮNG VẤN ĐỀ NỔI BẬT CỦA NỀN KINH TẾ VIỆT NAM GIAI ĐOẠN

Điều trị chống huyết khối trong tai biến mạch máu não

BÖnh Parkinson PGS.TS.BS NGUYỄN TRỌNG HƯNG BỆNH VIỆN LÃO KHOA TRUNG ƯƠNG TRƯỜNG ĐẠI HỌC Y HÀ NỘI Bác Ninh 2013.

Nasal Cannula X particulate mask

Evolving Architecture for Beyond the Standard Model

HF NOISE FILTERS PERFORMANCE

Electronics for Pedestrians – Passive Components –

Parameterization of Tabulated BRDFs Ian Mallett (me), Cem Yuksel

L-Systems and Affine Transformations

CMSC423: Bioinformatic Algorithms, Databases and Tools

Some aspect concerning the LMDZ dynamical core and its use

Bayesian Confidence Limits and Intervals

实习总结（Internship Summary)

Current State of Japanese Economy under Negative Interest Rate and Proposed Remedies Naoyuki Yoshino Dean Asian Development Bank Institute Professor Emeritus,

Front End Electronics for SOI Monolithic Pixel Sensor

Face Recognition Monday, February 1, 2016.

Solving Rubik's Cube By: Etai Nativ.

CS284 Paper Presentation Arpad Kovacs

انتقال حرارت 2 خانم خسرویار.

Summer Student Program First results

Theoretical Results on Neutrinos

HERMESでのHard Exclusive生成過程による核子内クォーク全角運動量についての研究

Wavelet Coherence & Cross-Wavelet Transform

yaSpMV: Yet Another SpMV Framework on GPUs

Creating Synthetic Microdata for Higher Educational Use in Japan: Reproduction of Distribution Type based on the Descriptive Statistics Kiyomi Shirakawa.

MOCLA02 Design of a Compact L-band Transverse Deflecting Cavity with Arbitrary Polarizations for the SACLA Injector Sep. 14th, 2015 H. Maesaka, T. Asaka,

Hui Wang†*, Canturk Isci‡, Lavanya Subramanian*,

Fuel cell development program for electric vehicle

Overview of TST-2 Experiment

Optomechanics with atoms

داده کاوی سئوالات نمونه

Inter-system biases estimation in multi-GNSS relative positioning with GPS and Galileo Cecile Deprez and Rene Warnant University of Liege, Belgium

ლექცია 4 - ფული და ინფლაცია

10. predavanje Novac i financijski sustav

Wissenschaftliche Aussprache zur Dissertation

FLUORECENCE MICROSCOPY SUPERRESOLUTION BLINK MICROSCOPY ON THE BASIS OF ENGINEERED DARK STATES* *Christian Steinhauer, Carsten Forthmann, Jan Vogelsang,

Particle acceleration during the gamma-ray flares of the Crab Nebular

Interpretations of the Derivative Gottfried Wilhelm Leibniz

Advisor: Chiuyuan Chen Student: Shao-Chun Lin

Widow Rockfish Assessment

SiW-ECAL Beam Test 2015 Kick-Off meeting

On Robust Neighbor Discovery in Mobile Wireless Networks

Chapter 6 并发：死锁和饥饿 Operating Systems: Internals and Design Principles

You NEED your book!!! Frequency Distribution

Y V =0 a V =V0 x b b V =0 z

Fairness-oriented Scheduling Support for Multicore Systems

Climate-Energy-Policy Interaction

Hui Wang†*, Canturk Isci‡, Lavanya Subramanian*,

Ch48 Statistics by Chtan FYHSKulai

The ABCD matrix for parabolic reflectors and its application to astigmatism free four-mirror cavities.

Measure Twice and Cut Once: Robust Dynamic Voltage Scaling for FPGAs

Online Learning: An Introduction

Factor Based Index of Systemic Stress (FISS)

What is Chemistry? Chemistry is: the study of matter & the changes it undergoes Composition Structure Properties Energy changes.

THE BERRY PHASE OF A BOGOLIUBOV QUASIPARTICLE IN AN ABRIKOSOV VORTEX*

Quantum-classical transition in optical twin beams and experimental applications to quantum metrology Ivano Ruo-Berchera Frascati.

The Toroidal Sporadic Source: Understanding Temporal Variations

FW 3.4: More Circle Practice

ارائه یک روش حل مبتنی بر استراتژی های تکاملی گروه بندی برای حل مسئله بسته بندی اقلام در ظروف

Decision Procedures Christoph M. Wintersteiger 9/11/2017 3:14 PM

Limits on Anomalous WWγ and WWZ Couplings from DØ

Presentation transcript:

DEPT. OF PHARMACEUTICS & PHARM. TECH PCT 202 STATISTICS DR. C. P. AZUBUIKE DEPT. OF PHARMACEUTICS & PHARM. TECH FACULTY OF PHARMACY UNIVERSITY OF LAGOS

OBJECTIVES Understanding of the fundamental concepts of statistics Analyze in a statistically acceptable manner, data from scientific experiments.

TOPIC DESCRIPTION Compilation and presentation of data Analysis of data Distribution of data Measurement of variation-SD, SE, limits of errors Comparison of data- tests for significance, co-efficient of variation, etc. Introduction to statistical quality control

TEXTBOOKS Pharmaceutical Statistics by David Jones Understanding Statistics by O. A. Adedayo Biostatistics….. A practical Approach to Research and Data handling by Anthony E. Ogbeibu

WHAT IS STATISTICS? The term statistics is the science and art of collecting, organizing, analyzing and interpreting numerical data affected by a multiplicity of causes, offering objective evaluation of the reliability of the conclusions based on the data. It is a theory of decision making in the face of uncertainty

TYPES OF STATISTICS Statistics may be divided into two sub-categories :- Descriptive statistics Inferential statistics Descriptive or deductive statistics provides general information about the fundamental statistical properties of data e.g. mean, median, mode, variance, standard deviation etc.

TYPES OF STATISTICS Inferential or analytical or inductive statistics enables one to draw inferences/conclusions based on information derived from experimental procedures, e.g. the antidiabetic effect of formulation A is greater than that of formulation B

APPLICATIONS OF STATISTICS Collection of data Numerical description of data Formulation of hypothesis concerning the nature of the data Understanding the relevance of data Design of Experiments to test the hypotheses, or indeed, to further consolidate or reject the hypothesis

DEFINITION OF COMMON TERMS USED IN STATISTICS Population: refers to the total number of cases in our focus of interest. Census: is the complete enumeration of a target population together with the collection of some important information on every element of the population. Sample: is a part of population and thus consists of any subgroup drawn from target population.

DEFINITION OF COMMON TERMS USED IN STATISTICS Parameter: is a numerical quantity summarizing a population (mean, SD, variance) Statistic: is a numerical quantity that summarized the characteristic of a sample e.g. sample mean, Variable: : is an occurrence which can assume any value from a prescribed set of values. It can be discrete or continuous variables

VARIATION IN SCIENTIFIC DATA A variable is a property with respect to which individuals in a sample differ in some ascertainable way. E.g. of variables include: The height of men in a particular region The weights of tablets derived from the same batch Concentration of cholesterol in the plasma of female subjects.

TYPES OF VARIABLES Measurement variables: may be described in a numerically ordered fashion. It may be continuous or discrete variables Continuous variables can assume an infinite number of numerical values between the lowest and highest points on a scale Discrete variables have a fixed number of values and always have integer values (whole numbers)

TYPES OF VARIABLES Ranked variables: ranking scales are e.g. of continuous variables. Although they do not represent physical measurement , such scales represent numerically ordered systems. Attributes or nominal variables cannot be measured because of their qualitative nature. Unlike ranked variables are not associated with numerical values. Note: Whenever attributes are combined with frequencies, they are referred to as enumeration data

PRESENTATION OF DATA Data can be presented in:- Tabular forms Diagrammatic forms In order to organize the data well, we need to classify the data before presenting either in tabular or diagrammatic forms Classification deals with grouping of data which have some identified common characteristics.

TABULATION OF DATA Tabulation deals with presentation of the classified data in tabular form A table is an array of data in rows and columns It enhances condensation of a large mass of data It enables ease comparison among classes of data It takes up less space than data presented in narrative form

CONTENTS OF A GOOD TABLE Title: is written at the top of the table and gives the description of the contents of the table Caption: is the column heading Stubs: are the row headings Footnote: is a brief explanatory information about the table which is not self-evident Units of Measurement: should be clearly specified.

TYPES OF TABLES Simple table: Complex table Further complex

SAMPLE OF A TABLE

DIAGRAMMATICAL/GRAPHICAL DATA PRESENTATION Pictograms Pie Chart Bar Charts Histogram Cumulative frequency curve or Ogive

PICTOGRAM Pictogram contains a pictorial symbol that represents the data of interest. The number of diagram drawn is usually proportional to the given data. A key is usually given to inform us about the value of each pictorial symbol

PICTOGRAM

PIE CHART A pie chart consists of a circle that has been divided into sectors which are proportional to the data Most of the data presented in a pie chart are categorical data Generally pie charts do not give information on the absolute magnitude except figures are assigned to each sector

PIE CHART

BAR CHARTS Simple Bar chart consists of separated rectangular bars drawn in such a way that the height is equivalent to the frequency. It could be drawn vertically or horizontally but vertical bar charts are most popular Unlike pie chart, it is easier to make comparison of the heights than of sectors Simple bar chart, multiple bar charts and component bar charts are e.g. of types of bar charts

BAR CHARTS

HISTOGRAM Histogram is similar to the simple bar chart except that the bars are not separated The area of each rectangular bar is proportional to its frequency The line joining the midpoint of the top of one bar to the other is known as frequency polygon

HISTOGRAM

CUMULATIVE FREQUENCY CURVE (OGIVE) When the cumulative totals of successive frequencies of a distribution are plotted against the corresponding class boundaries then we have an ogive The last cumulative frequency is the total of the frequencies in the distribution It can only be plotted for a frequency distribution For ungrouped frequency table, the values on the x-axis are the individual values of X For a grouped frequency table we shall need to compute the class boundaries.

CUMULATIVE FREQUENCY CURVE (OGIVE)

MEASUREMENT OF CENTRAL TENDENCY Single value which is a central point of a distribution is known as a measure of central tendency (CT) or location Measures of CT are typical and representative of a data. Every value in the distribution clusters around the measure of the location Arithmetic mean, median, mode, harmonic mean and geometric mean are e.g. of measure of CT

ARITHMETIC MEAN Mean is the most popular method for describing the central nature of data It refers to the centre of a distribution of data The use is most appropriate when ever the data is symmetrically distributed around the mean Mathematically, the arithmetic mean (µ, X) is described as follows:

ARITHMETIC MEAN 𝑗=1 𝑁 𝑋𝑗 / N Where ∑ is the sigma notation which is used for summing up numbers (‘the sum of’) 𝑋𝑗 refers to all data from j=1 to j=N N is the number of data contributing to the calculation

ARITHMETIC MEAN Question: The reduction in BP (mmHg) in 6 patients 4 hours after administration of a standard dose of a novel antihypertensive agent is shown in Table below: Calculate the mean reduction in BP reduction in the 6 patients Patient Number Reduction in blood pressure (mmHg) 1 20 2 25 3 21 4 34 5 41 6 37

ARITHMETIC MEAN Solution: Substituting the figures from the table into the equation the arithmetic mean, we obtain: 𝑗=1 𝑁 𝑋𝑗 / N = (20+25+21+34+41+37) 6 = 178 6 = 29.67 mmHg The term arithmetic mean in the current usage can be abbreviated as mean

WEIGHTED MEAN It is a special e.g. of the mean in which each datum point in the distribution does not contribute equally to the overall calculation of the mean Weighted mean is commonly employed whenever the data is divided into groups Each of the group possesses different weighting.

WEIGHTED MEAN Question: The effect of a defined dose of a commercially available analgesic to suppress pain following a painful stimulus was evaluated in 20 volunteers using a visual analogue scale. The results are presented in the table below: Number of Volunteers Pain assessment by volunteers 2 3 (extreme pain) 12 2 (moderate pain) 6 1 (slight pain)

WEIGHTED MEAN Solution: in this e.g., three sub-groups describe different clinical effects and therefore not of equal magnitude (weighting) Calculation of weighted average employs the following: 𝑗=1 𝑁 𝑤𝑗𝑋𝑗 / N wj is the weighting (frequency) of each group or series 𝑗=1 𝑁 𝑋𝑗 / N = 2𝑥3 + 12𝑥2 +(6𝑥 1)) 20 = 36 20 = 1.8 Note: calculated mean does not have any dimensions, as a direct result of the analogue scale used to access pain

MEDIAN It is an alternative means of describing of central nature of data which is relatively unaffected by the nature of the spread of data. It is the central value For data that is distributed in a normal fashion, the numerical values of the mean and median should be identical Both mean and median may be used to describe the central properties of moderately skewed. For data that is +ly skewed (i.e. distributed towards the y-axis), mean is greater than median.

MEDIAN Question: The weights of 11 tablets removed from a batch for quality control purposes are presented in the table below. Calculate the mean and median values of the tablet weights.

MEDIAN Individual weights of 11 tablets removed from a production batch Tablet number Tablet weight (mg) 1 251 2 255 3 250 4 245 5 265 6 260 7 231 8 225 9 10 275 11 300

MEDIAN Step 1 Calculation of the mean Step 2 Calculation of the median (251+255+250+245+265+260+231+225+250+275+300) 11 = 255.2 mg Step 2 Calculation of the median 1st, arrange the data in order of magnitude 225, 231, 245, 250, 250, 251, 255, 260, 265, 275 and 300 The median is defined as the central value, i.e. the value in position 6 which is 251mg.

MODE Mode is defined as most commonly occurring measurement in a set of data. Question: The concentrations of therapeutic agent (mg/ml) in 10 vials of a commercially available parenteral product have been determined using a chromatographic method. Calculate the mode of the observed concentrations.

MODE Concentration of a therapeutic agent in 10 vials of a commercially available product Vial number Conc of therapeutic agent (mg/ml) 1 200 2 205 3 4 201 5 199 6 195 7 202 8 9 10 207

MODE The most popular value in the above set of data is 205mg/ml (four concurrences) This value (205mg/ml) is the mode. Data may contain more than one mode. If two modes are present in a data set, the data is said to be bimodal.

MEASUREMENT OF THE VARIATION It deals with the spread or variability or dispersion of the data in the distribution. Dispersion deals with the way values in the distribution are scattered. As illustration, suppose the monthly salaries of 5 workers from two companies are as shown in the above table. Mean salary is same (N5, 000) but the distribution is not same. Company A N4,000 N4,500 N5,000 N5,500 N6,000 Company B N1,500 N1,750 N2,250 N15,000

MEASUREMENT OF THE VARIATION The concept of variability is very useful in inferential statistics The higher the variability of a distribution the less accurate is the estimate to be obtained about the measures of location In statistics, the variability of a distribution is better described in terms of a summary data rather than being described as low or high The measures include range, mean absolute deviation, variance, quartile deviation and standard deviation

RANGE It is simply defined as the difference between the highest value and lowest value Range = Highest value – Lowest value Coefficient of range = 𝐻𝑖𝑔ℎ𝑒𝑠𝑡 𝑣𝑎𝑙𝑢𝑒−𝐿𝑜𝑤𝑒𝑠𝑡 𝑣𝑎𝑙𝑢𝑒 𝐻𝑖𝑔ℎ𝑒𝑠𝑡 𝑣𝑎𝑙𝑢𝑒+𝐿𝑜𝑤𝑒𝑠𝑡 𝑣𝑎𝑙𝑢𝑒 The use of range to accurately describe data variation is limited It does not truly describe the variation of the entire data

RANGE The ranges of data sets A and B in the table below are as follows: Data set A: range = 30-10 = 20 Data set AB range = 30-28 = 2 Data Set A Data Set B 10 28 20 29 30 Mean=30 Mean = 30

MEAN DEVIATION It is a measure of data variation that is calculated as the average deviation from the mean. I mathematical terms, the mean deviation is described as follows: MD = 𝑋𝑗 −𝑋 𝑁 Where 𝑋𝑗 −𝑋 is the absolute value of the deviation of the values in the data set from the mean of the data set and N is the nos of observations in the data set

MEAN DEVIATION Question: The weights of six tablets of folic acid taken from a batch that was prepared by wet granulation method were 100.6, 98.3, 98.9, 95.1, 104.5 and 105.5. Calculate the mean deviation. Tablet No Weight of tablet (mg) 1 100.6 2 98.3 3 98.9 4 95.1 5 104.5 6 105.5

MEAN DEVIATION 𝑗=1 𝑁 𝑋𝑗 / N = Solution: Step 1 calculate the mean 𝑗=1 𝑁 𝑋𝑗 / N = 100.6+98.3+98.9+95.1+104.5+105.5) 6 = 100.5mg/ml Step 2 Calculate the mean difference 𝑋𝑗 −𝑋 𝑁 = {(100.6−100.5)+ 98.3−100.5 +(98.9−100.5)+ 95.1−100.5 + 104.5−100.5 +(105.5−100.5)) 6 = 3.1 mg/ml Note: No reference is made to the algebraic sign

VARIANCE Variance is the sum of the squared deviations from the mean divided by the number of observations (N) The sum of the squared deviations from the mean is called sum of squares (SS) SS = 𝑋𝑗−𝑋 2 Variance is thus the mean sum of squares. σ2= 𝑋𝑗−µ 2 /𝑁 where σ2 is the variance of a population σ2 is designated by the symbol sigma squared. µ is the population mean

VARIANCE s2= 𝑋𝑗−𝑋 2 /𝑁 -1 where s2 is the variance of a sample 𝑿 is the sample mean N-1 is called the degrees of freedom Primary reason for the differences b/w the two equations relates to the relative inaccuracy of the estimation of population variance from the sample variance.

STANDARD DEVIATION SD is a commonly used measure of the dispersion of data. SD is defined as the +ve √ of the variance It may be written mathematically as follows: SD of a population σ =√ 𝑋𝑗−µ 2 /𝑁 SD of a sample s= √ 𝑋𝑗−µ 2 /𝑁 Most calculators and computers can calculate SDs rapidly.

STANDARD DEVIATION (ERROR) OF THE MEAN Standard deviation of the mean is sometimes referred to as standard error of the mean (SEM) SD describes the variability (dispersion) of a set of data around a value, and an estimate of the variability of the data in a population may be derived from it. SEM is a measure of the variability of a set of mean values, calculated from individual groups of measurements (samples) that have been derived from a population

COEFFICIENT OF VARIATION CV is a statistical term that expresses the variability of a set of data It is defined as the ratio of the standard deviation (s) to the mean of the data set (X): 𝐶𝑉 % = 𝑠 𝑋 𝑥 100 It allows the variation of data sets of differing magnitude to be directly compared. E.g. if the mean ± SD of two sets are (A) 2500 ± 125 and (B) 50 ± 35, at first glance one may be deceived into believing that the variation of data B is less than that of A

ACCURACY Accuracy is defined as the closeness of a measured value to the true value (the value that would be expected in the absence of error) In pharmaceutical analysis, it is commonplace to describe the accuracy of an analytical method as the closeness of the observed (analysed) and expected values. Absolute error and Relative error may be used to describe the difference b/w observed and expected values.

ABSOLUTE ERROR It can be calculated using the formula errorabs =O-E Where errorabs, is the absolute error, O is the observed value or alternatively, the observed mean of a set of values, E is the expected (true) value. Question: A solution of quinine sulphate has been analysed using three analytical methods, and the results are shown in the table below. Calculate the errorabs

ABSOLUTE ERROR Concentration of quinine sulphate in a solution, as determined using three analytical methods Analytical method Observed value (mg/ml) Expected value (mg/ml) Absolute error (mg/ml) HPLC with UV detection 2.51 2.50 +0.01 UV spectroscopy 3.53 +1.03 Fluorescence spectroscopy 2.19 -0.31

ABSOLUTE ERROR Solution: The most accurate method may be defined as that which possesses the lowest value of absolute error while least accurate method may be defined as that which possesses the largest value of absolute error HPLC with UV detection (errorabs, +0.01 mg/ml) is most accurate while UV spectroscopy (errorabs, +1.03 mg/ml) is least accurate

RELATIVE ERROR The term was developed to overcome the problem with errorabs It can be calculated using the formula: errorrel = errorabs/E = O-E/E In the calculation, the sign of the difference (+ve or –ve) is ignored Greater numerical values of relative error are indicative of decreased accuracy. Advantage of the use of relative error can be seen in the next table

RELATIVE ERROR Analytical method Observed value (mg/ml) Expected value (mg/ml) Absolute error (mg/ml) Relative error (%) HPLC with UV detection 2.51 2.50 +0.01 0.40 UV spectroscopy 3.53 +1.03 41.20 Fluorescence spectroscopy 2.19 -0.31 12.40 0.19 0.50 62.00

PRECISION It is a statistical term that describes the dispersion of a set of measurements Unlike accuracy, it provides no indication of the closeness of an observation to particular expected quantity High precision is associated with low dispersion of values around a central value, a low SD.

PRECISION Question In a quality control laboratory the fill volumes of three samples of an antacid suspension have been measured and recorded as shown in the table below. Comment on the accuracy and precision of the volumes of the three samples

Fill volumes of selected samples of an antacid formulation PRECISION Fill volumes of selected samples of an antacid formulation Fill volume of sample A (ml) Fill volume of sample B (ml) Fill volume of sample C (ml) 47 29 28 48 39 26 49 27 50 59 30 51 69 X = 49.0 X = 28.2 s= 1.6 s= 15.8 s= 1.80 Errorrel of mean =2.0% Errorrel of mean =43.6%

PRECISION From the table The relative errors associated with samples A and B are identical and these samples are therefore considered to be equally accurate measurements of the true (expected) fill volume. Conversely, the accuracy of the mean fill volume of sample C is poor (43.6% relative error) and this is considered to be poor representation of the true fill

PRECISION Considering the SDs associated with the data, we can evaluate the precision of the measurements. Sample A has a low SD (1.6), low CV (3.3%) hence low dispersion of the data set around the mean …. precise Sample C has a low SD (1.8), low CV (6.4%) hence low dispersion of the data set around the mean … precise Sample B has a high SD (15.8), high CV (32.2%) hence high dispersion of the data set around the mean ... imprecise A exhibits high accuracy & high precision, B high accuracy & low precision