Presentation is loading. Please wait.

Presentation is loading. Please wait.

Human Growth: From data to functions

Similar presentations


Presentation on theme: "Human Growth: From data to functions"— Presentation transcript:

1 Human Growth: From data to functions

2 Challenges to measuring growth
We need repeated and regular access to subjects for up to 20 years. Height changes over the day, and must be measured at a fixed time. Height is measured in supine position in infancy, followed by standing height. The change involves an adjustment of about 1 cm. Measurement error is about 0.5 cm in later years, but is rather larger in infancy.

3 Challenges to functional modeling
We want smooth curves that fit the data as well as is reasonable. We will want to look at velocity and acceleration, so we want to differentiate twice and still be smooth. In principle the curves should be monotone; i. e., have a positive derivative.

4 The monotonicity problem
The tibia of a newborn measured daily shows us that over the short term growth takes places in spurts. This baby’s tibia grows as fast as 2 mm/day! How can we fit a smooth monotone function?

5 Weighted sums of basis functions
We need a flexible method for constructing curves to fit the data. We begin with a set of basic functional building blocks φk(t), called basis functions. Our fitting function x(t) is a weighted sum of these:

6 What are the main choices for basis functions?
Fourier series: a constant term, a sine/cosine pair of fixed frequency, and followed by a series of sine/cosine pairs with integer multiples of the base frequency. Fourier series are best for periodic data.

7 Five Fourier basis functions

8 B-splines These basis functions are piecewise polynomials defined by a set of discrete values called knots. The order of the polynomials (degree + 1) controls their smoothness. Each basis function is nonzero only over a number of contiguous inter-knot intervals equal to the order. Polynomials are a special type of B-spline, and are thus included within the system.

9 When should I use B-splines?
B-splines are the basis of choice for most non-periodic. They give complete control over flexibility, allowing more flexibility where needed and less where not needed. Computing with B-splines is extremely efficient.

10 Five order 2 B-spline basis functions: A basis for polygonal lines

11 Eight order 4 B-spline basis functions A basis for twice-differentiable functions

12 B-splines for growth data
We use order 6 B-splines because we want to differentiate the result at least twice. Order 4 splines look smooth, but their second derivatives are rough. We place a knot at each of the 31 ages. The total number of basis functions = order + number of interior knots. 35 in this case.

13 Isn’t using 35 basis functions to fit 31 observations a problem?
Yes. We will fit each observation exactly. This will ignore the fact that the measurement error is typically about 0.5 cm. But we’ll fix this up later, when we look at roughness penalties.

14 Okay, let’s see what happens
These two Matlab commands define the basis and fit the data: hgtbasis = create_bspline_basis([1,18], 35, 6, age); hgtfd = data2fd(hgtfmat, age, hgtbasis);

15 Why we need to smooth Noise in the data has a huge impact on derivative estimates.

16 Please let me smooth the data!
This command sets up 12 B-spline basis functions defined by equally spaced knots. This gives us about the right amount of fitting power given the error level. hgtbasis = create_bspline_basis([1,18], 12, 6);

17 These are velocities are much better.
They go negative on the right, though.

18 Let’s see some accelerations
These acceleration curves are too unstable at the ends. We need something better.

19 A measure of roughness What do we mean by “smooth”?
A function that is smooth has limited curvature. Curvature depends on the second derivative. A straight line is completely smooth.

20 Total curvature We can measure the roughness of a function x(t) by integrating its squared second derivative. The second derivative notation is D2x(t).

21 Total curvature of acceleration
Since we want acceleration to be smooth, we measure roughness at the level of acceleration:

22 The penalized least squares criterion
We strike a compromise between fitting the data and keeping the fit smooth.

23 How does this control roughness?
Smoothing parameter λ controls roughness. When λ = 0, only fitting the data matters. But as λ increases, we place more and more emphasis on penalizing roughness. As λ  ∞, only roughness matters, and functions having zero roughness are used.

24 We can either smooth at the data fitting step, or smooth a rough function.
This Matlab command smooths the fit to the data obtained using knots at ages. The roughness of the fourth derivative is controlled. lambda = 0.01; hgtfd = smooth_fd(hgtfd, lambda, 4);

25 Accelerations using a roughness penalty
These accelerations are much less variable at the extremes.

26 The corresponding velocities

27 How did you choose λ? This is inevitably involves judgment.
We smooth just enough to obtain tolerable roughness in the estimated curves (accelerations in this case), but not so much as to lose interesting variation. There are data-driven methods for choosing λ, but they offer only a reasonable place to begin exploring.

28 What about monotonicity?
The growth curves should be monotonic. The velocities should be non-negative. It’s hard to prevent linear combinations of anything from breaking the rules. We need an indirect approach to constructing a monotonic model

29 A differential equation for monotonicity
Any strictly monotonic function x(t) must satisfy a simple linear differential equation: The reason is simple: because of strict monotonicity, the first derivative Dx(t) will never be 0, and function w(t) is therefore simply D2x(t)/Dx(t).

30 The solution of the differential equation
Consequently, any strictly monotonic function x(t) must be expressible in the form This suggests that we transform the monotone smoothing problem into one of estimating function w(t), and constants β0 and β1.

31 What we have learned B-spline bases are a good choice for fitting non-periodic functions; Fourier series are right for periodic situations. We can control smoothness by either using a restricted number of basis functions, or by imposing a roughness penalty. Roughness penalty methods generally work better. Differential equations can play a useful role when fitting constrained functions to data.

32 More information Ramsay & Silverman (1997), Chs. 3, 4, 13
The long-term growth data are from the Berkeley growth study. The infant growth data were collected by Michael Hermanussen.


Download ppt "Human Growth: From data to functions"

Similar presentations


Ads by Google