Modeling Inter-Login Time on the Internet By Rahul Telang
2 Road Map Motivation Previous Literature Model development Data Results Conclusion and Future Direction
Rahul Telang3 Motivation Understanding Consumer behavior on the Internet is interesting from the point of view of - Social scientists E-tailers and Electronic commerce managers as well as advertisers.
Rahul Telang4 Previous Literature Homenet Field Trial (Kraut et. al. 1996) Information and communication (Kraut et. al. 1999) Internet and Web use in USA (Hoffman, Kalsbeek, Novak 1996) All the models trying to predict web use (time spent) through linear regression model.
Rahul Telang5 Login Interval (12 hour) Freq.
Rahul Telang6 Previous Literature Use of poisson model is very common in economic literature (Grilichez 1984; Cameron and Trivedi 1986). Exponential distribution extensively used in marketing literature too to model inter- purchase or inter-shopping time (Ehrenberg 1959, 1972; Morrison, Schmittlein 1981, 1988; Zufryden 1977, 78; Gupta 1991)
Rahul Telang7 Model Start with simple exponential model – We can get the likelihood for n consumers as – Where population is homogeneous and Sum of n i inter-login times for user i
Rahul Telang8 Model In reality we expect users to be different. We can capture this variation by letting λ vary across users as gamma distribution with parameter r and α. So likelihood for any user is – The log likelihood for N users is then
Rahul Telang9 Model Till now, Model is static with no explanatory variables. We can include some time varying covariates (Gupta 1991) by using non homogenous poisson process Variables of Interest : H1: Time spent on the Web: (consumption theory?) More time spent at time t on the web might delay the next login time. H2: Experience : As time progresses, inter login time increases. Demographics : Not time varying but different across users.
Rahul Telang10 Model Login rate is a function of explanatory variables – Where λ 0 is the base login rate. [Interpretation of β – as X increases, λ increases for positive beta and hence inter-login time reduces] This leads to log likelihood for N consumers as
Rahul Telang11 Model Heterogeneity again can be incorporated by varying λ as gamma distributed. The likelihood in that case is –
Rahul Telang12 Data Data comes from the Homenet study being conducted in CMU. It captures the navigation pattern of users for long period of time. Since the data comes from the server, it is very accurate and unbiased. Total sample of 143 users for 1 year from May 15 th 1998 to 15 th May First 40 weeks of data used for calibration, rest 12 weeks as hold-out sample to test the predictive power of model.
Rahul Telang13 Data Session : to be able to correctly measure the IL time, we follow the rule that until there is 30 minutes delay, the new session does not begin. We delete those users, who do not have at least 10 observations. Demographic information collected from questionnaire. (We include gender, generation, education, race, marital status, and household income in our model) Total sample consists of 121 users for 40 week, leading to distinct data points.
Rahul Telang14 Results Exponential Exp/Gamma λ0.3321(120.4) r 2.12 (17.6) α7.17 (20.54) LL(n=14503)
Rahul Telang15 Results ExponentialExp/Gamma λ0.149 (10.86) r1.41 (15.27) α 6.45 (12.2) Time spent (-4.35)-0.07 (-3.5) Sessions0.05 (82.3)-0.05 (-28.7) Gender0.11 (9.92)0.23 (2.17) Generation0.15 (17.13)0.20 (2.5) Education-0.06 (-34.9)-0.1 (-5.7) Race0.14 (16.5)0.10 (1.27) Marital Status 0.09 (24.7)0.14 (4.26) Income0.013 (8.65)0.11 (7.7) LL (n=14451)
Rahul Telang16 Conclusion and Future work Both the hypothesis are verified. Demographics are very significant. Comparison with Linear regression shows a much better fit. We need to test the results with the hold-out sample.