Horoscope Classification Using Support Vector Machines Part of the Home Health Horoscopes Project Joseph ‘Jofish’ Kaye INFO December 2004
Full HHH Project For this project, I’m looking at creating a sorted horoscope database for use in horoscope generation.
Background Home Health Horoscopes: –collaboration with Phoebe Sengers (Cornell) Bill Gaver (Royal College of Art, London)m
The problem with Ubiquitous & Context Aware Computing Current technology rhetoric around ubiquitous computing posits a legion of smart sensors deriving your every activity in its full complexity as a function of their combined outputs. The reality is that current sensor and artificial intelligence systems cannot come close to capturing the complexity of human action, even in a limited domain, such as the home.
HHH: A probe for discussion By using horoscopes as the final output of a ubiquitous computing system in a home, we are able to allow for human complexities and recognize the ambiguity inherent in the contexts derived from our sensor inputs.
Previous work: Psychology Forer 1949: The Fallacy of Personal Validation: A Classroom Demonstration of Gullibility. “Security is one of your major goals in life.” “Some of your aspirations tend to be pretty unrealistic.” Snyder 1974: Why Horoscopes Are True: The Effects Of Specificity On Acceptance Of Astrological Interpretations Similar statements to above rated 3.24/5 if told the statements were "generally true of people", 3.76/5 if they were based on the subject's year and month of birth and 4.38/5 if based on year, month and day of birth.
Previous work: Critical Theory Adorno 1953: The Stars Down to Earth –Horoscopes reinforce existing social structures –Pseudo-individualization Display that keen mind of yours“ "Follow up on that intuition of yours". –Vice presidential level –“Bi-phasic approach”: “The problem with how to dispense with the contradictory requirements of life is solved by the simple device of distributing these requirements over different periods, mostly of the same day.”
Raw Materials 45,000 unlabelled horoscopes screen-scraped from the web, split into 180,000 sentences. –Average sentence (“document”) length 11.9 words. 900 labeled horoscopes in six categories scraped from a fourth website, split into 1500 sentences. –Average sentence (“document”) length 9.3 words. Six categories: career, economy, health, luck, love, and relations.
Progressive ratios of test:train to determine accuracy
Use models to label corpus Modify b parameter to maximize accuracy Check by using 100% test data to check 100% training data Apply to entire unlabelled corpus to make list of sentences that are ~certain of being about luck
Sample: Luck class original sentences Apply luck model: –562 support vectors –Maximum accuracy at b=0.44 for 96.9% sentences rated negatively in luck class 6394 sentences that are definitely about luck
Distinctly luck-class sentences: Top ten sentences most likely to be not about luck: Good luck Luck is with you Lady luck is with you You'll seek fortune through investments and a little good luck Don't push your luck An apparent stroke of fortune is really the culmination of good luck and hard labor A potentially luck filled day Take advantage of your good luck Go ahead...press your luck button Luck should swing your way today A sense of impending good fortune motivates you today, so press your luck to the limit.
Definitively not about luck Top ten sentences most likely to be not about luck: Love and romance will be on your mind Then find someone else and do it up right Look into making changes that will help you look and feel better about yourself Adopting a positive attitude will help you relax and enjoy better health Your mind is saying work work work but your heart is saying fun excitement and romance - watch which one you listen to Your focus should be on your work and what you can do to make it better Work for a cause other than yourself It will all work out Work on yourself Support will work better than criticism New and unusual work methods will make it difficult for you to spend quality time at home.
Proposal Goal I: Determine if it is feasible to represent horoscopes or portions thereof in a support vector machine. Goal II: Determine if using latent semantic indexing helps in achieving Goal I. Goal III. Determine if it is possible to generate horoscopes from fragments (paragraphs or sentences, depending on the inputs) of horoscopes in a SVM; else determine if it is possible pick the appropriate horoscope from a set.
Project Goal I: Determine if it is feasible to represent horoscopes or portions thereof in a support vector machine. Done. Goal II: Determine if using latent semantic indexing helps in achieving Goal I. Not necessary! Goal III. Determine if it is possible to generate horoscopes from fragments (paragraphs or sentences, depending on the inputs) of horoscopes in a SVM; else determine if it is possible pick the appropriate horoscope from a set. 2 nd part done; 1 st part still possible
Future work Combining horoscope sentences to make realistic horoscopes Orthogonal, attitudinal categories: –Good luck / bad luck –Increasing / decreasing –etc. Hooking up to sensors and Putting into people’s houses…