Presentation is loading. Please wait.

Presentation is loading. Please wait.

The Endless Gallery: Visualizations of Author Data Howard D. White Xia Lin Jan Buzydlowski College of Information Science and Technology Drexel University.

Similar presentations


Presentation on theme: "The Endless Gallery: Visualizations of Author Data Howard D. White Xia Lin Jan Buzydlowski College of Information Science and Technology Drexel University."— Presentation transcript:

1 The Endless Gallery: Visualizations of Author Data Howard D. White Xia Lin Jan Buzydlowski College of Information Science and Technology Drexel University Philadelphia, PA 19104

2 Authors’ names have a double sense l They can designate: Persons (Living or Dead) Oeuvres l Persons or oeuvres can be mapped if they can be related with some metric. l Data gathered on persons can be linked to oeuvres and vice-versa.

3 What are author maps good for? l Provide intellectual overviews of specialties. l Assist in retrieval of documents. l Suggest networks for sociometric analysis.

4 Inputs to author maps l Multiple-author input Use author names from book or other publication Example: Deb Stagg used Diana Crane’s list of New York artists Example: White & McCain used 120 top-cited information scientists Could use names in one of Randall Collins’s diagrams Use judgment sample (own knowledge, advisors’, etc.) Example: Hinda Greenberg’s sample of 88 literary theorists Use author names from organization’s membership roll Example: Howard White used Nazer-Wellman list for “Globenet” l Single-author input AuthorLink

5 Collins, Randall. 1998. The sociology of philosophies; A global theory of intellectual change. Belknap: Harvard.

6 Two Author Co-citation Analyses in the Humanities from Drexel l Stagg, Deborah B. 1997. Art world maps: A quantitative sociology of contemporary American art. PhD dissertation. Drexel University. l Greenberg, Hinda F. 1999. Spanning boundaries: An interdisciplinary citation study based on literary-studies author co- citation clusters. PhD dissertation. Drexel University.

7 Hinda Greenberg’s 88 literary theorists as PFNET

8

9 Co-citation is the mentioning of any two earlier documents in the bibliographic references of a third, later document. The count of mentions may grow over time as new writings appear. Thus, co-citation counts can reflect citers’ changing perceptions of documents as more or less strongly related. Documents shown to be related by their co-citation counts can be mapped as proximate in intellectual space. Co-Citation Analysis Doc 1 Doc 2 Doc 3 Doc 3 co-cites Docs 1 and 2

10 Co-Citation Analysis Lin, Xia. 1997. Map Displays for Information Retrieval. Journal of the American Society for Information Science 48: 40-54. Chen, Chaomei. 1998. Bridging the Gap: The Use of Pathfinder Networks in Visual Navigation. Journal of Visual Languages and Computing 9: 267-286. l Document co-citation counts times two papers are cited together. l Author co-citation counts times two authors, e.g., Lin and Chen, are cited together. Journal co-citation counts times two journals are cited together.

11 Co-Citation Analysis l Data on co-citation are readily obtainable from databases of the Institute for Scientific Information (ISI) in Philadelphia, PA: Scisearch (Science Citation Index) Social Scisearch (Social Sciences Citation Index) Arts & Humanities Search (Arts & Humanities Citation Index) l These databases are searchable online through, e.g., the Dialog Corporation.

12 Author Co-Citation Analysis (ACA) Detects patterns in the frequency with which any works by any two authors are jointly cited in later works. Could be called analysis of co-cited oeuvres. l Only recurrent co-citation is significant: the more times authors are cited together, the more strongly related they are in the eyes of citers.

13 Author Co-Citation Analysis l If Ben Shneiderman and Shakespeare are cited together in one article, it probably means little. l If Ben Shneiderman and Stuart Card are cited together in more than 200 articles, it means a lot: their names have come to symbolize something like “interactive interfaces for digital libraries.” l In a cited-author (CA) search on Dialog, SELECT CA=SHNEIDERMAN B AND CA=CARD SK would retrieve the 200+ citing articles.

14 AuthorLink l Produces co-cited author maps in real time (a few seconds) on a Web site. l User merely has to enter name of a single author of interest as a “seed.” E.g., “Dickinson-E” for Emily Dickinson l System responds with the top authors co-cited with that seed—24 other names ranked by frequency of co-occurrence. l System then pairs every name with every other in a 25x25 square symmetric matrix.

15 Quick Visualizations of a Database l User can choose to display the matrix either as a Kohonen feature map (SOM, self-organizing map) or as a Pathfinder network map (PFNET). l User can use either map as An aid to retrieving articles that cite authors in various combinations. (Combinations are made interactively.) Reproducible artwork in a new study, such as a review of a literature or a commentary on the author used as “seed.”

16

17

18 Advantages of Maps l Ranked list of top 25 co-cited authors often contains names not previously known to user. l Both Kohonen maps and PFNETs show interconnections of the 25 authors not apparent in the one-dimensional ranking of a simple list. l Maps automatically pair authors with their biographers, editors, commentators, and critics.

19

20 AuthorLink’s Underlying Database and Software l ISI gave our college 10 years’ worth of data from the Arts & Humanities Citation Index (AHCI 1988-1997) as a research grant. Has 1.26 million bibliographic records on articles and other items from humanities journals. l For retrievals from AHCI, we bought BRS Search, an industrial-strength engine, from Dataware, Inc. l Buzydlowski and Lin have written several special programs in Java and C to implement our system on top of the BRS Search software.

21 Interpretation of Maps l Kohonen maps show high co-citation counts of authors by placing them closer in space. l PFNETs show highest co-citation counts of authors directly, as links between nodes bearing authors’ names.

22 Norman Mailer

23 Limitations on the Maps l AuthorLink maps are pictures of 10 years of scholarship as reflected in AHCI. l They simplify and highlight certain relationships in humanistic studies. l They do not capture all relationships in the data, nor do they do they present “superior” truths.

24

25

26 Interface Design Considerations l Link interface to valuable digital libraries (ISI citation databases and the journal literatures they lead to). l Focus on intellectual content: meaningful words, meaningfully presented. l Stress quick and flexible presentations over long-term displays.

27 PFNET of Plato rendered with Cortona VRML software

28 PFNET of Plato made with Pajek

29 3 Main Shapes Found in AuthorLink PFNET Displays l Dendrite E.g., for Virginia Woolf l Cycle E.g., for Herbert A. Simon l Star E.g., for Noam Chomsky

30

31

32

33 PFNETs l Are algorithmically connected graphs. based on finding “minimum-cost” path between any two nodes. l In ACA, this is generally the highest single co-citation count between author pairs (all pairs are examined). l Results in useful simplification of graph. l Use spring embedder algorithm to produce layout.

34 PFNETs l Make sense as pictures of relations in databases! l Independent observers have found them highly intelligible: Xia Lin on Chinese philosophers Kate McCain on historians of science & technology Howard White on various literary figures and artists l Buzydlowski research will test interpretability of PFNETs and Kohonen maps as interfaces for domain experts and naïve users.

35 Vincent van Gogh

36 Einstein-A and Mozart

37 Einstein-A and Bohr

38 Architecture of AuthorLink Front tier.. Middle tier.. Back tier BRS Search Engine Web Server JavaServlets Web-based Map Interface Java Applet Mapping Procedures Application Server Oracle Database MYSQL Database

39 Two Forms of Citation Data l Intercitation: Occurs when any member of a fixed group cites any other member of that group. Asymmetric. May or may not be reciprocal. Here, citing among Globenet members as dyads. l Co-citation: Occurs when any two authors are cited together in the reference list of any work. Important when recurrent above some threshold. Symmetric. Here, joint citation of any Globenet pair in a work by any author whose journal publications are covered by the Institute for Scientific Information.

40

41

42 Intercitation: citing or being cited in a group with definite membership Bo Br Ca Co Cy Fr He Ke Of Pe Po Ro Sc Su Tr Wi -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- 1 Bouchard 0 0 0 0 0 0 0 0 0 0 2 0 0 0 0 0 2 BrooksGunn 2 0 0 3 0 0 0 6 1 0 0 0 0 1 0 0 3 Case 0 0 0 0 0 0 0 2 0 0 0 0 0 0 0 0 4 Coe 0 0 0 0 0 0 0 0 0 0 0 0 0 6 0 0 5 Cynader 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 6 Frost 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 7 Hertzman 1 0 3 2 1 0 0 0 2 0 9 0 0 1 2 1 8 Keating 0 0 2 1 1 0 1 0 1 0 1 1 1 1 1 1 9 Offord 1 0 0 0 0 0 0 0 0 0 0 0 0 0 7 0 10 Pence 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 11 Power 0 0 1 0 1 0 3 1 1 0 0 0 0 0 1 0 12 Rohlen 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 13 Scardamalia 0 0 4 0 0 0 0 0 0 0 0 0 0 0 0 0 14 Suomi 0 0 0 12 0 0 0 0 2 0 1 0 0 0 1 0 15 Tremblay 1 5 0 0 0 0 0 0 8 0 3 0 0 0 0 0 16 Willms 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 Row names cite column names

43

44 Cumulative Intercitation, to 2000 High outcitation

45 High incitation Cumulative Intercitation, to 2000

46 Mann Applebaum Green Martins Jones Wood Stone Hart Cook Hopkins Scott Demore Oldfield Brown Grey 90 65 60 31 17 1 3 8 48 1 3 3 3 6 6 11 Smith Co-citation counts of Globenet authors: Line thicknesses are proportional. Colors reflect the seven disciplines of Globenet members.

47 Single Author Maps Can map an author’s Co-authors (those with whom she writes) Citees (those she cites) Co-citing authors (those who cite her with others) Co-cited authors (those others with whom she is cited)

48 CAMEOs: l Characterizations Automatically Made and Edited Online Form set of bibliographic records of writings by an author or citing an author Select a field from the records E.g., descriptors, co-cited authors Rank the terms in it by frequency of occurrence across the set

49 Some Types of CAMEOs l Descriptors or identifiers applied across an authors’ works l Journals to which an author contributed l Citation identities—an author’s citees l Citation images—the authors with whom an author is cocited

50 Subject CAMEO for Tom Nisonger: First 50 Identifiers in 3 ISI Files 1 4 IMPACT 2 4 LISTS 3 3 FACULTY 4 3 SERIALS 5 2 ARTICLES 6 2 CITATIONS 7 2 INDICATORS 8 2 JOURNALS 9 2 PATTERNS 10 2 PERIODICAL LITERATURE 11 2 RANKING 12 2 RELEVANCE 13 1 ACADEMIC LIBRARIANS 14 1 ACADEMIC-LIBRARY 15 1 ACCESS 16 1 ACQUISITIONS 17 1 AUTHORSHIP 18 1 BIBLIOMETRIC ANALYSIS 19 1 CITATION ANALYSIS 20 1 COLLECTION DEVELOPMENT 21 1 COLLEGE 22 1 COMMERCIAL DOCUMENT SUPPLIERS 23 1 CONSISTENCY 24 1 COST-EFFECTIVENESS 25 1 DEFINITION 26 1 DELIVERY 27 1 DENVER 28 1 DESIGN 29 1 DOCUMENTS 30 1 GENETICS 31 1 INDEX 32 1 INDEXING CONSISTENCY 33 1 INFORMATION-SCIENCE 34 1 INTERLIBRARY LOAN 35 1 JOINT COMMITTEE REPORT 36 1 JOURNAL-CITATION-REPORTS 37 1 LIBRARIANSHIP 38 1 LIBRARIES 39 1 MODEL 40 1 ONLINE 41 1 OVERLAP 42 1 PERCEIVED PRESTIGE 43 1 PHYSICS 44 1 PHYSICS JOURNALS 45 1 PLUS 46 1 PRACTITIONERS 47 1 PROFESSIONAL JOURNALS 48 1 PROFILE 49 1 PSYCHOLOGY 50 1 PUBLICATION

51 Co-citation Image CAMEO for Javed Mostafa: First 50 Names in 3 ISI files 1 26 MOSTAFA J 2 13 SALTON G 3 11 BELKIN NJ 4 11 MAES P 5 10 BESSER H 6 9 MARKEY K 7 8 RORVIG ME 8 7 ROBERTSON SE 9 6 FLICKNER M 10 6 FOLTZ PW 11 6 JENNINGS A 12 6 JORGENSEN C 13 6 KONSTAN JA 14 6 LAM W 15 6 LARSON RR 16 6 LEWIS DD 17 6 MUKHOPADHYAY S 18 6 OARD DW 19 6 OCONNOR BC 20 6 RESNICK P 21 5 BARNETT PJ 22 5 BEARD DV 23 5 CHANG SK 24 5 ESTER M 25 5 FIDEL R 26 5 GECSEI J 27 5 GUPTA A 28 5 HASTINGS SK 29 5 LANG K 30 5 LYNCH CA 31 5 PAZZANI M 32 5 SELOFF GA 33 5 TURNER J 34 4 ARMS WY 35 4 BACH JR 36 4 BALABANOVIC M 37 4 BATES MJ 38 4 BRAJNIK G 39 4 CAWKELL AE 40 4 CHANG SF 41 4 ENSER PGB 42 4 HARMAN D 43 4 HOLT B 44 4 HULL DA 45 4 JACOB EK 46 4 KORFHAGE R 47 4 LAYNE SS 48 4 LOSEE RM 49 4 MOUKAS A 50 4 NARENDRA KS

52 Journal CAMEO for Rob Kling: Where He’s Published at least Twice 1 9 INFORMATION SOCIETY 2 6 JOURNAL OF THE AMERICAN SOCIETY FOR INFORMATIO 3 4 BULLETIN OF THE AMERICAN SOCIETY FOR INFORMATI 4 4 COMMUNICATIONS OF THE ACM 5 4 SCIENCE TECHNOLOGY & HUMAN VALUES 6 3 ASTROPHYSICAL JOURNAL (Another Kling R?) 7 3 CRYPTOGAMIE ALGOLOGIE 8 3 INFORMATION AGE 9 3 INFORMATION PRIVACY 10 3 SOCIETY 11 2 CONTEMPORARY SOCIOLOGY-A JOURNAL OF REVIEWS 12 2 JOURNAL OF QUANTITATIVE SPECTROSCOPY & RADIATI (Ditto) 13 2 PHYSICA SCRIPTA (Ditto) 14 2 SYMBOLIC INTERACTION

53

54

55

56


Download ppt "The Endless Gallery: Visualizations of Author Data Howard D. White Xia Lin Jan Buzydlowski College of Information Science and Technology Drexel University."

Similar presentations


Ads by Google