Download presentation
Presentation is loading. Please wait.
Published byPernille Dalgaard Modified over 6 years ago
1
Introduction to Xinformatics: Course Scope, Assessments
Peter Fox Xinformatics ITWS, ERTH, CSCI 4400/6400 Module 1, January 16, 2018
2
Contents Introductions Course Outline Application areas
Logistics and resources Assessment and assignments Learning objectives, outcomes Introduction to Xinformatics Discussion Next modules and classes
3
E-Introductions Post a message to LMS Discussion Boards
Name, major, year This course: Interest, goals, outcomes? Have you completed any *suggested* prerequisites: Knowledge such as that gained in a Data Base class (e.g., CSCI-4380) Knowledge such as that gained in a Data Structures class (e.g., CSCI-1200) Knowledge such as that gained in a Data Science class (e.g. ITEC/CSCI/ERTH 4350/6350)
4
Course Outline Introduction to Informatics
Capturing the problem: Use case development and requirement analysis Information theory, models, tools Foundations; semiotics, library, cognitive and social science State-of-the-Art, informatics applications Information life-cycle Information architectures (Internet, Web, Grid, Cloud) Information Visualization, Information Discovery, Information Integration Information Audit and Workflow Information quality, and bias Class exercises, guest lectures and current events along the way
5
Application Areas Geoinformatics Astroinformatics Cheminformatics
Bioinformatics Helioinformatics Healthinformatics Ecoinformatics Nursing informatics and the list goes on, and on
6
Logistics Class: ITWS, ERTH, CSCI 4400/6400, Hours: 3:00pm-5:50pm Tuesdays Location: Lally 102 Instructors: Peter Fox, Office Hours: by appointment TA: Eliyah Afzal By appointment All content, lectures, readings, assignments, grades on LMS (and on (except grades).
7
Assessment and Assignments
Reading assignments Are given for almost every module Most are background and informational Some are key to completing assignments Some are relevant to the current week’s class Others are relevant to following week’s class (i.e. pre-reading) Undergraduates - will not be tested on but we will often discuss these in class and participation in these is taken into account Graduates – are tested as part of assignments, i.e. an extra question You will progress from individual work to group work
8
~ Assessment and Assignments
# Type Topic Weight Due Date 1 Individual Written Use Case and User Requirements Development 10% Jan 30, 2018 2 Information Uncertainty and Uncertainty Reduction Feb 13, 2018 3 Analysis of Cognitive, Collection, and Social/Cultural Aspects of Information Systems in Signs Mar 6, 2018 Individual Presentation 5% 4 Information Models and Information Architectures Mar 27, 2018 5 Group Written Development of Use Case, Information Architecture, Model and Prototype Implementation for Informatics Area of Your Choice 20% Apr 23, 2018 Group Presentation 6 In-Class Individual Participation Discuss Readings, Lecture Notes, Guest Lectures and Related Informatics’ Current Events in Class 15% All semester 7 On-line Individual Participation Answer and discuss on-line questions associated with the weekly Topics
9
Objectives To instruct future information architects how to sustainably generate information models, designs and architectures To instruct future technologists how to understand and support essential data and information needs of a wide variety of producers and consumers For both to know tools, and requirements to properly handle data and information Will learn and be evaluated on the underpinnings of informatics, including theoretical methods, technologies and best practices.
10
Learning Objectives Through class lectures, practical sessions, written and oral presentation assignments and projects, students should: Develop and demonstrate skill in Participation and Organization of multi-skilled teams in the application of Informatics Demonstrate the development of Conceptual and Information Models and explain them to non-experts (6400 level only) Demonstrate the application of information theory and design principles to information systems Evaluate, discuss and be able to apply (6400 level only) Informatics Best Practices and Standards Demonstrate knowledge (advanced for 6400 level) and application of Informatics Standards Develop and demonstrate (advanced for 6400 level) skill in Informatics Tool Use and Evaluation
11
Academic Integrity Student-teacher relationships are built on trust. For example, students must trust that teachers have made appropriate decisions about the structure and content of the courses they teach, and teachers must trust that the assignments that students turn in are their own. Acts, which violate this trust, undermine the educational process. The Rensselaer Handbook of Student Rights and Responsibilities defines various forms of Academic Dishonesty and you should make yourself familiar with these. In this class, all assignments that are turned in for a grade must represent the student’s own work. In cases where help was received, or teamwork was allowed, a notation on the assignment should indicate your collaboration. If found in violation of the academic dishonesty policy, students may be subject to two types of penalties. The instructor administers an academic (grade) penalty, and the student may also enter the Institute judicial process and be subject to such additional sanctions as: warning, probation, suspension, expulsion, and alternative actions as defined in the current Handbook of Student Rights and Responsibilities. . Submission of any assignment that is in violation of this policy will result in loss of grade for that assignment. If you have any question concerning this policy before submitting an assignment, please ask for clarification.
12
Questions? Post on LMS Lectures will be recorded (Adobe Connect) – and posted on LMS Or send to and
13
Introduction to Informatics
E.g. Bioinformatics Over the past few decades, major advances in the field of molecular biology, coupled with advances in genomic technologies, have led to an explosive growth in the biological information generated by the scientific community. This deluge of genomic information has, in turn, led to an absolute requirement for computerized databases to store, organize, and index the data and for specialized tools to view and analyze the data.
14
Tell us more… Bioinformatics is the field of science in which biology, computer science, and information technology merge to form a single discipline. The ultimate goal of the field is to enable the discovery of new biological insights as well as to create a global perspective from which unifying principles in biology can be discerned. At the beginning of the "genomic revolution", a bioinformatics concern was the creation and maintenance of a database to store biological information, such as nucleotide and amino acid sequences. Development of this type of database involved not only design issues but the development of complex interfaces whereby researchers could both access existing data as well as submit new or revised data.
15
And… Ultimately, however, all of this information must be combined to form a comprehensive picture of normal cellular activities so that researchers may study how these activities are altered in different disease states. Therefore, the field of bioinformatics has evolved such that the most pressing task now involves the analysis and interpretation of various types of data, including nucleotide and amino acid sequences, protein domains, and protein structures.
16
And… The actual process of analyzing and interpreting data is referred to as computational biology. Important sub-disciplines within bioinformatics and computational biology include: the development and implementation of tools that enable efficient access to, and use and management of, various types of information the development of new algorithms (mathematical formulas) and statistics with which to assess relationships among members of large data sets, such as methods to locate a gene within a sequence, predict protein structure and/or function, and cluster protein sequences into families of related sequences
17
One result – myexperiment.org
18
Data as Infostructure
19
Definitions Data - are encodings that represent the qualitative or quantitative attributes of a variable or set of variables. Data (plural of "datum", which is seldom used) - are typically the results of measurements, computations, or observations and can be the basis of graphs, images of a set of variables. Data - are often viewed as the lowest level of abstraction from which information and knowledge are derived*** *** we will learn that this is not as simple as it seems.
20
Definitions ctd. Information Knowledge
Representations (of facts? data?) in a form that lends itself to human use The word information derives from the Latin informare (in+formare) meaning to give form, shape, or character to. It is therefore to be the formative principle of, or to imbue with some specific character or quality. Knowledge Check out Wikipedia…. meaning
21
Definitions ctd. Metadata – data about data
Metainformation – information about information Documentation – integrated collection of information and metadata intended to support all aspects of data (find, access, use…) Provenance - origin or source from which something comes, intention for use, who/what generated for, manner of manufacture, history of subsequent owners, sense of place and time of manufacture, production or discovery, documented in detail sufficient to allow reproducibility
22
Could it really be this linear?
23
Data-Information-Knowledge Ecosystem
Producers Consumers Experience Data Information Knowledge Creation Gathering Presentation Organization Integration Conversation Context
24
TWC Curriculum tw.rpi.edu/web/Courses
Data Analytics Producers Consumers Data Science Xinformatics Semantic eScience Experience Data Information Knowledge Creation Gathering Presentation Organization Integration Conversation Context Web Science
25
The Information Era: Interoperability
Modern information and communications technologies are creating an “interoperable” information era in which ready access to data and information can be truly universal. Open access to data and services enables us to meet the new challenges of understanding complex systems: managing and accessing large data sets higher space/time resolution capabilities rapid response requirements data assimilation into models crossing disciplinary boundaries. Interoperability technologies have a 20 year history of development and are now mature. Combined with our growing ability to transmit large amounts of information efficiently (Internet, GRID), this provides us with an unprecedented ability to address new and old scientific problems in ways that were hitherto impossible. The challenge now in the geosciences is to capitalise on this new capability. e-Science initiatives are growing up in centers around the world in response to this opportunity.
26
Shifting the Burden from the User to the Provider
In the “heroic” era of science, the provider of data plays a relatively passive role. The onus is on the user to identify each data source, accumulate the required data, and prepare the data for assimilation and analysis. A similar process applies for acquiring software for analysis, modeling, and visualisation. In many cases, the user and the provider are the same person. In an interoperable e-Science world, the provider has to put much more work into describing the structure and content of data and information, and someone has to provide and support Web Services. The user is relieved of these burdens and benefits accordingly. The overall reduction in work load is enormous, but the provider does not see that. Fox CI and X-informatics - CSIG 2008, Aug 11
27
Other forms of information
IoT! Yottabytes
28
Information explosion
Devices are everywhere, but … by 2020
29
And, gulp, ‘un’structured*
30
The key is: As volume, complexity and heterogeneity increase…
Suddenly information may look more like a continuum All known methods, algorithms will not scale (except for very simple operations) And because it is information, humans are part of the loop Thus – we need to understand and apply the theoretical foundations Problem: most theory to date are developed in an analog world, not a digital one!!
31
Mind the gap As capabilities and needs grow on both sides: science/ medicine/ engineering – and technology: There is/ was still a gap between science and the underlying infrastructure and technology that is available Cyberinfrastructure is the new research environment(s) that support advanced data acquisition, data storage, data management, data integration, data mining, data visualization and other computing and information processing services over the Internet.
32
Mind the gap As capabilities and needs grow on both sides: science/ medicine/ engineering – and technology: There is/ was still a gap between science and the underlying infrastructure and technology that is available Informatics - information science includes the science of (data and) information, the practice of information processing, and the engineering of information systems. Informatics studies the structure, behavior, and interactions of natural and artificial systems that store, process and communicate (data and) information. It also develops its own conceptual and theoretical foundations. Since computers, individuals and organizations all process information, informatics has computational, cognitive and social aspects, including study of the social impact of information technologies. Wikipedia. Cyberinfrastructure is the new research environment(s) that support advanced data acquisition, data storage, data management, data integration, data mining, data visualization and other computing and information processing services over the Internet.
33
But really it’s not just one field
Informatics IT Cyber Infrastructure (CI) Cyber Informatics Core Informatics Science Informatics Science, Benefit to others Functional requirements SBA=societal benefit areas CI = Discipline neutral, e.g. web server, database, wiki Cyberinformatics = mapping to discipline neutral aspects Core informatics = Reasoning engine, semantics, computer science Science (X) informatics = Use cases, science domain terms, concepts in an ontology or controlled vocabulary
34
A moment of history In the late 1950’s (actually around or 1962 depending on what you read) the modern informatics term was coined Existed for a while but then split into library science and computer science and developed their own fields, became disconnected Now coming back to be relevant to science+… Informatics IS NOT just having a someone work with an “IT/ICT” person (NOT, NOT, NOT)
35
Cyberinformatics The first match between the domain and the underlying domain-neutral e-infrastructure/ cyberinfrastructure When the underlying infrastructure (when it becomes real infrastructure and not just software) changes this is one part that needs to change Less brittle since upper layers remain intact
36
Core informatics The realm of computer science (for the most part, also librarians) Strongly influenced by science (and engineering and medical applications) above and below this layer If we can leverage this, we do not need to do the specialist work, however … We must work with these scientists, sustainably
37
Science Informatics Where science meets the underlying technical capabilities and methods Must be expressible in science terms; increasingly use cases The people in this area are multi-lingual and both interdisciplinary and multi-disciplinary, few are trained or literate here ****** Team, or really a community of practice (CoP)
38
Use Case … is a collection of possible sequences of interactions between the system under discussion and its actors, relating to a particular goal. The collection should define all system behavior relevant to the actors to assure them goals will be carried out properly. Any system behavior that is irrelevant to the actors should not be included in the use cases. is a prose description of a system's behavior when interacting with the outside world. is a technique for capturing functional requirements of business systems and, potentially, of an IT system to support the business system. can also capture non-functional requirements
39
THE PHYSICS OF INFORMATION
BORROMEAN RINGS Three interlinked circles that represent inseparable parts of the whole. Remove any one ring and the other two fall apart. Because of this property, Borromean Rings have been used as a symbol of unity in many fields. © 2005 EvREsearch LTD Courtesy: P Berkman (2005). Remember STRUCTURE!!!!!!!! Information has three indivisible ingredients – content, context and structure. The ability to automatically utilize the inherent structure of information is the threshold in information management from hardcopy to digital media. EvREsearch©
40
Not a perfect story Many authors criticize the use of the term entropy, and physics of information Information conservation, diffusion, viscosity, advection, dissipation… sort of all make some sense Units are a big part of it (question: what are the possible units?) and what are the non-dimensional numbers? However the idea is very relevant to modeling, design and architecture We’ll revisit the components of the physics of information
41
Information theory Semiotics, also called semiotic studies or semiology, is the study of sign processes (semiosis), or signification and communication, signs and symbols, into three branches: Syntactics: Relation of signs to each other in formal structures Semantics: Relation between signs and the things to which they refer; their denotata Pragmatics: Relation of signs to their impacts on those who use them Wikipedia, see also Peirce:
42
Library science Curates the artifacts of knowledge but increasingly: (yes) information Organizes and manages them for consumers Cataloging and classification Preservation ‘maintaining or restoring access to artifacts, documents and records through the study, diagnosis, treatment and prevention of decay and damage’ (wikipedia) In our digital age Curation and preservation
43
HISTORY OF INFORMATION THRESHOLDS
INFORMATION VOLUME INFORMATION ERAS DIGITAL PAPER INFORMATION TRANSPORT INFORMATION INTEGRATION PAPYRUS CLAY Courtesy. Paul Berkman (2005) STONE 6000 5000 4000 3000 2000 1000 FUTURE TIME (years before present) © 2005 EvREsearch LTD
44
? 5 generations of work and mediation
6th Generation ? All these generations of mediation are in effect for information systems From: C. Borgman, 2008, NSF Cyberlearning Report
45
Social Science Branch of humanities
Especially as it relates to networks of people Exploits sociology of groups, teams Cultural norms as well as discipline norms Modes of what and how rewards are given Between those who produce and those who consume data and information How you collect, understand, model and design models and architectures is as much social as technical skill
46
Cognitive Science Interdisciplinary study of the mind and intelligence
operates at the intersection of psychology, philosophy, computer science, linguistics, anthropology, and neuroscience. Of relevance for data and information science are three significant theoretical underpinnings mental representation, the nature of expertise, and intuition Very relevant to models, modeling, metamodel choice
47
Computer Science Data Structures Knowledge/ Information Representation
Algorithms!! Encoding/ coding More…
48
Preview of Information Models
Conceptual models, aka domain models, are typically used to explore domain concepts High-level conceptual models are often created as part of initial requirements envisioning efforts as they are used to explore the high-level static business or science or medicine structures and concepts. Followed by logical and physical models
49
Object models An object model is a logic organization of the real world objects (entities), constraints on them, and the relationships among objects. E.g. A database (DB) language is a concrete syntax for an object (data) model. A DB system implements that model.
50
Architectures Building on content, context, and users, some illustrate information architecture as an iceberg. Just like an iceberg, the majority of information architecture work is out of sight, "below the water." The work includes the creation of plans, controlled-vocabularies, and blueprints all before any user interfaces are created.
51
Above the water and below
Design, design, design … of the interfaces… i.e. architecture, of the social, cognitive, etc. elements of information ‘systems’ Almost all are designed to support two basic modes of investigation/ navigation: induction and deduction… but enough of that for now
52
Information life-cycle (trivial)
53
Visualization
54
Information Audit Analysis and evaluation of a organization's information system (whether manual or computerized) to detect and rectify blockages, duplication, and leakage of information. The objectives of this audit are to improve accuracy, relevance, security, and timeliness of the recorded information. Related to workflow management
55
Information Quality, and Bias
Quality is perceived differently by information providers and information recipients There are many different qualitative and quantitative aspects of quality Methodologies for dealing with qualities are just emerging Little resources are allocated to quality Each organization handles quality differently Bias detection is often an expert skill, and bias correction even more-so
56
X=discipline (not eXtreme, sorry)
But really Core elements that apply across many disciplines
57
Discussion About informatics? Methodology and underpinnings
Definitions? You will hear a LOT of new terms and they are NOT BUZZWORDS – you need to understand them and show me how much Applications? Software requirement Components? Theory (we’ll start on this soon)
58
Skills needed Modeling, theory, architecture experience?
Nah, we’ll cover that Literacy with computers and applications that can handle information Yep Ability to access internet and retrieve/ acquire data Oh yea Presentation of assignments Ditto
59
What is expected Attend class (ON TIME), complete assignments (esp. reading, be prepared to give comments when asked in subsequent classes) Participate (e.g. reading), discuss, ask questions Work constructively in group and class sessions Work both individually, and then, in a group USE YOUR RPI ADDRESS And follow the assignment naming scheme
60
Also on the web Reading assignments – are intended to prepare you for following lectures and may be considered materials for written assignments or project Assignments will be posted on LMS Individual Group TA is your first contact for assignment questions but please me/us, seek out office hour appointments, and be engaged
61
What is next Next week – Capturing the problem: use cases and requirements as-a basis for Assignment 1 Reading for this week - Xinformatics Applications – Clinical Informatics: Urban Informatics: Geo Informatics: Astro Informatics: Bio Informatics: We will discuss these in class next week…
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.