2012 Discovery Informatics Symposium

Discovery Informatics Symposium: The Role of AI Research in Innovating Scientific Processes

November 2-4, 2012

AAAI Fall Symposium Series

Arlington, VA

Symposium Description

Addressing the ambitious research agendas put forward by many scientific disciplines requires meeting a multitude of challenges in intelligent systems, information sciences, and human-computer interaction. There are many aspects of the scientific discovery process that our community could help automate, facilitate, or make more efficient through artificial intelligence techniques. For example, although considerable efforts have been directed toward data modeling and integration, these activities continue to demand large investments of scientists’ time and effort. The scientific literature continues to grow and is becoming more and more unmanageable for researchers operating in the most active disciplines. Better interfaces for collaboration, visualization, and understanding would significantly improve scientific practice. Scientific data, publications, and tools could be published in open formats with appropriate semantic descriptions and metadata annotations to improve sharing and dissemination. Opportunities for broader participation in well-defined scientific tasks enable human contributors to provide large amounts of data, annotations, or complex processing results that could not otherwise be obtained. These are just some examples of areas where there are opportunities for artificial intelligent techniques could make a difference. Improvements and innovations across the spectrum of scientific processes and activities will have a profound impact on the rate of scientific discoveries.

This symposium provides a forum for researchers interested in understanding the role of AI techniques in improving or innovating scientific processes.

Download a flier advertising the symposium.

Program Highlights

The symposium will include invited talks, paper presentations, panel discussions, and plenary sessions. Six invited speakers will provide their personal perspectives on successes and challenges for Discovery Informatics. There will be seven full papers presented, interleaved with the invited talks. Two of the sessions will be panel discussions on current topics of interest, with panelists from a variety of perspectives including academia, funding agencies, and industry. The symposium will open and close with plenary sessions that will serve for exchange of general observations and synthesis of views for all attendees. AAAI will hold an evening reception as well as a joint session where major highlights of the other parallel symposia will be presented.

Download the symposium schedule.

Invited Speakers

  • Timothy W. Clark, Harvard University (bio)
  • William Cohen, Carnegie Mellon University (bio)
  • Lawrence Hunter, University of Colorado (bio)
  • Chris Lintott, University of Oxford (bio)
  • Hod Lipson, Cornell University (bio)
  • Jude Shavlik, University of Wisconsin Madison (bio)

Invited Panels

  • Challenges in Big Data: Discoveries at the Fringe of Science. Panelists: Lise Getoor, University of Maryland; Haym Hirsh, Rutgers University; Vasant Honavar, NSF; Steven Salzberg, Johns Hopkins University.
  • Discovery Informatics: Innovating Science Practice one Scientist at a Time. Panelists: Melissa Cragin, AAAS Science and Technology Fellow; Christopher Erdmann, The John G. Wolbach Library Harvard-Smithsonian Center for Astrophysics; Yolanda Gil, University of Southern California; Barbara Ransom, NSF.

Accepted Papers

  • Mohammad Taha Bahadori and Yan Liu
    On Causality Inference in Time Series
  • Nicholas Del Rio and Paulo Pinheiro Da Silva
    Capturing and Using Knowledge about the Use of Visualization Toolkits
  • Susan Epstein, Xingjian Li, Peter Valdez, Sofia Grayevsky, Eric Osisek, Xi Yun and Lei Xie
    Discovering Protein Clusters
  • Pat Langley and Glen Hunt
    A Web-Based Environment for Explanatory Biological Modeling
  • Arman Masoumi and Mikhail Soutchanski
    Organic Synthesis Planning Using the Situation Calculus
  • Leonardo Salayandia, Ann Gates and Paulo Pinheiro
    An Evaluation Approach for Interactions between Abstract Workflows and Provenance Traces
  • Marcelo Tallis, Drashti Dave and Gully Burns
    Preliminary meta-analyses of experimental design with examples from HIV vaccine protection studies

The symposium is part of the AAAI Fall Symposium Series. Please visit the site for the 2012 AAAI Fall Symposium to learn more about AAAI and the overall event.


Program

The symposium will include invited talks, paper presentations, panel discussions, and plenary sessions. Six invited speakers will provide their personal perspectives on successes and challenges for Discovery Informatics. There will be seven full papers presented, interleaved with the invited talks. Two of the sessions will be panel discussions on current topics of interest, with panelists from a variety of perspectives including academia, funding agencies, and industry. The symposium will open and close with plenary sessions that will serve for exchange of general observations and synthesis of views for all attendees. AAAI will hold an evening reception as well as a joint session where major highlights of the other parallel symposia will be presented.

Schedule

Download the symposium schedule.

Invited Talks


Timothy W. Clark: "Pervasive Semantic Annotation of Biomedical Literature using Domeo"
Abstract: Despite the fact that we now have access to almost all peer reviewed publications on the Web, these publications appear to us in a linear form which is a replica of print journals. At the same time there are increasingly attractive opportunities to surface data and concepts directly on the Web, using semantic organization. This talk will discuss how - for biomedical researchers - the Web of Documents and the Web of Data / Concepts can be bridged and integrated, using the Domeo Web Annotation Toolkit and the Annotation Ontology (AO). Domeo and AO can be used to annotate any HTML document whether or not it is under update control of the user. The AO annotation can be selectively shared and exchanged and is orthogonal to any specific biomedical domain ontology. We believe this approach will be extremely useful in drug discovery to break down information silos, increase information awareness and sharing, and integrate terminologies and data with documents and text, both public and private. We will discuss applications we are currently developing in collaboration with a major pharma.

Biography: Dr. Clark is Director of Bioinformatics, at the MassGeneral Institute for Neurodegenerative Disease & Instructor in Neurology of the Harvard Medical School. He trained as a computer scientist at Johns Hopkins, and began his work in life science informatics as one of the initial developers of the National Center for Biotechnology Information (NCBI) Genbank and a collaborator on the initial NCBI prototype of PubMed. He subsequently served as Vice-President of Informatics at Millennium Pharmaceuticals, where his team built one of the first integrated bio- and chemi-informatics software platforms in the pharmaceutical industry. He is a founding Editorial Board member of the journal Briefings in Bioinformatics, an Advisory Committee member of the World Wide Web Consortium (http://w3.org), and an Advisory Board member for the Neuroscience Information Framework (http://nif.nih.gov). Dr. Clark's research program focuses on multi-modal semantic integration of biomedical web communities, scientific discourse and experimental results. He is the Principal Investigator of the Semantic Web Applications in Neuromedicine (SWAN) (http://swan.mindinformatics.org) and Science Collaboration Framework (http://www.sciencecollaboration.org) projects. His informatics group built the reusable software platform for Stembook (http://www.stembook.org), an online review of stem cell biology published by the Harvard Stem Cell Institute, and created the Parkinson's Disease (PD) Online Research website (http://pdonlineresearch.org) in collaboration with the Michael J. Fox Foundation for Parkinson's Research.


William Cohen, Carnegie Mellon University: "Reasoning with Data Extracted from the Scientific Literature"
Abstract: The growing size of the scientific literature has led to a number of attempts to automatically extract entities and relationships from scientific papers, and then to populate databases with this extracted information. In my group we have been exploring techniques for using this sort of extracted information for specific tasks, including "bootstrapping" to improve the coverage of an extraction system, retrieval tasks involving the scientific literature, and modeling protein-protein interaction data. This joint work with Ramnath Balasubramanyan, Dana Movshovitz-Attias, and Ni Lao.

Biography: William Cohen received his bachelor's degree in Computer Science from Duke University in 1984, and a PhD in Computer Science from Rutgers University in 1990. From 1990 to 2000 Dr. Cohen worked at AT&T Bell Labs and later AT&T Labs-Research, and from April 2000 to May 2002 Dr. Cohen worked at Whizbang Labs, a company specializing in extracting information from the web. Dr. Cohen is President of the International Machine Learning Society, an Action Editor for the Journal of Machine Learning Research, and an Action Editor for the journal ACM Transactions on Knowledge Discovery from Data. He is also an editor, with Ron Brachman, of the AI and Machine Learning series of books published by Morgan Claypool. In the past he has also served as an action editor for the journal Machine Learning, the journal Artificial Intelligence, and the Journal of Artificial Intelligence Research. He was General Chair for the 2008 International Machine Learning Conference, held July 6-9 at the University of Helsinki, in Finland; Program Co-Chair of the 2006 International Machine Learning Conference; and Co-Chair of the 1994 International Machine Learning Conference. Dr. Cohen was also the co-Chair for the 3rd Int'l AAAI Conference on Weblogs and Social Media, which was held May 17-20, 2009 in San Jose, and was the co-Program Chair for the 4rd Int'l AAAI Conference on Weblogs and Social Media, which will be held May 23-26 at George Washington University in Washington, D. C. He is a AAAI Fellow, and in 2008, he won the SIGMOD "Test of Time" Award for the most influential SIGMOD paper of 1998. Dr. Cohen's research interests include information integration and machine learning, particularly information extraction, text categorization and learning from large datasets. He holds seven patents related to learning, discovery, information retrieval, and data integration, and is the author of more than 180 publications.


Lawrence Hunter, University of Colorado - Denver: "The First Artificial Mind Will Think About Molecular Biomedicine"
Abstract: Biomedicine, particularly as informed by genome-scale instrumentation, provides a unique domain for artificial intelligence and discovery informatics research. There are at least three phenomena that contribute to its status as a good domain for AI. First, there are several important characteristics of the domain, including (a) the knowledge-based (rather than law-like) nature of scientific explanation in biomedicine, (b) the modest role that common sense knowledge plays in biological reasoning, and (c) the possibility of embodiment of programs in the context of powerful automated experimental instrumentation. Second, there are a variety of highly significant resources available to researchers developing AI systems, including (a) extensive, continuously expanding, publicly available, and incompletely analyzed dataset of high value; (b) carefully constructed ontological resources constructed and maintained by diverse communities of experts, and (c) many specific use cases that offer clearly defined and highly significant problems that AI techniques have high potential to address. Finally, there is an extensive community of biomedical researchers and practitioners highly motivated to exploit and interact with computational systems that increase the quality, speed or ease of their scientific insights. The power of this combination of eager user community, valuable existing resources and appropriate domain characteristics is already clear from existing work in biomedical discovery informatics, but, as this talk will try to argue, the future is even brighter.

Biography: Dr. Lawrence Hunter is the Director of the Computational Bioscience Program and of the Center for Computational Pharmacology at the University of Colorado School of Medicine, and a Professor in the departments of Pharmacology and Computer Science (Boulder). He received his Ph.D. in computer science from Yale University in 1989, and then spent more than 10 years at the National Institutes of Health, ending as the Chief of the Molecular Statistics and Bioinformatics Section at the National Cancer Institute. He inaugurated two of the most important academic bioinformatics conferences, ISMB and PSB, and was the founding President of the International Society for Computational Biology. Dr. Hunter's research interests span a wide range of areas, from cognitive science to rational drug design. His primary focus recently has been cheap viagra online the integration of natural language processing, knowledge representation and machine learning techniques and their application to interpreting data generated by high throughput molecular biology.


Chris Lintott, University of Oxford: "Efficient crowdsourcing: How to do science with 600,000 participants"
Abstract: Citizen science in the form of crowdsourcing has to proved to be an effective response to the growing size of scientific datasets. This talk will present strategies and results from the Zooniverse, a collection of projects which have enabled more than 500,000 people to help scientists classify galaxies, discover planets, sort through whale songs and even transcribe ancient papyri. As datasets continue to grow, these projects must adapt, and the talk will concentrate on methods which move beyond the current naive random task assignment model. A dynamic bayesian classification of volunteers, applied to a supernova hunting project, was able to achieve much greater efficiency in classification, while dividing classifiers into communities based on their ability and behaviour. Future development of such systems will need to incorporate such analysis into their methodology, allowing user behaviour to guide task allocation, training and perhaps even collaboration.

Biography: Chris Lintott is a researcher in the department of physics at the University of Oxford where he is also a junior research fellow at New College. As PI of Galaxy Zoo and chair of the Citizen Science Alliance, he leads a large, transatlantic and multidisciplinary team of developers, scientists and educators with the aim of building the widest possible range of projects that enable meaningful public participation in science. His own research focuses on the formation of the present day population of galaxies, and he is a strong advocate of public understanding of science. In this latter role he serves on the board of trustees of Royal Museums Greenwich, and is co-presenter of the long-running BBC series 'The Sky at Night'.


Hod Lipson, Cornell University: "The Robotic Scientist: Distilling Natural Laws from Experimental Data, from particle physics to computational biology"
Abstract: Can machines discover scientific laws automatically? For centuries, scientists have attempted to identify and document analytical laws that underlie physical phenomena in nature. Despite the prevalence of computing power, the process of finding natural laws and their corresponding equations has resisted automation. This talk will outline a series of recent research projects, starting with self-reflecting robotic systems, and ending with machines that can formulate hypotheses, design experiments, and interpret the results, to discover new scientific laws. While the computer can discover new laws, will we still understand them? Our ability to have insight into science may not keep pace with the rate and complexity of automatically-generated discoveries. Are we entering a post-singularity scientific age, where computers not only discover new science, but now also need to find ways to explain it in a way that humans can understand? We will see examples from psychology to cosmology, from classical physics to modern physics, from big science to small science.

Biography: Hod Lipson is an Associate Professor of Mechanical & Aerospace Engineering and Computing & Information Science at Cornell University in Ithaca, NY. He directs the Creative Machines Lab, which focuses on novel ways for automatic design, fabrication and adaptation of virtual and physical machines. He has led work in areas such as evolutionary robotics, multi-material functional rapid prototyping, machine self-replication and programmable self-assembly. Lipson received his Ph.D. from the Technion - Israel Institute of Technology in 1998, and continued to a postdoc at Brandeis University and MIT. His research focuses primarily on biologically-inspired approaches, as they bring new ideas to engineering and new engineering insights into biology.


Jude Shavlik, University of Wisconsin - Madison: "Human-in-the-Loop Machine Learning"
Abstract: Machine learning has made tremendous progress over the past several decades. It has become one of today’s most important technologies for discovery and its future impact is likely to grow rapidly for the foreseeable future. However, to use the powerful capabilities offered by machine learning, domain experts typically need to find a collaborator who is a highly trained computer scientist possessing substantial experience with machine learning. This greatly limits the impact of this powerful technology. We are addressing the important challenge of reducing the barrier to entry for using machine learning by allowing domain experts to more directly communicate their expertise to machine learning algorithms. Providing such domain expertise in an effective manner promises to democratize machine learning, more generic cialis cheap quickly spreading this valuable technology to tasks where it can have a substantial impact. We are focusing on allowing domain experts to do more than providing (a) the features used to describe examples and (b) the desired outputs for training examples. We are creating learning algorithms that accept naturally expressed 'advice' whenever a domain expert has some knowledge that he or she wishes to provide. The human-provided advice need not be 100% correct since our learning algorithms are robust in the presence of imperfect advice.

Biography: Jude Shavlik is a Professor of Computer Sciences and of Biostatistics and Medical Informatics at the University of Wisconsin - Madison, and is a Fellow of the Association for the Advancement of Artificial Intelligence (AAAI). He has been at Wisconsin since 1988, following the receipt of his PhD from the University of Illinois for his work on Explanation-Based Learning. His current research interests include machine learning and computational biology, with an emphasis on using rich sources of training information, such as human-provided advice. He served for three years as editor-in-chief of the AI Magazine and serves on the editorial board of about a dozen journals. He chaired the 1998 International Conference on Machine Learning, co-chaired the First International Conference on Intelligent Systems for Molecular Biology in 1993, co-chaired the First International Conference on Knowledge Capture in 2001, was conference chair of the 2003 IEEE Conference on Data Mining, and co-chaired the 2007 International Conference on Inductive Logic Programming. He was a founding member of both the board of the International Machine Learning Society and the board of the International Society for Computational Biology. He co-edited, with Tom Dietterich, "Readings in Machine Learning." His research has been supported by DARPA, NSF, NIH (NLM and NCI), ONR, DOE, AT&T, IBM, and NYNEX.


Invited Panels


"Challenges in Big Data: Discoveries at the Fringe of Science"
Panelists: Lise Getoor, University of Maryland; Haym Hirsh, Rutgers University; Vasant Honavar, NSF; Steven Salzberg, Johns Hopkins University.


"Discovery Informatics: Innovating Science Practice one Scientist at a Time"
Panelists: Melissa Cragin, AAAS Science and Technology Fellow; Christopher Erdmann, The John G. Wolbach Library Harvard-Smithsonian Center for Astrophysics; Yolanda Gil, University of Southern California; Barbara Ransom, NSF.


Accepted Papers

  • Mohammad Taha Bahadori and Yan Liu
    On Causality Inference in Time Series
  • Nicholas Del Rio and Paulo Pinheiro Da Silva
    Capturing and Using Knowledge about the Use of Visualization Toolkits
  • Susan Epstein, Xingjian Li, Peter Valdez, Sofia Grayevsky, Eric Osisek, Xi Yun and Lei Xie
    Discovering Protein Clusters
  • Pat Langley and Glen Hunt
    A Web-Based Environment for Explanatory Biological Modeling
  • Arman Masoumi and Mikhail Soutchanski
    Organic Synthesis Planning Using the Situation Calculus
  • Leonardo Salayandia, Ann Gates and Paulo Pinheiro
    An Evaluation Approach for Interactions between Abstract Workflows and Provenance Traces
  • Marcelo Tallis, Drashti Dave and Gully Burns
    Preliminary meta-analyses of experimental design with examples from HIV vaccine protection studies


Important Note to Authors

Authors of accepted submissions must submit the final version of their papers by September 7, 2012. Papers should be no longer than 8 pages and follow the AAAI style files. AAAI will email the submission site will be emailed to the authors directly, along with a request to submit the permission to distribute form.

Any author with special special audio visual needs for their presentation (such as poster boards, power strips, flipcharts or laptop speakers/sound) should send the information in the audio/visual form below to the organizers immediately upon acceptance of their submissions.

Register now! Attendance is limited, so we recommend registering as soon as possible and alert interested colleagues to do so.

AUTHOR AUDIO/VISUAL FORM

All rooms in which the Symposia are held will have as standard equipment an LCD projector and screen. Individuals requiring special audio visual needs (such as poster boards, power strips, flipcharts or laptop speakers/sound) for their presentations are requested to return the form by September 7, 2012 to AAAI at fss12@aaai.org..


PRESENTER NAME:
SYMPOSIUM:
PAPER TITLE:
DATE AND TIME OF YOUR PRESENTATION:
TELEPHONE:
EMAIL:
SPECIAL A/V REQUEST (please only list what you would like AAAI to provide):

Please note that A/V requests are subject to budget restrictions. Authors are required to provide their own laptop computers, as well as all software needed to operate programs. Additional connections to the Internet in meeting rooms are restricted viagra online cheap to availability and budget considerations, and must be requested at least two months prior to the event.

Registration

Attendance is limited, so we recommend that you register as soon as possible and alert any interested colleagues to do so.

Registration is already open, and can be done on-line at the AAAI Fall Symposia registration site. To register by mail, use the registration form.

We seek submissions that: (1) report on success stories that illustrate the potential of future research in this field; (2) discuss lessons learned in the process of addressing challenging aspects of the scientific process; (3) analyze the impact of a particular technique in an area of science and reflect on its potential for broader applicability in other sciences; and (4) propose future concepts grounded in lessons learned and an understanding of the challenges in the scientific discovery process.

Topics of interest include but are not limited to:

  • Ontologies and knowledge bases that model particular areas of scientific knowledge
  • Semantic representations of metadata for all aspects of scientific processes
  • Techniques for organizing scientific literature
  • Workflow systems to manage complex data analysis processes
  • Knowledge discovery techniques that are embedded in the context of scientific investigations
  • Integrative approaches of machine learning and scientific model induction
  • Automated systems for experiment design, data analysis, and hypothesis generation and refinement
  • User-centered design of intelligent systems that partner with scientists to perform complex tasks
  • Integrated approaches to visualizing data, models, and the connections between them to foster new insights
  • Cognitive-centered design of scientist aids
  • Social computing systems that let novice participants contribute to scientific tasks

Submissions should be up to 6 pages, using the AAAI style files. Submissions should be uploaded to the submission site no later than June 5 2012 before midnight on the timezone of your choice.

AAAI will hold the compilation copyright on the set of papers for your symposium, and will make them freely accessible in the AAAI Digital Library. Authors of accepted papers will be required to sign the AAAI Distribution License. Authors are allowed to post their papers at their own sites, and retain copyright to their papers.

Important Dates


Submission deadline: June 22, 2012
Notification to authors: July 31, 2012
Camera-ready due: September 7, 2012
Registration deadline: September 14, 2012
Symposium: November 2-4, 2012

CO-CHAIRS

  • Will Bridewell, Stanford University
  • Yolanda Gil, University of Southern California
  • Haym Hirsh, Rutgers University
  • Kerstin Kleese van Dam, Pacific Northwest National Laboratory
  • Karsten Steinhaeuser, University of Minnesota

PROGRAM COMMITTEE

  • Cecilia Aragon, University of Washington
  • Phil Bourne, University of California San Diego
  • Elizabeth Bradley, University of Colorado
  • Paolo Ciccarese, Harvard University
  • Susan Davidson, University of Pennsylvania
  • Helena Deus, Digital Enterprise Research Institute
  • Tom Dietterich, Oregon State University
  • Yolanda Gil, University of Southern California
  • Clark Glymour, Carnegie Mellon University
  • Carla Gomes, Cornell University
  • Alexander Gray, Georgia Institute of Technology
  • Larry Hunter, University of Colorado
  • David Jensen, University of Massachusetts Amherst
  • Vipin Kumar, University of Minnesota
  • Hod Lipson, Cornell University
  • Huan Liu, Arizona State University
  • Yan Liu, University of Southern California
  • Miriah Meyer, University of Utah
  • Mark Musen, Stanford University
  • Andrey Rzhetsky, University of Chicago
  • Steve Sawyer, Syracuse University
  • Alex Schliep, Rutgers University
  • Christian Schunn, University of Pittsburgh
  • Nigam Shah, Stanford University
  • Alex Szalay, The Johns Hopkins University
  • Loren Terveen, University of Minnesota
  • Raul E. Valdes-Perez, Vivisimo Inc.
  • Evelyne Viegas, Microsoft Research

Travel

Symposium Location

The symposium will be held at the Westin Arlington Gateway in Arlington, Virginia. The hotel is next to the Ballston Metro, and within walking distance of NSF, ONR, AFOSR, DARPA, and other government agencies.

The symposium is as part of the AAAI Fall Symposium Series. Please visit the site for the 2012 AAAI Fall Symposia for location, travel arrangements, and other information about the event.

Hotel

A block of rooms has been reserved for attendees at the symposium hotel, the Westin Arlington Gateway. . To reserve a room online, please go to the on-line reservations site. Space is limited so we recommend that you make your reservation now. Reservations made after Wednesday, October 10 will be accepted based on availability at the hotel's prevailing rate.

Local Arrangements

The symposium is organized by AAAI as part of its AAAI Fall Symposium Series. Please visit the site for the 2012 AAAI Fall Symposia for location, travel arrangements, and other information about the event.