Graham Cummins
Washington State University
Design of a Semantic Type System to Facilitate Data Sharing and Analysis Tool Reuse
Wednesday 19th of October 2011 at 12:00pm
560 Evans
Data sharing between labs, and indeed between disciplines, reduces
duplication of effort, facilitates new discoveries, and leads to the
development of more flexible, reliable, and reproducible analysis
techniques. Initially, a data sharing solution is required to support
entry, storage and transfer of data. In order to be useful, however,
it must also assist potential collaborators to locate appropriate data
sets, apply their analysis tools, and interpret the results
meaningfully. This requires that the stored data sets incorporate
information describing both their structure and their meaning,
preferably in a form that is useful both to humans and machines.
Currently, most approaches to this problem focus on tagging data sets
with additional meta-data. These tags are human readable labels that
describe the meaning, and typically also the origin and intended use,
of the data. Meta-data tags, however, are of limited use to machines.
They can be searched over, but usually do not specify enough
information to allow the application of analysis tools. I present an
alternative approach to data markup, which falls between the domains
of meta-data tagging and computational data type (in the sense used in
the design of computer languages). I call this approach a semantic
type system. I present a design for such a system, which is based
heavily on the pattern-matching behavior of functional computer
languages, and an initial implementation of this design. I will show
examples of how this system provides an application interface for
analysis tools, how it can be used by humans to understand data, and
also how it can be used to make data sets easier to search. Finally, I
propose a possible mechanism for converting recorded data into
appropriately semantically typed structures.(video)
Join Email List
You can subscribe to our weekly seminar email list by sending an email to
majordomo@lists.berkeley.edu that contains the words
subscribe redwood in the body of the message.
(Note: The subject line can be arbitrary and will be ignored)