SSRC News
SSRC EventsNo upcoming events at this time. |
Distributed Metadata ManagementFacultyStudentsDescriptionAs the number and variety of files stored and accessed by users dramatically increases, existing file system structures have begun to fail as a mechanism for managing all of the information contained in those files. Many applications, such as email clients, multimedia management applications, and desktop search engines, have been forced to develop their own richer metadata infrastructures. While effective, these solutions are generally non-standard, non-portable, and potentially non-scalable. These issues suggest search, indexing, and information retrieval are becoming increasingly important areas for file and storage systems. In conjunction with faculty and students specializing in information retrieval at the UC Santa Cruz Department for Information Systems and Technology Management, we are developing system architectures that address these issues, which are scalable up to billions of files. Status
Our current areas of focus are scalable indexing architectures for storage systems, improved file system interfaces for search, and incorporating concepts from information retrieval, such as faceted search, into file systems. The design of our scalable indexing architecture leverages the unique characteristics of storage systems, such as data distribution and hierarchical namespaces, to facilitate efficient and robust search and management queries. This work is done in collaboration with Network Appliance. In addition, we are designing a file system query language, QUASAR, that allows users to have powerful semantic access to stored data. QUASAR allows semantic file system views and directories to be created, which provide more meaningful data representations. Also inter-file relationships, such as provenance, can be expressed and searched through links. Finally, we are exploring how faceted search and new ranking algorithms can improve data access. We are in the process of implementing these concepts in the Ceph distributed file system. QUASAR provides a system-level interface, but end users require an easy to use browser in order to help them navigate the file system. We are currently investigating applying faceted search, a technique that has seen success with digital libraries, to the file system domain. Faceted search uses rich key-value metadata to allow users to interactively navigate the search space. We are also investigating how these interfaces can be automatically personalized so that each user of the share can easily find the files that they are most interested in. Publications
Last modified 27 May 2008 |
|
© 2008 SSRC & UCSC |
Home | Research | People | Publications | Seminars | Sponsors |
| Site powered by Django |