The DB group meets Wednesday afternoons at 2:30pm. The list below gives the times and locations of upcoming meetings. Each meeting lasts for an hour and features either a local speaker or, on Seminar days, an invited outside speaker. Everyone is welcome to attend.
| DB Meeting: | Wednesday January 11, 2:30pm, DC 1331 |
| Speaker: | Umar Farooq Minhas |
| Title: | Scalable and Highly Available Database Systems in the Cloud |
| Abstract: | Cloud computing allows users to tap into a massive pool of shared computing resources such as servers, storage, and network. These resources are provided as a service to the users allowing them to "plug into the cloud" similar to a utility grid. The promise of the cloud is to free users from the tedious and often complex task of managing and provisioning computing resources to run applications. At the same time, the cloud brings several additional benefits including: a pay-as-you-go cost model, easier deployment of applications, elastic scalability, high availability, and a more robust and secure infrastructure. One important class of applications that users are increasingly deploying in the cloud is database management systems. Database management systems differ from other types of applications in that they manage large amounts of state that is frequently updated, and that must be kept consistent at all scales and in the face of failure. This makes it difficult to provide scalability and high availability for database systems in the cloud. In this talk, I will show how we can exploit cloud technologies and relational database systems to provide a highly available and scalable database service in the cloud. In the first part of my talk, I will present RemusDB, a reliable, cost-effective high availability solution that is implemented as a service provided by the virtualization platform. RemusDB can make any database system highly available with little or no code modifications by exploiting the capabilities of virtualization. In the second part of the talk, I will present two systems that aim to provide elastic scalability for database systems in the cloud using two very different approaches. The three systems I will present bring us closer to the goal of building a scalable and reliable transactional database service in the cloud. |
| DB Meeting: | Wednesday January 18, 2:30pm, DC 1331 |
| Speaker: | Jiewen Wu |
| Title: | Answering Object Queries in DL Knowledge Bases |
| Abstract: | We consider a generalization of instance retrieval over description-logics knowledge bases that provides users with assertions in which descriptions of qualifying objects are given in addition to their identifiers. Notably, this involves a transfer of basic database paradigms involving caching and query rewriting in the context of an assertion retrieval algebra. We present a query optimization framework for this algebra, with a focus on finding plans that avoid any need for general knowledge base reasoning at query execution time when sufficient cached results of earlier requests exist. |
| DB Seminar: | Wednesday January 25, 2:30pm, DC 1302 | |
| Speaker: | Ryan Johnson, University of Toronto | |
| Title: | Communication and co-design for scalable database engines |
| DB Meeting: | Wednesday February 1, 2:30pm, DC 1331 |
| Speaker: | Iman Elghandour |
| Title: | ReStore: Reusing Results of MapReduce Jobs |
| Abstract: | Analyzing large scale data has emerged as an important activity for many organizations in the past few years. This large scale data analysis is facilitated by the MapReduce programming and execution model and its implementations, most notably Hadoop. Users of MapReduce often have analysis tasks that are too complex to express as individual MapReduce jobs. Instead, they use high-level query languages such as Pig, Hive, or JAQL to express their complex tasks. The compilers of these languages translate queries into workflows of MapReduce jobs. Each job in these workflows reads its input from the distributed file system used by the MapReduce system and produces output that is stored in this distributed file system and read as input by the next job in the workflow. In my talk, I will present ReStore, a system that manages the storage and reuse of such intermediate results. |
| DB Meeting: | Wednesday February 8, 2:30pm, DC 1331 |
| Speaker: | Greg Drzadzewski |
| Title: | Online Analytical Processing of Documents |
| Abstract: |
With the availability of many large and ever growing document collections it is getting more cumbersome for users to explore them. While a search engine is useful to satisfy a user's ad hoc information needs, allowing a user to retrieve relevant documents through a keyword query, it is inadequate for analysis of bulky text information, which are necessary in many online applications. This type of exploration need can be addressed by providing support for online applications such as summarizing the contents of a text cell, and comparing the contents across multiple text cells. In my talk I will examine online analytical processing of documents and discuss the following two papers that deal with this area of research:
|
| Computer Science Seminar: | Wednesday February 15, 10:30am, DC 1304 | |
| Speaker: | Julia Stoyanovich, University of Pennsylvania | |
| Title: | Information Discovery in Large Complex Datasets |
| DB Meeting: | Wednesday February 15, 2:30pm, DC 1331 POSTPONED |
| Speaker: | Alex Hudek |
| Title: | On Enumerating Query Plans Using Interpolants |
| Abstract: | For relational (SQL) queries a standard service provided by current relational systems is to search the space of alternative query plans (ways of executing the query) to find one likely to have the best performance. A given query often has many semantically equivalent plans that vary in performance by many orders of magnitude making the problem of finding a best plan difficult. Recent trends in view based query rewriting, information integration, and ontology-based data access have made the relationship between the query and its plan space much more complex. Enumerating the possible plans has become even more challenging as the relationship between the user (logical) view of the data and the material capabilities for accessing relevant stored information has become less transparent. In this paper, we show how to use interpolation techniques to enumerate possible plans for a given user query. We also show how to obtain common varieties of plan patterns in this setting, such as those that derive from an enumeration of possible join orders for conjunctive (sub) queries. |
| Seminar: | Thursday February 16, 4:00pm, DC 1331 | |
| Speaker: | Arnon Sturm, Ben-Gurion University of the Negev | |
| Title: | A Methodology for Developing Secure Database Code | |
| Abstract: | Security in general and database protection from unauthorized access in particular, are crucial to organizations. Several methods and techniques were devised to address this concern. However, none of these provide a comprehensive solution. In this talk we explore a work done within the context of a research project which aims at developing a methodology for guiding and enforcing developers, in particular database designers, to deal with database security requirements related to authorization in the early stages of development. The proposed methodology enables to define and enforce organizational security policies, and to validate that security requirements defined by the designers of an application are in accord with the organizational transformation of the design results into actual implementation, i.e., into the specification of the database code, including the authorization specification. We also present an empirical evaluation of part of the proposed approach. |
| DB Meeting: | Wednesday February 29, 2:30pm, DC 1331 |
| Speaker: | Alex Hudek |
| Title: | On Enumerating Query Plans Using Interpolants |
| Abstract: | For relational (SQL) queries a standard service provided by current relational systems is to search the space of alternative query plans (ways of executing the query) to find one likely to have the best performance. A given query often has many semantically equivalent plans that vary in performance by many orders of magnitude making the problem of finding a best plan difficult. Recent trends in view based query rewriting, information integration, and ontology-based data access have made the relationship between the query and its plan space much more complex. Enumerating the possible plans has become even more challenging as the relationship between the user (logical) view of the data and the material capabilities for accessing relevant stored information has become less transparent. In this paper, we show how to use interpolation techniques to enumerate possible plans for a given user query. We also show how to obtain common varieties of plan patterns in this setting, such as those that derive from an enumeration of possible join orders for conjunctive (sub) queries. |
| Computer Science Seminar: | Monday March 5, 10:30am, DC 1304 | |
| Speaker: | Leman Akoglu, Carnegie Mellon University | |
| Title: | Massive Graph Analytics: Patterns, Anomalies, and Tools |
| DB Meeting: | Wednesday March 7, 2:30pm, DC 1331 |
| Speaker: | Ahmed Ataullah |
| Title: | TBD |
| Abstract: |
| DB Meeting: | Wednesday March 14, 2:30pm, DC 1331 |
| Speaker: | Gunes Aluc |
| Title: | |
| Abstract: |
| DB Meeting: | Wednesday March 28, 2:30pm, DC 1331 |
| Speaker: | Ani Nica |
| Title: | TBA |
| Abstract: |
| DB Seminar: | Monday April 9, 2:30pm, DC 1302 | |
| Speaker: | Martin Kersten, CWI | |
| Title: | TBD |
| DB Meeting: | Wednesday April 18, 2:30pm, DC 1331 |
| Speaker: | Ahmed Soror |
| Title: | TBA |
| Abstract: |
| DB Meeting: | Wednesday April 25, 2:30pm, DC 1331 |
| Speaker: | Ning Zhang |
| Title: | TBA |
| Abstract: |

Database Research Group
David R. Cheriton School of Computer Science
University of Waterloo
Waterloo, Ontario, Canada N2L 3G1
Tel: 519-888-4567
Fax: 519-885-1208
Contact | Feedback: db-webmaster@cs.uwaterloo.ca | Database Research Group