My current focus is on integration of HPC with DBMS concepts for data intensive computations.
I focus on making scientific array data available using declarative language and advanced query processing capabilities.
My work with the OSU DB Research Group relates to interactive DBMS engines.
We explore the affect interactive databases have on the DB Engine design.
For this research area, I developed a database engine (storage component and execution engine) that is able to satisfy our (high) performance needs for interactive queries.
I developed an efficient hand detection and tracking algorithm/mechanism. It is implemented within an iPhone app.
The whole purpose of this work was to explore efficient ways to detect objects on devices with limited processing abilites.
In addition, I worked on improving sampling performance over a Bayesian Network with all discrete distributions.
DissertationThe dissertation will be available to read soon, after formally published by The Ohio State University.
In the dissertation you will find all the work I have completed as part of the Ph.D. program. A few of my published works has been done "for fun", and were not counted towards the degree.
Publications / Research Work
2018 - In Submission...
Sckeow: An Optimizer for Multi-Level Join Execution of Skewed and Distributed Array Data (Roee Ebenstein, Gagan Agrawal)
In this work we target the optimization of join queries over data that becomes skewed during query execution.
We introduce a new execution and optimization approach for distributed querying of such queries (our approach generates an ideal execution plan directly, in most cases, decreasing the query evaluation effort).
FDQ: Advance Analytics over Real Scientific Array Datasets (Roee Ebenstein, Gagan Agrawal)
In this work we introduce analytical functions over scientific array data.
While introducing the new functionality, we thoroughly discuss method for efficient (distributed) execution of the new type of queries.
BitJoin: Executing Range Based Joins in Distributed Environments (Roee Ebenstein, Gagan Agrawal)
In this work we developed an efficient mechanism to execute joins with range based join criteria.
We use modified structure of BitMap indexes with WAH compression.
DistriPlan - An Optimized Join Execution Framework for Geo-Distributed Scientific Data (Roee Ebenstein, Gagan Agrawal, SSDBM 2017)
In this work we have developed a cost based scheme for optimization of data movement for query execution in distributed environments.
The system include data movement cost and penalties for networking and mechanisms to prune data movement options, which results in realistic and practical option spanning for distribution analysis and cost evaluation.
FluxQuery: An Execution Framework for Highly Interactive Query Workloads (Roee Ebenstein, Niranjan Kamat, Aranb Nandi, SIGMOD 2016)
In this paper we discuss modifications of a DBMS engine to support highly interactive workloads.
We develop a new cyclic scan engine for joins, and present two new novel join algorithms that are intended to provide results in interactive time frames.
DSDQuery DSI - Querying Scientific Data Repositories with Structured Operators (Roee Ebenstein, Gagan Agrawal, IEEE BigData 2015).
In this paper we discuss querying a distributed scientific array using declarative languages.
The engine that we expanded to support these features is SDQuery DSI.
SepLog - separating the log component from the DBMS engine. - This is an internal work as part of Spyros Blanas research seminar.
In this work I show that log separation from the DBMS engine results in better performance, and claim it would not sacrifice the DBMS consistency/durability.
This work will not be published in a conference or journal, but only within OSU.