The emerging parallel computing architectures, including grid and cloud computing on shared-nothing clusters of computing nodes and Chip Multiple Processors (CMP) and General Purpose computing on Graphics Processing Units (GPGPUs) technologies, have signficant impacts on geospatial computing. While the leading commercial and open source GIS and spatial databases are very slow in adopting these new parallel computing technologies (there are many practical reasons), as argued in our HPDGIS 2010 position paper, we believe that parallel GIS is the future of geospatial computing due to the inherent parallelisms in geospatial data and geospatial data processing. We are particular interested in GPGPUs for spatial indexing and query processing for two primary reasons:


First, while grid and cloud computing increasingly rely on identical commodity hardware to build clusters to solve large-scale problems, as argued in a PPoPP 2011 paper by  Hong et al., GPU architectures closely resemble supercomputers as both implement the primary Parallel Random Access Machine (PRAM) characteristic of utilizing a very large number of threads with uniform memory latency (such as Cray XMT). Besides being cost-effective and energy efficient in solving small to medium sized problems directly on GPU-equipped personal workstations, algorithms that can fully utilize GPGPU hardware capability on a single node will naturally boost the performance of grid/cloud computing resources on cluster computers to solve larger scale problems.
Second, among the classification of geospatial operations, i.e., local, focal, zonal and global, local and focal operations are relatively easy to parallelize on both CMPs and GPGPUs. Many of such techniques can be borrowed  from computer vision and image processing research, which is arguably a bigger research area that have attracted many talented researchers. On the other hand, zonal and global geospatial operations are more geospatial specific and their parallelization has been less well researched. While some experiences can be leveraged based on research in computer graphics such as ray tracing and iso-surface construction using hierarchical tree indices, they are very graphics application specific and may not be suitable for spatial queries that may involve arbitrary spatial (and relational) criteria. We believe spatial indexing and spatial queries are among the most difficult global/zonal geospatial operations to parallelize on GPGPUs and thus there are significant technical challenges and research values.


 As a research lab, our current focuses have two aspects:

  1. Understanding the inherent parallelism in spatial indexing and query processing and investigate efficient and effective ways to map such parallelism to GPGPU hardware architectures. We are currently adopting a parallel primitives based approach for this purpose. While the same idea has been explored by Erik G. Hoel and Hanan Samet, we believe the research topic is worthy of rediscovering. First, current commodity GPUs are widely available which makes the new research potentially applicable to every desktop/server computers that are equipped with GPUs.  According to CACM, NVIDIA alone has shipped almost 220 million CUDA-capable GPUs from 2006 to 2010. This is in contrast to the previous research that relies on then supercomputers that were only available to highly selected groups/individuals. Second, while previous research focused mostly on polygon/polyline data, location data, point cloud data and raster data collected by modern sensing and navigation technologies become more and more prevalent. Furthermore, instead of composing generic parallel primitives on 1D vector data, we are more interested in developing geospatial specific parallel primitives that work with geospatial data/computing better.
  1.  Design and implement popular geospatial operations on GPGPUs. Different from the first focus area, here we are interested in efficient designs and implementations that can fully utilize GPGPU parallel computing powers and pushing the limits of the parallel hardware for geospatial computing. The research focus not only can be directly applied to solve real world problem (and stimulate similar designs, implementations and applications), but also serves as benchmarks for primitives based designs and implementations to understand the tradeoffs between coding complexities and code efficiencies.

 
From 2010 to 2012, with very limited internal support, we have made some good progresses in both focus areas (see our HPC-G related publications). Starting with indexing raster data (GIS10, GIS11), we have expanded our scope to polygon rasterization (HPDGIS11) and nearest neighbor based interpolation (Com.Geo12) and spatial databases (see CudaGIS page for a summary of results by the time). Along the parallel primitives based geospatial indexing and query processing direction, we recently re-implemented the BMMQ-Tree construction algorithm using parallel primitives. We refer to the PrimQuad project page for more details where interested readers can download source code and test data and repeat our experiments.

During summer of 2012, we have made significant progresses towards efficient spatial joins on GPGPUs. Motivated by the practical needs of analyzing hundreds of millions of taxi trip records, we have been focusing on joining these point locations with urban infrastructure data, including street segments, census blocks and tax blocks in NYC. By integrating several performance boosting techniques, including collective query (through quasi-indexing point locations on GPUs), simple array based data structures and exploiting GPU floating point computing power, we are able to achieve 3-4 orders (1,000X-10,0000X) speedup over a reference serial CPU implementation using leading open source GIS software packages (which is already more efficient than spatial databases). We refer the respective project pages for further details: [Point to Network][Point in Polygon][Point to Polygon][Trajectory to Trajectory].

We are glad to receive a $450K/4yr award from the highly competitive Intelligent Information Systems (IIS) Medium Collaborative Research program at the National Science Foundation starting from 08/01/2013. The project is in collaboration with the University of Oklahoma which was funded separately. The relatively long term support  allows to start to investigate on architectural designs and integration with mainstream BigData systems. We have studied several open source BigData systems, including Apache Hadoop (with HIVE and PIG) , SciDB, Cloudera Impala and Spark/Shark.  We have decided to extend Cloudera Impala to support GPUs (and multi-core CPUs) for efficient INTEGRATED spatial and relational queries. The decision is based on the fact, as of now, Cloudera Impala is he only C/C++ based open source BIGDATA  system which is necessary to use SIMD computing power on GPUs (using CUDA) and VPUs on CPUs (SSE/AVX/AVX-512). During the Spring of 2014, we  are able to extend Imapla frontend with "SpatialJoin" join keyword and use OpenMP/TBB to parallel spatial joins on multi-core CPUs by extending Imapla's "BlockJoin". We note that, while Imapla is able to access HDFS and load, scan and transform (through UDFs) data using multiple threads, unfortunately, it currently does not support joins using  multiple threads. While the accomplishment is significant, our goal is to support distributed spatial and non-spatial queries in distributed computing nodes with GPUs (i.e., GPU cluster). Based on what we have learnt so far, this may require to completely rewrite Impala backend except the modules that interface with HDFS. During the system integration process, we also plan to add the [R-Tree] indexing module and the [Selectivity Estimation] module to support query optimizations.


Stay tuned for our latest progresses! Please drop me a line at jzhang at cs dot ccny dot cuny dot edu  if you have comments and/or experiences to share.