In all areas of science, technology, and digital industry, vast amounts of data is produced that needs to be reliably and efficiently stored, processed, accessed and shared. The inadequacy of traditional distributed computing systems in dealing with complex data handling problem in our new data-rich world requires a new paradigm called data-aware distributed computing. The advancement in data-aware distributed computing will capitalize NSF’s investments on TeraGrid, DataNets and other large-scale cyberinfrastracture and computational science efforts; and will directly impact scientific discovery and economic development in the nation.