The KnowWhereGraph improves data-driven decision making and data analytics, specifically data analytics that involve geographic data. The KnowWhereGraph is a knowledge graph tool that specifically enables other data-analysis knowledge tools that have a geospatial component.
GeoEnrichment describes the process by which data becomes augmented with a wide range of auxiliary information tailored to a geospatial study area (such as demographic data). GeoEnrichment tools significantly reduce the costs involved in acquiring, entering, and cleaning geo-data. Unfortunately, currently available geoenrichment services provide access to only pre-defined categories of information, do not handle effectively interconnected data, offer limited support for data integration, and are generally expensive.
The KnowWhereGraph makes data-driven decision making and data analytics substantially more effective, accessible, and affordable. The KnowWhereGraph merges novel Artificial Intelligence-based geoenrichment technologies with a knowledge graph that brings together open, cross-domain, densely integrated data spanning the human-environment interface.
The KnowWhereGraph is enabled by an open, freely usable knowledge graph. These graphs are a combination of scalable, Web-standard technologies, specifications, and data cultures for representing densely interconnected statements derived from structured or unstructured data across domains, in both human and machine-readable ways. The technology tools are designed to be useful to and useable by researchers, analysts, decision-makers, and the interested public in any domain or cross-domain activity requiring geospatial intelligence.
The KnowWhereGraph includes strong partnerships with non-academic and academic stakeholders including 4 for-profit organizations, 2 government agencies, and one non-profit, as well as five academic partnerships: ESRI (Geographic Information Systems); Oliver Wyman, (commodity markets and supply chains), Princeton Climate Analytics (weather and climate information services), In10T (digital agriculture, farm partnerships); US Geological Survey (USGS), Natural Resources Conservation Service within the U.S. Department of Agriculture (USDA): and DirectRelief (humanitarian aid); as well as University of California Santa Barbara(UCSB), Kansas State University (K-State), Michigan State University (MSU), Arizona State University (ASU), and University of Southern California(USC). Additional partnerships are expected to develop during this Phase II effort.
The KnowWhereGraph is a valuable element of the National Science Foundations (NSF) Convergence Accelerator Phase II cohort, providing geospatial tools to the other projects within the cohort. In addition the project focuses on several strategic application areas that are likely to benefit US society, including: COVID-19 related supply chain disruptions and the US food, agriculture, and energy sectors, and their attendant supply chains generally; environmental policy issues relative to interactions among agricultural sustainability, soil conservation practice, and farm labor; and delivery of emergency humanitarian aid, within the US and internationally. Anytime knowing where is key, The KnowWhereGraph will be helpful.
Formally, a knowledge graph consists of a massive set of statements, constructed from inter-connected node- and edge-labeled resources, allowing multiple, heterogeneous edges for the same nodes. A collection of definitional statements specifying the meaning of the knowledge graph's vocabulary is called its (KG) schema or ontology. The ontology is critical for rigorous logical interpretation and machine-actionability. Co-PI Pascal Hitzler explains "knowledge graphs are industry's go-to methods for complex data integration and re-use scenarios. We use rigorous and open standards together with sophisticated quality control based on years of experience and research to produce the highly versatile KnowWhereGraph. Spatial Information plays a key role, and we are significantly pushing the state of the art with our technology solutions."
Several innovations in knowledge graph technology will drive the project: (I) creating an open, web-accessible knowledge graph, with attendant methods and tools, to enable contributions to the graph from a range of sources; (II) developing strategies for semantically lifting imagery data, such as remotely sensed imagery and drone imagery, into this graph, thereby integrating vast amounts of data; (III) developing novel spatially-explicit AI-based methods, models, and services to enable geoenrichment on top of this graph; and (IV) developing both programmatic (application program interface, API) and human-accessible interfaces for The KnowWhereGraph. By merging the flexibility, expressive power, and community-driven features of open graph technologies with multi-format geospatial data and advanced geospatial intelligence, The KnowWhereGraph is designed to become a rich, integrative information resource that can transform and converge discovery, analysis, and synthesis within and across a multitude of fields and sectors.
Years ago, lead PI Krzyztof Janowicz said, it was enough for decision makers in some sectors to be concerned only with their local context, within a regional scope of enterprise. "Now, whether you are an individual farmer, a retail company, or a humanitarian relief organization, you act in a global context," he said. "You have to track commodity prices, tariffs, listen to the pulse of society, be sensitive to the culture of your markets, monitor weather forecasts, and even be able to react quickly to freak events such as pandemics or terrorists."
What KnowWhereGraph can do, according to Janowicz, is to bring together a wealth of highly diverse sources of relevant information to form an open, spatially-explicit knowledge graph - a model that integrates not just different kinds of data, but, importantly, also their relationships, in a way that can be accessed by those for whom the information matters most.
To interlink and be able to query all these data sources requires a "universal" language. "To some degree, such language already exists," Janowicz said. "It's called the resource description framework (RDF), and it enables us to describe the world around us in human and machine understandable terms." These resources can include things like maps, images, tabulated data, and text - all of which are built into a global and decentralized graph.
"RDF triples, which are statements in a subject, predicate, object form, enable us to publish knowledge about the world around us and irrespective of the fact who made these statements, what they are about, or when and where they were made," he said. "Everybody gets to contribute and connect statements to already existing ones."
Of course, the value of such a data graph relies heavily on the richness of the data, how current it is, and how the connections between disparate bits of information can become solutions to current problems or predictors of future scenarios. For that, there's artificial intelligence.
"We're going to develop AI methods to help decision-makers communicate with our KnowWhereGraph," Janowicz said. "Essentially, the graph will deliver contextual background information about an analyst's study area using a process called geo-enrichment. We are particularly interested in graph summarization techniques to find task-relevant triples from a pool of billions of other statements."
"Building a domain knowledge graph is a critical step towards developing artificial general intelligence for future machines to reason like human beings," said Wenwen Li, co-principal investigator of the project, who specializes in smart cyberinfrastructure and geospatial big data analytics. "The IT giants, such as Google and Facebook, have developed enterprise-level knowledge graphs to better understand the world's information to improve web search and product recommendation."
Building a scientific knowledge graph that models research data is very challenging - data comes from different sources, are encoded in different formats, are large in size, and are often short of metadata. Additionally, much of the existing data are hidden in the deep web, making their discovery and reuse even more difficult.
Co-PI Mark Schildhauer is excited about KnowWhereGraph creating a framework to support deep interrogation into specific thematic areas, as well as enabling bridging across multiple disciplines. "The KnowWhereGraph has Use Cases we call 'Verticals', that involve detailed inquiry into highly-focused topics, such as clarifying the linkages among soil health, agricultural productivity, and farming methods. However, we are also developing 'Horizontal' Use Cases, which are emerging through secondary or tertiary connections to nodes in our 'Verticals'. For example, hurricanes and floods disrupt communities through immediate impacts, but can have lasting effects on agricultural productivity, as well as community health and resilience. Our 'Horizontals' will reveal these connections, and be further enabled through some of the general 'design patterns' we're developing for interoperating with other Knowledge Graphs."
"The great part of the project," Co-PI Dean Rehberger explains, "is that not only are we working on a project that has the potential for real social impact, but we also get to work with a great team of superb scholars and researchers from both the academic and the private sector as well as NGOs. This is also a fast-paced, new grant program for NSF that emphasizes 'accelerating' the use of research in the public sphere. Very exciting."