TripleGeo is a spatial ETL (Extract-Transform-Load) software for transforming geospatial data into RDF triples with minimal overhead. TripleGeo aims to bridge the gap between typical geographic representations from a variety of proprietary files, DBMSs, and georeferenced systems with the demands of processing Big Geospatial Linked Data. Users can control the ETL process through parameterized configuration files with their application-specific settings. Particularly regarding geometries, TripleGeo can cope with all common spatial data types, optionally reprojecting them to another Coordinate Reference System, and it also yields RDF representations compliant with the OGC GeoSPARQL standard. Moreover, TripleGeo provides support for INSPIRE-aligned data and metadata, allowing their transformation to RDF according to the technical specifications of the INSPIRE Directive in the European Union.
This release of TripleGeo includes the following new features:
- Improved geospatial handling not only of primitive geometry types (like points, linestrings, or polygons), but even more complex geometries (MultiPolygons, Geometry Collections) from 8 geospatial file formats (shapefiles, GeoJSON, CSV, GPX, etc.) and 8 geospatially-enabled DBMSs (e.g., Oracle, PostGIS).
- User-defined mappings from thematic attributes in input records to RDF properties in resulting RDF triples according to a given ontology.
- Auto-generation of custom identifiers for each transformed feature, according to best practices. These identifiers will be used in the assignment of URIs to triples with namespaces prescribed by a given ontology.
- Regarding preformance, TripleGeo can efficiently transform millions of spatial entities in a few minutes even without any sophisticated data partitioning schemes. This confirms the robustness and versatility of the software and testifies its potential for handling even larger POI datasets in forthcoming releases.
Although TripleGeo remains a general-purpose ETL software for transforming geodata to the RDF model, in the context of the SLIPO project we have included special support for transformation of Points of Interest (POI) into RDF. So, the current version also includes:
- Support for POI-specific mappings to handle extra thematic attributes into RDF according to a given ontology. Note that transformation is not bound to our proposed OWL ontology for POI data, but it can also work with any user-specified data model suitable for POIs, e.g., using entities like Places as specified by Schema.org.
- Specification of classification schemes can be used to generate RDF triples that fully describe the (possibly hierarchical) categories, subcategories, etc. of POIs (e.g., food, restaurant, pizza).
- Reverse transformation from RDF to de facto geographical file formats (like ESRI shapefiles or CSV) enables POI vendors with existing products, systems, and services to exploit the added-value results of semantic integration (interlinked, fused, and enriched RDF data) like any other geodata in conventional formats.
Software: TripleGeo is open source software and its Java source code is publicly available under the GNU General Public License.
Documentation: Full documentation for all Java classes of the source code is available in this JavaDoc.
Examples: You may take a look at several example configurations for transformation from various geospatial formats and just try using the software by youself!
SPARQL Endpoint: We have created a SPARQL endpoint with 1,000,000 POI features extracted from OpenStreetMap using TripleGeo. This endpoint enables queries with spatial and/or thematic criteria against the RDF graph that contains all transformed triples.
We plan further improvements and extensions to make TripleGeo more user-friendly and capable to handle even larger POI datasets in cluster infrastructures. Stay tuned for the next releases of the software!Back