All systems

GIS / Geospatial

GIS data
Coming soon

Prepare

  1. Upload GIS / Geospatial files
Guide

Geographic Information System databases store spatial and mapping data with coordinates, projections, and geographic metadata.

Planned Support

  • ESRI Shapefiles (.shp, .shx, .dbf, .prj)
  • MapInfo TAB format (.tab, .dat, .id, .map)
  • Coordinate extraction and projection handling
  • Attribute data normalization
  • PostgreSQL PostGIS compatibility

What You Get Out

Once the parser ships, DataMeans will extract your data into multiple modern formats:

OutputDescription
csv/{TableName}.csvOne CSV file per table with all row data
xlsx/{TableName}.xlsxExcel workbook per table
xls/{TableName}.xlsLegacy Excel format per table
json/{TableName}.jsonJSON array of records per table
json/{TableName}.jsonlNewline-delimited JSON (streaming-friendly)
postgres.sqlPostgreSQL CREATE TABLE + INSERT statements
schema/schema-graph.jsonRelationship graph for visualization
schema/er-model.jsonER model for diagram tools
report.jsonStructured extraction report
report.mdHuman-readable extraction summary

File Requirements

For Shapefiles:

  • .shp (shape geometry)
  • .shx (shape index)
  • .dbf (attribute data)
  • .prj (projection info) - optional

For MapInfo:

  • .tab (table definition)
  • .dat (data file)
  • .id (index)
  • .map (map display)

Current Status

Parser development is in the planning phase. Shapefile support is well-defined (open ESRI standard); MapInfo is more proprietary.

Technical Notes

Spatial data requires specialized coordinate transformation libraries. The focus will be on extracting attribute data with coordinate columns for PostGIS import.


Last updated: January 2026

Technical reference

Overview

Geographic Information Systems (GIS) use specialized file formats to store spatial data with geometric shapes, coordinates, and attribute information for mapping and location-based analysis. These formats support vector geometries (points, lines, polygons), coordinate reference systems, and spatial relationships, enabling applications in urban planning, environmental monitoring, and geographic research. Unlike traditional databases, GIS formats include spatial indexing and projection metadata for accurate geographic representation.

History and Background

  • 1960s: Early GIS development at universities and government labs.
  • 1969: Esri (Environmental Systems Research Institute) founded by Jack and Laura Dangermond in Redlands, California.
  • 1970s: Canada Geographic Information System (CGIS) operational.
  • 1980s: Commercial GIS software emerges (ArcInfo, MapInfo).
  • 1990s: ESRI Shapefile becomes de facto standard for vector data.
  • 1994: OpenGIS Consortium (renamed Open Geospatial Consortium in 2004) founded for standards.
  • 2000s: Web-based GIS with Google Maps, OpenStreetMap.
  • 2004: Google acquires Keyhole, Inc., the originator of the KML format.
  • 2008: GeoJSON specification for JSON-based geospatial data.
  • 2014: GeoPackage adopted as an OGC standard.
  • 2016: GeoJSON standardized by the IETF as RFC 7946.

File Format Specifications

GIS formats vary from proprietary binary to open standards, supporting vector and raster data.

File Extensions:

  • .shp - ESRI Shapefile geometry
  • .dbf - Attribute data (dBase format)
  • .shx - Positional index of feature geometry
  • .prj - Projection information
  • .geojson - JSON-based geospatial
  • .kml - Keyhole Markup Language
  • .kmz - Zipped KML archive
  • .gml - Geography Markup Language (XML)
  • .gpkg - GeoPackage SQLite database
  • .fgb - FlatGeobuf binary encoding
  • .tab - MapInfo table format

File Structure:

  • Geometry: Coordinate-based shapes and locations
  • Attributes: Tabular data linked to geometries
  • Metadata: Projection, bounds, and spatial reference
  • Index: Spatial indexing for query performance
  • Topology: Relationships between spatial features
  • Shapefile Header: .shp opens with a fixed 100-byte header (file code, version, shape type, bounding box)
  • Shapefile Records: 8-byte record headers; lengths counted in 16-bit words; coordinates stored as 8-byte IEEE doubles
  • Index Records: each .shx entry is 8 bytes (record offset and content length)

Key Components:

  • Features: Individual spatial objects
  • Layers: Collections of related features
  • Coordinate Systems: Geographic or projected
  • Spatial Reference: Datum and projection parameters
  • Bounding Box: Extent of spatial data

Data Types and Structures

TypeDescriptionStorage
POINTSingle coordinate locationX,Y coordinates
LINESTRINGConnected line segmentsArray of coordinates
POLYGONClosed area boundaryOuter/inner rings
MULTIPOINTMultiple point locationsArray of points
MULTILINESTRINGMultiple line featuresArray of linestrings
MULTIPOLYGONMultiple polygon areasArray of polygons
GEOMETRYCOLLECTIONMixed geometry typesCollection of geometries
CIRCULARSTRINGCurve with circular arcs between pointsArc-defining coordinates
POLYHEDRALSURFACESurface of connected polygon patchesPolygon patches
TINTriangulated irregular network surfaceConnected triangles

Spatial Model:

  • Vector data represents discrete features
  • Attributes provide descriptive information
  • Spatial relationships (contains, intersects, etc.)
  • Coordinate precision and accuracy
  • Topology rules for data integrity
  • WKT (text) and WKB (binary) geometry encodings defined by OGC Simple Features

Version Differences

FormatYearKey ChangesCompatibility
Shapefileearly 1990sMulti-file binary vector format (spec published 1998)2 GB limit per .shp/.dbf file
GeoJSON2008JSON text encoding of featuresRFC 7946 (2016) requires WGS 84 coordinates
KML 2.22008Google Earth XML format adopted by OGCSupports altitude and TimeSpan/TimeStamp
KML 2.32015Minor revision (OGC 12-007r2); XML namespace unchanged from 2.2KML 2.2 documents remain valid
GML 3.22007OGC XML encoding, published as ISO 19136:2007Validated against XML Schema
GeoPackage2014SQLite 3 database container (.gpkg)Vector features and raster tiles in one file
FlatGeobuf2018FlatBuffers binary encodingOptional packed Hilbert R-tree index; streamable

Compatibility Notes:

  • Shapefiles lack topology and are file-based
  • GeoJSON is text-based and web-friendly
  • KML supports 3D placement (altitude) and time animation
  • Shapefile .dbf attribute fields are limited to 10-character names
  • Shapefile .dbf tables allow at most 255 fields; text fields hold up to 254 characters
  • GeoJSON (RFC 7946) orders coordinates longitude, latitude and removed the 2008 draft's crs member
  • KML altitude is measured in meters above the WGS 84 EGM96 geoid; .kmz files are zipped KML archives
  • FlatGeobuf deliberately omits random-write support; its spatial index is optional so files can be written as a stream
  • GeoPackage stores all content in a single SQLite 3 database file

Technical References


To learn how to use this format with DataMeans, see the User Guide.