All systems

Microsoft Excel

XLS / XLSX workbooks
Supported

Prepare

  1. Upload XLS or XLSX files
Guide

Microsoft Excel files (.xlsx, .xls) are widely used for storing tabular data with rich formatting.

What You Can Upload

  • .xlsx files (Office Open XML, Excel 2007+)
  • .xls files (BIFF8, Excel 97-2003)
  • ZIP archives with multiple Excel files

What You Get Out

DataMeans extracts your data into multiple modern formats:

OutputDescription
csv/{TableName}.csvOne CSV file per table with all row data
xlsx/{TableName}.xlsxExcel workbook per table
xls/{TableName}.xlsLegacy Excel format per table
json/{TableName}.jsonJSON array of records per table
json/{TableName}.jsonlNewline-delimited JSON (streaming-friendly)
postgres.sqlPostgreSQL CREATE TABLE + INSERT statements
schema/schema-graph.jsonRelationship graph for visualization
schema/er-model.jsonER model for diagram tools
report.jsonStructured extraction report
report.mdHuman-readable extraction summary

How to Export / Obtain Files

No special export needed - upload Excel files directly:

  1. Locate your Excel file
  2. Upload as-is or in a ZIP archive

Supported Features

  • Multiple worksheets (all sheets processed)
  • Cell types: text, numbers, dates, booleans
  • Formula values (calculated results, not formulas)
  • Large file handling with streaming

Known Limitations

  • Formulas exported as their calculated values
  • Charts and pivot tables not extracted
  • Macros and VBA code ignored
  • Merged cells may cause alignment issues

Best Practices

  • Use first row for column headers
  • Avoid merged cells in data tables
  • Keep one logical table per sheet
  • Use consistent data types within columns

Last updated: January 2026

Technical reference

Overview

Microsoft Excel uses proprietary binary and XML-based file formats for spreadsheet documents. The formats store tabular data, formulas, charts, and formatting in structured workbooks containing multiple worksheets. Excel files support complex data types, calculations, and rich formatting, making them suitable for data analysis and reporting.

History and Background

  • 1982: Multiplan spreadsheet released for CP/M, later ported to MS-DOS.
  • 1985: Microsoft Excel 1.0 released for the Macintosh, replacing Multiplan.
  • 1987: Excel 2.0 for Windows, first Windows version.
  • 1990: Excel 3.0 added toolbars, drawing tools, and 3D charts.
  • 1992: Excel 4.0 introduced the AutoFill feature.
  • 1993: Excel 5.0 introduced VBA, multi-sheet workbooks, and PivotTables.
  • 1995: Excel 7.0 (Office 95), rewritten internally as a 32-bit application.
  • 1997: Excel 8.0 (Office 97) introduced BIFF8 with Unicode strings and 65,536 rows.
  • 1999: Excel 9.0 (Office 2000) added HTML document creation and publishing.
  • 2002: Excel 10.0 (Office XP) introduced smart tags.
  • 2003: Excel 11.0 (Office 2003) added XML import/export.
  • 2006: Office Open XML formats standardized by Ecma International as ECMA-376 in December.
  • 2007: Excel 12.0 introduced XLSX format, increased limits.
  • 2008: OOXML approved as an ISO/IEC standard in April; ISO/IEC 29500:2008 published in November.
  • 2010: Excel 14.0 added sparklines and slicers.
  • 2013: Excel 15.0 introduced Flash Fill and Power Pivot data modeling.
  • 2016: Excel 16.0 integrated Power Query (Get & Transform) and added forecasting functions.
  • 2019: Excel 2019 added TEXTJOIN, IFS, and SWITCH functions plus map and funnel charts.
  • 2021: Excel 2021 added XLOOKUP, XMATCH, LET, and dynamic array functions.
  • 2024: Excel 2024 added charts that reference dynamic arrays, the IMAGE function, and 14 new text and array functions.

File Format Specifications

Binary Format (XLS/BIFF):

  • Proprietary binary structure with BIFF (Binary Interchange File Format) records
  • Stored in a Compound File Binary (OLE2) container; little-endian byte order
  • Workbook stream named "Workbook" in BIFF8, "Book" in BIFF5/BIFF7; BIFF2-BIFF4 stored a single sheet as a raw record stream with no OLE2 container
  • Each record begins with a 2-byte type identifier and a 2-byte size field; record data is capped at 8,224 bytes, with longer content split across Continue records
  • File extension: .xls
  • Maximum rows: 65,536; columns: 256 (Excel 97-2003)
  • File size limit: 2 GB (Compound File Binary 512-byte-sector limit)

XML Format (XLSX/OOXML):

  • ZIP archive containing XML files and resources
  • Based on Office Open XML standard (ECMA-376, ISO/IEC 29500)
  • Package layout follows the Open Packaging Conventions (OPC); every package must contain a [Content_Types].xml part mapping part names to content types
  • The .xlsb variant keeps the same OPC/ZIP package but stores worksheet parts as binary records defined in the MS-XLSB specification
  • File extension: .xlsx (workbook), .xlsm (macro-enabled), .xlsb (binary)
  • Maximum rows: 1,048,576; columns: 16,384
  • File size limit: Limited by available memory and system resources

Key Components:

  • Workbook: Container for all sheets and metadata
  • Worksheets: Individual spreadsheet tabs with data
  • Shared Strings: Centralized text storage for efficiency
  • Styles: Formatting definitions for cells
  • Relationships: Links between components
  • Drawings: Charts, shapes, images

Data Types and Structures

TypeStorageDescription
NumberIEEE 754 double64-bit floating point, up to 15 significant digits
RK number4 bytes (BIFF)Compressed 30-bit value: signed integer or truncated IEEE 754 double, optionally divided by 100
TextShared strings tableUnicode text strings, length up to 32,767 characters
Inline stringIn-cell XML (inlineStr)Text stored directly in the cell element instead of the shared strings table
Boolean1 byteTRUE/FALSE values
Error1 byte#N/A, #VALUE!, #REF!, etc.
Date/TimeSerial numberDays since January 1, 1900; 1900 wrongly treated as a leap year (Lotus 1-2-3 compatibility)
FormulaParsed token array (BIFF) or text (XLSX)Calculated expressions with cell references
Formula stringCached text (str)String result of a formula, cached in the cell element alongside the formula
ArrayFormula resultMulti-cell array formulas

Workbook Structure:

  • Worksheets organized as tabs
  • Cells addressed by column letter + row number (A1, B2, etc.)
  • Ranges defined by start:end notation (A1:B10)
  • Named ranges and tables for organization
  • Charts linked to worksheet data
  • Pivot tables for data summarization

Version Differences

VersionYearKey FeaturesFile Format
Excel 5.0/951993-1995VBA macros, multi-sheet workbooks.xls (BIFF5/BIFF7)
Excel 97-20031997-2003Unicode strings, 65,536-row grid.xls (BIFF8)
Excel 20072007Ribbon, 1,048,576-row grid.xlsx (OOXML)
Excel 20102010Sparklines, slicersNo format change
Excel 20132013Flash Fill, Data ModelAdds Strict Open XML save option
Excel 20162016Get & Transform, forecasting functionsNo format change
Excel 20192019TEXTJOIN, IFS, map and funnel chartsNo format change
Excel 20212021XLOOKUP, LET, dynamic arraysNo format change
Excel 20242024Dynamic array charts, IMAGE functionNo format change

Compatibility Notes:

  • .xlsx is the default file format in Excel 2007 and later
  • .xls files can be opened in newer versions
  • Opening an .xls file places the workbook in Compatibility Mode (shown in the title bar), keeping the 65,536-row, 256-column grid until the file is converted
  • Some features lost when saving to older formats
  • The Compatibility Checker runs automatically when saving a workbook to the Excel 97-2003 format and reports unsupported features
  • Excel 2010 can open but not save Strict Open XML workbooks; Excel 2013 and later can read and write them
  • VBA macros require .xlsm, .xlsb, or legacy .xls formats
  • Workbooks may use the 1900 or 1904 date system; older Excel for Mac versions defaulted to 1904

Technical References


To learn how to use this format with DataMeans, see the User Guide.