Prepare
- Upload XLS or XLSX files
Microsoft Excel files (.xlsx, .xls) are widely used for storing tabular data with rich formatting.
What You Can Upload
.xlsxfiles (Office Open XML, Excel 2007+).xlsfiles (BIFF8, Excel 97-2003)- ZIP archives with multiple Excel files
What You Get Out
DataMeans extracts your data into multiple modern formats:
| Output | Description |
|---|---|
csv/{TableName}.csv | One CSV file per table with all row data |
xlsx/{TableName}.xlsx | Excel workbook per table |
xls/{TableName}.xls | Legacy Excel format per table |
json/{TableName}.json | JSON array of records per table |
json/{TableName}.jsonl | Newline-delimited JSON (streaming-friendly) |
postgres.sql | PostgreSQL CREATE TABLE + INSERT statements |
schema/schema-graph.json | Relationship graph for visualization |
schema/er-model.json | ER model for diagram tools |
report.json | Structured extraction report |
report.md | Human-readable extraction summary |
How to Export / Obtain Files
No special export needed - upload Excel files directly:
- Locate your Excel file
- Upload as-is or in a ZIP archive
Supported Features
- Multiple worksheets (all sheets processed)
- Cell types: text, numbers, dates, booleans
- Formula values (calculated results, not formulas)
- Large file handling with streaming
Known Limitations
- Formulas exported as their calculated values
- Charts and pivot tables not extracted
- Macros and VBA code ignored
- Merged cells may cause alignment issues
Best Practices
- Use first row for column headers
- Avoid merged cells in data tables
- Keep one logical table per sheet
- Use consistent data types within columns
Last updated: January 2026
Overview
Microsoft Excel uses proprietary binary and XML-based file formats for spreadsheet documents. The formats store tabular data, formulas, charts, and formatting in structured workbooks containing multiple worksheets. Excel files support complex data types, calculations, and rich formatting, making them suitable for data analysis and reporting.
History and Background
- 1982: Multiplan spreadsheet released for CP/M, later ported to MS-DOS.
- 1985: Microsoft Excel 1.0 released for the Macintosh, replacing Multiplan.
- 1987: Excel 2.0 for Windows, first Windows version.
- 1990: Excel 3.0 added toolbars, drawing tools, and 3D charts.
- 1992: Excel 4.0 introduced the AutoFill feature.
- 1993: Excel 5.0 introduced VBA, multi-sheet workbooks, and PivotTables.
- 1995: Excel 7.0 (Office 95), rewritten internally as a 32-bit application.
- 1997: Excel 8.0 (Office 97) introduced BIFF8 with Unicode strings and 65,536 rows.
- 1999: Excel 9.0 (Office 2000) added HTML document creation and publishing.
- 2002: Excel 10.0 (Office XP) introduced smart tags.
- 2003: Excel 11.0 (Office 2003) added XML import/export.
- 2006: Office Open XML formats standardized by Ecma International as ECMA-376 in December.
- 2007: Excel 12.0 introduced XLSX format, increased limits.
- 2008: OOXML approved as an ISO/IEC standard in April; ISO/IEC 29500:2008 published in November.
- 2010: Excel 14.0 added sparklines and slicers.
- 2013: Excel 15.0 introduced Flash Fill and Power Pivot data modeling.
- 2016: Excel 16.0 integrated Power Query (Get & Transform) and added forecasting functions.
- 2019: Excel 2019 added TEXTJOIN, IFS, and SWITCH functions plus map and funnel charts.
- 2021: Excel 2021 added XLOOKUP, XMATCH, LET, and dynamic array functions.
- 2024: Excel 2024 added charts that reference dynamic arrays, the IMAGE function, and 14 new text and array functions.
File Format Specifications
Binary Format (XLS/BIFF):
- Proprietary binary structure with BIFF (Binary Interchange File Format) records
- Stored in a Compound File Binary (OLE2) container; little-endian byte order
- Workbook stream named "Workbook" in BIFF8, "Book" in BIFF5/BIFF7; BIFF2-BIFF4 stored a single sheet as a raw record stream with no OLE2 container
- Each record begins with a 2-byte type identifier and a 2-byte size field; record data is capped at 8,224 bytes, with longer content split across Continue records
- File extension:
.xls - Maximum rows: 65,536; columns: 256 (Excel 97-2003)
- File size limit: 2 GB (Compound File Binary 512-byte-sector limit)
XML Format (XLSX/OOXML):
- ZIP archive containing XML files and resources
- Based on Office Open XML standard (ECMA-376, ISO/IEC 29500)
- Package layout follows the Open Packaging Conventions (OPC); every package must contain a
[Content_Types].xmlpart mapping part names to content types - The
.xlsbvariant keeps the same OPC/ZIP package but stores worksheet parts as binary records defined in the MS-XLSB specification - File extension:
.xlsx(workbook),.xlsm(macro-enabled),.xlsb(binary) - Maximum rows: 1,048,576; columns: 16,384
- File size limit: Limited by available memory and system resources
Key Components:
- Workbook: Container for all sheets and metadata
- Worksheets: Individual spreadsheet tabs with data
- Shared Strings: Centralized text storage for efficiency
- Styles: Formatting definitions for cells
- Relationships: Links between components
- Drawings: Charts, shapes, images
Data Types and Structures
| Type | Storage | Description |
|---|---|---|
| Number | IEEE 754 double | 64-bit floating point, up to 15 significant digits |
| RK number | 4 bytes (BIFF) | Compressed 30-bit value: signed integer or truncated IEEE 754 double, optionally divided by 100 |
| Text | Shared strings table | Unicode text strings, length up to 32,767 characters |
| Inline string | In-cell XML (inlineStr) | Text stored directly in the cell element instead of the shared strings table |
| Boolean | 1 byte | TRUE/FALSE values |
| Error | 1 byte | #N/A, #VALUE!, #REF!, etc. |
| Date/Time | Serial number | Days since January 1, 1900; 1900 wrongly treated as a leap year (Lotus 1-2-3 compatibility) |
| Formula | Parsed token array (BIFF) or text (XLSX) | Calculated expressions with cell references |
| Formula string | Cached text (str) | String result of a formula, cached in the cell element alongside the formula |
| Array | Formula result | Multi-cell array formulas |
Workbook Structure:
- Worksheets organized as tabs
- Cells addressed by column letter + row number (A1, B2, etc.)
- Ranges defined by start:end notation (A1:B10)
- Named ranges and tables for organization
- Charts linked to worksheet data
- Pivot tables for data summarization
Version Differences
| Version | Year | Key Features | File Format |
|---|---|---|---|
| Excel 5.0/95 | 1993-1995 | VBA macros, multi-sheet workbooks | .xls (BIFF5/BIFF7) |
| Excel 97-2003 | 1997-2003 | Unicode strings, 65,536-row grid | .xls (BIFF8) |
| Excel 2007 | 2007 | Ribbon, 1,048,576-row grid | .xlsx (OOXML) |
| Excel 2010 | 2010 | Sparklines, slicers | No format change |
| Excel 2013 | 2013 | Flash Fill, Data Model | Adds Strict Open XML save option |
| Excel 2016 | 2016 | Get & Transform, forecasting functions | No format change |
| Excel 2019 | 2019 | TEXTJOIN, IFS, map and funnel charts | No format change |
| Excel 2021 | 2021 | XLOOKUP, LET, dynamic arrays | No format change |
| Excel 2024 | 2024 | Dynamic array charts, IMAGE function | No format change |
Compatibility Notes:
.xlsxis the default file format in Excel 2007 and later.xlsfiles can be opened in newer versions- Opening an
.xlsfile places the workbook in Compatibility Mode (shown in the title bar), keeping the 65,536-row, 256-column grid until the file is converted - Some features lost when saving to older formats
- The Compatibility Checker runs automatically when saving a workbook to the Excel 97-2003 format and reports unsupported features
- Excel 2010 can open but not save Strict Open XML workbooks; Excel 2013 and later can read and write them
- VBA macros require
.xlsm,.xlsb, or legacy.xlsformats - Workbooks may use the 1900 or 1904 date system; older Excel for Mac versions defaulted to 1904
Technical References
- Microsoft Excel Developer Documentation
- MS-XLS: Excel Binary File Format (.xls) Structure
- ECMA-376: Office Open XML File Formats
- Excel Specifications and Limits
- Wikipedia: Microsoft Excel
To learn how to use this format with DataMeans, see the User Guide.