Two datetime systems in Excel

该文章根据 CC-BY-4.0 协议发表,转载请遵循该协议。
本文地址:https://fenying.net/en/post/2022/10/25/stupid-excel-datetime-schema/

stupid-excel-datetime-schema

Long time ago, I create a command-line tool to convert Excel files to JSON files and vice versa. A few days ago, one of my colleagues reported a strange bug to me: some date values in sheets was wrong after converting to JSON.

At first, I didn’t pay much attention to it. A few days later, he continued to follow up this issue, and spotted the clue: copying values from the .xlsx file with problem, into another new Excel file, the date value was different; but if copying into another sheet in the same file, the date value was still the same… 😐

Today, I was hit by this issue again, and reproduced the problem with the BAD file we had.

After googling, I realized that Excel has two date systems:

  • 1900 Date System: The default date system in Excel on Lotus 1-2-3/MS-DOS/Windows systems, which treats 1900-01-01 as the first day.

  • 1904 Date System: Excel uses 1904-01-01 as the first day to be compatible with Macintosh, because the stupid Macintosh system does not support dates before 1904-01-01.

    Another reason is that Excel incorrectly treats 1900 as a leap year, so February 29, 1900 is a valid date. This error has been preserved to this day because Microsoft is afraid that fixing this error will cause problems with existing Excel files.

For more information, you can refer to the official document from Microsoft.

As the .xlsx file is a zip file, I unzipped it and found the schema of the date system in the file:

1<workbookPr date1904="1" defaultThemeVersion="166925"/>

When I changed the date1904 attribute to 0, and repack the .xlsx file, the date values were correct again.

And if you’re using Excel 2016 or later, you can turn off the “1904 date system” in the menu “File-Options-Advanced”.

comments powered by Disqus

Translations: