Home c# How to define Excel file format?

How to define Excel file format?




If you do not go into details, there are .xlsx files that are named as .xls and on the contrary, .xls that are named as .xlsx.
There are also those that correspond to their extension.

In general, you need to go through the files and rename them correctly, but I did not find how to determine the file format without relying on the extension in its name

Answer 1, Authority 100%

on the header. The XLSX new exel files begin PK signature ZIP-A 50 4B . With such a signature can be not only XLSX and also docx and zip archives, you need to deduct a list of packaged files, for clarification.

.xls files – there are two types, “new” XLS (usually it is usually found).
They have a signature D0 CF 1 E0 A1 B1 1A E1 (in fact, this is an office document signature, for a detailed identification, i.e. doc and xls will have the same signature, you need to search for “main” attachment In BIFF, the wrap, and check there signature, either by the list of BIFF-Overteads files), and the same old (up to version 7, 6-mAya was not in use) have a signature 09 08

For example, you can so

byte [] xlsdoc = f (); // downloaded a document in bytes
 if ((xlsdoc [0] == 0x50) & amp; & amp; (xlsdoc [1] == 0x4b)) {
   // New Ekel
 else if ((xlsdoc [0] == 0xd0) & amp; & amp; (xlsdoc [1] == 0xCF) /*...*/) {
  // Office 97.
 ELSE If ((XLSDOC [0] == 9) & amp; & amp; (xlsdoc [1] == 8)) {
  // old one-satellite
 } else {
  // in other cases - I think that garbage

Because The file can be large, it is better to deduct the first 16 bytes, and analyze them, and then return stream.position in zero.

More with Excel I came across jokes

  • xml & lt;? xml – file with Excel (if in Excel Save file as XML)
  • HTML Excel & lt; HTML – you often advise such a file to do on the Internet javascipt-Ohm php and t n, – Table with & lt; TD & GT; & lt; TR & GT; tags and specific assistants.

References to signatures

  1. http://uk.wikipedia.org/wiki/%D0%A1%D0%B8%D0%B3%D0%BD%D0%B0%D1%82 % D1% 83% D1% 80% D0% B0_% D1% 84% D0% B0% D0% B9% D0% BB% D1% 83 _ (% D0% BF% D0% B5% D1% 80% D0% B5% D0% BB% D1% 96% D0% BA)
  2. http://www.filesignatures.net /INDEX.php?page=Search&amp ;Search=XLS&AMP ;Mode=Ext
  3. http://www.filesignatures.net /INDEX.php?page=Search&amp ;Search=XLSX&AMP ;Mode=EXT
  4. http://en.wikipedia.org/wiki/Microsoft_excel

Answer 2, Authority 67%

Alternatively, in Windows to determine whether the file is a correct XLS file, you can use the STRUCTURD STORAGE API. According to the specification, the XLS format is a Structured Storage format file that contains a stream named Workbook.

MS-XLS : Excel Binary File Format Structure, paragraph 2.1.2 :

A File of the Type Specified by This Document Consists of Storages and Streams AS Specified in [MS-CFB] …

A Workbook Must Contain The WorkBook Stream …

You can use the following code for checking on XLS, based on this rule:

using system;
Using System.Collections;
using System.Runtime.interopServices;
Namespace ConsoleApplication1
  Class Program
    Static Extern Int StgopenstorageEx (
      [Marshalas (UNMANAGEDTYPE.LPWSTR)] String PwcSname,
      Intptr pstgoptions
      Intptr reserved2,
      [In] Ref GUID RIID,
      Out iStorage ppobjectOpen);
    Const uint stgm_direct = 0;
    const uint stgm_read = 0;
    Const uint stgm_share_exclusive = 0x10;
    const uint stgfmt_storage = 0;
    const uint pid_first_usable = 2;
    const uint STGC_DEFAULT = 0;
    [GUID ("0000000B-0000-0000-C000-000000000046")]
    [InterfaceType (CominterfaceType.interfaceInkNown)]
    Public Interface Istorage
      void a ();
      INT OPENSTREAM (String PwcSname,
        Intptr reserved1
        [Marshalas (UNMANAGEDTYPE.Interface)] Out Object PPSTM);
      Void CreateStorage (String Pwcsname, Uint GRFMode, Uint Reserve1, Uint Reserved2, Out Istorage PPSTG);
      Void Copyto (Uint CiidexClude, GUID [] RGIIDEXCLUDE, Intptr SnbexClude, Istorage Pstgdest);
      Void MoveElementTo (String Pwcsname, Istorage Pstgdest, String PwcsnewName, Uint Grfflags);
      Void Commit (uint grfcommitflags);
      void revert ();
      void b ();
      Void DestroyElement (String PwcSname);
      Void RenameElement (String PwcsOndName, String PwcsnewName);
      void c ();
      Void Setclass (Ref GUID CLSID);
      Void SetStateBits (Uint GrfStatebits, Uint GRFMASK);
      void d ();
    Public Static Bool Isxls (String Path)
      IStorage PSTORAGE = NULL;
      Object O = NULL;
      int hr;
      GUID Guidstorage = Typeof (IStorage) .guid;
        // Open the file
          0, intptr.zero, intptr.zero, Ref Guidstorage, Out PStorage);
        if (hr! = 0) Return False; // NOT Structured Storage File
        // Open the flow
        hr = pstaorage.openstream ("Workbook", intptr.zero, stgm_direct | STGM_READ | STGM_SHARE_EXCLUSIVE, 0, OUT O);
        RETURN HR == 0;
        // Liberation of resources
        if (PStorage! = NULL) Marshal.ReleaseComobject (PSTORAGE);
        if (O! = NULL) Marshal.ReleaseComobject (O);

Since the XLSX file is a zip archive of a specific structure, you can apply the same logic to check and use any library to work with ZIP-archives (in .NET 4.5+ there is a built-in System.IO.COMPRESSION ).

Programmers, Start Your Engines!

Why spend time searching for the correct question and then entering your answer when you can find it in a second? That's what CompuTicket is all about! Here you'll find thousands of questions and answers from hundreds of computer languages.

Recent questions