SharpDevelop Community

Get your problems solved!
Welcome to SharpDevelop Community Sign in | Join | Help
in Search

A dirty fix for extracting files with special characters

Last post 10-07-2008 5:47 PM by DomZ. 2 replies.
Page 1 of 1 (3 items)
Sort Posts: Previous Next
  • 09-24-2008 10:52 PM

    A dirty fix for extracting files with special characters

    There seems to be several workarounds for handling zipfiles containing files with special characters, and I'd like to share mine.

    My simple problem is that I have to extract zipfiles, most likely created using WinZip, which may contain files with names containing special characters - in my case, Danish letters like æøå/ÆØÅ.

    SharpZip kept messing them up, no matter what value I set ZipConstants.DefaultCodePage to.

     I made the following changes in ZipFile.cs:

    The line [string name = ZipConstants.ConvertToStringExt(bitFlags, buffer, nameLen);]

     was changed to:

                    int externallyDefinedCodepage = ZipConstants.DefaultCodePage;

                    if (versionMadeBy == 2836)
                    {
                        //Overrides DefaultCodePage to 1252 as mentioned at http://community.sharpdevelop.net/forums/thread/9246.aspx
                        ZipConstants.DefaultCodePage = System.Text.Encoding.Default.CodePage;
                    }

                    string name = ZipConstants.ConvertToStringExt(bitFlags, buffer, nameLen);

                    ZipConstants.DefaultCodePage = externallyDefinedCodepage;

    Now I get my files extracted with the correct Danish characters.

     I have not tested if this workaround has any negative/unexpected side effects, which I'd suspect it does, as ZipConstants.ConvertToStringEx() is also called in two other cases in that file - where I have not made any changes.

     However, it makes SharpZip useful to me, and perhaps it can help others.

    YMMV.

  • 10-06-2008 8:08 PM In reply to

    Re: A dirty fix for extracting files with special characters

    Yes encoding is a perenial problem with Zip archives dues to the code page handling.

    A built in strategy to allow dynamic code page/encoding handling would help for old archives.


  • 10-07-2008 5:47 PM In reply to

    • DomZ
    • Top 500 Contributor
    • Joined on 09-01-2008
    • Posts 8

    Re: A dirty fix for extracting files with special characters

    Hi John,

    I saw that most of new version of zip archivers tried to handle unicode. They all used the same strategy, they use unicode only when needed (if the file name or the directory name is from unicode encoding).

    Is it possible to support this in SharpZipLib and in FastZip ?

    Actually, IsUnicodeText is working well but all entries are encoded in unicode but most of the time you don't need to use unicode and that break the compatibility with most of archivers.

    Thanks

Page 1 of 1 (3 items)
Powered by Community Server (Commercial Edition), by Telligent Systems
Don't contact us via this (fleischfalle@alphasierrapapa.com) email address.