SharpDevelop Community

Get your problems solved!
Welcome to SharpDevelop Community Sign in | Join | Help
in Search

Problem with decompressing files with French Accents in the file name.

Last post 08-11-2009 10:31 AM by Randa Wadie. 17 replies.
Page 1 of 2 (18 items) 1 2 Next >
Sort Posts: Previous Next
  • 05-19-2009 11:15 AM

    Problem with decompressing files with French Accents in the file name.

    Hi,

    I have a problem while decompressing the zip file(created by FastZip). The zip contains a file name with French Accents. While unzipping, the actual Accent char is converted to invalid char '?' (ASCII 63) and throws an exception. I found out that while compressing(using FastZip) itself, its losing the encoding information as its encoded using OEMCodePage. After some research, I have changed it to UTF-8 encoding and after that the compression and decompression is working correctly(tested only on English PC currently). The below are the 2 methods where i have changed the code of #ziplib.

    //ZipConstants.cs

    public static string ConvertToString(byte[ data, int count)
            {
                if ( data == null ) {
                    return string.Empty;   
                }
              
                //return Encoding.GetEncoding(DefaultCodePage).GetString(data, 0, count);
                return Encoding.UTF8.GetString(data, 0, count); //newly changed line
            }

    //ZipConstants.cs

    public static byte[ ConvertToArray(string str)
            {
                if ( str == null ) {
                    return new byte[0];
                }       


                //return Encoding.GetEncoding(DefaultCodePage).GetBytes(str);
                return Encoding.UTF8.GetBytes(str); //newly changed line
            }

     

    In my case, the compression may happen in any of the English/German(with/without Umlauts)/French(with/without Accents)/Russian Language PC. The compressed file should be able to decompress in any of the English/German/French/Russian Language PC.

    Please let me know if UTF-8 encoding is the right one to solve this problem.

     

    Thanks in advance

    Best Regards,

    Ram

  • 06-12-2009 8:24 AM In reply to

    Re: Problem with decompressing files with French Accents in the file name.

    Hi Ram,

    Do you know if UTF8 is the encoding scheme that WinZip use for filenames? If it is then that would seem to be the one we should use.

    You said that you did some research, and a (very brief) google by me did not come up with anything.

     

    Thanks

    David

  • 06-12-2009 4:06 PM In reply to

    Re: Problem with decompressing files with French Accents in the file name.

    Hi David,

    I'm not sure whether WinZip uses UTF-8 or not. The research was to find out which encoding suits my requirement.

    But I've found out the following link:

    http://blogs.msdn.com/michkap/archive/2005/05/10/416181.aspx

     

    So can you please tell me whether modifying the #ZipLib code to use UTF-8 instead of DefaultEncoding/OEMCode page is a right choice?

     

    Thanks

    Best Regards,

    Ram.

     

     

  • 06-15-2009 12:59 AM In reply to

    Re: Problem with decompressing files with French Accents in the file name.

    Hi Ram,

    Excellent reference you have found and posted there. Michael Kaplan seems to know what he's talking about when he says that Winzip relies on the code page. He says that in essence, each character must be in the "default system code page".

    Based on that, I'd say the current code is (unfortunately) correct, and that we will have to wait for the zip format to catch up with modern encoding formats before we can change the code. In my opinion UTF-8 is an excellent format and I wish that zip used it but of course, Phil Katz was doing this long before UTF came along :-)

    (After further reading) This comment by Chuck Campbell, WinZip Technical Support is very helpful and revealing ... "WinZip is not a Unicode-aware application at this time. For this reason, WinZip can only display and process characters that exist in your current codepage. ... We are looking into how to handle this better in a future version of WinZip."

    Best regards

    David.

  • 06-15-2009 8:22 AM In reply to

    Re: Problem with decompressing files with French Accents in the file name.

    Hi David,

    Thanks for the reply. In my case, compression & decompression can be done only by our application. In other words, our application cannot decompress a file created using WinZip. It should only decompress those files which are compressed by itself.

    And if compression is done on the German application version (with Umlauts), and tries to decompress on a English Version, the default system code page varies and these Umlauts characters will not be available in English system code page - here the decompression fails.

    So to overcome this, i would like to change the code page of ICSharpCode.SharpZipLib to UTF-8 for the time being, till WinZip and ICSharpCode group supports this encoding.

    Any suggestions on this would be really helpful. Awaiting your reply.

     

    Best Regards,

    Ram.

     

  • 06-16-2009 2:18 AM In reply to

    Re: Problem with decompressing files with French Accents in the file name.

    Hi Ram,

    At the end of the PKWare AppNote, in Appendix D, they talk about this very problem. Apparently the 0x0008 Extra Data Field might be used for UTF. They say ...

    Quote : Examples of the intended usage for this field is to store whether "modified-UTF-8" (JAVA) is used, or UTF-8-MAC.  Similarly, other commonly used character encoding (code page) designations can be indicated through this field.  Formalized values for use of the 0x0008 record remain undefined at this time.  The definition for the layout of the 0x0008 field will be published when available.  Use of the 0x0008 Extra Field provides for storing data within a ZIP file in an encoding other than IBM Code Page 437 or UTF-8.

    I wonder what is stopping them from publishing a definition. Obviously not seen as a priority.

    Anyway, if you are only using it within your own application you can certainly do whatever you like with the code. Yes, go ahead and change the encoding within the main filename field to UTF, as you outlined earlier. Good idea. Don't bother with the 0x0008 data, I was just quoting that because I had missed it earlier.

    Good luck
    Dave

  • 06-16-2009 7:45 AM In reply to

    Re: Problem with decompressing files with French Accents in the file name.

    Thanks David. It really helps.

     

    Best Regards,

    Ram.

     

  • 07-27-2009 1:50 PM In reply to

    Re: Problem with decompressing files with French Accents in the file name.

     

     Hello,

    I read your posts, and I think I am facing a similar problem as yours but still I can't fix it and I need some assistance...

    I am using SharpZipLib to create zip file... However, when a folder inside the Zip file has Arabic name, its name becomes "????" which causes errors when I try to extract the zip file...

    In fact I am using FastZip object to create my zip file...and I really find it difficult to change the way I am creating this zip file...So, I tried changing in the SharpZipLib code "changing DefaultCodePage to Encoding.UTF8 or Encoding.GetEncoding(any of the numbers mentioned in your post) but none worked fine, actually the "???" just changed to special characters and that enabled the zip file to be extracted. However, I still need the folder inside to have a correct name...

    Can you please help me with that?

  • 07-28-2009 12:51 AM In reply to

    Re: Problem with decompressing files with French Accents in the file name.

    Hi

    Randa Wadie:
    actually the "???" just changed to special characters and that enabled the zip file to be extracted

    This is good. The filename is okay and you only need the folder to be fixed, correct?

    Instead of using FastZip, use ZipOutputStream to create the zipfile. When you create the ZipEntry, the constructor takes a parameter which is the name of file as it will appear in the zip, including the path. You can address the folder issue here.

    Hope that helps
    David

  • 07-28-2009 10:31 AM In reply to

    Re: Problem with decompressing files with French Accents in the file name.

     Hi David,
     
         Thanks a million for your quick response, but I have a couple of questions if you don't mind :)
     
    1- Inside the zip file I am creating, I will need to keep the same folder structure of the original folder that I will be zipping.... So,  will ZipOutputStream help me in doing so?
     
    2- I also found out that unzipping the zip file which includes arabic folder name using "ExtractZip" causes an exception as well....So do you have any suggestions for it? or do you know if changing the zipping process to using "ZipOutputStream" will make "ExtractZip" work fine?
     
                Thanks again 
     
    Randa.

  • 07-29-2009 12:17 AM In reply to

    Re: Problem with decompressing files with French Accents in the file name.

    Hi Randa

    1. The ZipEntry constructor takes the path and filename as it will appear. You have full control. For example,

    ZipEntry zipEntry = new ZipEntry(@"Folder1\Folder2\Folder3\Filename");

    will appear as that. Providing you keep the folder structure and do whatever you need to the special characters there should not be aproblem.

    2. Try extracting using the ZipFile class. Again, you control what you do with the output.

    Regards

    David

  • 07-29-2009 11:40 PM In reply to

    Re: Problem with decompressing files with French Accents in the file name.

     Daivid,

    Here is what I have done to create the zipped file from a folder on sharePoint Document Library (on the fly without saving a physical copy of this document library folder and its content on disk)

    what happens is that when I download the created the zipped file and try to extract it, it tells me that the following path can't be found...and he refers to the path of the folders and files i tried to add to the zip file

    please have a look at the code below and tell me what could be wrong or what needs to be done...

    also if you have a sample code of creating a zipped file using the ZipOutputStream  and then downloading it, this would be perfect...

    public void MainMethod()

    {

                string tempZippedFilePath;
                string zippedFileName = "zipped";

                DownloadAssignment(out tempZippedFilePath);


                Response.Clear();
                Response.AppendHeader(
                   "content-disposition",
                   "attachment; filename=" + zippedFileName + ".zip");
                Response.WriteFile(tempZippedFilePath);
                Response.Flush();
                Response.Close();

    }

     

     

    public void DownloadAssignment(out tempZippedFilePath)

    {

                        StringBuilder zippedFilePath = new StringBuilder();
                        zippedFilePath.AppendFormat("{0}{1}{2}{3}", finalTempFolderPath, @"\",
                                                    assignmentFolderName.ToString(), ".zip");

                        tempZippedFilePath = zippedFilePath.ToString();

                        CreateZip(tempZippedFilePath, assignmentFolders[0].Folder);

    }

           public void CreateZip(string tempZippedFilePath, SPFolder assignmentFolder)
            {
                ZipOutputStream zipOutputStream = new ZipOutputStream(File.Create(tempZippedFilePath));
                zipOutputStream.SetLevel(9);

                ZipEntry assFolderEntry = new ZipEntry(assignmentFolder.Name);
                zipOutputStream.PutNextEntry(assFolderEntry);

                foreach (SPFolder subFolder in assignmentFolder.SubFolders)
                {
                    ZipEntry assSubFolderEntry = new ZipEntry(assignmentFolder.Name + @"\" + subFolder.Name);
                    zipOutputStream.PutNextEntry(assSubFolderEntry);

                    foreach (SPFile file in subFolder.Files)
                    {
                        if ((bool)file.Item["IsLatest"] == true)
                        {
                            ZipEntry fileEntry = new ZipEntry(
                                                                assignmentFolder.Name +
                                                                @"\" +
                                                                subFolder.Name +
                                                                @"\" +
                                                                file.Name);

                            zipOutputStream.PutNextEntry(fileEntry);

                            using (Stream fileStream = file.OpenBinaryStream())
                            {
                                byte[ buffer = new byte[4096];
                                int sourceBytes;

                                do
                                {
                                    sourceBytes = fileStream.Read(buffer, 0, buffer.Length);
                                    zipOutputStream.Write(buffer, 0, sourceBytes);
                                } while (sourceBytes > 0);
                            }
                        }
                    }
                }

                zipOutputStream.Finish();
                zipOutputStream.Close();

            }

            /// <summary>
            /// Creates a temp folder to hold the assignment files while keeping their correct
            /// folder structure as in the document library
            /// </summary>
            /// <param name="systemTempPath">Path of the current system's temporary folder</param>
            /// <param name="finalTempFolderPath">Final path to the temp folder created</param>
            private void CreateTempFolder(string systemTempPath, out string finalTempFolderPath)
            {
                StringBuilder tempFolderPath = new StringBuilder();
                tempFolderPath.AppendFormat("{0}{1}", systemTempPath, "SLK_Temp");

                if (Directory.Exists(tempFolderPath.ToString()))
                {
                    int folderNameSuffix = 1;

                    finalTempFolderPath = tempFolderPath.ToString() + folderNameSuffix.ToString();

                    while (Directory.Exists(finalTempFolderPath))
                    {
                        folderNameSuffix++;
                        finalTempFolderPath = tempFolderPath.ToString() + folderNameSuffix.ToString();
                    }

                    Directory.CreateDirectory(finalTempFolderPath);
                }
                else
                {
                    finalTempFolderPath = tempFolderPath.ToString();
                    Directory.CreateDirectory(finalTempFolderPath);
                }
            }

     

    Also I need to know if it is correct to add an entry for a folder....or I can just add entries for files?

    what if I need to have a file added under a certain hierarchy of folders that are not yet created in the zip file? does your last example of giving the needed path of the file to the instructor of ZipEntry solve this without adding zipEntry for folders?

     

    Daivid, I know I am asking many questions but I got stuck with that for 2 days and I really need to finish that ASAP...

     

    I really apreciate your efforts...

     

    Thanks a million :)

     

  • 07-30-2009 2:13 AM In reply to

    Re: Problem with decompressing files with French Accents in the file name.

    Randa Wadie:
    when I download the created the zipped file and try to extract it, it tells me that the following path can't be found...and he refers to the path of the folders and files i tried to add to the zip file

    So the zip file is created okay, and the problem is after you download it in your browser? What program are you using to extract the zip? From your description the problem is in the extraction not the creation. Maybe I misunderstood.

     

     

  • 07-30-2009 7:14 AM In reply to

    Re: Problem with decompressing files with French Accents in the file name.

     I am not sure if the zip file is created okay, I think it is not, as long as it refuses to be extracted...That is why i sent you the code I am using in case I have done something wrong....

    The program I am using in extraction is WinRAR....

    What do you think?

    Also if you have any sample code, I will be grateful

  • 07-30-2009 7:37 AM In reply to

    Re: Problem with decompressing files with French Accents in the file name.

    Try extracting with Winzip. Does it look okay in Winzip and does it extract okay if not whats the error message.

    You need to be more specific in describing the problem, and you need to post concise code - the smallest possible working section of code.

    Try these -

    -- do not do a ZipEntry for the folders, just the files.

    -- make sure the filenames in the ZipEntry constructor start with a folder not a drive e.g. "C:\folder\..." is apparently bad, use "folder\..." instead.

    Regards

    David

Page 1 of 2 (18 items) 1 2 Next >
Powered by Community Server (Commercial Edition), by Telligent Systems
Don't contact us via this (fleischfalle@alphasierrapapa.com) email address.