This issue tracker has been migrated to GitHub ,
and is currently read-only.
For more information,
see the GitHub FAQs in the Python's Developer Guide.
Created on 2008年04月26日 04:18 by cdavid, last changed 2022年04月11日 14:56 by admin. This issue is now closed.
| Files | ||||
|---|---|---|---|---|
| File name | Uploaded | Description | Edit | |
| make_id_fix_and_test.patch | markm, 2011年03月26日 12:24 | Patch to fix msilib.make_id() and test it | review | |
| Messages (8) | |||
|---|---|---|---|
| msg65834 - (view) | Author: Cournapeau David (cdavid) | Date: 2008年04月26日 04:18 | |
Hi, I wanted to build a msi using the build_msi distutils command for one of my package, but at some point, it fails, at the function make_id, at line 177 in mstlib/__init__.py, for a file named aixc++.py. The regex indeed refuses any character which is not alphanumeric: is msi itself really that strict, or could this check be relaxed ? |
|||
| msg65842 - (view) | Author: Martin v. Löwis (loewis) * (Python committer) | Date: 2008年04月26日 12:19 | |
Indeed, the primary keys in many tables must be Identifiers, see http://msdn2.microsoft.com/en-us/library/aa369212(VS.85).aspx make_id tries to synthesize an identifier from a file name, and fails for your file names. |
|||
| msg65845 - (view) | Author: Cournapeau David (cdavid) | Date: 2008年04月26日 15:56 | |
Ok, thanks for the information. It may good to have a bit more informative error, though, such as saying which characters are allowed when checking against a regex ? |
|||
| msg65846 - (view) | Author: Martin v. Löwis (loewis) * (Python committer) | Date: 2008年04月26日 16:02 | |
Actually, the algorithm should be fixed to generate a valid identifier for any input. Would you like to work on a fix? |
|||
| msg65847 - (view) | Author: Cournapeau David (cdavid) | Date: 2008年04月26日 16:13 | |
It's not that I don't want to work on it, but I don't know anything about msi, except that some windows users of my packages request it :) So I would need some indication on what to fix exactly Do I understand right that dist_msi builds a database of the files, and that the identifiers could be named differently than the filenames themselves, as long as they are unique ? |
|||
| msg65848 - (view) | Author: Martin v. Löwis (loewis) * (Python committer) | Date: 2008年04月26日 16:23 | |
> Do I understand right that dist_msi builds a database of the files, and > that the identifiers could be named differently than the filenames > themselves, as long as they are unique ? Correct. As a design objective, I try to use identifiers close to the file names, to simplify debugging of the MSI file (Microsoft itself typically uses UUIDs instead). In short, just make make_id generate valid identifiers. An algorithm on top of that will make them unique in case of conflicts. Regards, Martin |
|||
| msg132232 - (view) | Author: Mark Mc Mahon (markm) * | Date: 2011年03月26日 12:24 | |
How about the following patch and tests... Per: http://msdn.microsoft.com/en-us/library/aa369212(v=vs.85).aspx """The Identifier data type is a text string. Identifiers may contain the ASCII characters A-Z (a-z), digits, underscores (_), or periods (.). However, every identifier must begin with either a letter or an underscore.""" So the spec would say that colons are NOT allowed. Editing some entries in the File table of an MSI (using Orca from the MSI SDK) and running the validation confirms that. All the following were flagged as errors: 'KDiff3EXE;"ASDF@#$', 'chmFile-', 'pdfFile(', 'hgbook]', 'TortoisePlinkEXE]', 'Hg.Cämd' I also did some speed testing (just in case non/regex might be slow) Python 3.2 (r32:88445, Feb 20 2011, 21:29:02) [MSC v.1500 32 bit (Intel)] on win32 Type "help", "copyright", "credits" or "license" for more information. >>> from timeit import timeit >>> setup = 'import string\nidentifier_chars = string.ascii_letters + string.digits + "._"\ntmp_str = []' >>> timeit("re.sub(r'[^a-zA-Z_\.]', '_', 'somefilename.txt')", setup = "import re") 4.434621757767205 >>> setup = 'import string\nidentifier_chars = string.ascii_letters + string.digits + "._"\ntmp_str = []' >>> timeit('"".join([c if c in identifier_chars else "_" for c in "somefilename.txt"])', setup) 3.3757537425069906 >>> |
|||
| msg132543 - (view) | Author: Mark Mc Mahon (markm) * | Date: 2011年03月29日 22:14 | |
This issue has been fixed by changes made in issue7639 and issue11696 |
|||
| History | |||
|---|---|---|---|
| Date | User | Action | Args |
| 2022年04月11日 14:56:33 | admin | set | github: 46946 |
| 2011年03月30日 05:33:38 | loewis | set | status: open -> closed resolution: fixed |
| 2011年03月29日 22:14:07 | markm | set | messages: + msg132543 |
| 2011年03月26日 12:24:18 | markm | set | files:
+ make_id_fix_and_test.patch nosy: + markm messages: + msg132232 keywords: + patch |
| 2010年01月13日 01:58:56 | brian.curtin | set | priority: normal stage: needs patch versions: + Python 2.7, - Python 2.5 |
| 2008年04月26日 16:23:53 | loewis | set | messages: + msg65848 |
| 2008年04月26日 16:13:55 | cdavid | set | messages: + msg65847 |
| 2008年04月26日 16:02:31 | loewis | set | messages: + msg65846 |
| 2008年04月26日 15:56:06 | cdavid | set | messages: + msg65845 |
| 2008年04月26日 12:19:34 | loewis | set | nosy:
+ loewis messages: + msg65842 |
| 2008年04月26日 04:18:42 | cdavid | create | |