Differences
This shows you the differences between two versions of the page.
| Both sides previous revision Previous revision Next revision | Previous revision | ||
| files:encoding [2016/09/30 00:50] – admin | files:encoding [2018/10/04 17:14] (current) – external edit 127.0.0.1 | ||
|---|---|---|---|
| Line 3: | Line 3: | ||
| ===== Preconditions ===== | ===== Preconditions ===== | ||
| - | - The file can have one encoding (same as code page). Encoding can be as Unicode ( UTF16 LE, BE (1200, 1201), UTF8 (65000) ) as not Unicode (for example 252 (Western European) etc). | + | - The file can have one encoding (same as code page). Encoding can be Unicode ( UTF16 LE, BE (1200, 1201), UTF8 (65000) ) and not Unicode (for example 252 (Western European) etc). |
| - There are several places, where encoding conversion can be applied to document: Open, Save As, New, Search and Replace | - There are several places, where encoding conversion can be applied to document: Open, Save As, New, Search and Replace | ||
| - | - The encoding can be selected/ | + | - The encoding can be selected/ |
| - | - If encoding for document once changed by the user, this preference has priority over all the rest of settings. Preferences are machine specific but can be reset, if HippoEDIT temp files would be deleted or format of them would change in new version. | + | - If encoding for document once changed by the user, this preference has priority over all the rest of settings. Preferences are machine specific but can be reset, if HippoEDIT temp files would be deleted or format of them would change in the new version. |
| So, how all this works together (or designed to work ) : | So, how all this works together (or designed to work ) : | ||
| Line 32: | Line 32: | ||
| * Encoding selected in File Save dialog | * Encoding selected in File Save dialog | ||
| * Current document encoding | * Current document encoding | ||
| - | * During save, HippoEDIT checks the consistency of current document encoding and encoding found with encoding strings (XML, HTML etc). If encoding does not match, user would be asked to select which encoding to use | + | * During save, HippoEDIT checks the consistency of current document encoding and encoding found with encoding strings (XML, HTML etc). If encoding does not match, |
| - | * Because HippoEDIT internally works with Unicode representation of text (UTF16 LE), on save, can happen that current text could not be saved without loss of information with currently selected encoding. In this case, HippoEDIT should pop-up a warning, informing the user about possible data loss and suggest to save the document as Unicode or using some another encoding. This behavior controlled by flag Check encoding accuracy in Tools-> | + | * Because HippoEDIT internally works with Unicode representation of text (UTF16 LE), on saving, can happen that current text could not be saved without loss of information with currently selected encoding. In this case, HippoEDIT should pop-up a warning, informing the user about possible data loss and suggest to save the document as Unicode or using some another encoding. This behavior controlled by flag Check encoding accuracy in Tools-> |
| ===== Search and Replace ===== | ===== Search and Replace ===== | ||
| - | Search and Replace encoding uses same logic as for Open/Save file, just interactive selection of encoding, with Open/Save dialog, not available. | + | Search and Replace encoding uses same logic as for Open/Save file, but interactive selection of encoding, with Open/Save dialog, not available. |
| ===== If there are problems ===== | ===== If there are problems ===== | ||
| Line 42: | Line 42: | ||
| So, if you see that documents are open with wrong encoding, you have several choices of how to solve this: | So, if you see that documents are open with wrong encoding, you have several choices of how to solve this: | ||
| * Explicitly select correct encoding in File Open dialog | * Explicitly select correct encoding in File Open dialog | ||
| - | * Set, for syntax you are using, forced encoding: | + | * Set, for syntax you are using, forced encoding |
| - | <code xml> | + | |
| - | < | + | |
| - | </ | + | |
| - | in SPECIFICATION section of schema spec file. | + | |
| * Disable extended auto-detection (IE algorithms). It can return the wrong result if data for analysis is not sufficient. | * Disable extended auto-detection (IE algorithms). It can return the wrong result if data for analysis is not sufficient. | ||
| It can be done with xml flags in settings.xml, | It can be done with xml flags in settings.xml, | ||
| Line 53: | Line 49: | ||
| </ | </ | ||
| - | Also from now on, extended encoding detection is enabled by default only for syntaxes inherited from deftext (as Plain Text, XML, and HTML). | + | Also from now on, extended encoding detection is enabled by default only for syntaxes inherited from //deftext// (as Plain Text, XML, and HTML). |
| You can control encoding even in more granular way by disabling some encoding detection methods, which in most cases do not provide false positives. As: | You can control encoding even in more granular way by disabling some encoding detection methods, which in most cases do not provide false positives. As: | ||
| + | * **extended** - heuristically based detection of encoding | ||
| + | * **min_confidence** - minimal confidence level for extendede decoding, default is 90, maybe higher than 100 | ||
| * **bom** (default //true//) - use BOM signs for for encoding detection | * **bom** (default //true//) - use BOM signs for for encoding detection | ||
| * **unicode** (default //true//) - use UTF16 (LE/BE) statistic detection logic in addition to BOM detection (if BOM is not defined) | * **unicode** (default //true//) - use UTF16 (LE/BE) statistic detection logic in addition to BOM detection (if BOM is not defined) | ||
| Line 61: | Line 59: | ||
| * **utf8** (default //true//) - use extended algorithms for UTF8 detection in addition to BOM detection (if BOM is not defined) | * **utf8** (default //true//) - use extended algorithms for UTF8 detection in addition to BOM detection (if BOM is not defined) | ||
| <code xml> | <code xml> | ||
| - | < | + | < |
| </ | </ | ||
| ===== How to set default encoding for specific syntax ===== | ===== How to set default encoding for specific syntax ===== | ||
| Line 93: | Line 91: | ||
| Doing of changes to [[terms: | Doing of changes to [[terms: | ||
| - | Safest | + | The safest |