1
1
Fork
You've already forked ecformat
0

EditorConfig: Support charset #21

Manually merged
BaumiCoder merged 56 commits from feature/3-charset into main 2025年09月12日 17:36:17 +02:00

Resolves #3

User perspective

Include the charset property into the check and fix command. The determined charset from the file content maybe incorrect. If so the user can disable the handling of the charset property (see CLI help).

Developer perspective

This Pull request introduces the general architecture for the handlers of the different properties. Each of them have to provide a check and fix method to process the given file (provided as a Path). The handlers should decide in its build function, if handling for a given set of EditorConfig property is necessary or not (returning an None if not necessary).

Resolves #3 ### User perspective Include the `charset` property into the `check` and `fix` command. The determined charset from the file content maybe incorrect. If so the user can disable the handling of the `charset` property (see CLI help). ### Developer perspective This Pull request introduces the general architecture for the handlers of the different properties. Each of them have to provide a check and fix method to process the given file (provided as a `Path`). The handlers should decide in its `build` function, if handling for a given set of EditorConfig property is necessary or not (returning an `None` if not necessary).
Changing file to mut reference because the read method needs it.
Log errors with filename and create a general error for the result.
Now it at least get utf-8 files correctly.
Only the CheckError are some what "normal" and should not stop
to see the errors of all files.
If ecformat is used a lib, the list of errors is maybe helpful.
When using ecformat as lib, it will be also of interest in which
files the errors occurred.
Access necessary to handle this types when ecformat is used as lib.
(Also solve the warning about unused methods at the getters.)
Handles all kind of pairs directly and is the same crate as
for determine the actual charset.
Use EditorConfig Charset enum for better distinction.
Need some special character to be different from UTF-8.
To macintosh it is very similar, even characters that are differently
encoded in Latin 1 and macintosh are not enough to detect to correct
charset from the file. Therefore, we have to exclude macintosh charset.
The charset specific function should be only necessary for handling
the charset property. All other property handle should rely on a
charset without the need to determine it from the file contents.
REUSE lint does not detect the license headers in UTF 16 files.
Some failing at the moment, maybe some special characters in test files
necessary to determine the correct charset after change.
This fixes the most failuring tests (only Latin1 still fails).
Maybe switch to Path later to avoid the problematic when to rewind
(or not for performance reasons).
Some special characters needs to be in the files to allow determine
Latin 1 encoding after convert to it, because otherwise UTF 8 is
determined after converting to Latin 1.
This allows to use some more helper functions like fs::write,
no need to keep track of when a rewind() call on the file is necessary
and allow to use the file name in warning messages.
Separate module for test helper function to allow using them
in the Integration tests as well.
Verbosity is not need when used as lib (or for Integration tests)
More flexible when used as library when it does not take the ownership.
The duplicated code for the parameters values with #[values(...)]
seems to be unavoidable, as rstest_reuse does not work here
(maybe because of enum type or
because of exporting the template from tests_utils)
For more ergonomic use of the CheckErrorList
Compiler warning in charset Integration Test
due to only one enum variant. Solved with the dummy to make sure
that this workaround will be removed later, when it is not necessary
anymore (a warn disabling could be forget to remove later on).
Singular "test" sounds nicer than plural "tests"
Do not mention private method in doc comments.
The previous function path was still working (e.g., for the mouse over
in VSCodium with rust analyzer), but was not correct anymore.
Now it only the function name is used, which is enough due to the
"use" of the function in this modules.
1. std
2. external crates
3. mod
4. internal uses
Put all helper function into a (inline) module, to be able to make
only the function public, which should be used from the CharsetHandler.
The other helper function are only for private use from function
inside the utils module.
BaumiCoder manually merged commit 67b1062d4f into main 2025年09月12日 17:36:17 +02:00
Sign in to join this conversation.
No reviewers
Labels
Clear labels
Compat/Breaking
Breaking change that won't be backward compatible
EditorConfig
0.17.2
Issues to support version 0.17.2 of the EditorConfig specification
Kind
Bug
Something is not working
Kind
Chore
Some tasks maintainig tasks
Kind
Documentation
Documentation changes
Kind
Enhancement
Improve existing functionality
Kind
Feature
New functionality
Kind
Testing
Issue or pull request related to testing
Packaging
About packaging the project for some platform
Priority
Critical
The priority is critical
Priority
High
The priority is high
Priority
Low
The priority is low
Priority
Medium
The priority is medium
Reviewed
Confirmed
Issue has been confirmed
Reviewed
Duplicate
This issue or pull request already exists
Reviewed
Invalid
Invalid issue
Reviewed
Won't Fix
This issue won't be fixed
Status
Abandoned
Somebody has started to work on this but abandoned work
Status
Blocked
Something is blocking this issue or pull request
Status
Need More Info
Feedback is required to reproduce issue or to continue work
WIP
Work in progress (Assignee is working on this issue)
Milestone
Clear milestone
No items
No milestone
Projects
Clear projects
No items
No project
Assignees
Clear assignees
No assignees
1 participant
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference
BaumiCoder/ecformat!21
Reference in a new issue
BaumiCoder/ecformat
No description provided.
Delete branch "feature/3-charset"

Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?