Fix asEmailMessage() raises ValueError when message has multiple TO recipients#477
Draft
glorat wants to merge 16 commits intoTeamMsgExtractor:next-releasefrom
Draft
Fix asEmailMessage() raises ValueError when message has multiple TO recipients#477glorat wants to merge 16 commits intoTeamMsgExtractor:next-releasefrom
glorat wants to merge 16 commits intoTeamMsgExtractor:next-releasefrom
Conversation
…lease Add pull request tests
…lease Version 0.51.0
…lease Fix readme
…lease Version 0.51.1
…lease Version 0.52.0
…lease Version 0.53.1
…lease Version 0.53.2
…lease Version 0.54.0
…lease Update changelog
…lease Version 0.54.1
…lease Version 0.55.0
Multiple To recipients produce duplicate TO keys in msg.header. The header-copy loop was assigning each individually, but EmailMessage enforces a single TO field. Duplicate keys are now merged into one comma-separated value before assignment. Adds example multi-to.msg and tests covering both parsing and EML conversion.
The header dedup dict was keyed case-sensitively, so 'TO' and 'To' were treated as separate keys and both assigned to the EmailMessage, triggering the single-field constraint. Dedup now keys by lowercased name while preserving the original casing of the first occurrence. Adds multi-to-to.msg and tests covering both parsing and EML conversion.
… headers Improper header unfolding left tab characters mid-value after stripping newlines. RFC 2047 encoded words using invalid charsets (e.g. malformed GB2312) were passed raw to EmailMessage, whose folding code then failed to re-encode the decoded replacement characters via as_bytes(). Fix uses proper RFC 5322 unfolding and re-encodes problematic encoded words as UTF-8 before assigning to EmailMessage.
Real-world emails often label headers as GB2312 but use byte sequences only valid in GBK (a strict superset). The previous latin-1 fallback produced garbled display names. Now tries GBK/CP936 before latin-1 when a GB2312-declared encoded word fails to decode.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Multiple To recipients produce duplicate TO keys in msg.header. The header-copy loop was assigning each individually, but EmailMessage enforces a single TO field. Duplicate keys are now merged into one comma-separated value before assignment.
Adds example multi-to.msg and tests covering both parsing and EML conversion.
CHECKLIST
next-releasebranch (orv0.29if applicable)?