vmime

Author	SHA1	Message	Date
Jan Engelhardt	aa60d00496	Fix a test failure in testNewFromString (#311 ) Fixes an oversight in d296c2d1.	2024-06-11 20:47:53 +02:00
Jan Engelhardt	d296c2d1d5	vmime: prevent loss of a space during text::createFromString (#306 ) ``` mailbox(text("Test München West", charsets::UTF_8), "a@b.de").generate(); ``` produces ``` =?us-ascii?Q?Test_?= =?utf-8?Q?M=C3=BCnchen?= =?us-ascii?Q?West?= <test@example.com> ``` The first space between ``Test`` and ``München`` is encoded as an underscore along with the first word: ``Test_``. The second space between ``München`` and ``West`` is encoded with neither of the two words and thus lost. Decoding the text results in ``Test MünchenWest`` instead of ``Test München West``. This is caused by how ``vmime::text::createFromString()`` handles transitions between 7-bit and 8-bit words: If an 8-bit word follows a 7-bit word, a space is appended to the previous word. The opposite case of a 7-bit word following an 8-bit word misses this behaviour. When one fixes this problem, a follow-up issue appears: ``text::createFromString("a b\xFFc d")`` tokenizes the input into ``m_words={word("a "), word("b\xFFc ", utf8), word("d")}``. This "right-side alignment" nature of the whitespace is a problem for word::generate(): As per RFC 2047, spaces between adjacent encoded words are just separators but not meant to be displayed. A space between an encoded word and a regular ASCII text is not just a separator but also meant to be displayed. When word::generate() outputs the b-word, it would have to strip one space, but only when there is a transition from encoded-word to unencoded word. word::generate() does not know whether d will be encoded or unencoded. The idea now is that we could change the tokenization of ``text::createFromString`` such that whitespace is at the start of words rather than at the end. With that, word::generate() need not know anything about the next word, but rather only the previous one. Thus, in this patch, 1. The tokenization of ``text::createFromString`` is changed to left-align spaces and the function is fixed to account for the missing space on transition. 2. ``word::generate`` learns how to steal a space character. 3. Testcases are adjusted to account for the shifted position of the space. Fixes: #283, #284 Co-authored-by: Vincent Richard <vincent@vincent-richard.net>	2024-05-21 15:55:06 +02:00
Jan Engelhardt	c105165c6e	tests: switch a byte sequence in textTest (#305 ) Switch out the byte sequence by one that is simiarly random, but one which happens to decode as valid UTF-8, such that the expected and actual strings are shown with reasonable characters on a terminal.	2024-05-21 15:48:26 +02:00
Jan Engelhardt	b447adbe37	Fixes/comments for guessBestEncoding (#304 ) * tests: add case for getRecommendedEncoding * vmime: avoid integer multiply wraparound in wordEncoder::guessBestEncoding If the input string is 42949673 characters long or larger, there will be integer overflow on 32-bit platforms when multiplying by 100. Switch that one computation to floating point. * vmime: update comment in wordEncoder::guessBestEncoding	2024-05-21 15:47:05 +02:00
Jan Engelhardt	97d15b8cd7	vmime: avoid changing SEVEN_BIT when encoding::decideImpl sees U+007F (#303 ) * vmime: avoid changing SEVEN_BIT when encoding::decideImpl sees U+007F Do not switch to QP/B64 when encountering U+007F. U+007F is part of ASCII just as much as U+0001 is. --------- Co-authored-by: Vincent Richard <vincent@vincent-richard.net>	2024-05-21 15:45:29 +02:00
vincent-richard	561746081f	Fixed possible recursion crash when parsing mailbox groups.	2022-01-25 10:28:20 +01:00
ibanic	5d78d879bb	Prevent accessing empty buffer	2021-05-15 22:32:24 +02:00
Jan Engelhardt	f4c611b736	Avoid force-encoding display names that fit within qcontent When the display name contains an At sign, or anything of the sort, libvmime would forcibly encode this to =?...?=, even if the line is fine ASCII which only needs quoting. rspamd takes excessive quoting as a sign of spam and penalizes such mails by raising the score (rule/match: TO_EXCESS_QP et al.)	2020-12-11 23:10:39 +01:00
vincent-richard	5c00f7867a	#238 Fixed whitespace between encoded words	2020-06-16 19:47:33 +02:00
vincent-richard	9a10a839ec	Added test.	2020-06-02 18:13:34 +02:00
Jan Engelhardt	b06e9e6f86	Skip delimiter lines that are not exactly equal to the boundary There is crap software out there that generates mails violating the prefix ban clause from RFC 2046 §5.1 ¶2. Switch vmime from a prefix match to an equality match, similar to what Alpine and Thunderbird do too.	2019-10-05 11:37:09 +02:00
Jan Engelhardt	df32418df5	Disregard whitespace between leading boundary hyphens and marker The way I read the RFC is that whitespace is not allowed before the boundary marker, only afterwards, so the checks for leading WS are removed, and the missing check for trailing WS is added. See RFC 2046 §5.1.1: """The boundary delimiter line is then defined as a line consisting entirely of two hyphen characters ("-", decimal value 45) followed by the boundary parameter value from the Content-Type header field, optional linear whitespace, and a terminating CRLF."""	2019-10-05 11:31:51 +02:00
Jan Engelhardt	d1190b496f	Improve address parser for malformed mailbox specifications Spammers use "Name <addr> <addr>" to trick some parsers. My expectations as to what the outcome should be is presented in the updated mailboxTest.cpp. The DFA in mailbox::parseImpl is hereby redone so as to pick the rightmost address-looking portion as the address, rather than something in between. While doing so, it will also no longer mangle the name part anymore (it does this by keeping a "as_if_name" variable around until the end).	2019-01-25 08:11:07 +01:00
Jan Engelhardt	cc18aa39c1	tests: add more malformation tests to mailboxTest	2019-01-24 13:17:52 +01:00
Vincent Richard	df135b5a8b	Removed 'stringProxy' since COW std::string is no longer valid in C++11.	2018-09-15 07:41:26 +02:00
Vincent Richard	b55bdc9c0b	Code style and clarity.	2018-09-05 23:54:48 +02:00
Vincent Richard	f173b0a535	Avoid copy by passing shared_ptr<> with const reference.	2018-08-18 16:08:25 +02:00
Vincent Richard	abba40e97d	Added unit test related to PR #192 .	2018-03-12 20:33:27 +01:00
Vincent Richard	c53e914ea5	Always ignore newlines between words.	2017-01-02 21:40:38 +01:00
Vincent Richard	5424aa2381	Fixed #149 : don't loose charset when fixing invalid broken words.	2016-11-05 13:31:54 +01:00
Vincent Richard	4fd8976515	Issue #126 : more warnings fixed.	2016-03-13 20:15:22 +01:00
Vincent Richard	c446afddd4	Estimate generated size of parameterized field.	2015-06-07 21:32:44 +02:00
Vincent Richard	e88b8eeac2	Fixed parsing of UTF8 email addresses (RFC-2047 local part + IDNA domain name).	2015-05-03 19:17:00 +02:00
Vincent Richard	19321f9026	Fixed unit test so that is does not depend on the current locale charset.	2015-02-19 21:24:41 +01:00
Vincent Richard	c5c66f9fdc	Issue #103 : fix badly encoded words.	2015-02-16 18:43:03 +01:00
Vincent Richard	e7739c0efe	Fixed issue #98 : support for wrongly padded B64 words.	2015-01-14 19:35:28 +01:00
Vincent Richard	03a0e36e91	Added support for language specification in RFC-2047 encoded words and RFC-2231 parameter values.	2014-06-30 22:48:42 +02:00
Vincent Richard	0863f50c26	Allow choosing between encoding modes for parameter values.	2014-06-17 21:08:22 +02:00
Vincent Richard	4aefcca374	Removed useless 'virtual' inheritance (fixed issue #84 ).	2014-06-06 19:26:01 +02:00
Vincent Richard	30ea54f269	Fixed parsing of empty lines in header field value.	2014-06-01 20:46:17 +02:00
Vincent Richard	ef892af655	Do not make calls to setlocale() in a library. Use default user locale in tests and examples.	2014-01-16 00:27:51 +01:00
Vincent Richard	7e265b05f4	Simplified types for better readability. Use appropriate types (size_t, byte_t...). Minor warning fixes.	2013-12-10 08:52:51 +01:00
Vincent Richard	f9913fa28a	Boost/C++11 shared pointers.	2013-11-21 22:16:57 +01:00
Vincent Richard	29954e5e50	Fixed group parsing in mailboxList.	2013-10-16 19:47:24 +02:00
Vincent Richard	b886cd4864	Refactored the way embedded objects are referenced in MHTML.	2013-07-11 18:06:26 +02:00
Vincent Richard	86f0a63802	Do not QP-encode CRLFs when content type is text.	2013-06-27 13:56:55 +02:00
Vincent Richard	de659db112	Removed debug printf.	2013-06-27 07:54:33 +02:00
Vincent Richard	1a30cfe41b	Unit tests for content handlers.	2013-06-26 21:41:42 +02:00
Vincent Richard	895b07cae9	Added support for SIZE SMTP extension (RFC-1870).	2013-06-24 15:32:40 +02:00
Vincent Richard	2e5574b146	Added support for transport padding in boundary (issue #38 ).	2013-06-13 12:00:42 +02:00
Vincent Richard	02e1cf65ab	Fixed comment.	2013-06-09 10:24:56 +02:00
Vincent Richard	9d2703c376	Added support for charset conversion with ICU (thanks to Mehmet Bozkurt).	2013-03-25 12:32:48 +01:00
Vincent Richard	32eb1ebe34	Strip spaces at end of header lines (Zarafa).	2013-03-24 15:50:16 +01:00
Vincent Richard	21945be4c4	Fixed warnings and 64-bit issues.	2013-03-24 12:30:26 +01:00
Vincent Richard	495526a5e6	Let whitespace break the value of a parameterized header field, not just a ';' (thanks to Zarafa).	2013-03-24 11:35:08 +01:00
Vincent Richard	84415da8e1	Fixed parsing header field value on next line.	2013-03-24 10:02:23 +01:00
Vincent Richard	da2797702f	Updated tests for charset conversion. Added test for UTF-7 encoding availability. Added test for input buffer underflow in charsetFilteredOutputStream. Refactored charset conversion tests and removed useless tests.	2013-03-18 09:35:04 +01:00
Vincent Richard	32a80f6c1e	Fixed mailbox and mailbox group parsing. Added unit tests.	2013-03-11 10:05:09 +01:00
Vincent Richard	1df8c6cd0e	Refactored unit tests.	2013-03-08 08:19:55 +01:00
Vincent Richard	8378b350df	Throw exception when an invalid value type is set in a header field.	2013-02-27 14:59:37 +01:00

1 2 3

125 Commits