```
mailbox(text("Test München West", charsets::UTF_8), "a@b.de").generate();
```
produces
```
=?us-ascii?Q?Test_?= =?utf-8?Q?M=C3=BCnchen?= =?us-ascii?Q?West?= <test@example.com>
```
The first space between ``Test`` and ``München`` is encoded as an
underscore along with the first word: ``Test_``. The second space
between ``München`` and ``West`` is encoded with neither of the two
words and thus lost. Decoding the text results in ``Test
MünchenWest`` instead of ``Test München West``.
This is caused by how ``vmime::text::createFromString()`` handles
transitions between 7-bit and 8-bit words: If an 8-bit word follows a
7-bit word, a space is appended to the previous word. The opposite
case of a 7-bit word following an 8-bit word *misses* this behaviour.
When one fixes this problem, a follow-up issue appears:
``text::createFromString("a b\xFFc d")`` tokenizes the input into
``m_words={word("a "), word("b\xFFc ", utf8), word("d")}``. This
"right-side alignment" nature of the whitespace is a problem for
word::generate():
As per RFC 2047, spaces between adjacent encoded words are just
separators but not meant to be displayed. A space between an encoded
word and a regular ASCII text is not just a separator but also meant
to be displayed.
When word::generate() outputs the b-word, it would have to strip one
space, but only when there is a transition from encoded-word to
unencoded word. word::generate() does not know whether d will be
encoded or unencoded.
The idea now is that we could change the tokenization of
``text::createFromString`` such that whitespace is at the *start* of
words rather than at the end. With that, word::generate() need not
know anything about the next word, but rather only the *previous*
one.
Thus, in this patch,
1. The tokenization of ``text::createFromString`` is changed to
left-align spaces and the function is fixed to account for
the missing space on transition.
2. ``word::generate`` learns how to steal a space character.
3. Testcases are adjusted to account for the shifted
position of the space.
Fixes: #283, #284
Co-authored-by: Vincent Richard <vincent@vincent-richard.net>
Switch out the byte sequence by one that is simiarly random, but one
which happens to decode as valid UTF-8, such that the expected and
actual strings are shown with reasonable characters on a terminal.
* tests: add case for getRecommendedEncoding
* vmime: avoid integer multiply wraparound in wordEncoder::guessBestEncoding
If the input string is 42949673 characters long or larger, there will
be integer overflow on 32-bit platforms when multiplying by 100.
Switch that one computation to floating point.
* vmime: update comment in wordEncoder::guessBestEncoding
* vmime: avoid changing SEVEN_BIT when encoding::decideImpl sees U+007F
Do not switch to QP/B64 when encountering U+007F.
U+007F is part of ASCII just as much as U+0001 is.
---------
Co-authored-by: Vincent Richard <vincent@vincent-richard.net>
* build: replace class noncopyable by C++11 deleted function declaration
C++11 is mandatory since commit v0.9.2-48-g8564b2f8, therefore we can
exercise the =delete keyword in class declarations to prohibit
copying.
* build: resolve -Woverloaded-virtual warnings
context.hpp:109:26: warning: "virtual vmime::context&
vmime::context::operator=(const vmime::context&)’ was hidden
[-Woverloaded-virtual=]
109 | virtual context& operator=(const context& ctx);
| ^~~~~~~~
generationContext.hpp:153:28: note: by ‘vmime::generationContext&
vmime::generationContext::operator=(const vmime::generationContext&)’
153 | generationContext& operator=(const generationContext& ctx);
| ^~~~~~~~
AFAICS, there is no point in having "virtual" on an assignment operator.
Any derived classes' operator= has different signature anyway.
It is also the only class with a virtual operator=, so that's an indicator
for oddness as well.
* build: resolve -Wdeprecated-declarations warnings
encoding.cpp: In static member function "static const vmime::encoding
vmime::encoding::decideImpl(std::__cxx11::basic_string<char>::const_iterator,
std::__cxx11::basic_string<char>::const_iterator)":
encoding.cpp:161:29: warning: "std::binder2nd<_Operation>
std::bind2nd(const _Operation&, const _Tp&) [with _Operation =
less<unsigned char>; _Tp = int]" is deprecated: use "std::bind" instead
[-Wdeprecated-declarations]
161 | std::bind2nd(std::less<unsigned char>(), 127)
| ~~~~~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
C++11 is mandatory, so just use a lambda already.
* Add parsing feedback via parsingContext
Changes the parsing context to be modifiable to be able to provide
feedback on the parsing. This allows the user to check if header
recovery was necessary, for example, while parsing the current message.
Signed-off-by: Ben Magistro <koncept1@gmail.com>
Co-authored-by: Vincent Richard <vincent@vincent-richard.net>
* Allow appending of local hostname to be configured via parsing context
Signed-off-by: Ben Magistro <koncept1@gmail.com>
Co-authored-by: Vincent Richard <vincent@vincent-richard.net>
When the ``sender`` function argument is the empty object, vmime
would still attempt to use it at ``sender.getEmail().generate()``,
but that produces just ``@``. As sendmail is called with ``-f @``,
this shows up in postfix's logs as ``<""@>``.
* build: add FreeBSD compilation support
* build: unbreak compilation with clang libc++
unary_function is obsolete with C++11 and removed in C++17.
gnu-gcc-libstdc++ still has the class, but llvm-clang-libc++ does
not, and there is a compile error.
vmime should have just stopped using unary_function with commit
v0.9.2-48-g8564b2f8.
$ cat x.cpp
$ clang++ -std=c++17 -stdlib=libc++ -c x.cpp
In file included from x.cpp:1:
In file included from /usr/local/include/vmime/net/transport.hpp:34:
In file included from /usr/local/include/vmime/net/service.hpp:36:
In file included from /usr/local/include/vmime/net/session.hpp:40:
In file included from /usr/local/include/vmime/utility/url.hpp:30:
/usr/local/include/vmime/propertySet.hpp:339:33: error: no template named
/'unary_function' in namespace 'std'; did you mean '__unary_function'?
class propFinder : public std::unary_function <shared_ptr <property>, bool> {
~~~~~^~~~~~~~~~~~~~
__unary_function
This restructures the cmake a little bit to only find components if they
are actually enabled. It also rearranges things to better group some
related items. This change also fixes include directories for the build
target allowing the library to be embedded making the install step
optional.
Signed-off-by: Ben Magistro <koncept1@gmail.com>