#include <opentrep/basic/OTransliterator.hpp>
Public Member Functions | |
std::string | unpunctuate (const std::string &iString) const |
std::string | unquote (const std::string &iString) const |
std::string | unaccent (const std::string &iString) const |
std::string | transliterate (const std::string &iString) const |
std::string | normalise (const std::string &iString) const |
OTransliterator () | |
OTransliterator (const OTransliterator &) | |
~OTransliterator () | |
Wrapper around a Unicode transliterator.
Definition at line 18 of file OTransliterator.hpp.
OPENTREP::OTransliterator::OTransliterator | ( | ) |
Default Constructor.
Definition at line 17 of file OTransliterator.cpp.
Referenced by OTransliterator().
OPENTREP::OTransliterator::OTransliterator | ( | const OTransliterator & | iTransliterator | ) |
OPENTREP::OTransliterator::~OTransliterator | ( | ) |
Destructor.
Definition at line 42 of file OTransliterator.cpp.
std::string OPENTREP::OTransliterator::unpunctuate | ( | const std::string & | iString | ) | const |
Remove the punctuation of the given string.
const | std::string& The string for which the punctuation must be removed |
Definition at line 156 of file OTransliterator.cpp.
References OPENTREP::getUTF8(), and unpunctuate().
Referenced by normalise(), and unpunctuate().
std::string OPENTREP::OTransliterator::unquote | ( | const std::string & | iString | ) | const |
Remove the quote characters of the given string.
const | std::string& The string for which the quote characters must be removed |
Definition at line 177 of file OTransliterator.cpp.
References OPENTREP::getUTF8(), and unquote().
Referenced by normalise(), and unquote().
std::string OPENTREP::OTransliterator::unaccent | ( | const std::string & | iString | ) | const |
Remove the accents of the given string.
Note that this transformation implies to apply a normalisation process (separation of the characters from their accents), typically forth (NFD) and back (NFC). See for instance http://www.unicode.org/faq/normalization.html
const | std::string& The string for which the accents must be removed |
Definition at line 198 of file OTransliterator.cpp.
References OPENTREP::getUTF8(), and unaccent().
Referenced by normalise(), and unaccent().
std::string OPENTREP::OTransliterator::transliterate | ( | const std::string & | iString | ) | const |
Transliterate (e.g., from Chinese or Russian scripts), into the Latin alphabet, the given string.
const | std::string& The string to be transliterated |
Definition at line 219 of file OTransliterator.cpp.
References OPENTREP::getUTF8(), and transliterate().
Referenced by normalise(), and transliterate().
std::string OPENTREP::OTransliterator::normalise | ( | const std::string & | iString | ) | const |
Perform all the above operations (unaccent, unquote, unpunctuate, transliterate) the given string.
const | std::string& The string to be normalised. |
Definition at line 233 of file OTransliterator.cpp.
References OPENTREP::getUTF8(), transliterate(), unaccent(), unpunctuate(), and unquote().
Referenced by OPENTREP::Place::addNameToXapianSets(), BOOST_AUTO_TEST_CASE(), and OPENTREP::Place::buildIndexSets().