summaryrefslogtreecommitdiffstats
path: root/media/libmedia/CharacterEncodingDetector.cpp
Commit message (Collapse)AuthorAgeFilesLines
* Fix operator precedenceGlenn Kasten2014-03-251-2/+2
| | | | Change-Id: I164708a5b76a341a185467b008ecbec98d58a6df
* Use more tags to help the ICU detector.Marco Nelissen2014-03-191-19/+96
| | | | | | | | | | | | | The detector only gave non-ascii data to ICU. In some cases that could result in very short data, for which ICU would issue a low confidence level for the actual encoding. By padding the data with additional (ascii) tags, we improve accuracy for such files. Becauses this can reduce accuracy in other cases, only do this when the initial confidence is low. b/13473604 Change-Id: I63d932043155c310b0e358cdf2d37787961e94b7
* Better character set encoding detectionMarco Nelissen2013-12-111-0/+364
Id3 tags are supposed to be ISO-8859-1 or unicode, but often aren't. To better detect the real encoding we now use ICU to detect possible encodings for a given byte sequence, then apply additional heuristics to determine the most likely one. b/5564857 Change-Id: I53bc83b006433da5c2f2ccfcd770ddb3a26b64d0