summaryrefslogtreecommitdiffstats
path: root/icu/src/main/java
Commit message (Collapse)AuthorAgeFilesLines
* java.text.RuleBasedCollator fixes.Elliott Hughes2010-04-221-2/+11
| | | | | | | | | | | | | | Add expectations for broken harmony tests, add our own equivalent (but correct) tets, and fix the bug turned up by the correct tests: the icu4jni RuleBasedCollator was using toString to convert a CharacterIterator to a String, resulting in iteration over the result of Object.toString (the class name and identity hash code) rather than the characters of interest. Also shut javac up about non-ASCII characters in Locale.java. Bug: 2608742 Bug: 2608750 Change-Id: I2171789058c8116eacd7e5815bd483f0bc07c69b
* Merge LocaleData and Resources, rename Resources to ICU.Elliott Hughes2010-04-164-109/+97
| | | | | | | Also move our ICU tests into our little tree of tests. Bug: 2596471 Change-Id: I73b53d74c26ef9bf670f12cac58b51ba61eefead
* Change DecimalFormatSymbols to have a field per symbol.Elliott Hughes2010-04-161-5/+45
| | | | | | | I'd been wanting to do this for some time, but cleaning up the recent performance changes I made to Formatter was the final straw. Change-Id: I6d516de66a0bed5e759bca590b4cc124ce2eb712
* Fix String.toLowerCase and toUpperCase.Elliott Hughes2010-04-131-0/+2
| | | | | | | | | | | | | | | Rather than try to cope with Lithuanian, let's just hand that one to ICU4C. I've removed my hand-crafted Azeri/Turkish lowercasing too, in favor of ICU. Presence of a high surrogate (which implies a supplemental character) is a good reason to hand over to ICU too. On the uppercasing side, I've kept our existing hard-coded table and just added code to defer to ICU for Azeri, Lithuanian, and Turkish (plus supplemental characters). I don't like the tables, but I don't have proof that they're incorrect. Bug: 2340628 Change-Id: I36b556b0444623a5aacc1afc58ebb4d84211d3dc
* Make String.split 10x faster.Elliott Hughes2010-04-091-0/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Almost all uses of String.split in the Android codebase use trivial single literal character separators. This patch optimizes that case to avoid the use of regular expressions entirely. The 10x speedup isn't the whole story, because the speedup is really proportional to the number of separators in the input. 10x is easily achievable, but the speedup could be arbitrarily high. Before: benchmark us logarithmic runtime PatternSplitComma 84.8 XXXXXXXXXXXXXX|||||||||||||| PatternSplitLiteralDot 85.0 XXXXXXXXXXXXXX|||||||||||||| StringSplitComma 166.3 XXXXXXXXXXXXXXXXXXXXXXXXXXXX| StringSplitHard 173.6 XXXXXXXXXXXXXXXXXXXXXXXXXXXXXX StringSplitLiteralDot 167.7 XXXXXXXXXXXXXXXXXXXXXXXXXXXX| After: benchmark us logarithmic runtime PatternSplitComma 18.9 XXX||||||||||||||||||||| PatternSplitLiteralDot 19.0 XXX||||||||||||||||||||| StringSplitComma 18.8 XXX||||||||||||||||||||| StringSplitHard 174.2 XXXXXXXXXXXXXXXXXXXXXXXXXXXXXX StringSplitLiteralDot 18.8 XXX||||||||||||||||||||| (The benchmarks starting "Pattern" use a precompiled Pattern for performance. Those starting "String" use String.split and would traditional entail a temporary Pattern. As you can see, creating Patterns is very expensive for us, and each one throws a finalizer spanner in the GC works too. The new fast path avoids all this. I'll commit the benchmark -- along with all the others I've ever used -- to http://code.google.com/p/dalvik this afternoon.) Tests? We actually pass _more_ tests after this patch, because the increase in performance means we don't hit timeouts. Change-Id: I404298e21a78d72cf5ce6ea675844bf251e3825b
* More Charset/ICU cleanup.Elliott Hughes2010-04-023-395/+70
| | | | | | | | | | | | | | | | I've been feeling guilty about leaving broken double-checked locking (missing the "volatile") in harmony's Charset code. A quick investigation showed that the method that it's intended to optimize is basically never called, and the RI's documentation explicitly says "don't call this; it's slow". So this patch fixes that. I've also improved our documentation. I've also deleted a bunch of dead code. I've also tidied up some dodgy native string handling. Change-Id: Iad69ebb3459d9cc4c4ff37b255d458b83fe40132
* Tidy up our getAvailableLocales methods to actually ask ICU4C.Elliott Hughes2010-04-014-42/+53
| | | | | | | | | These specialized methods are little used, and in several cases ICU itself just returns the list of locales, but that's ICU's business, not ours. As long as ICU is in charge of our locale-specific data, it should be responsible for answering questions about what locale-specific data is available... Change-Id: Idc8a66bbf7fcbc6b06e30929e6a7af3fe30ab7d1
* Add Java 6's java.net.IDN.Elliott Hughes2010-04-011-0/+44
| | | | | | | | | | harmony's tests and my code, though ICU4C does all the hard work. I've added a test of my own to demonstrate some weird RI behavior (that I've emulated in our implementation). Bug: 2497395 Change-Id: I8146f72a8a3204449ee3d0d9065dadc1c1c77fcc
* Add Java 6's exponent separator to DecimalFormatSymbols.Elliott Hughes2010-03-291-0/+5
| | | | | Bug: 2497395 Change-Id: Ic552fa828649bae882e508a62a44073d1038b5c0
* Java 6 changed CollationKey from final to abstract.Elliott Hughes2010-03-266-1077/+652
| | | | | | | | | | | | | | I've also taken the opportunity to tidy up our implementation a little, though my hands are tied by (a) the fact that our concrete classes are in a separate package from our abstract classes and (b) frameworks/base actually pokes about with our icu4jni collation code (http://b/2417080). I've also tidied up a bunch of dead code. In particular, it's silly for us to check parameters in Java that will be checked in native code (and that one would assume will be valid most of the time anyway). Bug: 1635883 Change-Id: I7db3c1ff1f0d23cb85604f9c8eb995e4488d7c0a
* Start cleaning up the Charset implementation.Elliott Hughes2010-03-263-131/+36
| | | | | | | | | | | | | | This was going to be https://issues.apache.org/jira/browse/HARMONY-6461, but I couldn't resist cleaning up some of the surrounding code, and ended up cleaning up some of our native code too. In the course of the afternoon I spent on this, I lost my conviction that the upstream change makes sense, so I reverted that, leaving this change just pure cleanup. (Note that the cleanup work is incomplete. This is an improvement, but there's plenty left to do. I just don't want to get too distracted until all the Java 6 changes are done.) Change-Id: I56841db5f6c038bbf7942e83a148dca546519269
* Clean up the Java side of the ICU interface a bit.Elliott Hughes2010-03-198-272/+136
| | | | | | | | My original intention was just to add the missing "final" on a few classes, but our BreakIterator implementation struck me as excessively bloated and confusing. Change-Id: I2d2dccafe8ec91124f3c83909c9ec647cc2d51e2
* Add's Java 6's DecimalFormat.setRoundingMode (et cetera).Elliott Hughes2010-03-181-0/+17
| | | | | | | | | | | | | | | | | | Format and NumberFormat's bogusly-public constructors became protected with Java 6. DecimalFormat gained more control over rounding behavior. There's a slight mismatch with our ICU4C-based implementation in that ICU4C doesn't support RoundingMode.UNNECESSARY, so I've had to fake that (but I doubt it's used much, if at all). I've pulled out the obviously Android-specific tests from the harmony DecimalFormatTest.java, but I've only brought back the rounding mode changes from the current harmony code to avoid the new tests' dependencies. I've also added one new test of my own, to check that setMaximumFractionDigits affects rounding as it should (since the harmony tests don't test this, and it's somewhat subtle). Bug: 2497395 Change-Id: Ifafc8bb051e078ead988073281f5c33f0aeb130a
* Use a manually-synchronized HashMap instead of ConcurrentHashMap in LocaleData.Elliott Hughes2010-03-051-9/+17
| | | | | | | | | | | | | | | | ConcurrentHashMap is our slowest choice at the moment: ConcurrentHashMapGet 782 XXXXXXXXXXXXXXXXXXXXXXXXXXXXXX HashMapGet 272 XXXXXXXXXX||||||||||||||| HashMapGet_Synchronized 317 XXXXXXXXXXXX||||||||||||| HashtableGet 325 XXXXXXXXXXXX|||||||||||||| LinkedHashMapGet 280 XXXXXXXXXX||||||||||||||| The cost of some commonly-created temporary objects (such as DateFormatSymbols) is dominated by the lookup of the locale data. This patch takes "new DateFormatSymbols" from 3us to 2.3us on passion/froyo (a 23% drop). Bug: 2492505
* Add (but @hide) String.isEmpty and Locale.ROOT.Elliott Hughes2010-03-021-1/+1
|
* am 10ebc7d0: Merge "Implement (but @hide) java.text.Normalizer from Java 6."Elliott Hughes2010-03-011-0/+47
|\ | | | | | | | | | | | | Merge commit '10ebc7d0b84dcb98e1a7eeac96ef06acdfc8d184' into dalvik-dev * commit '10ebc7d0b84dcb98e1a7eeac96ef06acdfc8d184': Implement (but @hide) java.text.Normalizer from Java 6.
| * Implement (but @hide) java.text.Normalizer from Java 6.Elliott Hughes2010-03-011-0/+47
| | | | | | | | | | | | | | | | Based on https://android-git.corp.google.com/g/42516. Includes the harmony tests from their Java 6 branch. Bug: 719001
* | Merge "Fix a few of our FindBugs "high" warnings." into dalvik-devElliott Hughes2010-02-122-7/+13
|\ \
| * | Fix a few of our FindBugs "high" warnings.Elliott Hughes2010-02-122-7/+13
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | RuleBasedBreakIterator was breaking the equals/hashCode contract. Various classes were calling toString on arrays, which isn't very useful. GregorianCalendar was missing a null/instanceof check. (FindBugs complained about the former, but the super.equals would actually take care of that. The lack of the explicit "instanceof" did mean that we could throw ClassCastException if you had a Calendar that wasn't a GregorianCalendar, though. [Not easily testable, and I hope we'll replace our calendars with ICU4J's before we actually have another Calendar subclass.]) Collator's cache was broken, but luckily never had anything inserted into it anyway.
* | | Use one method to create a Locale from a String.Elliott Hughes2010-02-123-63/+34
|/ / | | | | | | Bug: 2392157
* | Remove RuleBasedNumberFormat from our icu4jni fork, since we don't need it.Elliott Hughes2010-02-031-258/+0
|/ | | | | | This is ICU API not used by Java, so there's no point pretending to maintain it. Bug: http://b/2377457
* Use DecimalFormatSymbols' new default constructor for speed.Elliott Hughes2010-01-281-30/+19
| | | | | | | | This brings "new DecimalFormat" down to ~80us (from ~260us before this patch, or ~600us this time last week). Also remove some dead code and tighten up some accessibility. Depends on https://android-git.corp.google.com/g/38877.
* Double the speed of DecimalFormat creation.Elliott Hughes2010-01-281-23/+14
| | | | | | | | | | | Our calls to unum_setSymbol were making us O(n^2); switching to the C++ API and doing a bulk update is a huge win. (ICU is really a C++ library with a C wrapper. It's always going to be slightly wasteful to go via C, but here it's especially harmful.) The new ScopedJavaUnicodeString provides a best-of-breed bridge between Java strings on the Java heap and the UnicodeString type that ICU wants. I'll come back and switch more of our ICU JNI over in a later patch.
* Remove commented-out code.Elliott Hughes2010-01-271-21/+1
| | | | Mistakenly left in my previous change when I wasn't certain it was dead.
* Fix NumberFormat's behavior with BigInteger and custom Number subclasses.Elliott Hughes2010-01-271-17/+27
| | | | | | | Also remove a few bits of cruft I ran across, and stop duplicating the documentation between NumberFormat and DecimalFormat. Bug: 2387934
* Rename icu4jni's DecimalFormat to NativeDecimalFormat, to reduce confusion.Elliott Hughes2010-01-271-15/+15
|
* Gut NativeDecimalFormat in favor of icu4jni.DecimalFormat.Elliott Hughes2010-01-272-208/+159
| | | | | | (I'll come back and rename icu4jni.DecimalFormat to NativeDecimalFormat and remove all the fully-qualified names that distinguish between java.text's DecimalFormat and icu4jni's DecimalFormat.)
* Simplify our DecimalFormat.Elliott Hughes2010-01-263-380/+132
| | | | | | | | | | | | | | | Both the is-a and has-a hierarchies for our DecimalFormat implementation were over-complicated. This patch starts to address that, and makes cloning twice as fast (50us versus 100us), but not as fast as I'd like (<10us), and without making much of a dent in the time it takes to create a new NumberFormat (550us versus 600us). The speed of cloning is important because Formatter has a hack that uses it, and I want to change NumberFormat so that it always hands out clones... at least until I have time to make "new NumberFormat" acceptably fast. Also fixes DecimalFormat.applyLocalizedPattern (which used to behave as if you'd called applyPattern).
* Minor tidy-up of some of the ICU interface.Elliott Hughes2010-01-213-39/+4
| | | | | | Dead code, a class that shouldn't be instantiated, work that's probably better done all on the native side, and some slightly improved error reporting.
* Improve the DecimalFormat JNI.Elliott Hughes2010-01-211-13/+7
| | | | | | | | | | | | | | | | | | | We don't need two identical copies of the code for double and long; ICU uses overloading, and we should take advantage of that. We can also improve the code to remove unnecessary heap allocation, remove unnecessary temporary copies, and only make JNI calls and ask for the attribute data when necessary. I've also switched the code from the thread-unsafe strtok(3) to strtok_r(3). I've also removed unnecessary temporary char[]s and copying in DecimalFormat. I've also fixed another instance of the "if (doubleValue == longValue) longPath" anti-pattern that gets -0.0 wrong. (It's also worth noting that caliper says the difference between the double and long paths is very small, on the order of 2us.) (The new code takes about 20us per call compared to 60us for the old code, measured on passion-eng.)
* Speed up Character.Elliott Hughes2010-01-141-134/+22
| | | | | | | | Remove a useless layer of indirection in UCharacter (which is the bridge between java.lang.Character and ICU). We're not at the stage where the JIT can do this for us, and even if it could, why give it extra work to do? Also fix the incorrect copyright header which was probably copied from a file where it made sense.
* Fix Date.toString.Elliott Hughes2010-01-141-5/+26
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Date.toString was using the TimeZone id ("America/Los_Angeles") rather than the time zone short name ("PDT" or "PST", depending on time of year). The naive fix made things 5x slower, so I improved Resources.getDisplayTimeZone so the fixed Date.toString is only 2x slower. This could be improved further with a faster getDisplayTimeZone. I hoped to replace the body of Date.toString with a call to SimpleDateFormat, but that turns out to be 40x slower. This patch also optimizes SimpleDateFormat to bring the gap down to 8x by using Resources.getDisplayTimeZone instead of asking for all the strings. (Note that these improvements refer to the hopefully common case of localized strings for the default locale. If you have the misfortune to need strings for other locales, the new code will be more like 600x faster. At 0.5s a call on the fastest current hardware, I hope no-one's actually doing that. Dalvik Explorer -- available on the Market -- needs to do it when generating summary reports, and it is indeed ridiculously slow. It uses two SimpleDateFormat objects per locale, so it takes 1s per locale, for about 60 locales. I've tested Dalvik Explorer with this patch, and it does fix that pathological behavior.) Also fix a bug I introduced in https://android-git.corp.google.com/g/36242 that meant that our zone names String[][] contained incorrect values (accidentally concatenating each successive value in a row), found by existing tests now we use more of those values. Also replace a couple of "new Integer" calls with Integer.valueOf for a modest speedup. Also factor out some duplication. Bug: http://code.google.com/p/android/issues/detail?id=6013
* Remove the last bits of the ICU ResourceBundle hack.Elliott Hughes2010-01-131-41/+0
| | | | | | | This patch switches us over to calling ICU directly for localized currency symbols, and then removes all the mechanism for sneaking fake ResourceBundle implementations in. The code's a lot simpler too, because ICU's default behavior is what we want anyway.
* Support non-default negative patterns in NumberFormat.getIntegerInstance.Elliott Hughes2010-01-111-2/+7
| | | | Spotted while rewriting the associated JNI recently.
* Last bunch of NumberFormat speedups.Elliott Hughes2010-01-053-123/+60
| | | | | | | | | | | | | | | | | | | | | | Don't mess around with setCurrency in DecimalFormat.copySymbols when we're going to override any effect that call will have had in the next few lines: we always call setCurrencySymbol and setInternationalCurrencySymbol, so setCurrency is just wasting time. Replaces the NativeDecimalFormat.UNumberFormatSymbol enum -- which was only used for getting ints to pass to native code, using Enum.ordinal -- with ints. Adds a constructor to the java.text DecimalFormat so we can avoid cloning the DecimalFormatSymbols object we create for its private use. This is another 10% shaved off. I've also removed an unused local from the icu4jni DecimalFormat, so I can remove a then-unused getLocale method from the ICU DecimalFormatSymbols. I've rewritten the icu4jni DecimalFormatSymbols.clone to remove the scary constructor that took an arbitrary int and treated it as a uintptr_t when talking to native code.
* Speed up DecimalFormatSymbols.Elliott Hughes2010-01-042-29/+51
| | | | | | | | | | We don't need to create temporary String objects; we can just pass a char directly. We also don't need to initialize aspects of our native peer if we know we're going to overwrite them straight away, and making copying into ICU the responsibility of the icu4jni class rather than the java.text is slightly cleaner. Together, these changes make creating a new NumberFormat about 20% faster.
* Fix build (accidental API leak).Elliott Hughes2010-01-041-0/+30
| | | | | Move a couple of methods into LocaleData -- where they should have been from the beginning -- so they're automatically hidden from our users.
* Stop using ResourceBundle for locale data.Elliott Hughes2010-01-043-76/+218
| | | | This offers an additional speed increase and gets rid of a lot of native code.
* Speed up the way we access ICU's locale data.Elliott Hughes2009-12-212-260/+89
| | | | | | | | | | | | | | | | | | | | | | | | | | | This patch makes creating a new NumberFormat or new SimpleDateFormat 2x faster. Basically, the ResourceBundle mechanism is really expensive in several ways: 1. The two-level caching is unnecessary for locale data, and expensive because it burns through a lot of temporary objects. 2. The PrivilegedAction stuff is unnecessary and expensive because it too burns quite a few temporary objects (including an ArrayList for each call; should we consider removing support for SecurityManager so we can remove this cruft from our code?). 3. The caching in most cases doesn't cache anything useful; the ResourceBundles simply forward all questions straight to native code anyway, all we're caching is an unnecessary forwarding object (in a cache where lookups cost more than just creating a new unnecessary forwarding object would cost). I've left CurrencyResourceBundle on the slow (ResourceBundle.getBundle) path because I'm not yet sure how much of that path's semantics it relies on. I still return LocaleResourceBundle instances (albeit via a much faster path) but we should fix that. The native code returns an array which ResourceBundle stuffs into a Hashtable and the calling code accesses via hash table lookups. This despite the fact that the keys are a small fixed set known in advance. We could make the native layer and the calling layer simpler and faster by using a "struct", and doing so would make the middle layer go away completely.
* Depessimize string conversions.Elliott Hughes2009-12-182-14/+14
| | | | | | | | | | | | Why does this idiom persist? It's ugly, and it's the least efficient way to do it. (I found the ones in DecimalFormatSymbols while invesigating why "new SimpleDateFormat()" burns through so many StringBuilders. grep(1) found the rest.) The DocumentBuilderImpl removes an unnecessary level of indirection, since we implement Character.toString in terms of String.valueOf. (I wouldn't have bothered except this was the only use of Character.toString in the core libraries, and I added it myself a few weeks ago.)
* Fix java.util.Formatter formatting of -0.0.Elliott Hughes2009-12-091-36/+17
| | | | | | | | | | | | The active ingredient here is the two changes to stop comparing longValue with doubleValue and formatting the long if the two compare equal. This causes us to lose the sign of 0 (because there's no long -0, but -0.0d == 0). Instead, we explicitly test for boxed Double and Float arguments (because the number of integral types is larger, they get the "else" clause). The other changes are just minor cosmetic changes made as I followed the code. Bug found by jtreg, so no new test.
* CharsetDecoderICU/CharsetEncoderICU should take arrayOffset into account.Elliott Hughes2009-11-032-10/+22
| | | | | | | | | | | | | CharsetDecoderICU and CharsetEncoderICU special-case array-backed ByteBuffers and CharBuffers for performance reasons, but they shouldn't assume that the backing array always has offset 0. An external user hit this while using the jAudioTagger library. Test cases from user submission: http://code.google.com/p/android/issues/detail?id=4237 See also: 2234697
* Fix a few FindBugs warnings in code that isn't upstream.Elliott Hughes2009-10-273-6/+6
| | | | Bugs: 2099642, 2099637
* Fix icu4jni Resources ("Locale") to not expose its internals.Elliott Hughes2009-10-141-6/+14
| | | | | | We shouldn't expose internal arrays without copying. Bug: 2102273
* Bug fixing for NumberFormat and BigDecimal.Jesse Wilson2009-10-121-0/+23
| | | | | | | | | | 1. Fixed the bug that DecimalFormat does not handle multiplier. 2. Fixed the bug that DecimalFormat does not handle precision. This is a copy of the original Eclair change, https://android-git.corp.google.com/g/26297 Bug: 1897917.
* Make Resources$DefaultTimeZones preloadable again.Elliott Hughes2009-09-151-18/+20
| | | | | | | | | | | | | | | | | | http://s9/81864 was a premature optimization that stopped the timezone data being loaded in the zygote. So instead of paying the (admittedly large) time and space costs once in the zygote, we now pay them once per application. Revert the problematic parts of that change. Note that this isn't simply a reverse patch: 1. I've changed the comment to make it clear that although this *looks* like idiomatic lazy initialization, it's actually the opposite. A comment to that effect might have prevented this code from being broken. 2. I've left the last two hunks of the original patch stand, because they appear reasonable but unrelated. Bug: 1941311, 1819285.
* InputStreamReader forgets to convert incomplete multibyte characters at the ↵Urs Grob2009-09-041-11/+4
| | | | | | buffer boundary BUG=2033986
* Bug 1844104: Fix buffer overwrite bugs in CharsetEncoderICU and ↵Mihai Preda2009-05-262-12/+47
| | | | | | CharsetDecoderICU. And add unit test.
* AI 148694: Manually copied from cupcake_dcm CL 148669-p9. When resetting theAndy Stadler2009-05-111-0/+4
| | | | | | | | | mode of the encoder, clear the leftover input & output buffers. This claims to fix buffer overwriting we're seeing during account sync and message download. BUG=1822859 Automated import of CL 148694
* auto import from //branches/cupcake/...@137873The Android Open Source Project2009-03-111-34/+0
|