| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Add expectations for broken harmony tests, add our own equivalent (but correct)
tets, and fix the bug turned up by the correct tests: the icu4jni
RuleBasedCollator was using toString to convert a CharacterIterator to a
String, resulting in iteration over the result of Object.toString (the class
name and identity hash code) rather than the characters of interest.
Also shut javac up about non-ASCII characters in Locale.java.
Bug: 2608742
Bug: 2608750
Change-Id: I2171789058c8116eacd7e5815bd483f0bc07c69b
|
|
|
|
|
|
|
| |
Also move our ICU tests into our little tree of tests.
Bug: 2596471
Change-Id: I73b53d74c26ef9bf670f12cac58b51ba61eefead
|
|
|
|
|
|
|
| |
I'd been wanting to do this for some time, but cleaning up the recent
performance changes I made to Formatter was the final straw.
Change-Id: I6d516de66a0bed5e759bca590b4cc124ce2eb712
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Rather than try to cope with Lithuanian, let's just hand that one to ICU4C.
I've removed my hand-crafted Azeri/Turkish lowercasing too, in favor of ICU.
Presence of a high surrogate (which implies a supplemental character) is a
good reason to hand over to ICU too.
On the uppercasing side, I've kept our existing hard-coded table and just
added code to defer to ICU for Azeri, Lithuanian, and Turkish (plus
supplemental characters). I don't like the tables, but I don't have proof
that they're incorrect.
Bug: 2340628
Change-Id: I36b556b0444623a5aacc1afc58ebb4d84211d3dc
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Almost all uses of String.split in the Android codebase use trivial single
literal character separators. This patch optimizes that case to avoid the
use of regular expressions entirely.
The 10x speedup isn't the whole story, because the speedup is really
proportional to the number of separators in the input. 10x is easily
achievable, but the speedup could be arbitrarily high.
Before:
benchmark us logarithmic runtime
PatternSplitComma 84.8 XXXXXXXXXXXXXX||||||||||||||
PatternSplitLiteralDot 85.0 XXXXXXXXXXXXXX||||||||||||||
StringSplitComma 166.3 XXXXXXXXXXXXXXXXXXXXXXXXXXXX|
StringSplitHard 173.6 XXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
StringSplitLiteralDot 167.7 XXXXXXXXXXXXXXXXXXXXXXXXXXXX|
After:
benchmark us logarithmic runtime
PatternSplitComma 18.9 XXX|||||||||||||||||||||
PatternSplitLiteralDot 19.0 XXX|||||||||||||||||||||
StringSplitComma 18.8 XXX|||||||||||||||||||||
StringSplitHard 174.2 XXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
StringSplitLiteralDot 18.8 XXX|||||||||||||||||||||
(The benchmarks starting "Pattern" use a precompiled Pattern for performance.
Those starting "String" use String.split and would traditional entail a
temporary Pattern. As you can see, creating Patterns is very expensive for
us, and each one throws a finalizer spanner in the GC works too. The new
fast path avoids all this. I'll commit the benchmark -- along with all the
others I've ever used -- to http://code.google.com/p/dalvik this afternoon.)
Tests? We actually pass _more_ tests after this patch, because the increase
in performance means we don't hit timeouts.
Change-Id: I404298e21a78d72cf5ce6ea675844bf251e3825b
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
I've been feeling guilty about leaving broken double-checked locking (missing
the "volatile") in harmony's Charset code. A quick investigation showed that
the method that it's intended to optimize is basically never called, and the
RI's documentation explicitly says "don't call this; it's slow". So this patch
fixes that.
I've also improved our documentation.
I've also deleted a bunch of dead code.
I've also tidied up some dodgy native string handling.
Change-Id: Iad69ebb3459d9cc4c4ff37b255d458b83fe40132
|
|
|
|
|
|
|
|
|
| |
These specialized methods are little used, and in several cases ICU itself
just returns the list of locales, but that's ICU's business, not ours. As
long as ICU is in charge of our locale-specific data, it should be responsible
for answering questions about what locale-specific data is available...
Change-Id: Idc8a66bbf7fcbc6b06e30929e6a7af3fe30ab7d1
|
|
|
|
|
|
|
|
|
|
| |
harmony's tests and my code, though ICU4C does all the hard work.
I've added a test of my own to demonstrate some weird RI behavior (that I've
emulated in our implementation).
Bug: 2497395
Change-Id: I8146f72a8a3204449ee3d0d9065dadc1c1c77fcc
|
|
|
|
|
| |
Bug: 2497395
Change-Id: Ic552fa828649bae882e508a62a44073d1038b5c0
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
I've also taken the opportunity to tidy up our implementation a little,
though my hands are tied by (a) the fact that our concrete classes are
in a separate package from our abstract classes and (b) frameworks/base
actually pokes about with our icu4jni collation code (http://b/2417080).
I've also tidied up a bunch of dead code. In particular, it's silly for
us to check parameters in Java that will be checked in native code (and
that one would assume will be valid most of the time anyway).
Bug: 1635883
Change-Id: I7db3c1ff1f0d23cb85604f9c8eb995e4488d7c0a
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This was going to be https://issues.apache.org/jira/browse/HARMONY-6461,
but I couldn't resist cleaning up some of the surrounding code, and ended
up cleaning up some of our native code too. In the course of the afternoon
I spent on this, I lost my conviction that the upstream change makes
sense, so I reverted that, leaving this change just pure cleanup.
(Note that the cleanup work is incomplete. This is an improvement, but
there's plenty left to do. I just don't want to get too distracted until
all the Java 6 changes are done.)
Change-Id: I56841db5f6c038bbf7942e83a148dca546519269
|
|
|
|
|
|
|
|
| |
My original intention was just to add the missing "final" on a few classes,
but our BreakIterator implementation struck me as excessively bloated and
confusing.
Change-Id: I2d2dccafe8ec91124f3c83909c9ec647cc2d51e2
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Format and NumberFormat's bogusly-public constructors became protected with
Java 6. DecimalFormat gained more control over rounding behavior. There's a
slight mismatch with our ICU4C-based implementation in that ICU4C doesn't
support RoundingMode.UNNECESSARY, so I've had to fake that (but I doubt it's
used much, if at all).
I've pulled out the obviously Android-specific tests from the harmony
DecimalFormatTest.java, but I've only brought back the rounding mode changes
from the current harmony code to avoid the new tests' dependencies. I've also
added one new test of my own, to check that setMaximumFractionDigits affects
rounding as it should (since the harmony tests don't test this, and it's
somewhat subtle).
Bug: 2497395
Change-Id: Ifafc8bb051e078ead988073281f5c33f0aeb130a
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
ConcurrentHashMap is our slowest choice at the moment:
ConcurrentHashMapGet 782 XXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
HashMapGet 272 XXXXXXXXXX|||||||||||||||
HashMapGet_Synchronized 317 XXXXXXXXXXXX|||||||||||||
HashtableGet 325 XXXXXXXXXXXX||||||||||||||
LinkedHashMapGet 280 XXXXXXXXXX|||||||||||||||
The cost of some commonly-created temporary objects (such as
DateFormatSymbols) is dominated by the lookup of the locale data. This patch
takes "new DateFormatSymbols" from 3us to 2.3us on passion/froyo (a 23% drop).
Bug: 2492505
|
| |
|
|\
| |
| |
| |
| |
| |
| | |
Merge commit '10ebc7d0b84dcb98e1a7eeac96ef06acdfc8d184' into dalvik-dev
* commit '10ebc7d0b84dcb98e1a7eeac96ef06acdfc8d184':
Implement (but @hide) java.text.Normalizer from Java 6.
|
| |
| |
| |
| |
| |
| |
| |
| | |
Based on https://android-git.corp.google.com/g/42516.
Includes the harmony tests from their Java 6 branch.
Bug: 719001
|
|\ \ |
|
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
RuleBasedBreakIterator was breaking the equals/hashCode contract.
Various classes were calling toString on arrays, which isn't very useful.
GregorianCalendar was missing a null/instanceof check. (FindBugs complained about
the former, but the super.equals would actually take care of that. The lack of
the explicit "instanceof" did mean that we could throw ClassCastException if you
had a Calendar that wasn't a GregorianCalendar, though. [Not easily testable,
and I hope we'll replace our calendars with ICU4J's before we actually have
another Calendar subclass.])
Collator's cache was broken, but luckily never had anything inserted into it
anyway.
|
|/ /
| |
| |
| | |
Bug: 2392157
|
|/
|
|
|
|
| |
This is ICU API not used by Java, so there's no point pretending to maintain it.
Bug: http://b/2377457
|
|
|
|
|
|
|
|
| |
This brings "new DecimalFormat" down to ~80us (from ~260us before this patch,
or ~600us this time last week). Also remove some dead code and tighten up some
accessibility.
Depends on https://android-git.corp.google.com/g/38877.
|
|
|
|
|
|
|
|
|
|
|
| |
Our calls to unum_setSymbol were making us O(n^2); switching to the C++ API
and doing a bulk update is a huge win. (ICU is really a C++ library with a
C wrapper. It's always going to be slightly wasteful to go via C, but here
it's especially harmful.)
The new ScopedJavaUnicodeString provides a best-of-breed bridge between Java
strings on the Java heap and the UnicodeString type that ICU wants. I'll come
back and switch more of our ICU JNI over in a later patch.
|
|
|
|
| |
Mistakenly left in my previous change when I wasn't certain it was dead.
|
|
|
|
|
|
|
| |
Also remove a few bits of cruft I ran across, and stop duplicating the
documentation between NumberFormat and DecimalFormat.
Bug: 2387934
|
| |
|
|
|
|
|
|
| |
(I'll come back and rename icu4jni.DecimalFormat to NativeDecimalFormat and
remove all the fully-qualified names that distinguish between java.text's
DecimalFormat and icu4jni's DecimalFormat.)
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Both the is-a and has-a hierarchies for our DecimalFormat implementation were
over-complicated. This patch starts to address that, and makes cloning twice
as fast (50us versus 100us), but not as fast as I'd like (<10us), and without
making much of a dent in the time it takes to create a new NumberFormat (550us
versus 600us).
The speed of cloning is important because Formatter has a hack that uses it,
and I want to change NumberFormat so that it always hands out clones... at
least until I have time to make "new NumberFormat" acceptably fast.
Also fixes DecimalFormat.applyLocalizedPattern (which used to behave as if
you'd called applyPattern).
|
|
|
|
|
|
| |
Dead code, a class that shouldn't be instantiated, work that's probably
better done all on the native side, and some slightly improved error
reporting.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
We don't need two identical copies of the code for double and long; ICU uses
overloading, and we should take advantage of that. We can also improve the code
to remove unnecessary heap allocation, remove unnecessary temporary copies, and
only make JNI calls and ask for the attribute data when necessary.
I've also switched the code from the thread-unsafe strtok(3) to strtok_r(3).
I've also removed unnecessary temporary char[]s and copying in DecimalFormat.
I've also fixed another instance of the "if (doubleValue == longValue) longPath"
anti-pattern that gets -0.0 wrong. (It's also worth noting that caliper says
the difference between the double and long paths is very small, on the order
of 2us.)
(The new code takes about 20us per call compared to 60us for the old code,
measured on passion-eng.)
|
|
|
|
|
|
|
|
| |
Remove a useless layer of indirection in UCharacter (which is the bridge
between java.lang.Character and ICU). We're not at the stage where the
JIT can do this for us, and even if it could, why give it extra work to
do? Also fix the incorrect copyright header which was probably copied from
a file where it made sense.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Date.toString was using the TimeZone id ("America/Los_Angeles") rather than
the time zone short name ("PDT" or "PST", depending on time of year). The
naive fix made things 5x slower, so I improved Resources.getDisplayTimeZone
so the fixed Date.toString is only 2x slower. This could be improved further
with a faster getDisplayTimeZone.
I hoped to replace the body of Date.toString with a call to SimpleDateFormat,
but that turns out to be 40x slower. This patch also optimizes SimpleDateFormat
to bring the gap down to 8x by using Resources.getDisplayTimeZone instead of
asking for all the strings.
(Note that these improvements refer to the hopefully common case of localized
strings for the default locale. If you have the misfortune to need strings for
other locales, the new code will be more like 600x faster. At 0.5s a call on
the fastest current hardware, I hope no-one's actually doing that.
Dalvik Explorer -- available on the Market -- needs to do it when generating
summary reports, and it is indeed ridiculously slow. It uses two
SimpleDateFormat objects per locale, so it takes 1s per locale, for about 60
locales. I've tested Dalvik Explorer with this patch, and it does fix that
pathological behavior.)
Also fix a bug I introduced in https://android-git.corp.google.com/g/36242 that
meant that our zone names String[][] contained incorrect values (accidentally
concatenating each successive value in a row), found by existing tests now we
use more of those values.
Also replace a couple of "new Integer" calls with Integer.valueOf for a modest
speedup.
Also factor out some duplication.
Bug: http://code.google.com/p/android/issues/detail?id=6013
|
|
|
|
|
|
|
| |
This patch switches us over to calling ICU directly for localized currency
symbols, and then removes all the mechanism for sneaking fake ResourceBundle
implementations in. The code's a lot simpler too, because ICU's default
behavior is what we want anyway.
|
|
|
|
| |
Spotted while rewriting the associated JNI recently.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Don't mess around with setCurrency in DecimalFormat.copySymbols when we're
going to override any effect that call will have had in the next few lines:
we always call setCurrencySymbol and setInternationalCurrencySymbol, so
setCurrency is just wasting time.
Replaces the NativeDecimalFormat.UNumberFormatSymbol enum -- which was only used
for getting ints to pass to native code, using Enum.ordinal -- with ints.
Adds a constructor to the java.text DecimalFormat so we can avoid cloning the
DecimalFormatSymbols object we create for its private use.
This is another 10% shaved off.
I've also removed an unused local from the icu4jni DecimalFormat, so I can
remove a then-unused getLocale method from the ICU DecimalFormatSymbols.
I've rewritten the icu4jni DecimalFormatSymbols.clone to remove the scary
constructor that took an arbitrary int and treated it as a uintptr_t when
talking to native code.
|
|
|
|
|
|
|
|
|
|
| |
We don't need to create temporary String objects; we can just pass a char
directly. We also don't need to initialize aspects of our native peer if
we know we're going to overwrite them straight away, and making copying
into ICU the responsibility of the icu4jni class rather than the java.text
is slightly cleaner.
Together, these changes make creating a new NumberFormat about 20% faster.
|
|
|
|
|
| |
Move a couple of methods into LocaleData -- where they should have been from
the beginning -- so they're automatically hidden from our users.
|
|
|
|
| |
This offers an additional speed increase and gets rid of a lot of native code.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This patch makes creating a new NumberFormat or new SimpleDateFormat 2x faster.
Basically, the ResourceBundle mechanism is really expensive in several ways:
1. The two-level caching is unnecessary for locale data, and expensive because
it burns through a lot of temporary objects.
2. The PrivilegedAction stuff is unnecessary and expensive because it too burns
quite a few temporary objects (including an ArrayList for each call; should
we consider removing support for SecurityManager so we can remove this cruft
from our code?).
3. The caching in most cases doesn't cache anything useful; the ResourceBundles
simply forward all questions straight to native code anyway, all we're
caching is an unnecessary forwarding object (in a cache where lookups cost
more than just creating a new unnecessary forwarding object would cost).
I've left CurrencyResourceBundle on the slow (ResourceBundle.getBundle) path
because I'm not yet sure how much of that path's semantics it relies on.
I still return LocaleResourceBundle instances (albeit via a much faster path)
but we should fix that. The native code returns an array which ResourceBundle
stuffs into a Hashtable and the calling code accesses via hash table lookups.
This despite the fact that the keys are a small fixed set known in advance.
We could make the native layer and the calling layer simpler and faster by
using a "struct", and doing so would make the middle layer go away completely.
|
|
|
|
|
|
|
|
|
|
|
|
| |
Why does this idiom persist? It's ugly, and it's the least efficient way to do
it. (I found the ones in DecimalFormatSymbols while invesigating why
"new SimpleDateFormat()" burns through so many StringBuilders. grep(1) found
the rest.)
The DocumentBuilderImpl removes an unnecessary level of indirection, since we
implement Character.toString in terms of String.valueOf. (I wouldn't have
bothered except this was the only use of Character.toString in the core
libraries, and I added it myself a few weeks ago.)
|
|
|
|
|
|
|
|
|
|
|
|
| |
The active ingredient here is the two changes to stop comparing longValue
with doubleValue and formatting the long if the two compare equal. This
causes us to lose the sign of 0 (because there's no long -0, but -0.0d == 0).
Instead, we explicitly test for boxed Double and Float arguments (because
the number of integral types is larger, they get the "else" clause).
The other changes are just minor cosmetic changes made as I followed the code.
Bug found by jtreg, so no new test.
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
CharsetDecoderICU and CharsetEncoderICU special-case array-backed ByteBuffers
and CharBuffers for performance reasons, but they shouldn't assume that the
backing array always has offset 0.
An external user hit this while using the jAudioTagger library.
Test cases from user submission:
http://code.google.com/p/android/issues/detail?id=4237
See also: 2234697
|
|
|
|
| |
Bugs: 2099642, 2099637
|
|
|
|
|
|
| |
We shouldn't expose internal arrays without copying.
Bug: 2102273
|
|
|
|
|
|
|
|
|
|
| |
1. Fixed the bug that DecimalFormat does not handle multiplier.
2. Fixed the bug that DecimalFormat does not handle precision.
This is a copy of the original Eclair change,
https://android-git.corp.google.com/g/26297
Bug: 1897917.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
http://s9/81864 was a premature optimization that stopped the timezone data
being loaded in the zygote. So instead of paying the (admittedly large) time
and space costs once in the zygote, we now pay them once per application.
Revert the problematic parts of that change. Note that this isn't simply a
reverse patch:
1. I've changed the comment to make it clear that although
this *looks* like idiomatic lazy initialization, it's actually the opposite.
A comment to that effect might have prevented this code from being broken.
2. I've left the last two hunks of the original patch stand, because they
appear reasonable but unrelated.
Bug: 1941311, 1819285.
|
|
|
|
|
|
| |
buffer boundary
BUG=2033986
|
|
|
|
|
|
| |
CharsetDecoderICU.
And add unit test.
|
|
|
|
|
|
|
|
|
| |
mode of the encoder, clear the leftover input & output buffers.
This claims to fix buffer overwriting we're seeing during account
sync and message download.
BUG=1822859
Automated import of CL 148694
|
| |
|