我本身基本上从不进行 Android 编码,但我有一些笔记和可能的想法供您参考,因为这是纯Java。
你的读者会very阅读每个元素时工作量过大。
首先,你不需要创建Gson
每次你需要的时候:
- 它是不可变的且线程安全的。
- 创建它的成本相对较高。
- 实例化一个
Gson
实例也命中堆more执行然后垃圾收集的时间。
接下来,Gson 中的仅反序列化和 JSON 流读取之间存在差异:第一个可能在底层使用重型类型适配器组合,而后者可以简单地逐个解析 JSON 文档。
话虽如此,您在读取 JSON 流时可以获得更好的性能:众所周知,您的 JSON 文件具有非常严格的结构,因此高级解析器可以更简单地实现。
假设一个简单的测试套件针对您的问题有不同的实现:
数据对象
城市.java
final class City {
@SerializedName("_id")
final int id;
@SerializedName("country")
final String country;
@SerializedName("name")
final String name;
@SerializedName("coord")
final Coordinates coordinates;
private City(final int id, final String country, final String name, final Coordinates coordinates) {
this.id = id;
this.country = country;
this.name = name;
this.coordinates = coordinates;
}
static City of(final int id, final String country, final String name, final Coordinates coordinates) {
return new City(id, country, name, coordinates);
}
@Override
public boolean equals(final Object o) {
if ( this == o ) {
return true;
}
if ( o == null || getClass() != o.getClass() ) {
return false;
}
final City that = (City) o;
return id == that.id;
}
@Override
public int hashCode() {
return id;
}
@SuppressWarnings("ConstantConditions")
public static int compareByName(final City city1, final City city2) {
return city1.name.compareTo(city2.name);
}
}
坐标.java
final class Coordinates {
@SerializedName("lat")
final double latitude;
@SerializedName("lon")
final double longitude;
private Coordinates(final double latitude, final double longitude) {
this.latitude = latitude;
this.longitude = longitude;
}
static Coordinates of(final double latitude, final double longitude) {
return new Coordinates(latitude, longitude);
}
@Override
public boolean equals(final Object o) {
if ( this == o ) {
return true;
}
if ( o == null || getClass() != o.getClass() ) {
return false;
}
final Coordinates that = (Coordinates) o;
return Double.compare(that.latitude, latitude) == 0
&& Double.compare(that.longitude, longitude) == 0;
}
@Override
public int hashCode() {
final long latitudeBits = Double.doubleToLongBits(latitude);
final long longitudeBits = Double.doubleToLongBits(longitude);
final int latitudeHash = (int) (latitudeBits ^ latitudeBits >>> 32);
final int longitudeHash = (int) (longitudeBits ^ longitudeBits >>> 32);
return 31 * latitudeHash + longitudeHash;
}
}
测试基础设施
ITest.java
interface ITest {
@Nonnull
default String getName() {
return getClass().getSimpleName();
}
@Nonnull
Collection<City> test(@Nonnull JsonReader jsonReader)
throws IOException;
}
main
public static void main(final String... args)
throws IOException {
final Iterable<ITest> tests = ImmutableList.of(
FirstTest.get(),
ReadAsWholeListTest.get(),
ReadAsWholeTreeSetTest.get(),
ReadJsonStreamIntoListTest.get(),
ReadJsonStreamIntoTreeSetTest.get(),
ReadJsonStreamIntoListChunksTest.get()
);
for ( int i = 0; i < 3; i++ ) {
for ( final ITest test : tests ) {
try ( final ZipInputStream zipInputStream = new ZipInputStream(Resources.getPackageResourceInputStream(Q49273660.class, "cities.json.zip")) ) {
for ( ZipEntry zipEntry = zipInputStream.getNextEntry(); zipEntry != null; zipEntry = zipInputStream.getNextEntry() ) {
if ( zipEntry.getName().equals("cities.json") ) {
final JsonReader jsonReader = new JsonReader(new InputStreamReader(zipInputStream)); // do not close
System.out.printf("%1$35s : ", test.getName());
final Stopwatch stopwatch = Stopwatch.createStarted();
final Collection<City> cities = test.test(jsonReader);
System.out.printf("in %d ms with %d elements\n", stopwatch.elapsed(TimeUnit.MILLISECONDS), cities.size());
assertSorted(cities, City::compareByName);
}
}
}
}
System.out.println("--------------------");
}
}
private static <E> void assertSorted(final Iterable<? extends E> iterable, final Comparator<? super E> comparator) {
final Iterator<? extends E> iterator = iterable.iterator();
if ( !iterator.hasNext() ) {
return;
}
E a = iterator.next();
if ( !iterator.hasNext() ) {
return;
}
do {
final E b = iterator.next();
if ( comparator.compare(a, b) > 0 ) {
throw new AssertionError(a + " " + b);
}
a = b;
} while ( iterator.hasNext() );
}
Tests
FirstTest.java
这是最慢的一个。
这只是你的问题对测试的适应。
final class FirstTest
implements ITest {
private static final ITest instance = new FirstTest();
private FirstTest() {
}
static ITest get() {
return instance;
}
@Nonnull
@Override
public List<City> test(@Nonnull final JsonReader jsonReader)
throws IOException {
jsonReader.beginArray();
final List<City> cities = new ArrayList<>();
while ( jsonReader.hasNext() ) {
final City city = new Gson().fromJson(jsonReader, City.class);
cities.add(city);
}
jsonReader.endArray();
cities.sort(City::compareByName);
return cities;
}
}
ReadAsWholeListTest.java
这很可能是您实现它的方式。
它不是获胜者,但它是最简单的一个,并且它使用默认排序。
final class ReadAsWholeListTest
implements ITest {
private static final ITest instance = new ReadAsWholeListTest();
private ReadAsWholeListTest() {
}
static ITest get() {
return instance;
}
private static final Gson gson = new Gson();
private static final Type citiesListType = new TypeToken<List<City>>() {
}.getType();
@Nonnull
@Override
public List<City> test(@Nonnull final JsonReader jsonReader) {
final List<City> cities = gson.fromJson(jsonReader, citiesListType);
cities.sort(City::compareByName);
return cities;
}
}
ReadAsWholeTreeSetTest.java
另一个想法,如果你不绑定到列表,是使用已经排序的集合,例如TreeSet
。
因为我不知道是否有办法指定一个新的TreeSet
比较器机制Gson
,它必须使用自定义类型适配器工厂(但这不是必需的,如果City
已经可以通过名称进行比较,但是它不灵活)。
final class ReadAsWholeTreeSetTest
implements ITest {
private static final ITest instance = new ReadAsWholeTreeSetTest();
private ReadAsWholeTreeSetTest() {
}
static ITest get() {
return instance;
}
@SuppressWarnings({ "rawtypes", "unchecked" })
private static final TypeToken<TreeSet<?>> rawTreeSetType = (TypeToken) TypeToken.get(TreeSet.class);
private static final Map<Type, Comparator<?>> comparatorsRegistry = ImmutableMap.of(
City.class, (Comparator<City>) City::compareByName
);
private static final Gson gson = new GsonBuilder()
.registerTypeAdapterFactory(new TypeAdapterFactory() {
@Override
public <T> TypeAdapter<T> create(final Gson gson, final TypeToken<T> typeToken) {
if ( !TreeSet.class.isAssignableFrom(typeToken.getRawType()) ) {
return null;
}
final Type elementType = ((ParameterizedType) typeToken.getType()).getActualTypeArguments()[0];
@SuppressWarnings({ "rawtypes", "unchecked" })
final Comparator<Object> comparator = (Comparator) comparatorsRegistry.get(elementType);
if ( comparator == null ) {
return null;
}
final TypeAdapter<TreeSet<?>> originalTreeSetTypeAdapter = gson.getDelegateAdapter(this, rawTreeSetType);
final TypeAdapter<?> originalElementTypeAdapter = gson.getDelegateAdapter(this, TypeToken.get(elementType));
final TypeAdapter<TreeSet<Object>> treeSetTypeAdapter = new TypeAdapter<TreeSet<Object>>() {
@Override
public void write(final JsonWriter jsonWriter, final TreeSet<Object> treeSet)
throws IOException {
originalTreeSetTypeAdapter.write(jsonWriter, treeSet);
}
@Override
public TreeSet<Object> read(final JsonReader jsonReader)
throws IOException {
jsonReader.beginArray();
final TreeSet<Object> elements = new TreeSet<>(comparator);
while ( jsonReader.hasNext() ) {
final Object element = originalElementTypeAdapter.read(jsonReader);
elements.add(element);
}
return elements;
}
}.nullSafe();
@SuppressWarnings({ "rawtypes", "unchecked" })
final TypeAdapter<T> castTreeSetTypeAdapter = (TypeAdapter<T>) treeSetTypeAdapter;
return castTreeSetTypeAdapter;
}
})
.create();
private static final Type citiesSetType = new TypeToken<TreeSet<City>>() {
}.getType();
@Nonnull
@Override
public Set<City> test(@Nonnull final JsonReader jsonReader) {
return gson.fromJson(jsonReader, citiesSetType);
}
}
JSON 流读取器测试
下面的类是一个特殊的读取器测试,它使用读取城市 JSON 的简化策略。
AbstractJsonStreamTest.java
它可能是尽可能简单的(就JSON结构分析而言),并且它要求JSON文档非常严格。
abstract class AbstractJsonStreamTest
implements ITest {
protected static void read(final JsonReader jsonReader, final Consumer<? super City> cityConsumer)
throws IOException {
jsonReader.beginArray();
while ( jsonReader.hasNext() ) {
jsonReader.beginObject();
require(jsonReader, "country");
final String country = jsonReader.nextString();
require(jsonReader, "name");
final String name = jsonReader.nextString();
require(jsonReader, "_id");
final int id = jsonReader.nextInt();
require(jsonReader, "coord");
jsonReader.beginObject();
require(jsonReader, "lon");
final double longitude = jsonReader.nextDouble();
require(jsonReader, "lat");
final double latitude = jsonReader.nextDouble();
jsonReader.endObject();
jsonReader.endObject();
final City city = City.of(id, country, name, Coordinates.of(latitude, longitude));
cityConsumer.accept(city);
}
jsonReader.endArray();
}
private static void require(final JsonReader jsonReader, final String expectedName)
throws IOException {
final String actualName = jsonReader.nextName();
if ( !actualName.equals(expectedName) ) {
throw new JsonParseException("Expected " + expectedName + " but was " + actualName);
}
}
}
ReadJsonStreamIntoListTest.java
这个很像ReadAsWholeListTest
但它使用简化的反序列化机制。
final class ReadJsonStreamIntoListTest
extends AbstractJsonStreamTest {
private static final ITest instance = new ReadJsonStreamIntoListTest();
private ReadJsonStreamIntoListTest() {
}
static ITest get() {
return instance;
}
@Nonnull
@Override
public Collection<City> test(@Nonnull final JsonReader jsonReader)
throws IOException {
final List<City> cities = new ArrayList<>();
read(jsonReader, cities::add);
cities.sort(City::compareByName);
return cities;
}
}
ReadJsonStreamIntoTreeSetTest.java
与前一个一样,这个也只是更昂贵的实现的另一种实现(ReadAsWholeTreeSetTest
),但是它不需要自定义类型适配器。
final class ReadJsonStreamIntoTreeSetTest
extends AbstractJsonStreamTest {
private static final ITest instance = new ReadJsonStreamIntoTreeSetTest();
private ReadJsonStreamIntoTreeSetTest() {
}
static ITest get() {
return instance;
}
@Nonnull
@Override
public Collection<City> test(@Nonnull final JsonReader jsonReader)
throws IOException {
final Collection<City> cities = new TreeSet<>(City::compareByName);
read(jsonReader, cities::add);
return cities;
}
}
ReadJsonStreamIntoListChunksTest.java
以下测试基于您最初的想法,但它不会并行对块进行排序(我不确定,但您可以尝试一下)。
我仍然认为前两个更简单,可能更容易维护,并且可以提供更多的性能增益。
final class ReadJsonStreamIntoListChunksTest
extends AbstractJsonStreamTest {
private static final ITest instance = new ReadJsonStreamIntoListChunksTest();
private ReadJsonStreamIntoListChunksTest() {
}
static ITest get() {
return instance;
}
@Nonnull
@Override
public List<City> test(@Nonnull final JsonReader jsonReader)
throws IOException {
final Collection<List<City>> cityChunks = new ArrayList<>();
final AtomicReference<List<City>> cityChunkRef = new AtomicReference<>(new ArrayList<>());
read(jsonReader, city -> {
final List<City> cityChunk = cityChunkRef.get();
cityChunk.add(city);
if ( cityChunk.size() >= 10000 ) {
cityChunks.add(cityChunk);
cityChunkRef.set(new ArrayList<>());
}
});
if ( !cityChunkRef.get().isEmpty() ) {
cityChunks.add(cityChunkRef.get());
}
for ( final List<City> cities : cityChunks ) {
Collections.sort(cities, City::compareByName);
}
return merge(cityChunks, City::compareByName);
}
/**
* <p>Adapted from:</p>
* <ul>
* <li>Original question: https://stackoverflow.com/questions/1774256/java-code-review-merge-sorted-lists-into-a-single-sorted-list</li>
* <li>Accepted answer: https://stackoverflow.com/questions/1774256/java-code-review-merge-sorted-lists-into-a-single-sorted-list/1775748#1775748</li>
* </ul>
*/
@SuppressWarnings("MethodCallInLoopCondition")
private static <E> List<E> merge(final Iterable<? extends List<E>> lists, final Comparator<? super E> comparator) {
int totalSize = 0;
for ( final List<E> l : lists ) {
totalSize += l.size();
}
final List<E> result = new ArrayList<>(totalSize);
while ( result.size() < totalSize ) { // while we still have something to add
List<E> lowest = null;
for ( final List<E> l : lists ) {
if ( !l.isEmpty() ) {
if ( lowest == null || comparator.compare(l.get(0), lowest.get(0)) <= 0 ) {
lowest = l;
}
}
}
assert lowest != null;
result.add(lowest.get(0));
lowest.remove(0);
}
return result;
}
}
检测结果
For my desktopJRE我可以得到以下测试结果:
FirstTest : in 5797 ms with 209557 elements
ReadAsWholeListTest : in 796 ms with 209557 elements
ReadAsWholeTreeSetTest : in 733 ms with 162006 elements
ReadJsonStreamIntoListTest : in 461 ms with 209557 elements
ReadJsonStreamIntoTreeSetTest : in 452 ms with 162006 elements
ReadJsonStreamIntoListChunksTest : in 607 ms with 209557 elements
--------------------
FirstTest : in 3396 ms with 209557 elements
ReadAsWholeListTest : in 493 ms with 209557 elements
ReadAsWholeTreeSetTest : in 520 ms with 162006 elements
ReadJsonStreamIntoListTest : in 385 ms with 209557 elements
ReadJsonStreamIntoTreeSetTest : in 377 ms with 162006 elements
ReadJsonStreamIntoListChunksTest : in 540 ms with 209557 elements
--------------------
FirstTest : in 3448 ms with 209557 elements
ReadAsWholeListTest : in 429 ms with 209557 elements
ReadAsWholeTreeSetTest : in 421 ms with 162006 elements
ReadJsonStreamIntoListTest : in 400 ms with 209557 elements
ReadJsonStreamIntoTreeSetTest : in 385 ms with 162006 elements
ReadJsonStreamIntoListChunksTest : in 480 ms with 209557 elements
--------------------
正如你所看到的,创建了过多的Gson
实例绝对是一个错误的想法。
更优化的测试可以获得更好的性能。
然而,在我的环境中,将大列表拆分为排序块(非并行)以便稍后合并并不会带来太多性能提升。
为了简单起见,可能是最好的选择,我会选择ReadJsonStreamInto_Collection_Test
取决于所需的集合。
我不太确定它在真正的 Android 环境中工作得有多好,但你可以简单地进行一些 JSON 反序列化,比 Gson 使用其内部结构做得更好一些。
顺便一提:
- 我不太确定,但是你注意到 162006 个独特的城市了吗?您的 JSON 文件probably有一些重复项(至少如果它
_id
是身份)。
- 如果您只是生成一个排序版本怎么办
cities.json
提前在 Android 设备上使用之前先在您的工作站上安装?此外,如果我的上述假设是正确的,您可能想要过滤掉重复项。