我最近将我的应用程序切换到 Spring Boot 2。我依靠 Spring Data JPA 来处理所有事务,我注意到这与我的旧配置之间存在巨大的速度差异。存储大约 1000 个元素在大约 6 秒内完成,现在需要超过 25 秒。我看过有关使用 Data JPA 进行批处理的帖子,但这些都不起作用。

让我向您展示 2 个配置:


    @Table(name = "category")
    public class CategoryDB implements Serializable
        private static final long serialVersionUID = -7047292240228252349L;

        @Column(name = "category_id", length = 24)
        private String category_id;

        @Column(name = "category_name", length = 50)
        private String name;

        @Column(name = "category_plural_name", length = 50)
        private String pluralName;

        @Column(name = "url_icon", length = 200)
        private String url;

        @Column(name = "parent_category", length = 24)
        @JoinColumn(name = "parent_category", referencedColumnName = "category_id")
        private String parentID;

        //Getters & Setters



    public Set<String> insert(Set<CategoryDB> element)
        Set<String> ids = new HashSet<>();
        Transaction tx = session.beginTransaction();
        for (CategoryDB category : element)
            String id = (String) session.save(category);
        return ids;

旧的 Hibernate XML 配置文件:

    <property name="show_sql">true</property>
    <property name="format_sql">true</property>

    <!-- connection information -->
    <property name="hibernate.connection.driver_class">com.mysql.cj.jdbc.Driver</property>
    <property name="hibernate.dialect">org.hibernate.dialect.MySQLDialect</property>

    <!-- database pooling information -->
    <property name="connection_provider_class">org.hibernate.connection.C3P0ConnectionProvider</property>
    <property name="hibernate.c3p0.min_size">5</property>
    <property name="hibernate.c3p0.max_size">100</property>
    <property name="hibernate.c3p0.timeout">300</property>
    <property name="hibernate.c3p0.max_statements">50</property>
    <property name="hibernate.c3p0.idle_test_period">3000</property>


18949156 nanoseconds spent acquiring 2 JDBC connections;
5025322 nanoseconds spent releasing 2 JDBC connections;
33116643 nanoseconds spent preparing 942 JDBC statements;
3185229893 nanoseconds spent executing 942 JDBC statements;
0 nanoseconds spent executing 0 JDBC batches;
0 nanoseconds spent performing 0 L2C puts;
0 nanoseconds spent performing 0 L2C hits;
0 nanoseconds spent performing 0 L2C misses;
3374152568 nanoseconds spent executing 1 flushes (flushing a total of 941 entities and 0 collections);
6485 nanoseconds spent executing 1 partial-flushes (flushing a total of 0 entities and 0 collections)


public interface CategoryRepository extends JpaRepository<CategoryDB,String>
    @Query("SELECT cat.parentID FROM CategoryDB cat WHERE cat.category_id = :#{#category.category_id}")
    String getParentID(@Param("category") CategoryDB category);





spring.jpa.properties.hibernate.generate_statistics = true


24543605 nanoseconds spent acquiring 1 JDBC connections;
0 nanoseconds spent releasing 0 JDBC connections;
136919170 nanoseconds spent preparing 942 JDBC statements;
5457451561 nanoseconds spent executing 941 JDBC statements;
19985781508 nanoseconds spent executing 19 JDBC batches;
0 nanoseconds spent performing 0 L2C puts;
0 nanoseconds spent performing 0 L2C hits;
0 nanoseconds spent performing 0 L2C misses;
20256178886 nanoseconds spent executing 3 flushes (flushing a total of 2823 entities and 0 collections);
0 nanoseconds spent executing 0 partial-flushes (flushing a total of 0 entities and 0 collections)

可能,我代表 Spring 错误配置了一些东西。这是一个巨大的性能差异,我已经走进了死胡同。非常感谢任何关于这里出了什么问题的提示。

让我们合并统计数据,以便可以轻松比较它们。 旧行的前缀为o,新的n。 计数为 0 的行将被忽略。 纳秒测量值经过格式化,以便毫秒可以位于.

o:    18 949156 nanoseconds spent acquiring 2 JDBC connections;
n:    24 543605 nanoseconds spent acquiring 1 JDBC connections;

o:    33 116643 nanoseconds spent preparing 942 JDBC statements;
n:   136 919170 nanoseconds spent preparing 942 JDBC statements;

o:  3185 229893 nanoseconds spent executing 942 JDBC statements;
n:  5457 451561 nanoseconds spent executing 941 JDBC statements; //loosing ~2sec

o:            0 nanoseconds spent executing 0 JDBC batches;
n: 19985 781508 nanoseconds spent executing 19 JDBC batches; // loosing ~20sec

o:  3374 152568 nanoseconds spent executing 1 flushes (flushing a total of 941 entities and 0 collections);
n: 20256 178886 nanoseconds spent executing 3 flushes (flushing a total of 2823 entities and 0 collections); // loosing ~20sec, processing 3 times the entities

o:         6485 nanoseconds spent executing 1 partial-flushes (flushing a total of 0 entities and 0 collections)
n:            0 nanoseconds spent executing 0 partial-flushes (flushing a total of 0 entities and 0 collections)


  • 新版本有19个批次,需要20秒,这在旧版本中根本不存在。

  • 新版本有 3 次冲水,而不是 1 次,总共花费了 20 秒或大约 6 倍的时间。这可能与批次或多或少相同的额外时间,因为它们肯定是这些冲洗的一部分。

尽管批处理应该会让事情变得更快,但有报告表明它们会让事情变得更慢,尤其是使用 MySql 时:为什么Spring的jdbcTemplate.batchUpdate()这么慢? https://stackoverflow.com/questions/20360574/why-springs-jdbctemplate-batchupdate-so-slow


  • 禁用批处理,以测试您是否确实遇到某种缓慢的批处理问题。
  • 使用链接的 SO 帖子可以加快批处理速度。
  • 记录实际执行的 SQL 语句以便找出差异。 由于这将导致需要操作的日志相当长,因此请尝试仅提取两个文件中的 SQL 语句,并使用 diff 工具对它们进行比较。
  • 记录刷新以便了解触发额外刷新的原因。
  • 使用断点和调试器或额外的日志记录来找出哪些实体被刷新以及为什么第二个变体中有更多实体。

上述所有提案均在 JPA 上运行。 但是您的统计数据和问题内容表明您正在单个或几个表中进行简单的插入。 在 JDBC 上执行此操作,例如与一个JdbcTemplate可能会更有效,至少更容易理解。


