问题现象
某天开发人员告诉我,spark大数据任务执行时,docker 环境上的mysql 连接不上了。于是乎,打开spark查看日志,发现如下错误:
22/10/18 13:52:56 WARN DruidDataSource: get connection timeout retry : 1
22/10/18 13:52:58 ERROR DruidDataSource: create connection SQLException, url: jdbc:mysql://10.192.11.13:3345/yundb?useUnicode=true&characterEncoding=utf8&serverTimezone=Asia/Shanghai&allowMultiQueries=true&connectTimeout=5000&socketTimeout=5000&autoReconnect=true&maxReconnects=5&failOverReadOnly=false, errorCode 0, state 08001
java.sql.SQLNonTransientConnectionException: Could not create connection to database server. Attempted reconnect 5 times. Giving up.
at com.mysql.cj.jdbc.exceptions.SQLError.createSQLException(SQLError.java:110)
at com.mysql.cj.jdbc.exceptions.SQLError.createSQLException(SQLError.java:97)
at com.mysql.cj.jdbc.exceptions.SQLError.createSQLException(SQLError.java:89)
at com.mysql.cj.jdbc.exceptions.SQLError.createSQLException(SQLError.java:63)
at com.mysql.cj.jdbc.exceptions.SQLError.createSQLException(SQLError.java:73)
at com.mysql.cj.jdbc.ConnectionImpl.connectWithRetries(ConnectionImpl.java:906)
at com.mysql.cj.jdbc.ConnectionImpl.createNewIO(ConnectionImpl.java:831)
at com.mysql.cj.jdbc.ConnectionImpl.<init>(ConnectionImpl.java:456)
at com.mysql.cj.jdbc.ConnectionImpl.getInstance(ConnectionImpl.java:246)
at com.mysql.cj.jdbc.NonRegisteringDriver.connect(NonRegisteringDriver.java:199)
at com.alibaba.druid.pool.DruidAbstractDataSource.createPhysicalConnection(DruidAbstractDataSource.java:1513)
at com.alibaba.druid.pool.DruidAbstractDataSource.createPhysicalConnection(DruidAbstractDataSource.java:1578)
at com.alibaba.druid.pool.DruidDataSource$CreateConnectionThread.run(DruidDataSource.java:2466)
Caused by: com.mysql.cj.exceptions.CJException: null, message from server: "Host '10.192.11.19' is blocked because of many connection errors; unblock with 'mysqladmin flush-hosts'"
at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
……
at com.mysql.cj.NativeSession.connect(NativeSession.java:157)
at com.mysql.cj.jdbc.ConnectionImpl.connectWithRetries(ConnectionImpl.java:850)
... 7 more
22/10/18 13:53:02 ERROR DefaultWorkflowEngine: workflow execute error. instance id=7a7699ef-4346-4e5a-a787-221d5629a8ff, current activity=cn.tongdun.yuntu.client.workflow.model.definition.Activity@541bdb14, e=com.alibaba.druid.pool.GetConnectionTimeoutException: wait millis 6000, active 0, maxActive 8, creating 1
at com.alibaba.druid.pool.DruidDataSource.getConnectionInternal(DruidDataSource.java:1508)
at com.alibaba.druid.pool.DruidDataSource.getConnectionDirect(DruidDataSource.java:1253)
at com.alibaba.druid.pool.DruidDataSource.getConnection(DruidDataSource.java:1233)
at com.alibaba.druid.pool.DruidDataSource.getConnection(DruidDataSource.java:1223)
at cn.txxx.yxxxx.hai.storage.db.mysql.ConnectionPool.getConnection(ConnectionPool.scala:49)
……
at cn.txxx.yxxxx.suan.handler.main.BatchMain.main(BatchMain.scala)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.apache.spark.deploy.yarn.ApplicationMaster$$anon$2.run(ApplicationMaster.scala:685)
Caused by: java.sql.SQLNonTransientConnectionException: Could not create connection to database server. Attempted reconnect 5 times. Giving up.
at com.mysql.cj.jdbc.exceptions.SQLError.createSQLException(SQLError.java:110)
at com.mysql.cj.jdbc.exceptions.SQLError.createSQLException(SQLError.java:97)
at com.mysql.cj.jdbc.exceptions.SQLError.createSQLException(SQLError.java:89)
at com.mysql.cj.jdbc.exceptions.SQLError.createSQLException(SQLError.java:63)
关键错误信息:Caused by: com.mysql.cj.exceptions.CJException: null, message from server: “Host ‘10.192.11.19’ is blocked because of many connection errors; unblock with ‘mysqladmin flush-hosts’”
看到这个,大概猜测原因是该mysql启动时间太长了,有可能存在某些使用不合理,有连接泄漏的现象。
因连接数过多,导致建立新的连接不成功。
解决办法
重启mysql 后,连接长长。
分析原因
查看mysql 上的连接数,有很多很早以前的连接,一直没有释放。
可能是有不合理使用数据库连接的地方。有代码写的不合理,发生连接泄漏。
当然,如果下一次再出现,就可以根据连接信息里的IP和端口,去调查是哪个应用产生的连接泄漏。
本文内容由网友自发贡献,版权归原作者所有,本站不承担相应法律责任。如您发现有涉嫌抄袭侵权的内容,请联系:hwhale#tublm.com(使用前将#替换为@)