当尝试从 ant 运行 Sonar 时,由于 SocketTimeoutException,我遇到了偶发故障。
设置:我在 RHEL 6 机器上运行 Sonar 4.0,配置为与 Postgres 9.2 一起运行。我使用 Jenkins 1.544 每晚构建 175 个项目。我让 Jenkins 在从机上使用单个执行器串行构建项目,但最近我通过 swarm 添加了第二个从机,所以现在我有两个节点(每个节点都有一个执行器),并行构建。我使用声纳 ant 任务从 ant 运行声纳,这在很大程度上工作得很好。
当我只有一个从站时,在尝试加载引导属性时,偶尔会出现作业失败并出现 SocketTimeoutException 的情况。现在我添加了第二个节点,这种情况似乎发生得更加频繁。有趣的是,当我使用 ant 在命令行上构建项目时,我也看到了同样的失败。这似乎不是服务器资源问题。我是声纳服务器的唯一用户,我能够在没有任何实际负载的情况下收到此错误。
这是我今天早上从命令行运行时得到的堆栈跟踪:
[exec] BUILD FAILED
[exec] /var/lib/jenkins/sonar.buildfile:113: org.sonar.runner.impl.RunnerException: Unable to execute Sonar
[exec] at org.sonar.runner.impl.BatchLauncher$1.delegateExecution(BatchLauncher.java:79)
[exec] at org.sonar.runner.impl.BatchLauncher$1.run(BatchLauncher.java:63)
[exec] at java.security.AccessController.doPrivileged(Native Method)
[exec] at org.sonar.runner.impl.BatchLauncher.doExecute(BatchLauncher.java:57)
[exec] at org.sonar.runner.impl.BatchLauncher.execute(BatchLauncher.java:50)
[exec] at org.sonar.runner.api.EmbeddedRunner.doExecute(EmbeddedRunner.java:71)
[exec] at org.sonar.runner.api.Runner.execute(Runner.java:89)
[exec] at org.sonar.ant.SonarTask.launchAnalysis(SonarTask.java:53)
[exec] at org.sonar.ant.SonarTask.execute(SonarTask.java:48)
[exec] at org.apache.tools.ant.UnknownElement.execute(UnknownElement.java:291)
[exec] at sun.reflect.GeneratedMethodAccessor4.invoke(Unknown Source)
[exec] at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
[exec] at java.lang.reflect.Method.invoke(Method.java:601)
[exec] at org.apache.tools.ant.dispatch.DispatchUtils.execute(DispatchUtils.java:106)
[exec] at org.apache.tools.ant.Task.perform(Task.java:348)
[exec] at org.apache.tools.ant.Target.execute(Target.java:392)
[exec] at org.apache.tools.ant.Target.performTasks(Target.java:413)
[exec] at org.apache.tools.ant.Project.executeSortedTargets(Project.java:1399)
[exec] at org.apache.tools.ant.Project.executeTarget(Project.java:1368)
[exec] at org.apache.tools.ant.helper.DefaultExecutor.executeTargets(DefaultExecutor.java:41)
[exec] at org.apache.tools.ant.Project.executeTargets(Project.java:1251)
[exec] at org.apache.tools.ant.Main.runBuild(Main.java:811)
[exec] at org.apache.tools.ant.Main.startAnt(Main.java:217)
[exec] at org.apache.tools.ant.launch.Launcher.run(Launcher.java:280)
[exec] at org.apache.tools.ant.launch.Launcher.main(Launcher.java:109)
[exec] Caused by: org.sonar.api.utils.SonarException: Unable to request: /batch_bootstrap/properties?dryRun=false
[exec] at org.sonar.batch.bootstrap.ServerClient.request(ServerClient.java:92)
[exec] at org.sonar.batch.bootstrap.ServerClient.request(ServerClient.java:82)
[exec] at org.sonar.batch.bootstrap.ServerClient.request(ServerClient.java:78)
[exec] at org.sonar.batch.bootstrap.BatchSettings.downloadSettings(BatchSettings.java:97)
[exec] at org.sonar.batch.bootstrap.BatchSettings.init(BatchSettings.java:72)
[exec] at org.sonar.batch.bootstrap.BatchSettings.<init>(BatchSettings.java:55)
[exec] at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
[exec] at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57)
[exec] at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
[exec] at java.lang.reflect.Constructor.newInstance(Constructor.java:525)
[exec] at org.picocontainer.injectors.AbstractInjector.newInstance(AbstractInjector.java:145)
[exec] at org.picocontainer.injectors.ConstructorInjector$1.run(ConstructorInjector.java:342)
[exec] at org.picocontainer.injectors.AbstractInjector$ThreadLocalCyclicDependencyGuard.observe(AbstractInjector.java:270)
[exec] at org.picocontainer.injectors.ConstructorInjector.getComponentInstance(ConstructorInjector.java:364)
[exec] at org.picocontainer.injectors.AbstractInjectionFactory$LifecycleAdapter.getComponentInstance(AbstractInjectionFactory.java:56)
[exec] at org.picocontainer.behaviors.AbstractBehavior.getComponentInstance(AbstractBehavior.java:64)
[exec] at org.picocontainer.behaviors.Stored.getComponentInstance(Stored.java:91)
[exec] at org.picocontainer.DefaultPicoContainer.getInstance(DefaultPicoContainer.java:698)
[exec] at org.picocontainer.DefaultPicoContainer.getComponent(DefaultPicoContainer.java:646)
[exec] at org.picocontainer.DefaultPicoContainer.getComponent(DefaultPicoContainer.java:631)
[exec] at org.picocontainer.parameters.BasicComponentParameter$1.resolveInstance(BasicComponentParameter.java:118)
[exec] at org.picocontainer.parameters.ComponentParameter$1.resolveInstance(ComponentParameter.java:136)
[exec] at org.picocontainer.injectors.SingleMemberInjector.getParameter(SingleMemberInjector.java:78)
[exec] at org.picocontainer.injectors.SingleMemberInjector.getMemberArguments(SingleMemberInjector.java:61)
[exec] at org.picocontainer.injectors.MethodInjector.getMemberArguments(MethodInjector.java:100)
[exec] at org.picocontainer.injectors.MethodInjector$2.run(MethodInjector.java:112)
[exec] at org.picocontainer.injectors.AbstractInjector$ThreadLocalCyclicDependencyGuard.observe(AbstractInjector.java:270)
[exec] at org.picocontainer.injectors.MethodInjector.decorateComponentInstance(MethodInjector.java:120)
[exec] at org.picocontainer.injectors.CompositeInjector.decorateComponentInstance(CompositeInjector.java:58)
[exec] at org.picocontainer.injectors.Reinjector.reinject(Reinjector.java:142)
[exec] at org.picocontainer.injectors.ProviderAdapter.getComponentInstance(ProviderAdapter.java:96)
[exec] at org.picocontainer.DefaultPicoContainer.getInstance(DefaultPicoContainer.java:698)
[exec] at org.picocontainer.DefaultPicoContainer.getComponent(DefaultPicoContainer.java:646)
[exec] at org.picocontainer.DefaultPicoContainer.getComponent(DefaultPicoContainer.java:631)
[exec] at org.picocontainer.parameters.BasicComponentParameter$1.resolveInstance(BasicComponentParameter.java:118)
[exec] at org.picocontainer.parameters.ComponentParameter$1.resolveInstance(ComponentParameter.java:136)
[exec] at org.picocontainer.injectors.SingleMemberInjector.getParameter(SingleMemberInjector.java:78)
[exec] at org.picocontainer.injectors.ConstructorInjector$CtorAndAdapters.getParameterArguments(ConstructorInjector.java:309)
[exec] at org.picocontainer.injectors.ConstructorInjector$1.run(ConstructorInjector.java:335)
[exec] at org.picocontainer.injectors.AbstractInjector$ThreadLocalCyclicDependencyGuard.observe(AbstractInjector.java:270)
[exec] at org.picocontainer.injectors.ConstructorInjector.getComponentInstance(ConstructorInjector.java:364)
[exec] at org.picocontainer.injectors.AbstractInjectionFactory$LifecycleAdapter.getComponentInstance(AbstractInjectionFactory.java:56)
[exec] at org.picocontainer.behaviors.AbstractBehavior.getComponentInstance(AbstractBehavior.java:64)
[exec] at org.picocontainer.behaviors.Stored.getComponentInstance(Stored.java:91)
[exec] at org.picocontainer.DefaultPicoContainer.getInstance(DefaultPicoContainer.java:698)
[exec] at org.picocontainer.DefaultPicoContainer.getComponent(DefaultPicoContainer.java:646)
[exec] at org.picocontainer.DefaultPicoContainer.getComponent(DefaultPicoContainer.java:631)
[exec] at org.picocontainer.parameters.BasicComponentParameter$1.resolveInstance(BasicComponentParameter.java:118)
[exec] at org.picocontainer.parameters.ComponentParameter$1.resolveInstance(ComponentParameter.java:136)
[exec] at org.picocontainer.injectors.SingleMemberInjector.getParameter(SingleMemberInjector.java:78)
[exec] at org.picocontainer.injectors.ConstructorInjector$CtorAndAdapters.getParameterArguments(ConstructorInjector.java:309)
[exec] at org.picocontainer.injectors.ConstructorInjector$1.run(ConstructorInjector.java:335)
[exec] at org.picocontainer.injectors.AbstractInjector$ThreadLocalCyclicDependencyGuard.observe(AbstractInjector.java:270)
[exec] at org.picocontainer.injectors.ConstructorInjector.getComponentInstance(ConstructorInjector.java:364)
[exec] at org.picocontainer.injectors.AbstractInjectionFactory$LifecycleAdapter.getComponentInstance(AbstractInjectionFactory.java:56)
[exec] at org.picocontainer.behaviors.AbstractBehavior.getComponentInstance(AbstractBehavior.java:64)
[exec] at org.picocontainer.behaviors.Stored.getComponentInstance(Stored.java:91)
[exec] at org.picocontainer.DefaultPicoContainer.instantiateComponentAsIsStartable(DefaultPicoContainer.java:1033)
[exec] at org.picocontainer.DefaultPicoContainer.addAdapterIfStartable(DefaultPicoContainer.java:1025)
[exec] at org.picocontainer.DefaultPicoContainer.startAdapters(DefaultPicoContainer.java:1002)
[exec] at org.picocontainer.DefaultPicoContainer.start(DefaultPicoContainer.java:766)
[exec] at org.sonar.api.platform.ComponentContainer.startComponents(ComponentContainer.java:91)
[exec] at org.sonar.api.platform.ComponentContainer.execute(ComponentContainer.java:77)
[exec] at org.sonar.batch.bootstrapper.Batch.startBatch(Batch.java:92)
[exec] at org.sonar.batch.bootstrapper.Batch.execute(Batch.java:74)
[exec] at org.sonar.runner.batch.IsolatedLauncher.execute(IsolatedLauncher.java:45)
[exec] at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
[exec] at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
[exec] at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
[exec] at java.lang.reflect.Method.invoke(Method.java:601)
[exec] at org.sonar.runner.impl.BatchLauncher$1.delegateExecution(BatchLauncher.java:75)
[exec] ... 24 more
[exec] Caused by: java.net.SocketTimeoutException: Read timed out
[exec] at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
[exec] at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57)
[exec] at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
[exec] at java.lang.reflect.Constructor.newInstance(Constructor.java:525)
[exec] at sun.net.www.protocol.http.HttpURLConnection$6.run(HttpURLConnection.java:1674)
[exec] at sun.net.www.protocol.http.HttpURLConnection$6.run(HttpURLConnection.java:1672)
[exec] at java.security.AccessController.doPrivileged(Native Method)
[exec] at sun.net.www.protocol.http.HttpURLConnection.getChainedException(HttpURLConnection.java:1670)
[exec] at sun.net.www.protocol.http.HttpURLConnection.getInputStream(HttpURLConnection.java:1243)
[exec] at java.net.HttpURLConnection.getResponseCode(HttpURLConnection.java:468)
[exec] at org.sonar.api.utils.HttpDownloader$BaseHttpDownloader$HttpInputSupplier.getInput(HttpDownloader.java:274)
[exec] at org.sonar.api.utils.HttpDownloader$BaseHttpDownloader$HttpInputSupplier.getInput(HttpDownloader.java:235)
[exec] at org.sonar.batch.bootstrap.ServerClient.request(ServerClient.java:88)
[exec] ... 94 more
[exec] Caused by: java.net.SocketTimeoutException: Read timed out
[exec] at java.net.SocketInputStream.socketRead0(Native Method)
[exec] at java.net.SocketInputStream.read(SocketInputStream.java:150)
[exec] at java.net.SocketInputStream.read(SocketInputStream.java:121)
[exec] at java.io.BufferedInputStream.fill(BufferedInputStream.java:235)
[exec] at java.io.BufferedInputStream.read1(BufferedInputStream.java:275)
[exec] at java.io.BufferedInputStream.read(BufferedInputStream.java:334)
[exec] at sun.net.www.http.HttpClient.parseHTTPHeader(HttpClient.java:633)
[exec] at sun.net.www.http.HttpClient.parseHTTP(HttpClient.java:579)
[exec] at sun.net.www.protocol.http.HttpURLConnection.getInputStream(HttpURLConnection.java:1322)
[exec] at sun.net.www.protocol.http.HttpURLConnection.getHeaderField(HttpURLConnection.java:2677)
[exec] at java.net.URLConnection.getContentEncoding(URLConnection.java:533)
[exec] at org.sonar.api.utils.HttpDownloader$BaseHttpDownloader$HttpInputSupplier.getInput(HttpDownloader.java:272)
[exec] ... 96 more
[exec]
[exec] Total time: 23 seconds
我查看了声纳 access.log,可以看到初始连接和从上面列出的失败中重定向:
172.20.2.172 - - [01/01/2014:09:45:25 -0500]“GET /api/server/version HTTP/1.1”200 3“-”“Ant/2.1”
172.20.2.172 - - [01/01/2014:09:45:26 -0500]“GET /batch_bootstrap/index HTTP/1.1”302 125“-”“Ant/2.1”
172.20.2.172 - - [01/01/2014:09:45:26 -0500]“GET /deploy/bootstrap/index.txt HTTP/1.1”200 3899“-”“Ant/2.1”
有趣的是,我稍后查看了日志,它似乎显示在我启动构建后 30 分钟连接失败:
172.20.2.172 - - [01/01/2014:09:45:25 -0500]“GET /api/server/version HTTP/1.1”200 3“-”“Ant/2.1”
172.20.2.172 - - [01/01/2014:09:45:26 -0500]“GET /batch_bootstrap/index HTTP/1.1”302 125“-”“Ant/2.1”
172.20.2.172 - - [01/01/2014:09:45:26 -0500]“GET /deploy/bootstrap/index.txt HTTP/1.1”200 3899“-”“Ant/2.1”
172.20.2.172 - - [01/01/2014:10:16:20 -0500]“GET /batch_bootstrap/properties?dryRun=false HTTP/1.1”401 41“-”“声纳
空/空”
现在,我正在使用 LDAP 插件根据活动目录对声纳用户进行身份验证,我了解到这可能会导致速度变慢,甚至可能导致超时。我添加了 ncsd 但没有注意到任何改进。我还升级为使用 sonar-ant-task-2.1.jar 但没有看到变化。我可以从浏览器访问 /batch_bootstrap/properties?dryRun=false,尽管有时需要比其他人更长的时间。如果我重新运行失败的作业,第二次尝试几乎总是会成功。
我不知道接下来要尝试什么。我想扩大集群节点的数量,但恐怕这只会导致更多的失败。我不喜欢告诉开发人员忽略构建失败电子邮件。由于工作数量较多,我经常遇到这个问题。如果有人认为日志记录或配置更改有助于避免或隔离问题,我愿意尝试它们。
谢谢 - 萨姆
更新:我创建了一个简短的脚本来获取 /batch_bootstrap/properties?dryRun=false 并使用curl 将结果发送到 /dev/null 。我将 curl --max-time 设置为 2 秒,并将其放在 cron 上每 2 分钟运行一次。几乎每个请求都会在 2 秒超时之前完成(48 小时内只有 26 次失败,全部发生在 00:30 - 01:00 期间,当时我正在运行一些维护任务)。自从我开始以这种方式 ping 服务器以来,我没有再遇到过因 SocketTimeoutException 而导致构建失败的情况。我不太喜欢这个解决方案,但它目前似乎有效。如果有人可以尝试的话,我仍然对替代方案感兴趣。
更新:我能够降低 cron 的频率,改为每 5 分钟运行一次,而没有看到问题。当我尝试每 10 分钟运行一次时,我再次看到 SocketTimeoutException 失败。我还进行了一般性的 yum 更新以达到良好的效果,但这显然并没有改善情况。
谢谢 - 萨姆
更新:当 Elasticsearch 被分解为一个单独的进程时,Sonar 4.5 的情况变得越来越糟。在声纳日志文件中,我遇到了以下几个失败:
2014.12.24 03:42:28 ERROR web[o.s.s.s.SearchClient] could not execute request: org.elasticsearch.action.get.GetRequestBuilder@48c9ec56
org.elasticsearch.client.transport.NoNodeAvailableException: No node available
我记录了一段时间内的负载,发现它在夜间构建周期中比我预期的要高。除了声纳故障之外,我仍然没有看到任何其他奇怪的行为。最近出现了一些新硬件,因此我将声纳单独移到了一个新盒子上。自从做了这一切之后,我就没有再看到任何与超时相关的构建失败。
因此,如果有人看到类似的错误,这种类型的错误似乎表明您需要投入更多资源。
谢谢 - 萨姆