ForEach-Object -Parallel https://learn.microsoft.com/en-us/powershell/module/microsoft.powershell.core/foreach-object?view=powershell-7.2 and Start-ThreadJob https://docs.microsoft.com/en-us/powershell/module/threadjob/start-threadjob?view=powershell-7.2具有内置功能来限制可以同时运行的线程数量,这同样适用于Runspace https://learn.microsoft.com/en-us/dotnet/api/system.management.automation.runspaces.runspace?view=powershellsdk-7.0.0和他们的运行空间池 https://learn.microsoft.com/en-us/dotnet/api/system.management.automation.runspaces.runspacepool?view=powershellsdk-7.0.0这是两个 cmdlet 在幕后使用的内容。
Start-Job https://learn.microsoft.com/en-us/powershell/module/microsoft.powershell.core/start-job?view=powershell-7.2不提供此类功能,因为每个作业都在单独的进程中运行,而不是之前提到的在同一进程中的不同线程中运行的 cmdlet。我个人也不认为它是并行替代方案,它非常慢,并且在大多数情况下线性循环会比它更快。序列化和反序列化 https://learn.microsoft.com/en-us/powershell/module/microsoft.powershell.core/about/about_remote_output?view=powershell-7.2#deserialized-objects在某些情况下也可能是一个问题。
如何限制正在运行的线程数?
这两个 cmdlet 都提供-ThrottleLimit
为此的参数。
- https://learn.microsoft.com/en-us/powershell/module/threadjob/start-threadjob?view=powershell-7.2#-throttlelimit https://learn.microsoft.com/en-us/powershell/module/threadjob/start-threadjob?view=powershell-7.2#-throttlelimit
- https://learn.microsoft.com/en-us/powershell/module/microsoft.powershell.core/foreach-object?view=powershell-7.2#-throttlelimit https://learn.microsoft.com/en-us/powershell/module/microsoft.powershell.core/foreach-object?view=powershell-7.2#-throttlelimit
代码看起来怎么样?
$dir = (New-Item "000" -ItemType Directory -Force).FullName
# ForEach-Object -Parallel
$zipfiles | ForEach-Object -Parallel {
$name = [IO.Path]::GetFileNameWithoutExtension($_)
7z.exe x -o $name .\$name
Move-Item $_ $using:dir -Force
7z.exe a $_ .\$name\*.*
} -ThrottleLimit 5
# Start-ThreadJob
$jobs = foreach ($i in $zipfiles) {
Start-ThreadJob {
$name = [IO.Path]::GetFileNameWithoutExtension($using:i)
7z.exe x -o $name .\$name
Move-Item $using:i $using:dir -Force
7z.exe a $using:i .\$name\*.*
} -ThrottleLimit 5
}
$jobs | Receive-Job -Wait -AutoRemoveJob
只有 PowerShell 5.1 可用并且无法安装新模块时如何实现相同的效果?
The 运行空间池 https://learn.microsoft.com/en-us/dotnet/api/system.management.automation.runspaces.runspacepool?view=powershellsdk-7.0.0提供相同的功能,无论是.SetMaxRunspaces(Int32) Method https://learn.microsoft.com/en-us/dotnet/api/system.management.automation.runspaces.runspacepool.setmaxrunspaces?view=powershellsdk-7.0.0#system-management-automation-runspaces-runspacepool-setmaxrunspaces(system-int32)或通过瞄准其中之一RunspaceFactory.CreateRunspacePool超载 https://learn.microsoft.com/en-us/dotnet/api/system.management.automation.runspaces.runspacefactory.createrunspacepool?view=powershellsdk-7.0.0#overloads提供一个maxRunspaces
极限作为参数。
代码看起来怎么样?
$dir = (New-Item "000" -ItemType Directory -Force).FullName
$limit = 5
$iss = [initialsessionstate]::CreateDefault2()
$pool = [runspacefactory]::CreateRunspacePool(1, $limit, $iss, $Host)
$pool.ThreadOptions = [Management.Automation.Runspaces.PSThreadOptions]::ReuseThread
$pool.Open()
$tasks = foreach ($i in $zipfiles) {
$ps = [powershell]::Create().AddScript({
param($path, $dir)
$name = [IO.Path]::GetFileNameWithoutExtension($path)
7z.exe x -o $name .\$name
Move-Item $path $dir -Force
7z.exe a $path .\$name\*.*
}).AddParameters(@{ path = $i; dir = $dir })
$ps.RunspacePool = $pool
@{ Instance = $ps; AsyncResult = $ps.BeginInvoke() }
}
foreach($task in $tasks) {
$task['Instance'].EndInvoke($task['AsyncResult'])
$task['Instance'].Dispose()
}
$pool.Dispose()
请注意,对于所有示例,尚不清楚 7zip 代码是否正确,此答案尝试演示如何在 PowerShell 中完成异步,而不是如何压缩文件/文件夹。
下面是一个辅助函数,可以简化并行调用的过程,尝试模拟ForEach-Object -Parallel
并且与 PowerShell 5.1 兼容不应该被视为一个强大的解决方案:
NOTE 本次问答 https://stackoverflow.com/questions/74257556/is-there-an-easier-way-to-run-commands-in-parallel-while-keeping-it-efficient-in为以下功能提供了更好、更强大的替代方案。
using namespace System.Management.Automation
using namespace System.Management.Automation.Runspaces
using namespace System.Collections.Generic
function Invoke-Parallel {
[CmdletBinding()]
param(
[Parameter(Mandatory, ValueFromPipeline, DontShow)]
[object] $InputObject,
[Parameter(Mandatory, Position = 0)]
[scriptblock] $ScriptBlock,
[Parameter()]
[int] $ThrottleLimit = 5,
[Parameter()]
[hashtable] $ArgumentList
)
begin {
$iss = [initialsessionstate]::CreateDefault2()
if($PSBoundParameters.ContainsKey('ArgumentList')) {
foreach($argument in $ArgumentList.GetEnumerator()) {
$iss.Variables.Add([SessionStateVariableEntry]::new($argument.Key, $argument.Value, ''))
}
}
$pool = [runspacefactory]::CreateRunspacePool(1, $ThrottleLimit, $iss, $Host)
$tasks = [List[hashtable]]::new()
$pool.ThreadOptions = [PSThreadOptions]::ReuseThread
$pool.Open()
}
process {
try {
$ps = [powershell]::Create().AddScript({
$args[0].InvokeWithContext($null, [psvariable]::new("_", $args[1]))
}).AddArgument($ScriptBlock.Ast.GetScriptBlock()).AddArgument($InputObject)
$ps.RunspacePool = $pool
$invocationInput = [PSDataCollection[object]]::new(1)
$invocationInput.Add($InputObject)
$tasks.Add(@{
Instance = $ps
AsyncResult = $ps.BeginInvoke($invocationInput)
})
}
catch {
$PSCmdlet.WriteError($_)
}
}
end {
try {
foreach($task in $tasks) {
$task['Instance'].EndInvoke($task['AsyncResult'])
if($task['Instance'].HadErrors) {
$task['Instance'].Streams.Error
}
$task['Instance'].Dispose()
}
}
catch {
$PSCmdlet.WriteError($_)
}
finally {
if($pool) { $pool.Dispose() }
}
}
}
其工作原理的示例:
# Hashtable Key becomes the Variable Name inside the Runspace!
$outsideVariables = @{ Message = 'Hello from {0}' }
0..10 | Invoke-Parallel {
"[Item $_] - " + $message -f [runspace]::DefaultRunspace.InstanceId
Start-Sleep 5
} -ArgumentList $outsideVariables -ThrottleLimit 3