为什么C#内存流要保留这么多内存?

2024-01-04

我们的软件通过一个解压缩某些字节数据GZipStream,它从a读取数据MemoryStream。这些数据以 4KB 块的形式解压并写入另一个块中MemoryStream.

我们已经意识到进程分配的内存比实际解压缩的数据要高得多。

例子: 具有 2,425,536 字节的压缩字节数组解压缩为 23,050,718 字节。我们使用的内存分析器显示该方法MemoryStream.set_Capacity(Int32 value)分配了 67,104,936 字节。这是保留内存和实际写入内存之间的 2.9 倍。

Note: MemoryStream.set_Capacity被调用自MemoryStream.EnsureCapacity它本身是从MemoryStream.Write在我们的函数中。

为什么MemoryStream保留这么多容量,即使它只附加 4KB 的块?

这是解压缩数据的代码片段:

private byte[] Decompress(byte[] data)
{
    using (MemoryStream compressedStream = new MemoryStream(data))
    using (GZipStream zipStream = new GZipStream(compressedStream, CompressionMode.Decompress))
    using (MemoryStream resultStream = new MemoryStream())
    {
        byte[] buffer = new byte[4096];
        int iCount = 0;

        while ((iCount = zipStream.Read(buffer, 0, buffer.Length)) > 0)
        {
            resultStream.Write(buffer, 0, iCount);
        }
        return resultStream.ToArray();
    }
}

注意:如果相关,这是系统配置:

  • Windows XP 32位,
  • .NET 3.5
  • 使用 Visual Studio 2008 编译

Because 这是算法 http://referencesource.microsoft.com/#mscorlib/system/io/memorystream.cs#1416df83d2368912了解它如何扩展其容量。

public override void Write(byte[] buffer, int offset, int count) {

    //... Removed Error checking for example

    int i = _position + count;
    // Check for overflow
    if (i < 0)
        throw new IOException(Environment.GetResourceString("IO.IO_StreamTooLong"));

    if (i > _length) {
        bool mustZero = _position > _length;
        if (i > _capacity) {
            bool allocatedNewArray = EnsureCapacity(i);
            if (allocatedNewArray)
                mustZero = false;
        }
        if (mustZero)
            Array.Clear(_buffer, _length, i - _length);
        _length = i;
    }

    //... 
}

private bool EnsureCapacity(int value) {
    // Check for overflow
    if (value < 0)
        throw new IOException(Environment.GetResourceString("IO.IO_StreamTooLong"));
    if (value > _capacity) {
        int newCapacity = value;
        if (newCapacity < 256)
            newCapacity = 256;
        if (newCapacity < _capacity * 2)
            newCapacity = _capacity * 2;
        Capacity = newCapacity;
        return true;
    }
    return false;
}

public virtual int Capacity 
{
    //...

    set {
         //...

        // MemoryStream has this invariant: _origin > 0 => !expandable (see ctors)
        if (_expandable && value != _capacity) {
            if (value > 0) {
                byte[] newBuffer = new byte[value];
                if (_length > 0) Buffer.InternalBlockCopy(_buffer, 0, newBuffer, 0, _length);
                _buffer = newBuffer;
            }
            else {
                _buffer = null;
            }
            _capacity = value;
        }
    }
}

因此,每次达到容量限制时,容量都会增加一倍。它这样做的原因是Buffer.InternalBlockCopy对于大型数组来说,操作速度很慢,因此如果必须频繁调整每个 Write 调用的大小,性能将显着下降。

您可以采取的一些措施来提高性能,您可以将初始容量设置为至少为压缩数组的大小,然后您可以将大小增加一个小于2.0以减少您使用的内存量。

const double ResizeFactor = 1.25;

private byte[] Decompress(byte[] data)
{
    using (MemoryStream compressedStream = new MemoryStream(data))
    using (GZipStream zipStream = new GZipStream(compressedStream, CompressionMode.Decompress))
    using (MemoryStream resultStream = new MemoryStream(data.Length * ResizeFactor)) //Set the initial size to be the same as the compressed size + 25%.
    {
        byte[] buffer = new byte[4096];
        int iCount = 0;

        while ((iCount = zipStream.Read(buffer, 0, buffer.Length)) > 0)
        {
            if(resultStream.Capacity < resultStream.Length + iCount)
               resultStream.Capacity = resultStream.Capacity * ResizeFactor; //Resize to 125% instead of 200%

            resultStream.Write(buffer, 0, iCount);
        }
        return resultStream.ToArray();
    }
}

如果您愿意,您可以执行更奇特的算法,例如根据当前压缩比调整大小

const double MinResizeFactor = 1.05;

private byte[] Decompress(byte[] data)
{
    using (MemoryStream compressedStream = new MemoryStream(data))
    using (GZipStream zipStream = new GZipStream(compressedStream, CompressionMode.Decompress))
    using (MemoryStream resultStream = new MemoryStream(data.Length * MinResizeFactor)) //Set the initial size to be the same as the compressed size + the minimum resize factor.
    {
        byte[] buffer = new byte[4096];
        int iCount = 0;

        while ((iCount = zipStream.Read(buffer, 0, buffer.Length)) > 0)
        {
            if(resultStream.Capacity < resultStream.Length + iCount)
            {
               double sizeRatio = ((double)resultStream.Position + iCount) / (compressedStream.Position + 1); //The +1 is to prevent divide by 0 errors, it may not be necessary in practice.

               //Resize to minimum resize factor of the current capacity or the 
               // compressed stream length times the compression ratio + min resize 
               // factor, whichever is larger.
               resultStream.Capacity =  Math.Max(resultStream.Capacity * MinResizeFactor, 
                                                 (sizeRatio + (MinResizeFactor - 1)) * compressedStream.Length);
             }

            resultStream.Write(buffer, 0, iCount);
        }
        return resultStream.ToArray();
    }
}
本文内容由网友自发贡献,版权归原作者所有,本站不承担相应法律责任。如您发现有涉嫌抄袭侵权的内容,请联系:hwhale#tublm.com(使用前将#替换为@)

为什么C#内存流要保留这么多内存? 的相关文章

随机推荐