正如我在评论中所说,将二进制数据打包成字符串格式(如 JSON)是一种浪费 - 如果您使用 base64,则会将数据传输大小增加 33%,并且这也会使 JSON 解码器难以正确解码JSON,因为它需要流过整个结构才能提取索引。
最好单独发送它们 - JSON 作为 JSON,然后文件内容直接作为二进制发送。当然,您需要一种方法来区分两者,最简单的方法是在发送 JSON 数据时在其前面添加其长度,以便服务器知道要读取多少字节才能获取 JSON,然后读取其余部分作为文件内容。这将使其成为一种非常简单的协议,其包形成为:
[JSON LENGTH][JSON][FILE CONTENTS]
假设 JSON 永远不会大于 4GB(如果是的话,您将遇到更大的问题,因为解析它将是一场噩梦),这足以让JSON LENGTH
固定 4 字节(32 位)作为无符号整数(如果您不希望 JSON 超过 64KB,您甚至可以选择 16 位),因此整个策略将在客户端工作,如下所示:
- 创建有效负载
- 将其编码为 JSON,然后将其编码为
bytes
使用UTF-8编码
- 获取上述包的长度并将其作为流的前 4 个字节发送
- 发送JSON包
- 读取并发送文件内容
在服务器端执行相同的过程
- 读取接收到的数据的前4个字节以获得JSON负载长度
- 读取下一个字节数以匹配此长度
- 使用 UTF-8 将它们解码为字符串,然后解码 JSON 以获取有效负载
- 读取其余的流数据并将其存储到文件中
或者在代码中,客户端:
import json
import os
import socket
import struct
BUFFER_SIZE = 4096 # a uniform buffer size to use for our transfers
# pick up an absolute path from the script folder, not necessary tho
file_path = os.path.abspath(os.path.join(os.path.dirname(__file__), "downloads", "cat.png"))
# let's first prepare the payload to send over
payload = {"id": 12, "filename": os.path.basename(file_path), "message": "So cute!"}
# now JSON encode it and then turn it onto a bytes stream by encoding it as UTF-8
json_data = json.dumps(payload).encode("utf-8")
# then connect to the server and send everything
with socket.socket(socket.AF_INET, socket.SOCK_STREAM) as s: # create a socket
print("Connecting...")
s.connect(("127.0.0.1", 1234)) # connect to the server
# first send the JSON payload length
print("Sending `{filename}` with a message: {message}.".format(**payload))
s.sendall(struct.pack(">I", len(json_data))) # pack as BE 32-bit unsigned int
# now send the JSON payload itself
s.sendall(json_data) # let Python deal with the buffer on its own for the JSON...
# finally, open the file and 'stream' it to the socket
with open(file_path, "rb") as f:
chunk = f.read(BUFFER_SIZE)
while chunk:
s.send(chunk)
chunk = f.read(BUFFER_SIZE)
# alternatively, if you're using Python 3.5+ you can just use socket.sendfile() instead
print("Sent.")
和服务器:
import json
import os
import socket
import struct
BUFFER_SIZE = 4096 # a uniform buffer size to use for our transfers
target_path = os.path.abspath(os.path.join(os.path.dirname(__file__), "fileCache"))
with socket.socket(socket.AF_INET, socket.SOCK_STREAM) as s:
s.bind(("127.0.0.1", 1234)) # bind to the 1234 port on localhost
s.listen(0) # allow only one connection so we don't have to deal with data separation
while True:
print("Waiting for a connection...")
connection, address = s.accept() # wait for and accept the incoming connection
print("Connection from `{}` accepted.".format(address))
# read the starting 32 bits and unpack them into an int to get the JSON length
json_length = struct.unpack(">I", connection.recv(4))[0]
# now read the JSON data of the given size and JSON decode it
json_data = b"" # initiate an empty bytes structure
while len(json_data) < json_length:
chunk = connection.recv(min(BUFFER_SIZE, json_length - len(json_data)))
if not chunk: # no data, possibly broken connection/bad protocol
break # just exit for now, you should deal with this case in production
json_data += chunk
payload = json.loads(json_data.decode("utf-8")) # JSON decode the payload
# now read the rest and store it into a file at the target path
file_path = os.path.join(target_path, payload["filename"])
with open(file_path, "wb") as f: # open the target file for writing...
chunk = connection.recv(BUFFER_SIZE) # and stream the socket data to it...
while chunk:
f.write(chunk)
chunk = connection.recv(BUFFER_SIZE)
# finally, lets print out that we received the data
print("Received `{filename}` with a message: {message}".format(**payload))
注意:请记住,这是 Python 3.x 代码 - 对于 Python 2.x,您必须自己处理上下文管理,而不是让with ...
阻止打开/关闭套接字。
这就是全部内容。当然,在实际环境中,您需要处理断开连接、多个客户端等问题。但这是底层过程。