我正在测试机器学习水域并使用TS成立模型来重新训练网络以对我想要的对象进行分类。
最初,我的预测是在本地存储的图像上运行的,我意识到从文件中取消持久化图形需要 2-5 秒的时间,并且大约在同一时间运行实际的预测。
此后,我调整了我的代码以合并来自 OpenCV 的摄像头输入,但在上述情况下,视频延迟是不可避免的。
初始图形加载期间预计会出现时间问题;这就是为什么initialSetup()
是预先运行的,但是 2-5 秒是荒谬的。
我觉得我目前的申请;实时分类,这不是最好的加载方式。还有另一种方法可以做到这一点吗?我知道对于移动版本,TS 建议缩小图表。减肥是这里的出路吗?以防万一,我的图表当前为 87.4MB
除此之外,有没有办法加快预测过程?
import os
import cv2
import timeit
import numpy as np
import tensorflow as tf
camera = cv2.VideoCapture(0)
# Loads label file, strips off carriage return
label_lines = [line.rstrip() for line
in tf.gfile.GFile('retrained_labels.txt')]
def grabVideoFeed():
grabbed, frame = camera.read()
return frame if grabbed else None
def initialSetup():
os.environ['TF_CPP_MIN_LOG_LEVEL'] = '2'
start_time = timeit.default_timer()
# This takes 2-5 seconds to run
# Unpersists graph from file
with tf.gfile.FastGFile('retrained_graph.pb', 'rb') as f:
graph_def = tf.GraphDef()
graph_def.ParseFromString(f.read())
tf.import_graph_def(graph_def, name='')
print 'Took {} seconds to unpersist the graph'.format(timeit.default_timer() - start_time)
def classify(image_data):
print '********* Session Start *********'
with tf.Session() as sess:
start_time = timeit.default_timer()
# Feed the image_data as input to the graph and get first prediction
softmax_tensor = sess.graph.get_tensor_by_name('final_result:0')
print 'Tensor', softmax_tensor
print 'Took {} seconds to feed data to graph'.format(timeit.default_timer() - start_time)
start_time = timeit.default_timer()
# This takes 2-5 seconds as well
predictions = sess.run(softmax_tensor, {'Mul:0': image_data})
print 'Took {} seconds to perform prediction'.format(timeit.default_timer() - start_time)
start_time = timeit.default_timer()
# Sort to show labels of first prediction in order of confidence
top_k = predictions[0].argsort()[-len(predictions[0]):][::-1]
print 'Took {} seconds to sort the predictions'.format(timeit.default_timer() - start_time)
for node_id in top_k:
human_string = label_lines[node_id]
score = predictions[0][node_id]
print('%s (score = %.5f)' % (human_string, score))
print '********* Session Ended *********'
initialSetup()
while True:
frame = grabVideoFeed()
if frame is None:
raise SystemError('Issue grabbing the frame')
frame = cv2.resize(frame, (299, 299), interpolation=cv2.INTER_CUBIC)
# adhere to TS graph input structure
numpy_frame = np.asarray(frame)
numpy_frame = cv2.normalize(numpy_frame.astype('float'), None, -0.5, .5, cv2.NORM_MINMAX)
numpy_final = np.expand_dims(numpy_frame, axis=0)
classify(numpy_final)
cv2.imshow('Main', frame)
if cv2.waitKey(1) & 0xFF == ord('q'):
break
camera.release()
cv2.destroyAllWindows()
EDIT 1
调试代码后,我意识到会话创建是一项既消耗资源又消耗时间的操作。
在之前的代码中,除了运行预测之外,还为每个 OpenCV 源创建了一个新会话。
将 OpenCV 操作封装在单个会话中可以节省大量时间,但这仍然会增加初始运行的大量开销;预测需要 2-3 秒。此后,预测需要大约 0.5 秒,这使得相机输入仍然滞后。
import os
import cv2
import timeit
import numpy as np
import tensorflow as tf
camera = cv2.VideoCapture(0)
# Loads label file, strips off carriage return
label_lines = [line.rstrip() for line
in tf.gfile.GFile('retrained_labels.txt')]
def grabVideoFeed():
grabbed, frame = camera.read()
return frame if grabbed else None
def initialSetup():
os.environ['TF_CPP_MIN_LOG_LEVEL'] = '2'
start_time = timeit.default_timer()
# This takes 2-5 seconds to run
# Unpersists graph from file
with tf.gfile.FastGFile('retrained_graph.pb', 'rb') as f:
graph_def = tf.GraphDef()
graph_def.ParseFromString(f.read())
tf.import_graph_def(graph_def, name='')
print 'Took {} seconds to unpersist the graph'.format(timeit.default_timer() - start_time)
initialSetup()
with tf.Session() as sess:
start_time = timeit.default_timer()
# Feed the image_data as input to the graph and get first prediction
softmax_tensor = sess.graph.get_tensor_by_name('final_result:0')
print 'Took {} seconds to feed data to graph'.format(timeit.default_timer() - start_time)
while True:
frame = grabVideoFeed()
if frame is None:
raise SystemError('Issue grabbing the frame')
frame = cv2.resize(frame, (299, 299), interpolation=cv2.INTER_CUBIC)
cv2.imshow('Main', frame)
# adhere to TS graph input structure
numpy_frame = np.asarray(frame)
numpy_frame = cv2.normalize(numpy_frame.astype('float'), None, -0.5, .5, cv2.NORM_MINMAX)
numpy_final = np.expand_dims(numpy_frame, axis=0)
start_time = timeit.default_timer()
# This takes 2-5 seconds as well
predictions = sess.run(softmax_tensor, {'Mul:0': numpy_final})
print 'Took {} seconds to perform prediction'.format(timeit.default_timer() - start_time)
start_time = timeit.default_timer()
# Sort to show labels of first prediction in order of confidence
top_k = predictions[0].argsort()[-len(predictions[0]):][::-1]
print 'Took {} seconds to sort the predictions'.format(timeit.default_timer() - start_time)
for node_id in top_k:
human_string = label_lines[node_id]
score = predictions[0][node_id]
print('%s (score = %.5f)' % (human_string, score))
print '********* Session Ended *********'
if cv2.waitKey(1) & 0xFF == ord('q'):
sess.close()
break
camera.release()
cv2.destroyAllWindows()
EDIT 2
经过一番摆弄后,我偶然发现图量化 and 图变换这些就是所取得的成果。
原图:87.4MB
量化图:87.5MB
转换后的图:87.1MB
八位计算:22MB但遇到this使用时。