recognition mnist handwriting digits

2023-11-12

recognition mnist handwriting digits

download mnist and load data

MNIST can be downloaded in this website http://yann.lecun.com/exdb/mnist/.

After download data set, unzip it like this: tar -xzvf ‘****.gz’.

And there will two datasets and four files used by us in forward steps.

t10k-images.idx3-ubyte, t10k-labels.idx3-ubyte

train-images.idx3-ubyte, train-labels.idx3-ubyte

There are two functions provided to extract the data.
One is implemented in Python language.
The other is implemented in Matlab language.

function images = loadMNISTImages(filename)
%loadMNISTImages returns a [number of MNIST images]x28x28 matrix containing
%the raw MNIST images

fp = fopen(filename, 'rb');
assert(fp ~= -1, ['Could not open ', filename, '']);

magic = fread(fp, 1, 'int32', 0, 'ieee-be');
assert(magic == 2051, ['Bad magic number in ', filename, '']);

numImages = fread(fp, 1, 'int32', 0, 'ieee-be');
numRows = fread(fp, 1, 'int32', 0, 'ieee-be');
numCols = fread(fp, 1, 'int32', 0, 'ieee-be');

images = fread(fp, inf, 'unsigned char');
images = reshape(images, numCols, numRows, numImages);
images = permute(images,[2 1 3]);

fclose(fp);

% Reshape to #pixels x #examples
images = reshape(images, size(images, 1) * size(images, 2), size(images, 3));
% Convert to double and rescale to [0,1]
images = double(images) / 255;

end
function labels = loadMNISTLabels(filename)
%loadMNISTLabels returns a [number of MNIST images]x1 matrix containing
%the labels for the MNIST images

fp = fopen(filename, 'rb');
assert(fp ~= -1, ['Could not open ', filename, '']);

magic = fread(fp, 1, 'int32', 0, 'ieee-be');
assert(magic == 2049, ['Bad magic number in ', filename, '']);

numLabels = fread(fp, 1, 'int32', 0, 'ieee-be');

labels = fread(fp, inf, 'unsigned char');

assert(size(labels,1) == numLabels, 'Mismatch in label count');

fclose(fp);

end

The code above is provided by Prof. Andrew Ng.
The following code extracts data by Python.
Firstly, we should load the library we need.

# import libs we need
import numpy as np
import struct
import matplotlib.pyplot as plt

In LeCun’s blog, how the picture is saved has been illustrated in details.

lenet-5dataintroduc

# load data

def loadimg(imgfilename):
    with open(imgfilename, 'rb') as imgfile:
        datastr = imgfile.read()

    index = 0
    mgc_num, img_num, row_num, col_num = struct.unpack_from('>IIII', datastr, index)
    index += struct.calcsize('>IIII')

    image_array = np.zeros((img_num, row_num, col_num))
    for img_idx in xrange(img_num):
        img = struct.unpack_from('>784B', datastr, index)
        index += struct.calcsize('>784B')
        image_array[img_idx,:,:] = np.reshape(np.array(img), (28,28))
    image_array = image_array/255.0
    np.save(imgfilename[:6]+'image-py', image_array)
    return None

def loadlabel(labelfilename):
    with open(labelfilename, 'rb') as labelfile:
        datastr = labelfile.read()

    index = 0
    mgc_num, label_num = struct.unpack_from('>II', datastr, index)
    index += struct.calcsize('>II')

    label = struct.unpack_from('{}B'.format(label_num), datastr, index)
    index += struct.calcsize('{}B'.format(label_num))

    label_array = np.array(label)

    np.save(labelfilename[:5]+'label-py', label_array)
    return None

The two functions above are used to import data and save them as a python-fitting format (.npy).

loadimg('train-images.idx3-ubyte')
loadimg('t10k-images.idx3-ubyte')
loadlabel('train-labels.idx1-ubyte')
loadlabel('t10k-labels.idx1-ubyte')

Then it is easy to load data by numpy function.

train_image = np.load('train-image-py.npy')
train_label = np.load('trainlabel-py.npy')
test_image = np.load('t10k-iimage-py.npy')
test_label = np.load('t10k-label-py.npy')

What are the dimensions of our Array?

print(train_image.shape)
print(train_label.shape)
print(test_image.shape)
print(test_label.shape)
(60000, 28, 28)
(60000,)
(10000, 28, 28)
(10000,)

Loading one of the pictures, we can check whether the data is well saved. And then using matplotlib to display it, we get a digit picture.

# check data
%matplotlib inline
im = train_image[9,:,:]
im = 255*im
plt.imshow(im, cmap='gray')
plt.show()

4.png

print(train_label[9])
4
im = test_image[17,:,:]
im = 255*im
plt.imshow(im, cmap='gray')
plt.show()
print(test_label[17])

7.png

7

OK. Do we finish the data-preparation stage?

import tensorflow as tf
from six.moves import reduce
image_size = 28
num_labels = 10
num_channels = 1 # gray scale

reformat = lambda data,labels: (data.reshape((-1, image_size, image_size, 1)).astype(np.float32),(np.arange(num_labels) == labels[:,None]).astype(np.float32))

Sorry, we do not finish it yet.
For training networks, we have to change the dimensions of our data. What’s more, we will add one more variable for storing the label of each picture. Our label variable is in One-Hot encoding format. Thanks to tensorflow, it has provided reformatfunction to help us.

If you have doubts with convolution networks or One-Hot Encoding, you can find some detailed explanations in my former blogs: Convolution Networks,One-Hot Encoding

train_dataset, train_labels = reformat(train_image, train_label)
test_dataset, test_labels = reformat(test_image, test_label)
print('train_dataset size: ', train_dataset.shape)
print('train_labels size: ', train_labels.shape)
print('test_dataset size: ', test_dataset.shape)
print('test_labels size: ', test_labels.shape)
('train_dataset size: ', (60000, 28, 28, 1))
('train_labels size: ', (60000, 10))
('test_dataset size: ', (10000, 28, 28, 1))
('test_labels size: ', (10000, 10))

At this step, we have finished our preparation.

We have got train_dataset,train_labels,test_dataset,test_labels in right format.

accuracy = lambda pred, labels: (100.0 * np.sum(np.argmax(pred,1) == np.argmax(labels,1))/pred.shape[0] )

Function accuracy is used to compute the accuracy of our model.

Training and Testing

Our architecture of convolution network is based on the following picture.
arch

We define the architecture in the function model.

Due to the dimensions of our initial picture is (28,28), the first convolution in the picture is ignored by us.

The optimizer of gradient descend algorithm is used as:

tf.train.AdamOptimizer(learning_rate=0.001).minimize(loss) .

batch_size = 128

num_steps = 4501

graph = tf.Graph()  
with graph.as_default():  
    # Input data.  
    tf_train_dataset = tf.placeholder(tf.float32, shape=(batch_size, image_size, image_size, num_channels)) # num_channels=1 grayscale   
    tf_train_labels = tf.placeholder(tf.float32, shape=(batch_size, num_labels))  
    tf_test_dataset = tf.constant(test_dataset)  

        # Variables.

    filter1 = tf.Variable(tf.truncated_normal([1,1,1,6], stddev=0.1))  
    biases1 = tf.Variable(tf.zeros([6]))  

    filter2 = tf.Variable(tf.truncated_normal( [5,5,6,16], stddev=0.1))  
    biases2 = tf.Variable(tf.constant(1.0, shape=[16]))  

    filter3 = tf.Variable(tf.truncated_normal([5,5, 16, 120], stddev=0.1))  
    biases3 = tf.Variable(tf.constant(1.0, shape=[120]))  

    weights1 = tf.Variable(tf.truncated_normal([120, 84], stddev=0.1))  
    w_biases1 = tf.Variable(tf.zeros([84]))  
    weights2 = tf.Variable(tf.truncated_normal([84, 10], stddev=0.1)) 
    w_biases2 = tf.Variable(tf.zeros([10]))  

    def model(data):
        # data (batch, 28, 28, 1)
        # filter1 (1, 1, 1, 6)
        conv = tf.nn.conv2d(data, filter1, [1,1,1,1], padding='SAME')
        conv = tf.nn.tanh(conv + biases1)
    # data reshaped to (batch, 28, 28, 1)
    # filter1 reshaped yo (1*1*1, 6)
    # conv shape (batch, 28, 28, 6)
    # sub-smapling
        conv = tf.nn.avg_pool(conv, [1,2,2,1], [1,2,2,1], padding='SAME')
    # conv shape(batch, 14, 14, 6)
    # filter2 shape(5, 5, 6, 16)
        conv = tf.nn.conv2d(conv, filter2, [1,1,1,1], padding='VALID')
    # conv reshaped to (batch, 10, 10, 5*5*6)
    # filter2 reshaped to (5*5*6, 16)
    # conv shape (batch, 10, 10, 16)
        conv = tf.nn.tanh(conv + biases2)
    # conv shape (batch, 10, 10, 16)
        conv = tf.nn.avg_pool(conv, [1,2,2,1], [1,2,2,1], padding='SAME')
    # conv shape (batch, 5,5 16)
    # filter3 shape (5,5, 16, 120)
        conv = tf.nn.conv2d(conv, filter3, [1,1,1,1], padding='VALID')
    # conv reshape( batch, 1, 1, 5*5*16)
    # filter3 reshape (5*5*16, 120)
    # conv = (batch, 1,1, 120)
        conv = tf.nn.tanh(conv + biases3)
        shape = conv.get_shape().as_list()
        reshape = tf.reshape(conv, (shape[0], reduce(lambda a,b:a*b, shape[1:])))
        hidden = tf.nn.relu(tf.matmul(reshape, weights1) + w_biases1) 
        hidden = tf.nn.dropout(hidden, 0.8)
        logits = tf.matmul(hidden, weights2) + w_biases2
        return logits

     # Training computation.  
    logits = model(tf_train_dataset)  
    loss = tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits(logits, tf_train_labels))  

        # Optimizer.  
    optimizer = tf.train.AdamOptimizer(learning_rate=0.001).minimize(loss)  

        # Predictions for the training, validation, and test data.  
    train_prediction = tf.nn.softmax(logits)  
    test_prediction = tf.nn.softmax(model(tf_test_dataset))  



with tf.Session(graph=graph) as session:
    tf.initialize_all_variables().run()
    print('Initialized')
    for step in range(num_steps):
        offset = (step * batch_size) % (train_labels.shape[0] - batch_size)
        batch_data = train_dataset[offset:(offset + batch_size), :, :, :]
        batch_labels = train_labels[offset:(offset + batch_size), :]
        feed_dict = {tf_train_dataset : batch_data, tf_train_labels : batch_labels}
        _, l, predictions = session.run([optimizer, loss, train_prediction], feed_dict=feed_dict)
        if (step % 500 == 0):

            print('Minibatch loss at step %d: %f' % (step, l))

            print('Minibatch accuracy: %.1f%%' % accuracy(predictions, batch_labels))
    print('Test accuracy: %.1f%%' % accuracy(test_prediction.eval(), test_labels))
Initialized
Minibatch loss at step 0: 2.441036
Minibatch accuracy: 10.9%
Minibatch loss at step 500: 0.214182
Minibatch accuracy: 92.2%
Minibatch loss at step 1000: 0.086537
Minibatch accuracy: 97.7%
Minibatch loss at step 1500: 0.107810
Minibatch accuracy: 96.9%
Minibatch loss at step 2000: 0.088142
Minibatch accuracy: 97.7%
Minibatch loss at step 2500: 0.150886
Minibatch accuracy: 96.1%
Minibatch loss at step 3000: 0.088806
Minibatch accuracy: 98.4%
Minibatch loss at step 3500: 0.039191
Minibatch accuracy: 97.7%
Minibatch loss at step 4000: 0.018480
Minibatch accuracy: 99.2%
Minibatch loss at step 4500: 0.010719
Minibatch accuracy: 100.0%
Test accuracy: 98.2%

The accuracy rate is about 98%. However, that is not a good enough accuracy rate. There are several aspects of our model to be improved:
1. subsampling type: max pooling, average pooling, etc.
2. activate function: relu, tanh, etc.
3. initial value.

本文内容由网友自发贡献,版权归原作者所有,本站不承担相应法律责任。如您发现有涉嫌抄袭侵权的内容,请联系:hwhale#tublm.com(使用前将#替换为@)

recognition mnist handwriting digits 的相关文章

随机推荐

  • c++能不能给类的成员变量在声明的时候初始化?

    能 可能早先的版本不能 但是c 11标准下能 有人说在声明的时候初始化相当于在构造函数中初始化 其实不是的 成员变量初始化的顺序为 先进行声明时初始化 然后进行初始化列表初始化 最后进行构造函数初始化 如下代码 另外初始化列表中初始化的顺序
  • STM32启动BOOT0 BOOT1设置方法

    转载自 https www jianshu com p 38c4a90bac19 不同的下载方式对应STM32启动方式也不同 如下图是STM32三种启动方式 第一种启动方式是最常用的用户FLASH启动 正常工作就在这种模式下 STM32的F
  • webpack4 sideEffects实战轻松搞懂

    sideEffects 译作副作用 函数副作用是指函数在正常工作任务之外对外部环境所施加的影响 具体地说 函数副作用是指函数被调用 完成了函数既定的计算任务 但同时因为访问了外部数据 尤其是因为对外部数据进行了写操作 从而一定程度地改变了系
  • Vue的过渡

    目录 单元素过渡 1 css过渡 2 过渡的类名介绍 3 CSS动画 4 自定义过渡的类名 5 元素过渡使用JavaScript钩子函数 多元素过渡 1 基础用法 2 key属性 3 过渡模式 多组件过渡 列表过渡 单元素过渡 1 css过
  • Android Zebra斑马打印机 打印面单不清楚 解决方法

    刚开始的效果 字体模糊 分析原因 1 打印机的打印浓度太低 2 文本字体不对 3 bitmap的问题 因为这个面单是通过view 获取到bitmap再喂给打印机打印的 下载打印机的驱动 设置打印机打印浓度 没有效果 调整字体类型没有效果 分
  • Java 加解密技术系列之 SHA

    序 上一篇文章中介绍了基本的单向加密算法 MD5 也大致的说了说它实现的原理 这篇文章继续之前提到的单向加密 主要讲的是 SHA 同 MD5 一样 SHA 同样也是一个系列 它包括 SHA 1 SHA 224 SHA 256 SHA 384
  • 服务器怎么开虚拟用户,Linux文件服务器实战详解(虚拟用户)

    vsftpd基于系统用户访问ftp服务器 系统用户越多越不利于管理 不利于系统安全 这样就以vsftp虚拟防护的方式来解决 虚拟用户没有实际的真实系统用户 而是通过映射到其中一个真实用户以及设置相应权限来访问验证 虚拟用户不能登陆系统 1
  • android项目迁移到androidX:类映射(android.support.design*)

    支持库类 AndroidX 类 android support design animation AnimationUtils com google android material animation AnimationUtils and
  • 01. 实现登录功能的UI自动化测试脚本

    在软件开发过程中 登录功能是一个非常重要且常见的功能 为了确保登录功能的稳定性和正确性 我们可以利用UI自动化测试来自动验证登录流程 本文将介绍如何编写一个登录功能的UI自动化测试脚本 并通过使用Selenium和pytest库实现自动化测
  • rbf神经网络预测matlab_MATLAB 基于灰色神经网络的预测算法研究—订单需求预测...

    点击上方蓝字关注 公众号 MATLAB 神经网络变量筛选 基于BP的神经网络变量筛选 灰色系统理论是一种研究少数据 贫信息 不确定性问题的新方法 它以部分信息已知 部分信息未知的 小样本 贫信息 不确定系统为研究对象 通过对 部分 已知信息
  • aot类型的自定义算子能编译成功,也能被调用,但运行时报错(官网leakyrelu示例)

    环境 mindspore gpu 2 0 cuda 11 1 代码 执行到这行报错 报错 解答 根据日志我们可以看到是dlopen失败 该行日志对应的完整截图信息 需要查看日志里的error message以进一步定位原因
  • 自动刷新网页代码(可多个网页)

    前言 在我们的日常工作中 有时候会遇到这种需求 就是需要不停的刷新当前页面 看看是否有变化 但是又不想手动去刷新 当然了 在浏览器的开发者工具里面点击拓展 会发现有一些现成的工具 但是不一定好用 而且很难同时刷新多个网页 因此本篇博客利用p
  • 基于多点通信的PtoP聊天程序

    import java io InputStream import java io OutputStream import java util Hashtable import javax microedition io Connector
  • button按钮组件VM359:1 Component "pages/index/index" does not have a method "btnClick" to handle event "t

    设置按钮组件事件bindtap 仅在wxml中定义但未在js文件中写响应函数就会导致这种情况 VM359 1 Component pages index index does not have a method btnClick to ha
  • Elasticsearch实战(十四)---聚合搜索Aggs多层嵌套聚合处理

    Elasticsearch实战 聚合搜索Aggs多层分组嵌套 统计处理 文章目录 Elasticsearch实战 聚合搜索Aggs多层分组嵌套 统计处理 1 准备数据 2 分组嵌套查询及count avg操作 2 1 以部门分组 求部门av
  • 基于亚马逊云科技无服务器服务快速搭建电商平台——部署篇

    受疫情影响消费者习惯发生改变 刺激了全球电商行业的快速发展 除了依托第三方电商平台将产品销售给消费者之外 企业通过品牌官网或者自有电商平台销售商品也是近几年电商领域快速发展的商业模式 独立站电商模式可以进行多方面 全渠道的互联网市场拓展 推
  • 接口测试-第03天-Postman用例集、断言、前置脚本、关联、生成测试报告

    更多功能测试以及全套学习路线图均在专栏 戳进去领取 系列文章目录 身为开发必知必会的Linux Linux远程连接 命令的使用 Linux命令大全 唯一以案例详解文 持续更新中 Linux命令大全以及数据库 唯一以案例详解文 已完结 Web
  • SVM算法笔记(2)

    线性可分支持向量机与硬间隔最大化 1 线性可分支持向量机 一般地 训练数据线性可分 存在无穷个分离超平面可将两类数据正确分开 感知机利用误分类最小的策略 求得分离超平面 解有无穷多个 线性可分支持向量机利用间隔最大化求最优分离超平面 解唯一
  • QT从入门到实战x篇_12_资源文件添加(QWindow的手动布局、路径用反斜杠、增加代码移植性)

    接上文 QT从入门到实战x篇 xx QMainWindow状态栏 铆接部件 核心部件 可以有多个的add 只能有一个的set 本篇介绍如何添加资源文件 创建Qt程序如下 1 手动实现上篇中提到的部件 1 在UI文件中可以双击菜单栏对应项目
  • recognition mnist handwriting digits

    recognition mnist handwriting digits download mnist and load data MNIST can be downloaded in this website http yann lecu