0 前言





1 环境配置

1.1 python环境


git clone https://github.com/ultralytics/yolov5


conda create -n yolov5py37 python=3.7

pip install torch==1.7.1+cu110 torchvision==0.8.2+cu110 torchaudio==0.7.2 -f https://download.pytorch.org/whl/torch_stable.html

pip install -r requirements.txt

1.2 官方github的样例

1.2.1  打印检测结果


import torch

# Model
model = torch.hub.load('ultralytics/yolov5', 'yolov5s')  # or yolov5m, yolov5l, yolov5x, custom

# Images
img = 'https://ultralytics.com/images/zidane.jpg'  # or file, Path, PIL, OpenCV, numpy, list

# Inference
results = model(img)

# Results
results.print()  # or .show(), .save(), .crop(), .pandas(), etc.


python inference.py
(yolov5py37) meng@meng:~/deeplearning/yolov5$ python inference.py 
Downloading: "https://github.com/ultralytics/yolov5/archive/master.zip" to /home/meng/.cache/torch/hub/master.zip
Downloading https://ultralytics.com/assets/Arial.ttf to /home/meng/.config/Ultralytics/Arial.ttf...
fatal: 不是一个 git 仓库(或者任何父目录):.git
YOLOv5 🚀 2022-3-12 torch 1.7.1+cu110 CUDA:0 (NVIDIA GeForce RTX 3070, 7960MiB)

Downloading https://github.com/ultralytics/yolov5/releases/download/v6.1/yolov5s.pt to yolov5s.pt...
100%|█████████████████████████████████████| 14.1M/14.1M [00:07<00:00, 2.06MB/s]

Fusing layers... 
Model Summary: 213 layers, 7225885 parameters, 0 gradients, 16.5 GFLOPs
Adding AutoShape... 
image 1/1: 720x1280 2 persons, 2 ties
Speed: 7411.2ms pre-process, 8.4ms inference, 1.2ms NMS per image at shape (1, 3, 384, 640)

1.2.2 展示检测结果


2 运用detect.py进行检测


python detect.py --source 0  # webcam
                          img.jpg  # image
                          vid.mp4  # video
                          path/  # directory
                          path/*.jpg  # glob
                          'https://youtu.be/Zgi9g1ksQHc'  # YouTube
                          'rtsp://example.com/media.mp4'  # RTSP, RTMP, HTTP stream


2.1 网络摄像头


python detect.py --source 0


(yolov5py37) meng@meng:~/deeplearning/yolov5$ python detect.py --source 0
detect: weights=yolov5s.pt, source=0, data=data/coco128.yaml, imgsz=[640, 640], conf_thres=0.25, iou_thres=0.45, max_det=1000, device=, view_img=False, save_txt=False, save_conf=False, save_crop=False, nosave=False, classes=None, agnostic_nms=False, augment=False, visualize=False, update=False, project=runs/detect, name=exp, exist_ok=False, line_thickness=3, hide_labels=False, hide_conf=False, half=False, dnn=False
YOLOv5 🚀 v6.1-28-gc6b4f84 torch 1.7.1+cu110 CUDA:0 (NVIDIA GeForce RTX 3070, 7960MiB)

Fusing layers... 
Model Summary: 213 layers, 7225885 parameters, 0 gradients, 16.5 GFLOPs
1/1: 0...  Success (inf frames 640x480 at 30.00 FPS)

0: 480x640 1 person, 1 cup, 2 chairs, 2 tvs, Done. (0.501s)

2.2 将检测过程可视化

python detect.py --visualize





import numpy as np

        运行python new.py即可看到里面的矩阵数据,但数据挺多的:

3 运用train.py进行训练

3.1 第一次报错

python train.py --data coco.yaml --cfg yolov5n.yaml --weights '' --batch-size 128


Traceback (most recent call last):
  File "train.py", line 643, in <module>
  File "train.py", line 539, in main
    train(opt.hyp, opt, device, callbacks)
  File "train.py", line 227, in train
    prefix=colorstr('train: '), shuffle=True)
  File "/home/meng/deeplearning/yolov5/utils/datasets.py", line 109, in create_dataloader
  File "/home/meng/deeplearning/yolov5/utils/datasets.py", line 433, in __init__
    assert nf > 0 or not augment, f'{prefix}No labels in {cache_path}. Can not train without labels. See {HELP_URL}'
AssertionError: train: No labels in /home/meng/deeplearning/datasets/coco/train2017.cache. Can not train without labels. See https://github.com/ultralytics/yolov5/wiki/Train-Custom-Data


3.2 换一条命令

参考:Train Custom Data · ultralytics/yolov5 Wiki · GitHub

python train.py --img 640 --batch 16 --epochs 3 --data coco128.yaml --weights yolov5s.pt

        可以跑通,尽管也报:Dataset not found, missing paths: ['/home/meng/deeplearning/datasets/coco128/images/train2017']

meng@meng:~/deeplearning/yolov5$ python train.py --img 640 --batch 16 --epochs 3 --data coco128.yaml --weights yolov5s.pt
train: weights=yolov5s.pt, cfg=, data=coco128.yaml, hyp=data/hyps/hyp.scratch-low.yaml, epochs=3, batch_size=16, imgsz=640, rect=False, resume=False, nosave=False, noval=False, noautoanchor=False, evolve=None, bucket=, cache=None, image_weights=False, device=, multi_scale=False, single_cls=False, optimizer=SGD, sync_bn=False, workers=8, project=runs/train, name=exp, exist_ok=False, quad=False, cos_lr=False, label_smoothing=0.0, patience=100, freeze=[0], save_period=-1, local_rank=-1, entity=None, upload_dataset=False, bbox_interval=-1, artifact_alias=latest
github: skipping check (offline), for updates see https://github.com/ultralytics/yolov5
YOLOv5 🚀 v6.1-28-gc6b4f84 torch 1.7.1+cu110 CUDA:0 (NVIDIA GeForce RTX 3070, 7960MiB)

hyperparameters: lr0=0.01, lrf=0.01, momentum=0.937, weight_decay=0.0005, warmup_epochs=3.0, warmup_momentum=0.8, warmup_bias_lr=0.1, box=0.05, cls=0.5, cls_pw=1.0, obj=1.0, obj_pw=1.0, iou_t=0.2, anchor_t=4.0, fl_gamma=0.0, hsv_h=0.015, hsv_s=0.7, hsv_v=0.4, degrees=0.0, translate=0.1, scale=0.5, shear=0.0, perspective=0.0, flipud=0.0, fliplr=0.5, mosaic=1.0, mixup=0.0, copy_paste=0.0
Weights & Biases: run 'pip install wandb' to automatically track and visualize YOLOv5 🚀 runs (RECOMMENDED)
TensorBoard: Start with 'tensorboard --logdir runs/train', view at http://localhost:6006/

Dataset not found, missing paths: ['/home/meng/deeplearning/datasets/coco128/images/train2017']
Downloading https://ultralytics.com/assets/coco128.zip to coco128.zip...
100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 6.66M/6.66M [00:02<00:00, 2.44MB/s]
Dataset autodownload success, saved to /home/meng/deeplearning/datasets

                 from  n    params  module                                  arguments                     
  0                -1  1      3520  models.common.Conv                      [3, 32, 6, 2, 2]              
  1                -1  1     18560  models.common.Conv                      [32, 64, 3, 2]                
  2                -1  1     18816  models.common.C3                        [64, 64, 1]                   
  3                -1  1     73984  models.common.Conv                      [64, 128, 3, 2]               
  4                -1  2    115712  models.common.C3                        [128, 128, 2]                 
  5                -1  1    295424  models.common.Conv                      [128, 256, 3, 2]              
  6                -1  3    625152  models.common.C3                        [256, 256, 3]                 
  7                -1  1   1180672  models.common.Conv                      [256, 512, 3, 2]              
  8                -1  1   1182720  models.common.C3                        [512, 512, 1]                 
  9                -1  1    656896  models.common.SPPF                      [512, 512, 5]                 
 10                -1  1    131584  models.common.Conv                      [512, 256, 1, 1]              
 11                -1  1         0  torch.nn.modules.upsampling.Upsample    [None, 2, 'nearest']          
 12           [-1, 6]  1         0  models.common.Concat                    [1]                           
 13                -1  1    361984  models.common.C3                        [512, 256, 1, False]          
 14                -1  1     33024  models.common.Conv                      [256, 128, 1, 1]              
 15                -1  1         0  torch.nn.modules.upsampling.Upsample    [None, 2, 'nearest']          
 16           [-1, 4]  1         0  models.common.Concat                    [1]                           
 17                -1  1     90880  models.common.C3                        [256, 128, 1, False]          
 18                -1  1    147712  models.common.Conv                      [128, 128, 3, 2]              
 19          [-1, 14]  1         0  models.common.Concat                    [1]                           
 20                -1  1    296448  models.common.C3                        [256, 256, 1, False]          
 21                -1  1    590336  models.common.Conv                      [256, 256, 3, 2]              
 22          [-1, 10]  1         0  models.common.Concat                    [1]                           
 23                -1  1   1182720  models.common.C3                        [512, 512, 1, False]          
 24      [17, 20, 23]  1    229245  models.yolo.Detect                      [80, [[10, 13, 16, 30, 33, 23], [30, 61, 62, 45, 59, 119], [116, 90, 156, 198, 373, 326]], [128, 256, 512]]
Model Summary: 270 layers, 7235389 parameters, 7235389 gradients, 16.5 GFLOPs

Transferred 349/349 items from yolov5s.pt
Scaled weight_decay = 0.0005
optimizer: SGD with parameter groups 57 weight (no decay), 60 weight, 60 bias
train: Scanning '/home/meng/deeplearning/datasets/coco128/labels/train2017' images and labels...128 found, 0 missing, 2 empty, 0 corrupt: 100%|███████████████████████████████████████████████████████████████████████████████████████████████| 128/128 [00:00<00:00, 11289.71it/s]
train: New cache created: /home/meng/deeplearning/datasets/coco128/labels/train2017.cache
val: Scanning '/home/meng/deeplearning/datasets/coco128/labels/train2017.cache' images and labels... 128 found, 0 missing, 2 empty, 0 corrupt: 100%|█████████████████████████████████████████████████████████████████████████████████████████████████████| 128/128 [00:00<?, ?it/s]
Plotting labels to runs/train/exp8/labels.jpg... 

AutoAnchor: 4.27 anchors/target, 0.994 Best Possible Recall (BPR). Current anchors are a good fit to dataset ✅
Image sizes 640 train, 640 val
Using 8 dataloader workers
Logging results to runs/train/exp8
Starting training for 3 epochs...

     Epoch   gpu_mem       box       obj       cls    labels  img_size
       0/2      3.3G   0.04377   0.06153   0.01789       226       640: 100%|██████████| 8/8 [00:03<00:00,  2.39it/s]                                                                                                                                                              
               Class     Images     Labels          P          R     mAP@.5 mAP@.5:.95: 100%|██████████| 4/4 [00:00<00:00,  6.85it/s]                                                                                                                                              
                 all        128        929      0.759      0.618      0.717      0.474

     Epoch   gpu_mem       box       obj       cls    labels  img_size
       1/2     4.27G   0.04464    0.0689   0.01817       207       640: 100%|██████████| 8/8 [00:01<00:00,  4.49it/s]                                                                                                                                                              
               Class     Images     Labels          P          R     mAP@.5 mAP@.5:.95: 100%|██████████| 4/4 [00:00<00:00,  6.95it/s]                                                                                                                                              
                 all        128        929      0.694      0.678      0.732      0.487

     Epoch   gpu_mem       box       obj       cls    labels  img_size
       2/2     4.27G   0.04492   0.06209   0.01751       241       640: 100%|██████████| 8/8 [00:01<00:00,  4.52it/s]                                                                                                                                                              
               Class     Images     Labels          P          R     mAP@.5 mAP@.5:.95: 100%|██████████| 4/4 [00:00<00:00,  6.98it/s]                                                                                                                                              
                 all        128        929      0.704      0.674      0.737       0.49

3 epochs completed in 0.003 hours.
Optimizer stripped from runs/train/exp8/weights/last.pt, 14.9MB
Optimizer stripped from runs/train/exp8/weights/best.pt, 14.9MB

Validating runs/train/exp8/weights/best.pt...
Fusing layers... 
Model Summary: 213 layers, 7225885 parameters, 0 gradients, 16.5 GFLOPs
               Class     Images     Labels          P          R     mAP@.5 mAP@.5:.95: 100%|██████████| 4/4 [00:03<00:00,  1.08it/s]                                                                                                                                              
                 all        128        929      0.699      0.679      0.738      0.491
              person        128        254      0.828      0.736      0.811       0.52
             bicycle        128          6      0.778      0.593      0.648      0.407
                 car        128         46      0.655      0.454      0.576      0.248
          motorcycle        128          5      0.571        0.8      0.866      0.705
            airplane        128          6      0.921          1      0.995      0.736
                 bus        128          7      0.626      0.714      0.738      0.626
               train        128          3      0.613      0.667      0.806      0.571
               truck        128         12      0.454      0.417      0.495       0.27
                boat        128          6      0.794      0.333      0.464      0.173
       traffic light        128         14      0.648      0.266      0.362      0.216
           stop sign        128          2      0.751          1      0.995      0.796
               bench        128          9      0.633      0.556      0.624      0.231
                bird        128         16      0.899          1      0.995      0.634
                 cat        128          4      0.704          1      0.995      0.797
                 dog        128          9      0.782      0.667      0.851      0.567
               horse        128          2      0.723          1      0.995      0.672
            elephant        128         17      0.945      0.882      0.934      0.694
                bear        128          1      0.635          1      0.995      0.895
               zebra        128          4      0.848          1      0.995      0.947
             giraffe        128          9      0.704      0.778       0.94      0.687
            backpack        128          6      0.751        0.5      0.779      0.362
            umbrella        128         18      0.873      0.765      0.899      0.513
             handbag        128         19      0.599      0.238      0.335      0.142
                 tie        128          7      0.708      0.714       0.81      0.498
            suitcase        128          4      0.726          1      0.995      0.563
             frisbee        128          5      0.688        0.8        0.8        0.7
                skis        128          1      0.598          1      0.995      0.398
           snowboard        128          7      0.796      0.714      0.848      0.567
         sports ball        128          6      0.613      0.667      0.603      0.309
                kite        128         10      0.777      0.698      0.629      0.249
        baseball bat        128          4      0.381        0.5        0.4      0.135
      baseball glove        128          7      0.527      0.429      0.457      0.309
          skateboard        128          5          1      0.571       0.69      0.476
       tennis racket        128          7      0.438      0.448      0.534      0.291
              bottle        128         18      0.695      0.635        0.6      0.281
          wine glass        128         16      0.605          1      0.916      0.469
                 cup        128         36      0.795      0.753      0.845      0.542
                fork        128          6      0.866      0.333      0.445      0.314
               knife        128         16      0.731      0.688      0.656      0.367
               spoon        128         22      0.695      0.545      0.645       0.35
                bowl        128         28      0.834      0.719      0.741      0.505
              banana        128          1      0.465          1      0.995      0.298
            sandwich        128          2          1          0      0.606      0.535
              orange        128          4      0.801          1      0.995      0.703
            broccoli        128         11      0.461      0.455      0.476       0.35
              carrot        128         24      0.633      0.625      0.736      0.473
             hot dog        128          2      0.436          1      0.828      0.712
               pizza        128          5      0.833        0.8      0.962      0.677
               donut        128         14      0.667          1       0.96       0.82
                cake        128          4      0.823          1      0.995      0.846
               chair        128         35      0.485      0.629      0.587      0.297
               couch        128          6          1      0.761      0.881       0.54
        potted plant        128         14      0.712      0.786      0.858      0.471
                 bed        128          3       0.43      0.287      0.597      0.374
        dining table        128         13      0.798      0.608      0.598      0.389
              toilet        128          2      0.791          1      0.995      0.895
                  tv        128          2      0.575          1      0.995      0.796
              laptop        128          3      0.965      0.333      0.665      0.399
               mouse        128          2          1          0     0.0923     0.0462
              remote        128          8      0.806      0.625      0.635       0.54
          cell phone        128          8      0.474      0.375      0.365      0.214
           microwave        128          3      0.749          1      0.995        0.7
                oven        128          5      0.413        0.4       0.44      0.289
                sink        128          6      0.377      0.333       0.34      0.217
        refrigerator        128          5      0.593        0.8      0.808       0.55
                book        128         29      0.462      0.356      0.321      0.158
               clock        128          9      0.671      0.778      0.879      0.728
                vase        128          2      0.408          1      0.995      0.895
            scissors        128          1          1          0      0.332     0.0663
          teddy bear        128         21      0.764      0.667      0.787      0.496
          toothbrush        128          5      0.802          1      0.962      0.621
Results saved to runs/train/exp8


3.3 对比上面两条命令的数据集


# YOLOv5 🚀 by Ultralytics, GPL-3.0 license
# COCO 2017 dataset http://cocodataset.org by Microsoft
# Example usage: python train.py --data coco.yaml
# parent
# ├── yolov5
# └── datasets
#     └── coco  ← downloads here

# Train/val/test sets as 1) dir: path/to/imgs, 2) file: path/to/imgs.txt, or 3) list: [path/to/imgs1, path/to/imgs2, ..]
path: ../datasets/coco  # dataset root dir
train: train2017.txt  # train images (relative to 'path') 118287 images
val: val2017.txt  # val images (relative to 'path') 5000 images
test: test-dev2017.txt  # 20288 of 40670 images, submit to https://competitions.codalab.org/competitions/20794

# Classes
nc: 80  # number of classes
names: ['person', 'bicycle', 'car', 'motorcycle', 'airplane', 'bus', 'train', 'truck', 'boat', 'traffic light',
        'fire hydrant', 'stop sign', 'parking meter', 'bench', 'bird', 'cat', 'dog', 'horse', 'sheep', 'cow',
        'elephant', 'bear', 'zebra', 'giraffe', 'backpack', 'umbrella', 'handbag', 'tie', 'suitcase', 'frisbee',
        'skis', 'snowboard', 'sports ball', 'kite', 'baseball bat', 'baseball glove', 'skateboard', 'surfboard',
        'tennis racket', 'bottle', 'wine glass', 'cup', 'fork', 'knife', 'spoon', 'bowl', 'banana', 'apple',
        'sandwich', 'orange', 'broccoli', 'carrot', 'hot dog', 'pizza', 'donut', 'cake', 'chair', 'couch',
        'potted plant', 'bed', 'dining table', 'toilet', 'tv', 'laptop', 'mouse', 'remote', 'keyboard', 'cell phone',
        'microwave', 'oven', 'toaster', 'sink', 'refrigerator', 'book', 'clock', 'vase', 'scissors', 'teddy bear',
        'hair drier', 'toothbrush']  # class names

# Download script/URL (optional)
download: |
  from utils.general import download, Path

  # Download labels
  segments = False  # segment or box labels
  dir = Path(yaml['path'])  # dataset root dir
  url = 'https://github.com/ultralytics/yolov5/releases/download/v1.0/'
  urls = [url + ('coco2017labels-segments.zip' if segments else 'coco2017labels.zip')]  # labels
  download(urls, dir=dir.parent)

  # Download data
  urls = ['http://images.cocodataset.org/zips/train2017.zip',  # 19G, 118k images
          'http://images.cocodataset.org/zips/val2017.zip',  # 1G, 5k images
          'http://images.cocodataset.org/zips/test2017.zip']  # 7G, 41k images (optional)
  download(urls, dir=dir / 'images', threads=3)



# YOLOv5 🚀 by Ultralytics, GPL-3.0 license
# COCO128 dataset https://www.kaggle.com/ultralytics/coco128 (first 128 images from COCO train2017) by Ultralytics
# Example usage: python train.py --data coco128.yaml
# parent
# ├── yolov5
# └── datasets
#     └── coco128  ← downloads here

# Train/val/test sets as 1) dir: path/to/imgs, 2) file: path/to/imgs.txt, or 3) list: [path/to/imgs1, path/to/imgs2, ..]
path: ../datasets/coco128  # dataset root dir
train: images/train2017  # train images (relative to 'path') 128 images
val: images/train2017  # val images (relative to 'path') 128 images
test:  # test images (optional)

# Classes
nc: 80  # number of classes
names: ['person', 'bicycle', 'car', 'motorcycle', 'airplane', 'bus', 'train', 'truck', 'boat', 'traffic light',
        'fire hydrant', 'stop sign', 'parking meter', 'bench', 'bird', 'cat', 'dog', 'horse', 'sheep', 'cow',
        'elephant', 'bear', 'zebra', 'giraffe', 'backpack', 'umbrella', 'handbag', 'tie', 'suitcase', 'frisbee',
        'skis', 'snowboard', 'sports ball', 'kite', 'baseball bat', 'baseball glove', 'skateboard', 'surfboard',
        'tennis racket', 'bottle', 'wine glass', 'cup', 'fork', 'knife', 'spoon', 'bowl', 'banana', 'apple',
        'sandwich', 'orange', 'broccoli', 'carrot', 'hot dog', 'pizza', 'donut', 'cake', 'chair', 'couch',
        'potted plant', 'bed', 'dining table', 'toilet', 'tv', 'laptop', 'mouse', 'remote', 'keyboard', 'cell phone',
        'microwave', 'oven', 'toaster', 'sink', 'refrigerator', 'book', 'clock', 'vase', 'scissors', 'teddy bear',
        'hair drier', 'toothbrush']  # class names

# Download script/URL (optional)
download: https://ultralytics.com/assets/coco128.zip

3.4 第一次报错解决一半


http://images.cocodataset.org/zips/train2017.zip # 19G, 118k images
http://images.cocodataset.org/zips/val2017.zip   # 1G, 5k images


         同时将coco.yaml 中download部分删除掉(备份好)。


python train.py --data coco.yaml --cfg yolov5n.yaml --weights '' --batch-size 128

        这时可以正常找到图片文件,但是cuda out of memory了;我的显存只有8g

RuntimeError: CUDA out of memory. Tried to allocate 200.00 MiB (GPU 0; 7.77 GiB total capacity; 5.70 GiB already allocated; 177.62 MiB free; 5.92 GiB reserved in total by PyTorch)






