T5的整体介绍【代码实战】

2023-10-31

T5的整体介绍【代码实战】

0、前言
1.Header
2.summary
3 T5 model

0、前言

本文是对T5预训练模型的一个介绍，以及能够用来做任务测试，完整的代码稍后挂上链接。

1.Header

import torch
from torch import nn
import torch.nn.functional as F
import transformers
# from transformers_utils import get_params
from transformers import pipeline
# ~/.cache/huggingface/hub
from transformers import AutoTokenizer, AutoConfig, AutoModel
# ~/.cache/huggingface/datasets
from datasets import load_dataset
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import matplotlib as mpl
from IPython.display import Image

# trainable parameters of the model
def get_params(model):
    model_parameters = filter(lambda p: p.requires_grad, model.parameters())# 用filter函数过滤掉那些不需要梯度更新的参数，只保留那些需要梯度更新的参数，然后把它们放在一个变量，叫做model_parameters。这个变量也是一个迭代器。
    params = sum([np.prod(p.size()) for p in model_parameters])# 用一个列表推导式遍历model_parameters中的每个参数，然后用np.prod函数计算每个参数的元素个数。np.prod函数的作用是把一个序列中的所有元素相乘。例如，如果一个参数的形状是(2, 3)，那么它的元素个数就是2 * 3 = 6。然后把所有参数的元素个数加起来，得到一个总和，放在一个变量，叫做params。
    return params

# default: 100
mpl.rcParams['figure.dpi'] = 150
device = 'cuda' if torch.cuda.is_available() else 'cpu'
device

‘cuda’

2.summary

t5: Text-To-Text Transfer Transformer
- https://ai.googleblog.com/2020/02/exploring-transfer-learning-with-t5.html
- https://huggingface.co/docs/transformers/model_doc/t5

Image('t5.png')

在这里插入图片描述
可见可以做的任务有1.翻译；2.是否接受一个句子；3.句子直接的相似度计算；4.摘要。

CoLA: Linguistic Acceptability
- CoLA，全称为The Corpus of Linguistic Acceptability，是一个英语语言的句子接受度数据集，由华盛顿大学计算机科学与工程系的一组研究人员于2018年创建。该数据集旨在提供一个用于评估自然语言处理模型所生成文本的语言接受度和流畅度的基准测试集。
- CoLA数据集由10657个英语句子组成，这些句子来自各种不同的来源，包括核心新闻材料和审判文件等。每个句子都被标记为可接受或不可接受，可接受的句子应该具有语法正确性和常识性，相反，不可接受的句子可能会涉及句法错误、歧义、语义冲突等问题。
- CoLA数据集是典型的二元分类问题，用于测试模型对自然语言句子的语法和语义的理解能力。同时，CoLA数据集还提供了一个挑战，即对于不可接受的句子，模型需要能够识别错误类型以及如何修复它们。
- CoLA数据集被广泛应用于自然语言处理领域，特别是语言理解、句法分析、语义分析等方面的研究。该数据集已被众多研究人员用于评估自然语言处理模型的性能和对其改进。
STSB: Semantic Textual Similarity Beachmark
- STSB，全称为The Semantic Textual Similarity Benchmark，是一个用于评估自然语言处理模型对文本语义相似性的数据集和基准测试集。该数据集由华盛顿大学计算机科学与工程系的研究人员于2012年创建，主要用于评估模型在句子和文本相似性任务中的性能。
- STSB数据集包含7,000对句子，这些句子来自各种不同的来源，例如自然语言文本、问题回答对、比较句等。每个句子对都被标记为0到5之间的相似度分数（0表示两个句子毫不相似，5表示两个句子非常相似）。
- 在STSB数据集中，模型需要对每对句子的相似度进行预测。这是一个连续值预测问题，需要模型预测一个浮点数作为句子对之间的相似性得分。
- STSB数据集是测试模型在句子和文本相似性任务中的常用基准测试集之一。自该数据集发布以来，很多研究已经使用它来评估不同的自然语言处理模型和各种技术。它在一些实际应用领域，例如问答系统和文本匹配任务中具有重要实际意义。

3 T5 model

vocabulary size：32128

model	参数量	hidden dim	encoder/decoder layers
t5-small	61M	512 (64*8) -> 512	6
t5-base	223M	768 (64*12) -> 768	12
t5-large	738M	1024 (64*16) -> 1024	24
t5-3b	2.85B	4096 (128*32) -> 1024	24
t5-11b	11B	16384 (128*128) -> 1024	24

# t5-small
# t5-base
# t5-large
# t5-3b
# tb-11b
model_ckpt = 't5-3b'
# 比 T5Model 多了一层 hidden_state -> vocab_size 的 mlp 的映射
model = AutoModel.from_pretrained(model_ckpt, )
tokenizer = AutoTokenizer.from_pretrained(model_ckpt)
config = AutoConfig.from_pretrained(model_ckpt)
config

T5Config {
“_name_or_path”: “t5-3b”,
“architectures”: [
“T5WithLMHeadModel”
],
“d_ff”: 16384,
“d_kv”: 128,
“d_model”: 1024,
“decoder_start_token_id”: 0,
“dense_act_fn”: “relu”,
“dropout_rate”: 0.1,
“eos_token_id”: 1,
“feed_forward_proj”: “relu”,
“initializer_factor”: 1.0,
“is_encoder_decoder”: true,
“is_gated_act”: false,
“layer_norm_epsilon”: 1e-06,
“model_type”: “t5”,
“n_positions”: 512,
“num_decoder_layers”: 24,
“num_heads”: 32,
“num_layers”: 24,
“output_past”: true,
“pad_token_id”: 0,
“relative_attention_max_distance”: 128,
“relative_attention_num_buckets”: 32,
“task_specific_params”: {
“summarization”: {
“early_stopping”: true,
“length_penalty”: 2.0,
“max_length”: 200,
“min_length”: 30,
“no_repeat_ngram_size”: 3,
“num_beams”: 4,
“prefix”: "summarize: "
},
“translation_en_to_de”: {
“early_stopping”: true,
“max_length”: 300,
“num_beams”: 4,
“prefix”: "translate English to German: "
},
“translation_en_to_fr”: {
“early_stopping”: true,
“max_length”: 300,
“num_beams”: 4,
“prefix”: "translate English to French: "
},
“translation_en_to_ro”: {
“early_stopping”: true,
“max_length”: 300,
“num_beams”: 4,
“prefix”: "translate English to Romanian: "
}
},
“transformers_version”: “4.28.0”,
“use_cache”: true,
“vocab_size”: 32128
}

可见相关任务是1. summarize: 2.translate English to German: 3.translate English to French:4. translate English to Romanian:。

model

T5Model(
(shared): Embedding(32128, 1024)
(encoder): T5Stack(
(embed_tokens): Embedding(32128, 1024)
(block): ModuleList(
(0): T5Block(
(layer): ModuleList(
(0): T5LayerSelfAttention(
(SelfAttention): T5Attention(
(q): Linear(in_features=1024, out_features=4096, bias=False)
(k): Linear(in_features=1024, out_features=4096, bias=False)
(v): Linear(in_features=1024, out_features=4096, bias=False)
(o): Linear(in_features=4096, out_features=1024, bias=False)
(relative_attention_bias): Embedding(32, 32)
)
(layer_norm): T5LayerNorm()
(dropout): Dropout(p=0.1, inplace=False)
)
(1): T5LayerFF(
(DenseReluDense): T5DenseActDense(
(wi): Linear(in_features=1024, out_features=16384, bias=False)
(wo): Linear(in_features=16384, out_features=1024, bias=False)
(dropout): Dropout(p=0.1, inplace=False)
(act): ReLU()
)
(layer_norm): T5LayerNorm()
(dropout): Dropout(p=0.1, inplace=False)
)
)
)
(1-23): 23 x T5Block(
(layer): ModuleList(
(0): T5LayerSelfAttention(
(SelfAttention): T5Attention(
(q): Linear(in_features=1024, out_features=4096, bias=False)
(k): Linear(in_features=1024, out_features=4096, bias=False)
(v): Linear(in_features=1024, out_features=4096, bias=False)
(o): Linear(in_features=4096, out_features=1024, bias=False)
)
(layer_norm): T5LayerNorm()
(dropout): Dropout(p=0.1, inplace=False)
)
(1): T5LayerFF(
(DenseReluDense): T5DenseActDense(
(wi): Linear(in_features=1024, out_features=16384, bias=False)
(wo): Linear(in_features=16384, out_features=1024, bias=False)
(dropout): Dropout(p=0.1, inplace=False)
(act): ReLU()
)
(layer_norm): T5LayerNorm()
(dropout): Dropout(p=0.1, inplace=False)
)
)
)
)
(final_layer_norm): T5LayerNorm()
(dropout): Dropout(p=0.1, inplace=False)
)
(decoder): T5Stack(
(embed_tokens): Embedding(32128, 1024)
(block): ModuleList(
(0): T5Block(
(layer): ModuleList(
(0): T5LayerSelfAttention(
(SelfAttention): T5Attention(
(q): Linear(in_features=1024, out_features=4096, bias=False)
(k): Linear(in_features=1024, out_features=4096, bias=False)
(v): Linear(in_features=1024, out_features=4096, bias=False)
(o): Linear(in_features=4096, out_features=1024, bias=False)
(relative_attention_bias): Embedding(32, 32)
)
(layer_norm): T5LayerNorm()
(dropout): Dropout(p=0.1, inplace=False)
)
(1): T5LayerCrossAttention(
(EncDecAttention): T5Attention(
(q): Linear(in_features=1024, out_features=4096, bias=False)
(k): Linear(in_features=1024, out_features=4096, bias=False)
(v): Linear(in_features=1024, out_features=4096, bias=False)
(o): Linear(in_features=4096, out_features=1024, bias=False)
)
(layer_norm): T5LayerNorm()
(dropout): Dropout(p=0.1, inplace=False)
)
(2): T5LayerFF(
(DenseReluDense): T5DenseActDense(
(wi): Linear(in_features=1024, out_features=16384, bias=False)
(wo): Linear(in_features=16384, out_features=1024, bias=False)
(dropout): Dropout(p=0.1, inplace=False)
(act): ReLU()
)
(layer_norm): T5LayerNorm()
(dropout): Dropout(p=0.1, inplace=False)
)
)
)
(1-23): 23 x T5Block(
(layer): ModuleList(
(0): T5LayerSelfAttention(
(SelfAttention): T5Attention(
(q): Linear(in_features=1024, out_features=4096, bias=False)
(k): Linear(in_features=1024, out_features=4096, bias=False)
(v): Linear(in_features=1024, out_features=4096, bias=False)
(o): Linear(in_features=4096, out_features=1024, bias=False)
)
(layer_norm): T5LayerNorm()
(dropout): Dropout(p=0.1, inplace=False)
)
(1): T5LayerCrossAttention(
(EncDecAttention): T5Attention(
(q): Linear(in_features=1024, out_features=4096, bias=False)
(k): Linear(in_features=1024, out_features=4096, bias=False)
(v): Linear(in_features=1024, out_features=4096, bias=False)
(o): Linear(in_features=4096, out_features=1024, bias=False)
)
(layer_norm): T5LayerNorm()
(dropout): Dropout(p=0.1, inplace=False)
)
(2): T5LayerFF(
(DenseReluDense): T5DenseActDense(
(wi): Linear(in_features=1024, out_features=16384, bias=False)
(wo): Linear(in_features=16384, out_features=1024, bias=False)
(dropout): Dropout(p=0.1, inplace=False)
(act): ReLU()
)
(layer_norm): T5LayerNorm()
(dropout): Dropout(p=0.1, inplace=False)
)
)
)
)
(final_layer_norm): T5LayerNorm()
(dropout): Dropout(p=0.1, inplace=False)
)
)

format(get_params(model), ',')

‘2,851,598,336’

参数是2.85B

3.1 forward

input_ids = tokenizer(
    "Studies have been shown that owning a dog is good for you", return_tensors="pt"
).input_ids  # Batch size 1
decoder_input_ids = tokenizer("Studies show that", return_tensors="pt").input_ids  
# preprocess: Prepend decoder_input_ids with start token which is pad token for T5Model.
# This is not needed for torch's T5ForConditionalGeneration as it does this internally using labels arg.
decoder_input_ids = model._shift_right(decoder_input_ids)
input_ids

tensor([[6536, 43, 118, 2008, 24, 293, 53, 3, 9, 1782, 19, 207,
21, 25, 1]])

model.eval()
# forward pass
outputs = model(input_ids=input_ids, decoder_input_ids=decoder_input_ids)
last_hidden_states = outputs.last_hidden_state
last_hidden_states

tensor([[[-0.1611, -0.0524, 0.2812, …, -0.0113, -0.5561, -0.1034],
[-0.0441, 0.0494, 0.0101, …, 0.2337, 0.1868, 0.0204],
[-0.1586, -0.0830, -0.0067, …, 0.1704, 0.0040, 0.1689],
[-0.0349, -0.0160, 0.0020, …, 0.1688, -0.0871, 0.1037]]],
grad_fn=< MulBackward0>)

def t5_forward(model, input_ids, decoder_input_ids):
    encoder_outputs = model.encoder(input_ids=input_ids)
#     print(encoder_outputs)
    hidden_states = encoder_outputs[0]
    decoder_outputs = model.decoder(input_ids=decoder_input_ids, 
                                    encoder_hidden_states=hidden_states,)
    return decoder_outputs.last_hidden_state

t5_forward(model, input_ids, decoder_input_ids)

tensor([[[-0.1611, -0.0524, 0.2812, …, -0.0113, -0.5561, -0.1034],
[-0.0441, 0.0494, 0.0101, …, 0.2337, 0.1868, 0.0204],
[-0.1586, -0.0830, -0.0067, …, 0.1704, 0.0040, 0.1689],
[-0.0349, -0.0160, 0.0020, …, 0.1688, -0.0871, 0.1037]]],
grad_fn=< MulBackward0>)

可以看到两个结果是一样的。

3.2 预训练任务

Unsupervised denoising training
- MLM(Mask Language Model)
- span mask
Supervised training
- seq2seq

from transformers import T5ForConditionalGeneration
model = T5ForConditionalGeneration.from_pretrained(model_ckpt)
model

T5ForConditionalGeneration(
(shared): Embedding(32128, 1024)
(encoder): T5Stack(
(embed_tokens): Embedding(32128, 1024)
(block): ModuleList(
(0): T5Block(
(layer): ModuleList(
(0): T5LayerSelfAttention(
(SelfAttention): T5Attention(
(q): Linear(in_features=1024, out_features=4096, bias=False)
(k): Linear(in_features=1024, out_features=4096, bias=False)
(v): Linear(in_features=1024, out_features=4096, bias=False)
(o): Linear(in_features=4096, out_features=1024, bias=False)
(relative_attention_bias): Embedding(32, 32)
)
(layer_norm): T5LayerNorm()
(dropout): Dropout(p=0.1, inplace=False)
)
(1): T5LayerFF(
(DenseReluDense): T5DenseActDense(
(wi): Linear(in_features=1024, out_features=16384, bias=False)
(wo): Linear(in_features=16384, out_features=1024, bias=False)
(dropout): Dropout(p=0.1, inplace=False)
(act): ReLU()
)
(layer_norm): T5LayerNorm()
(dropout): Dropout(p=0.1, inplace=False)
)
)
)
(1-23): 23 x T5Block(
(layer): ModuleList(
(0): T5LayerSelfAttention(
(SelfAttention): T5Attention(
(q): Linear(in_features=1024, out_features=4096, bias=False)
(k): Linear(in_features=1024, out_features=4096, bias=False)
(v): Linear(in_features=1024, out_features=4096, bias=False)
(o): Linear(in_features=4096, out_features=1024, bias=False)
)
(layer_norm): T5LayerNorm()
(dropout): Dropout(p=0.1, inplace=False)
)
(1): T5LayerFF(
(DenseReluDense): T5DenseActDense(
(wi): Linear(in_features=1024, out_features=16384, bias=False)
(wo): Linear(in_features=16384, out_features=1024, bias=False)
(dropout): Dropout(p=0.1, inplace=False)
(act): ReLU()
)
(layer_norm): T5LayerNorm()
(dropout): Dropout(p=0.1, inplace=False)
)
)
)
)
(final_layer_norm): T5LayerNorm()
(dropout): Dropout(p=0.1, inplace=False)
)
(decoder): T5Stack(
(embed_tokens): Embedding(32128, 1024)
(block): ModuleList(
(0): T5Block(
(layer): ModuleList(
(0): T5LayerSelfAttention(
(SelfAttention): T5Attention(
(q): Linear(in_features=1024, out_features=4096, bias=False)
(k): Linear(in_features=1024, out_features=4096, bias=False)
(v): Linear(in_features=1024, out_features=4096, bias=False)
(o): Linear(in_features=4096, out_features=1024, bias=False)
(relative_attention_bias): Embedding(32, 32)
)
(layer_norm): T5LayerNorm()
(dropout): Dropout(p=0.1, inplace=False)
)
(1): T5LayerCrossAttention(
(EncDecAttention): T5Attention(
(q): Linear(in_features=1024, out_features=4096, bias=False)
(k): Linear(in_features=1024, out_features=4096, bias=False)
(v): Linear(in_features=1024, out_features=4096, bias=False)
(o): Linear(in_features=4096, out_features=1024, bias=False)
)
(layer_norm): T5LayerNorm()
(dropout): Dropout(p=0.1, inplace=False)
)
(2): T5LayerFF(
(DenseReluDense): T5DenseActDense(
(wi): Linear(in_features=1024, out_features=16384, bias=False)
(wo): Linear(in_features=16384, out_features=1024, bias=False)
(dropout): Dropout(p=0.1, inplace=False)
(act): ReLU()
)
(layer_norm): T5LayerNorm()
(dropout): Dropout(p=0.1, inplace=False)
)
)
)
(1-23): 23 x T5Block(
(layer): ModuleList(
(0): T5LayerSelfAttention(
(SelfAttention): T5Attention(
(q): Linear(in_features=1024, out_features=4096, bias=False)
(k): Linear(in_features=1024, out_features=4096, bias=False)
(v): Linear(in_features=1024, out_features=4096, bias=False)
(o): Linear(in_features=4096, out_features=1024, bias=False)
)
(layer_norm): T5LayerNorm()
(dropout): Dropout(p=0.1, inplace=False)
)
(1): T5LayerCrossAttention(
(EncDecAttention): T5Attention(
(q): Linear(in_features=1024, out_features=4096, bias=False)
(k): Linear(in_features=1024, out_features=4096, bias=False)
(v): Linear(in_features=1024, out_features=4096, bias=False)
(o): Linear(in_features=4096, out_features=1024, bias=False)
)
(layer_norm): T5LayerNorm()
(dropout): Dropout(p=0.1, inplace=False)
)
(2): T5LayerFF(
(DenseReluDense): T5DenseActDense(
(wi): Linear(in_features=1024, out_features=16384, bias=False)
(wo): Linear(in_features=16384, out_features=1024, bias=False)
(dropout): Dropout(p=0.1, inplace=False)
(act): ReLU()
)
(layer_norm): T5LayerNorm()
(dropout): Dropout(p=0.1, inplace=False)
)
)
)
)
(final_layer_norm): T5LayerNorm()
(dropout): Dropout(p=0.1, inplace=False)
)
(lm_head): Linear(in_features=1024, out_features=32128, bias=False)
)

可见T5ForConditionalGeneration多了最后一层，即(lm_head): Linear(in_features=1024, out_features=32128, bias=False)，Language Model_head。

# Unsupervised denoising training
# mlm
input_ids = tokenizer("The <extra_id_0> walks in <extra_id_1> park", return_tensors="pt").input_ids
labels = tokenizer("<extra_id_0> cute dog <extra_id_1> the <extra_id_2>", return_tensors="pt").input_ids

# the forward function automatically creates the correct decoder_input_ids
loss = model(input_ids=input_ids, labels=labels).loss
loss.item()

1.9458732604980469

# Supervised training
# seq2seq

input_ids = tokenizer("translate English to German: The house is wonderful.", return_tensors="pt").input_ids
labels = tokenizer("Das Haus ist wunderbar.", return_tensors="pt").input_ids

# the forward function automatically creates the correct decoder_input_ids
loss = model(input_ids=input_ids, labels=labels).loss
loss.item()

0.9009745717048645

3.2.1 multi sentence pairs

# the following 2 hyperparameters are task-specific
max_source_length = 512
max_target_length = 128

# Suppose we have the following 2 training examples:
input_sequence_1 = "Welcome to NYC"
output_sequence_1 = "Bienvenue à NYC"

input_sequence_2 = "HuggingFace is a company"
output_sequence_2 = "HuggingFace est une entreprise"

# encode the inputs
task_prefix = "translate English to French: "
input_sequences = [input_sequence_1, input_sequence_2]

encoding = tokenizer(
    [task_prefix + sequence for sequence in input_sequences],
    padding="longest",
    max_length=max_source_length,
    truncation=True,
    return_tensors="pt",
)

input_ids, attention_mask = encoding.input_ids, encoding.attention_mask

# encode the targets
target_encoding = tokenizer(
    [output_sequence_1, output_sequence_2],
    padding="longest",
    max_length=max_target_length,
    truncation=True,
    return_tensors="pt",
)
labels = target_encoding.input_ids

# replace padding token id's of the labels by -100 so it's ignored by the loss
labels[labels == tokenizer.pad_token_id] = -100

# forward pass
loss = model(input_ids=input_ids, attention_mask=attention_mask, labels=labels).loss
loss.item()

0.19245588779449463

3.3 完成 tasks

input_ids = tokenizer.encode("translate English to German: Hello, my dog is cute", return_tensors="pt") 
result = model.generate(input_ids)
tokenizer.decode(result[0])

‘ Hallo, mein Hund ist süß’

# inference
input_ids = tokenizer(
    "summarize: Transfer learning, where a model is first pre-trained on a data-rich task before being fine-tuned on a downstream task, has emerged as a powerful technique in natural language processing (NLP). The effectiveness of transfer learning has given rise to a diversity of approaches, methodology, and practice. In this paper, we explore the landscape of transfer learning techniques for NLP by introducing a unified framework that converts every language problem into a text-to-text format. Our systematic study compares pretraining objectives, architectures, unlabeled datasets, transfer approaches, and other factors on dozens of language understanding tasks. By combining the insights from our exploration with scale and our new “Colossal Clean Crawled Corpus”, we achieve state-of-the-art results on many benchmarks covering summarization, question answering, text classification, and more. To facilitate future work on transfer learning for NLP, we release our dataset, pre-trained models, and code.", 
    return_tensors="pt"
).input_ids  # Batch size 1
outputs = model.generate(input_ids, max_length=128)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))
# studies have shown that owning a dog is good for you.

transfer learning has emerged as a powerful technique in natural language processing (NLP) in this paper, we explore the landscape of transfer learning techniques for NLP. we introduce a unified framework that converts every language problem into a text-to-text format.

参考：
https://github.com/chunhuizhang/bert_t5_gpt/blob/main/tutorials/09_t5_overall.ipynb

本文内容由网友自发贡献，版权归原作者所有，本站不承担相应法律责任。如您发现有涉嫌抄袭侵权的内容，请联系:hwhale#tublm.com(使用前将#替换为@)

pytorch深度学习实战

python

开发语言

NLP

深度学习

T5的整体介绍【代码实战】的相关文章

用于将 cython 中的许多 C++ 类包装到单个共享对象的项目结构

我在文档邮件列表和这个问题在这里 https stackoverflow com questions 10300660 cython and distutils 但我想得到一个更直接的答案来解决我的具体情况我正在通过尝试一点一点地包装我
没有名为 crypto.cipher 的模块

我现在正在尝试加密一段时间我最近得到了这个基于 python 的密码器名为PythonCrypter https github com jbertman PythonCrypter 我对 Python 相当陌生当我尝试通过终端打开 C
将 saxon 与 python 结合使用

我需要使用 python 处理 XSLT 目前我正在使用仅支持 XSLT 1 的 lxml 现在我需要处理 XSLT 2 有没有办法将 saxon XSLT 处理器与 python 一起使用有两种可能的方法设置一个 HTTP 服务接受
通过最小元素比较对 5 个元素进行排序

我必须在 python 中使用元素之间的最小比较次数来建模对 5 个元素的列表进行排序的执行计划除此之外复杂性是无关紧要的结果是一个对的列表表示在另一时间对列表进行排序所需的比较我知道有一种算法可以通过 7 次比较总是在元素之间
使用带有关键字参数的 map() 函数

这是我尝试使用的循环map功能于 volume ids 1 2 3 4 5 ip 172 12 13 122 for volume id in volume ids my function volume id ip ip 我有办法做到这一点
Python - StatsModels、OLS 置信区间

在 Statsmodels 中我可以使用以下方法拟合我的模型 import statsmodels api as sm X np array 22000 13400 47600 7400 12000 32000 28000 31000 6
使用 on_bad_lines 将 pandas.read_csv 中的无效行写入文件

我有一个 CSV 文件我正在使用 Python 来解析该文件我发现文件中的某些行具有不同的列数 001 Snow Jon 19801201 002 Crom Jake 19920103 003 Wise Frank 19880303 l
根据列值突出显示数据框中的行？

假设我有这样的数据框 col1 col2 col3 col4 0 A A 1 pass 2 1 A A 2 pass 4 2 A A 1 fail 4 3 A A 1 fail 5 4 A A 1 pass 3 5 A A 2 fail 2
SQLALchemy .query：类“Car”的未解析属性引用“query”

我有一个这里已经提到的问题https youtrack jetbrains com issue PY 44557 https youtrack jetbrains com issue PY 44557 但我还没有找到解决方案我使用 Pyt
Python pickle：腌制对象不等于源对象

我认为这是预期的行为但想检查一下也许找出原因因为我所做的研究结果是空白我有一个函数可以提取数据创建自定义类的新实例然后将其附加到列表中该类仅包含变量然后我使用协议 2 作为二进制文件将该列表腌制到文件中稍后我重新运行脚本
在Python中获取文件描述符的位置

比如说我有一个原始数字文件描述符我需要根据它获取文件中的当前位置 import os psutil some code that works with file lp lib open path to file p psutil Pro
IO 密集型任务中的 Python 多线程

建议仅在 IO 密集型任务中使用 Python 多线程因为 Python 有一个全局解释器锁 GIL 只允许一个线程持有 Python 解释器的控制权然而多线程对于 IO 密集型操作有意义吗 https stackoverflow c
向 Altair 图表添加背景实心填充

I like Altair a lot for making graphs in Python As a tribute I wanted to regenerate the Economist graph s in Mistakes we
如何在 Python 中追加到 JSON 文件？

我有一个 JSON 文件其中包含 67790 1 kwh 319 4 现在我创建一个字典a dict我需要将其附加到 JSON 文件中我尝试了这段代码 with open DATA FILENAME a as f json obj js
为字典中的一个键附加多个值[重复]

这个问题在这里已经有答案了我是 python 新手我有每年的年份和值列表我想要做的是检查字典中是否已存在该年份如果存在则将该值附加到特定键的值列表中例如我有一个年份列表并且每年都有一个值 2010 2 2009 4 1989
有人用过 Dabo 做过中型项目吗？ [关闭]

Closed 这个问题是基于意见的 help closed questions 目前不接受答案我们正处于一个新的 ERP 风格的客户端服务器应用程序的开始阶段该应用程序是作为 Python 富客户端开发的我们目前正在评估 Dabo
Python：如何将列表列表的元素转换为无向图？

我有一个程序可以检索 PubMed 出版物列表并希望构建一个共同作者图这意味着对于每篇文章我想将每个作者如果尚未存在添加为顶点并添加无向边或增加每个合著者之间的权重我设法编写了第一个程序该程序检索每个出版物的作者列表并
在 Qt 中自动调整标签文本大小 - 奇怪的行为

在 Qt 中我有一个复合小部件它由排列在 QBoxLayouts 内的多个 QLabels 组成当小部件调整大小时我希望标签文本缩放以填充标签区域并且我已经在 resizeEvent 中实现了文本大小的调整这可行但似乎发生了某
导入错误：没有名为 site 的模块 - mac

我已经有这个问题几个月了每次我想获取一个新的 python 包并使用它时我都会在终端中收到此错误 ImportError No module named site 我不知道为什么会出现这个错误实际上我无法使用任何新软件包因为每次我
NotImplementedError：无法将符号张量 (lstm_2/strided_slice:0) 转换为 numpy 数组。时间

张量流版本 2 3 1 numpy 版本 1 20 在代码下面 define model model Sequential model add LSTM 50 activation relu input shape n steps n fe

随机推荐

哔哩哔哩移动端项目：Vue3.2 + TS + Axios入门到实战

课程简介 Vue3 2 终于定稿了
ingress-nginx k8s.gcr.io 替换中国镜像

使用国内镜像源下载镜像列表 Docker Hub 去docker hub 上搜索 docker pull liangjw ingress nginx controller v1 1 2 GENERATED FOR K8S 1 20 api
JeecgBoot中的JEditableTable组件复杂自定义功能：修改其中一个字段时其他字段也同时做相应改变，级联选择器的使用

参考官方技术文档 http jeecg boot mydoc io t 345687 初始需求如下图当我修改产品名称的选项时产品编码和产品规格的字段会根据我给出的数组自动改变 1 首先我们需要将column类型改为slot才能自定义产
Kendo UI开发教程(20): Kendo MVVM 数据绑定(九) Text

Text绑定可以使用ViewModel来设置DOM元素的文本属性如果需要设置input textarea 或select的显示需要使用value属性 1
unity 实战功能TileMap在真机无法显示或者代码无法创建

unity 版本 2017 4 3f 解决方案这位博主跟我遇到的问题一致最后也是采用了他的解决方案直接先在编辑器里扫一遍透明度为0的tile 最后在真机上使用 refreshtile刷新图 gettile获取之后更改颜色或者sprit
微信小程序优惠券列表领取（send-coupon插件）

官方领取流程插件配置和引入请参考官方文档 https pay weixin qq com wiki doc apiv3 apis chapter9 3 1 shtml
【报错解决】ModuleNotFoundError: No module named ‘kornia‘

pip install kornia 0 6 0
Spring监听器的处理过程

监听容器是如何初始化的当程序执行到AbstractAutowireCapableBeanFactory BeanPostProcessor的实现类保证后续bean可能被包装 initializeBean 方法时轮询执行postProc
【GB28181】PJSIP库（六）使用视频：获取图像、本地预览、发送接收视频等

目录郭老二博文之图像视频汇总 1 简介 PJSUA2 的媒体对象均派生自抽象基类pj Media 媒体对象是指能够生成或读取媒体的对象类pj VideoMedia派生自pj Media 代表视频媒体 PJSUA2 支持多种类型的视频媒
VScode通过remote ssh连接虚拟机 & 报错过程试图写入的管道不存在（已解决）

因为在windows上VSCode使用的默认ssh工具存在实现上的问题导致一旦我们直接使用默认ssh连接会有报错过程试图写入的管道不存在 The process tried to write to a nonexistent pipe
ElasticSearch 实践过程中遇到的几个小问题

ulimit 不生效有一台机器的在启动 ES 的时候始终报错 1 max file descriptors 65000 for elasticsearch process is too low 但是我已经在 etc security li
git使用-去除merge branch ’master‘提交

git去除merge branch master 提交问题现象在项目开发中经常会有这样的情况发生更新上游项目代码时操作 Tom localhost dev gw ac git remote add upstream http Tom
关于百度地图marker的点击事件

在最开始学习使用百度地图 marker的点击事件很容易实现点击弹出框像这样 baiduMap setOnMarkerClickListener new BaiduMap OnMarkerClickListener Override pu
关于MinGW和MSYS

MinGW是Minimalist GNU for Windows的缩写是本地Windows应用的极简开发环境 MinGW为本地MS Windows应用的开发提供了完整的开源编程工具集而且不依赖于第三方C运行时DLL 它确实依赖于很多由微
gitee新建仓库并上传项目

目录一新建仓库二克隆仓库到本地三本地添加项目文件四上传至gitee仓库五提交代码更新至远程仓库一新建仓库 1 点击右上角的号新建仓库 2 填写仓库信息点击创建即可二克隆仓库到本地 1 点击克隆下载 S
python 用随机森林模型补充数值变量缺失值

对数据建模之前填补缺失值是必不可少的一步这里把用随机森林模型快速预测缺失值的方法总结如下以方便日后的工作 data df DataFrame类型的数据 obj column 待填补缺失值的列名 missing other column
Android开发亲测可用--多种方式获取手机短信验证码自动填入

Android开发静态注册动态注册短信中心库监控获取手机验证码自动复制到剪切板或或填入输入框友情提醒初学者这是广播接收器的类写在xml中静态注册或写在启动类的Oncreate方法下动态注册即可有新短信通知就会触发若使用正常
Latex常用数学符号

相关参考常用数学符号的 LaTeX 表示方法 Latex所有常用数学符号整理 CSDN latex插入图片的文章 Side by side figures in LaTex 可以复制的特殊符号表数学模式重音符号 hat a hat a
电路分析笔记-电阻电路的等效变换

电路的等效变换两端网络网络任何一个复杂的电路向外引出两个端钮且从一个端子流入的电流等于从另一端子流出的电流则称这一电路为二端网络或一端口网络两端电路等效的概念两个两端电路端口具有相同的电压电流关系则称它们是等效的电路
T5的整体介绍【代码实战】

T5的整体介绍代码实战 0 前言 1 Header 2 summary 3 T5 model 3 1 forward 3 2 预训练任务 3 2 1 multi sentence pairs 3 3 完成 tasks 0 前言本文是对T

T5的整体介绍【代码实战】

T5的整体介绍【代码实战】

0、前言

1.Header

2.summary

3 T5 model

3.1 forward

3.2 预训练任务

3.2.1 multi sentence pairs

3.3 完成 tasks

T5的整体介绍【代码实战】 的相关文章

随机推荐

热门标签

T5的整体介绍【代码实战】的相关文章