TensorBoard 中的 hp_metric 是什么以及如何摆脱它?

如何解决TensorBoard 中的 hp_metric 是什么以及如何摆脱它?

我是 Tensorboard 的新手。

我正在使用相当简单的代码运行一个实验,这是输出:

enter image description here

我不记得要求提供 hp_metric 图表,但它在这里。

它是什么以及如何摆脱它?


使用 Pytorch Lightning 重现完整代码(不是我认为任何人都应该重现这个来回答):

请注意取消引用 TensorBoard 的唯一行是

self.logger.experiment.add_scalars("losses",{"train_loss": loss},global_step=self.current_epoch)
import torch
from torch import nn
import torch.nn.functional as F
from typing import List,Optional
from pytorch_lightning.core.lightning import LightningModule
from Testing.Research.toy_datasets.ClustersDataset import ClustersDataset
from torch.utils.data import DataLoader
from Testing.Research.config.ConfigProvider import ConfigProvider
from pytorch_lightning import Trainer,seed_everything
from torch import optim
import os
from pytorch_lightning.loggers import TensorBoardLogger


class VAEFC(LightningModule):
    # see https://towardsdatascience.com/understanding-variational-autoencoders-vaes-f70510919f73
    # for possible upgrades,see https://arxiv.org/pdf/1602.02282.pdf
    # https://stats.stackexchange.com/questions/332179/how-to-weight-kld-loss-vs-reconstruction-loss-in-variational-auto-encoder
    def __init__(self,encoder_layer_sizes: List,decoder_layer_sizes: List,config):
        super(VAEFC,self).__init__()
        self._config = config
        self.logger: Optional[TensorBoardLogger] = None

        assert len(encoder_layer_sizes) >= 3,"must have at least 3 layers (2 hidden)"
        # encoder layers
        self._encoder_layers = nn.ModuleList()
        for i in range(1,len(encoder_layer_sizes) - 1):
            enc_layer = nn.Linear(encoder_layer_sizes[i - 1],encoder_layer_sizes[i])
            self._encoder_layers.append(enc_layer)

        # predict mean and covariance vectors
        self._mean_layer = nn.Linear(encoder_layer_sizes[
                                         len(encoder_layer_sizes) - 2],encoder_layer_sizes[len(encoder_layer_sizes) - 1])
        self._logvar_layer = nn.Linear(encoder_layer_sizes[
                                           len(encoder_layer_sizes) - 2],encoder_layer_sizes[len(encoder_layer_sizes) - 1])

        # decoder layers
        self._decoder_layers = nn.ModuleList()
        for i in range(1,len(decoder_layer_sizes)):
            dec_layer = nn.Linear(decoder_layer_sizes[i - 1],decoder_layer_sizes[i])
            self._decoder_layers.append(dec_layer)

        self._recon_function = nn.MSELoss(reduction='mean')

    def _encode(self,x):
        for i in range(len(self._encoder_layers)):
            layer = self._encoder_layers[i]
            x = F.relu(layer(x))

        mean_output = self._mean_layer(x)
        logvar_output = self._logvar_layer(x)
        return mean_output,logvar_output

    def _reparametrize(self,mu,logvar):
        if not self.training:
            return mu
        std = logvar.mul(0.5).exp_()
        if std.is_cuda:
            eps = torch.cuda.FloatTensor(std.size()).normal_()
        else:
            eps = torch.FloatTensor(std.size()).normal_()
        reparameterized = eps.mul(std).add_(mu)
        return reparameterized

    def _decode(self,z):
        for i in range(len(self._decoder_layers) - 1):
            layer = self._decoder_layers[i]
            z = F.relu((layer(z)))

        decoded = self._decoder_layers[len(self._decoder_layers) - 1](z)
        # decoded = F.sigmoid(self._decoder_layers[len(self._decoder_layers)-1](z))
        return decoded

    def _loss_function(self,recon_x,x,logvar,reconstruction_function):
        """
        recon_x: generating images
        x: origin images
        mu: latent mean
        logvar: latent log variance
        """
        binary_cross_entropy = reconstruction_function(recon_x,x)  # mse loss TODO see if mse or cross entropy
        # loss = 0.5 * sum(1 + log(sigma^2) - mu^2 - sigma^2)
        kld_element = mu.pow(2).add_(logvar.exp()).mul_(-1).add_(1).add_(logvar)
        kld = torch.sum(kld_element).mul_(-0.5)
        # KL divergence Kullback–Leibler divergence,regularization term for VAE
        # It is a measure of how different two probability distributions are different from each other.
        # We are trying to force the distributions closer while keeping the reconstruction loss low.
        # see https://towardsdatascience.com/understanding-variational-autoencoders-vaes-f70510919f73

        # read on weighting the regularization term here:
        # https://stats.stackexchange.com/questions/332179/how-to-weight-kld-loss-vs-reconstruction-loss-in-variational
        # -auto-encoder
        return binary_cross_entropy + kld * self._config.regularization_factor

    def training_step(self,batch,batch_index):
        orig_batch,noisy_batch,_ = batch
        noisy_batch = noisy_batch.view(noisy_batch.size(0),-1)

        recon_batch,logvar = self.forward(noisy_batch)

        loss = self._loss_function(
            recon_batch,orig_batch,reconstruction_function=self._recon_function
        )
        # self.logger.experiment.add_scalars("losses",{"train_loss": loss})
        self.logger.experiment.add_scalars("losses",global_step=self.current_epoch)
        # self.logger.experiment.add_scalar("train_loss",loss,self.current_epoch)
        self.logger.experiment.flush()
        return loss

    def train_dataloader(self):
        default_dataset,train_dataset,test_dataset = ClustersDataset.clusters_dataset_by_config()
        train_dataloader = DataLoader(train_dataset,batch_size=self._config.batch_size,shuffle=True)
        return train_dataloader

    def test_dataloader(self):
        default_dataset,test_dataset = ClustersDataset.clusters_dataset_by_config()
        test_dataloader = DataLoader(test_dataset,shuffle=True)
        return test_dataloader

    def configure_optimizers(self):
        optimizer = optim.Adam(model.parameters(),lr=self._config.learning_rate)
        return optimizer

    def forward(self,x):
        mu,logvar = self._encode(x)
        z = self._reparametrize(mu,logvar)
        decoded = self._decode(z)
        return decoded,logvar


if __name__ == "__main__":
    config = ConfigProvider.get_config()
    seed_everything(config.random_seed)
    latent_dim = config.latent_dim
    enc_layer_sizes = config.enc_layer_sizes + [latent_dim]
    dec_layer_sizes = [latent_dim] + config.dec_layer_sizes
    model = VAEFC(config=config,encoder_layer_sizes=enc_layer_sizes,decoder_layer_sizes=dec_layer_sizes)

    logger = TensorBoardLogger(save_dir='tb_logs',name='VAEFC')
    logger.hparams = config  # TODO only put here relevant stuff
    # trainer = Trainer(gpus=1)
    trainer = Trainer(deterministic=config.is_deterministic,#auto_lr_find=config.auto_lr_find,#log_gpu_memory='all',# min_epochs=99999,max_epochs=config.num_epochs,default_root_dir=os.getcwd(),logger=logger
                      )
    # trainer.tune(model)
    trainer.fit(model)
    print("done training vae with lightning")

ClustersDataset.py

from torch.utils.data import Dataset
import matplotlib.pyplot as plt
import torch
import numpy as np
from Testing.Research.config.ConfigProvider import ConfigProvider


class ClustersDataset(Dataset):
    __default_dataset = None
    __default_dataset_train = None
    __default_dataset_test = None

    def __init__(self,cluster_size: int,noise_factor: float = 0,transform=None,n_clusters=2,centers_radius=4.0):
        super(ClustersDataset,self).__init__()
        self._cluster_size = cluster_size
        self._noise_factor = noise_factor
        self._n_clusters = n_clusters
        self._centers_radius = centers_radius
        # self._transform = transform
        self._size = self._cluster_size * self._n_clusters

        self._create_data_clusters()
        self._combine_clusters_to_array()
        self._normalize_data()
        self._add_noise()

        # self._plot()
        pass

    @staticmethod
    def clusters_dataset_by_config():
        if ClustersDataset.__default_dataset is not None:
            return \
                ClustersDataset.__default_dataset,\
                ClustersDataset.__default_dataset_train,\
                ClustersDataset.__default_dataset_test
        config = ConfigProvider.get_config()
        default_dataset = ClustersDataset(
            cluster_size=config.cluster_size,noise_factor=config.noise_factor,n_clusters=config.n_clusters,centers_radius=config.centers_radius
        )
        
        train_size = int(config.train_size * len(default_dataset))
        test_size = len(default_dataset) - train_size
        train_dataset,test_dataset = torch.utils.data.random_split(default_dataset,[train_size,test_size])

        ClustersDataset.__default_dataset = default_dataset
        ClustersDataset.__default_dataset_train = train_dataset
        ClustersDataset.__default_dataset_test = test_dataset

        return default_dataset,test_dataset

    def _create_data_clusters(self):
        self._clusters = [torch.zeros((self._cluster_size,2)) for _ in range(self._n_clusters)]
        centers_radius = self._centers_radius
        for i,c in enumerate(self._clusters):
            r,y = 3.0,centers_radius * np.cos(i * np.pi * 2 / self._n_clusters),centers_radius * np.sin(
                i * np.pi * 2 / self._n_clusters)
            cluster_length = 1.1
            cluster_start = i * 2 * np.pi / self._n_clusters
            cluster_end = cluster_length * (i + 1) * 2 * np.pi / self._n_clusters
            cluster_inds = torch.linspace(start=cluster_start,end=cluster_end,steps=self._cluster_size,dtype=torch.float)
            c[:,0] = r * torch.sin(cluster_inds) + y
            c[:,1] = r * torch.cos(cluster_inds) + x

    def _plot(self):
        plt.figure()
        plt.scatter(self._noisy_values[:,0],self._noisy_values[:,1],s=1,color='b',label="noisy_values")
        plt.scatter(self._values[:,self._values[:,color='r',label="values")
        plt.legend(loc="upper left")
        plt.show()

    def _combine_clusters_to_array(self):
        size = self._size
        self._values = torch.zeros(size,2)
        self._labels = torch.zeros(size,dtype=torch.long)
        for i,c in enumerate(self._clusters):
            self._values[i * self._cluster_size: (i + 1) * self._cluster_size,:] = self._clusters[i]
            self._labels[i * self._cluster_size: (i + 1) * self._cluster_size] = i

    def _add_noise(self):
        size = self._size

        mean = torch.zeros(size,2)
        std = torch.ones(size,2)
        noise = torch.normal(mean,std)
        self._noisy_values = torch.zeros(size,2)
        self._noisy_values[:] = self._values
        self._noisy_values = self._noisy_values + noise * self._noise_factor

    def _normalize_data(self):
        values_min,values_max = torch.min(self._values),torch.max(self._values)
        self._values = (self._values - values_min) / (values_max - values_min)
        self._values = self._values * 2 - 1

    def __len__(self):
        return self._size  # number of samples in the dataset

    def __getitem__(self,index):
        item = self._values[index,:]
        noisy_item = self._noisy_values[index,:]
        # if self._transform is not None:
        #     noisy_item = self._transform(item)
        return item,noisy_item,self._labels[index]

    @property
    def values(self):
        return self._values

    @property
    def noisy_values(self):
        return self._noisy_values

配置值(ConfigProvider 只是将这些作为对象返回)

num_epochs: 15
batch_size: 128
learning_rate: 0.0001
auto_lr_find: False

noise_factor: 0.1
regularization_factor: 0.0

cluster_size: 5000
n_clusters: 5
centers_radius: 4.0
train_size: 0.8

latent_dim: 8

enc_layer_sizes: [2,200,200]
dec_layer_sizes: [200,2]

retrain_vae: False
random_seed: 11
is_deterministic: True

解决方法

这是pytorch Lightning中tensorboard的默认设置。您可以将 default_hp_metric 设置为 false 以消除此指标。

TensorBoardLogger(save_dir='tb_logs',name='VAEFC',default_hp_metric=False)

hp_metric 可帮助您跟踪不同超参数的模型性能。您可以在 tensorboard 中的 hparams 查看。

,

hp_metric(超参数指标)用于帮助您调整超参数。

您可以将此指标设置为您喜欢的任何值,如 pytorch official docs 中所述。

然后,您可以查看您的超参数,并根据您选择的任何指标查看哪个结果最好。

或者,如果您不想要它,您可以按照@joe32140 的回答中的建议禁用它:

您可以将 default_hp_metric 设置为 false 以消除此指标。

TensorBoardLogger(save_dir='tb_logs',default_hp_metric=False)

版权声明:本文内容由互联网用户自发贡献,该文观点与技术仅代表作者本人。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如发现本站有涉嫌侵权/违法违规的内容, 请发送邮件至 dio@foxmail.com 举报,一经查实,本站将立刻删除。

相关推荐


依赖报错 idea导入项目后依赖报错,解决方案:https://blog.csdn.net/weixin_42420249/article/details/81191861 依赖版本报错:更换其他版本 无法下载依赖可参考:https://blog.csdn.net/weixin_42628809/a
错误1:代码生成器依赖和mybatis依赖冲突 启动项目时报错如下 2021-12-03 13:33:33.927 ERROR 7228 [ main] o.s.b.d.LoggingFailureAnalysisReporter : *************************** APPL
错误1:gradle项目控制台输出为乱码 # 解决方案:https://blog.csdn.net/weixin_43501566/article/details/112482302 # 在gradle-wrapper.properties 添加以下内容 org.gradle.jvmargs=-Df
错误还原:在查询的过程中,传入的workType为0时,该条件不起作用 <select id="xxx"> SELECT di.id, di.name, di.work_type, di.updated... <where> <if test=&qu
报错如下,gcc版本太低 ^ server.c:5346:31: 错误:‘struct redisServer’没有名为‘server_cpulist’的成员 redisSetCpuAffinity(server.server_cpulist); ^ server.c: 在函数‘hasActiveC
解决方案1 1、改项目中.idea/workspace.xml配置文件,增加dynamic.classpath参数 2、搜索PropertiesComponent,添加如下 <property name="dynamic.classpath" value="tru
删除根组件app.vue中的默认代码后报错:Module Error (from ./node_modules/eslint-loader/index.js): 解决方案:关闭ESlint代码检测,在项目根目录创建vue.config.js,在文件中添加 module.exports = { lin
查看spark默认的python版本 [root@master day27]# pyspark /home/software/spark-2.3.4-bin-hadoop2.7/conf/spark-env.sh: line 2: /usr/local/hadoop/bin/hadoop: No s
使用本地python环境可以成功执行 import pandas as pd import matplotlib.pyplot as plt # 设置字体 plt.rcParams['font.sans-serif'] = ['SimHei'] # 能正确显示负号 p
错误1:Request method ‘DELETE‘ not supported 错误还原:controller层有一个接口,访问该接口时报错:Request method ‘DELETE‘ not supported 错误原因:没有接收到前端传入的参数,修改为如下 参考 错误2:cannot r
错误1:启动docker镜像时报错:Error response from daemon: driver failed programming external connectivity on endpoint quirky_allen 解决方法:重启docker -> systemctl r
错误1:private field ‘xxx‘ is never assigned 按Altʾnter快捷键,选择第2项 参考:https://blog.csdn.net/shi_hong_fei_hei/article/details/88814070 错误2:启动时报错,不能找到主启动类 #
报错如下,通过源不能下载,最后警告pip需升级版本 Requirement already satisfied: pip in c:\users\ychen\appdata\local\programs\python\python310\lib\site-packages (22.0.4) Coll
错误1:maven打包报错 错误还原:使用maven打包项目时报错如下 [ERROR] Failed to execute goal org.apache.maven.plugins:maven-resources-plugin:3.2.0:resources (default-resources)
错误1:服务调用时报错 服务消费者模块assess通过openFeign调用服务提供者模块hires 如下为服务提供者模块hires的控制层接口 @RestController @RequestMapping("/hires") public class FeignControl
错误1:运行项目后报如下错误 解决方案 报错2:Failed to execute goal org.apache.maven.plugins:maven-compiler-plugin:3.8.1:compile (default-compile) on project sb 解决方案:在pom.
参考 错误原因 过滤器或拦截器在生效时,redisTemplate还没有注入 解决方案:在注入容器时就生效 @Component //项目运行时就注入Spring容器 public class RedisBean { @Resource private RedisTemplate<String
使用vite构建项目报错 C:\Users\ychen\work>npm init @vitejs/app @vitejs/create-app is deprecated, use npm init vite instead C:\Users\ychen\AppData\Local\npm-