无法使用Google terraform GKE模块在GKE群集上创建Windows Nodepool

如何解决无法使用Google terraform GKE模块在GKE群集上创建Windows Nodepool

我正在尝试使用Google模块使用Windows node_pool来配置GKE群集,我正在调用模块

  source  = "terraform-google-modules/kubernetes-engine/google//modules/beta-private-cluster-update-variant"
  version = "9.2.0"

我必须为GKE所需的linux池和我们所需的Windows池定义两个池,terraform总是能够成功配置linux node_pool,但是无法配置窗口一和错误消息

module.gke.google_container_cluster.primary: Still modifying... [id=projects/uk-xxx-xx-xxx-b821/locations/europe-west2/clusters/gke-nonpci-dev,24m31s elapsed]
module.gke.google_container_cluster.primary: Still modifying... [id=projects/uk-xxx-xx-xxx-b821/locations/europe-west2/clusters/gke-nonpci-dev,24m41s elapsed]
module.gke.google_container_cluster.primary: Still modifying... [id=projects/uk-xxx-xx-xxx-b821/locations/europe-west2/clusters/gke-nonpci-dev,24m51s elapsed]
module.gke.google_container_cluster.primary: Modifications complete after 24m58s [id=projects/xx-xxx-xx-xxx-b821/locations/europe-west2/clusters/gke-nonpci-dev]
module.gke.google_container_node_pool.pools["windows-node-pool"]: Creating...

Error: error creating NodePool: googleapi: Error 400: Workload Identity is not supported on Windows nodes. Create the nodepool without workload identity by specifying --workload-metadata=GCE_METADATA.,badRequest

  on .terraform\modules\gke\terraform-google-kubernetes-engine-9.2.0\modules\beta-private-cluster-update-variant\cluster.tf line 341,in resource "google_container_node_pool" "pools":
 341: resource "google_container_node_pool" "pools" {

我尝试了很多设置该元数据值的地方,但我觉得不正确:

从地形侧面看:

我尝试了很多地方将此元数据添加到模块本身或main.tf文件中的node_config范围内,在其中调用模块,我试图将其添加到node_pools列表的Windows node_pool范围中,但是没有接受一条消息,提示此处不需要设置WORKLOAD IDENTITY

我也尝试设置enable_shielded_nodes = false,但这并没有太大帮助。

即使通过命令行(这是我的命令行),我也试图测试它是否可行

C:\>gcloud container node-pools --region europe-west2 list
NAME                    MACHINE_TYPE   DISK_SIZE_GB  NODE_VERSION
default-node-pool-d916  n1-standard-2  100           1.17.9-gke.600

 
C:\>gcloud container node-pools --region europe-west2 create window-node-pool --cluster=gke-nonpci-dev --image-type=WINDOWS_SAC --no-enable-autoupgrade --machine-type=n1-standard-2
WARNING: Starting in 1.12,new node pools will be created with their legacy Compute Engine instance metadata APIs disabled by default. To create a node pool with legacy instance metadata endpoints disabled,run `node-pools create` with the flag `--metadata disable-legacy-endpoints=true`.
This will disable the autorepair feature for nodes. Please see https://cloud.google.com/kubernetes-engine/docs/node-auto-repair for more information on node autorepairs.
ERROR: (gcloud.container.node-pools.create) ResponseError: code=400,message=Workload Identity is not supported on Windows nodes. Create the nodepool without workload identity by specifying --workload-metadata=GCE_METADATA.

C:\>gcloud container node-pools --region europe-west2 create window-node-pool --cluster=gke-nonpci-dev --image-type=WINDOWS_SAC --no-enable-autoupgrade --machine-type=n1-standard-2 --workload-metadata=GCE_METADATA --metadata disable-legacy-endpoints=true
This will disable the autorepair feature for nodes. Please see https://cloud.google.com/kubernetes-engine/docs/node-auto-repair for more information on node autorepairs.
ERROR: (gcloud.container.node-pools.create) ResponseError: code=400,message=Service account "874988475980-compute@developer.gserviceaccount.com" does not exist.

C:\>gcloud auth list
                       Credentialed Accounts
ACTIVE  ACCOUNT
*       tf-xxx-xxx-xx-xxx@xx-xxx-xx-xxx-xxxx.iam.gserviceaccount.com

这个运行 gcloud auth list的服务帐户是我用来运行terraform的那个帐户,但是我不知道错误消息的来源是哪一个,即使尝试通过命令行创建Windows nodepool也是如此。上面显示的内容也无效,我有点卡住了,也不知道该怎么办。

由于模块9.2.0对于我们之前设置的所有基于Linux的群集来说是一个稳定的模块,因此我认为这可能是Windows node_pool的旧版本,因此我使用11.0.0来查看是否会有所不同,但最终会导致错误

module.gke.google_container_node_pool.pools["default-node-pool"]: Refreshing state... [id=projects/uk-tix-p1-npe-b821/locations/europe-west2/clusters/gke-nonpci-dev/nodePools/default-node-pool-d916]

Error: failed to execute ".terraform/modules/gke.gcloud_delete_default_kube_dns_configmap/terraform-google-gcloud-1.4.1/scripts/check_env.sh": fork/exec .terraform/modules/gke.gcloud_delete_default_kube_dns_configmap/terraform-google-gcloud-1.4.1/scripts/check_env.sh: %1 is not a valid Win32 application.

  on .terraform\modules\gke.gcloud_delete_default_kube_dns_configmap\terraform-google-gcloud-1.4.1\main.tf line 70,in data "external" "env_override":
  70: data "external" "env_override" {

Error: failed to execute ".terraform/modules/gke.gcloud_wait_for_cluster/terraform-google-gcloud-1.3.0/scripts/check_env.sh": fork/exec .terraform/modules/gke.gcloud_wait_for_cluster/terraform-google-gcloud-1.3.0/scripts/check_env.sh: %1 is not a valid Win32 application.

  on .terraform\modules\gke.gcloud_wait_for_cluster\terraform-google-gcloud-1.3.0\main.tf line 70,in data "external" "env_override":
  70: data "external" "env_override" {

这是我设置node_pools参数的方式


  node_pools = [
    {
      name               = "linux-node-pool"
      machine_type       = var.nodepool_instance_type
      min_count          = 1
      max_count          = 10
      disk_size_gb       = 100
      disk_type          = "pd-standard"
      image_type         = "COS"                                  
      auto_repair        = true                                   
      auto_upgrade       = true                                 
      service_account    = google_service_account.gke_cluster_sa.email
      preemptible        = var.preemptible
      initial_node_count = 1
    },{
      name               = "windows-node-pool"
      machine_type       = var.nodepool_instance_type
      min_count          = 1
      max_count          = 10
      disk_size_gb       = 100
      disk_type          = "pd-standard"
      image_type         = var.nodepool_image_type                
      auto_repair        = true                                   
      auto_upgrade       = true                                   
      service_account    = google_service_account.gke_cluster_sa.email
      preemptible        = var.preemptible
      initial_node_count = 1
  
    }
  ]

  cluster_resource_labels = var.cluster_resource_labels           

  # health check and webhook firewall rules
  node_pools_tags = {
    all = [
      "xx-xxx-xxx-local-xxx",]
  }

  node_pools_metadata = {
    all = {
//      workload-metadata = "GCE_METADATA"
    }

    linux-node-pool = {
      ssh-keys = join("\n",[for user,key in var.node_ssh_keys : "${user}:${key}"])
      block-project-ssh-keys = true
    }

    windows-node-pool = {
      workload-metadata = "GCE_METADATA"
    }

  }

  • 这是共享的VPC,我在其中为群集配置群集版本:1.17.9-gke.600

解决方法

检出https://github.com/terraform-google-modules/terraform-google-kubernetes-engine/issues/632以获取解决方案。

错误消息不明确,并且GKE有一个内部错误来跟踪此问题。我们将尽快改善错误消息。

版权声明:本文内容由互联网用户自发贡献,该文观点与技术仅代表作者本人。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如发现本站有涉嫌侵权/违法违规的内容, 请发送邮件至 dio@foxmail.com 举报,一经查实,本站将立刻删除。

相关推荐


依赖报错 idea导入项目后依赖报错,解决方案:https://blog.csdn.net/weixin_42420249/article/details/81191861 依赖版本报错:更换其他版本 无法下载依赖可参考:https://blog.csdn.net/weixin_42628809/a
错误1:代码生成器依赖和mybatis依赖冲突 启动项目时报错如下 2021-12-03 13:33:33.927 ERROR 7228 [ main] o.s.b.d.LoggingFailureAnalysisReporter : *************************** APPL
错误1:gradle项目控制台输出为乱码 # 解决方案:https://blog.csdn.net/weixin_43501566/article/details/112482302 # 在gradle-wrapper.properties 添加以下内容 org.gradle.jvmargs=-Df
错误还原:在查询的过程中,传入的workType为0时,该条件不起作用 <select id="xxx"> SELECT di.id, di.name, di.work_type, di.updated... <where> <if test=&qu
报错如下,gcc版本太低 ^ server.c:5346:31: 错误:‘struct redisServer’没有名为‘server_cpulist’的成员 redisSetCpuAffinity(server.server_cpulist); ^ server.c: 在函数‘hasActiveC
解决方案1 1、改项目中.idea/workspace.xml配置文件,增加dynamic.classpath参数 2、搜索PropertiesComponent,添加如下 <property name="dynamic.classpath" value="tru
删除根组件app.vue中的默认代码后报错:Module Error (from ./node_modules/eslint-loader/index.js): 解决方案:关闭ESlint代码检测,在项目根目录创建vue.config.js,在文件中添加 module.exports = { lin
查看spark默认的python版本 [root@master day27]# pyspark /home/software/spark-2.3.4-bin-hadoop2.7/conf/spark-env.sh: line 2: /usr/local/hadoop/bin/hadoop: No s
使用本地python环境可以成功执行 import pandas as pd import matplotlib.pyplot as plt # 设置字体 plt.rcParams['font.sans-serif'] = ['SimHei'] # 能正确显示负号 p
错误1:Request method ‘DELETE‘ not supported 错误还原:controller层有一个接口,访问该接口时报错:Request method ‘DELETE‘ not supported 错误原因:没有接收到前端传入的参数,修改为如下 参考 错误2:cannot r
错误1:启动docker镜像时报错:Error response from daemon: driver failed programming external connectivity on endpoint quirky_allen 解决方法:重启docker -> systemctl r
错误1:private field ‘xxx‘ is never assigned 按Altʾnter快捷键,选择第2项 参考:https://blog.csdn.net/shi_hong_fei_hei/article/details/88814070 错误2:启动时报错,不能找到主启动类 #
报错如下,通过源不能下载,最后警告pip需升级版本 Requirement already satisfied: pip in c:\users\ychen\appdata\local\programs\python\python310\lib\site-packages (22.0.4) Coll
错误1:maven打包报错 错误还原:使用maven打包项目时报错如下 [ERROR] Failed to execute goal org.apache.maven.plugins:maven-resources-plugin:3.2.0:resources (default-resources)
错误1:服务调用时报错 服务消费者模块assess通过openFeign调用服务提供者模块hires 如下为服务提供者模块hires的控制层接口 @RestController @RequestMapping("/hires") public class FeignControl
错误1:运行项目后报如下错误 解决方案 报错2:Failed to execute goal org.apache.maven.plugins:maven-compiler-plugin:3.8.1:compile (default-compile) on project sb 解决方案:在pom.
参考 错误原因 过滤器或拦截器在生效时,redisTemplate还没有注入 解决方案:在注入容器时就生效 @Component //项目运行时就注入Spring容器 public class RedisBean { @Resource private RedisTemplate<String
使用vite构建项目报错 C:\Users\ychen\work>npm init @vitejs/app @vitejs/create-app is deprecated, use npm init vite instead C:\Users\ychen\AppData\Local\npm-