- nebula 版本:v3.5.0
- operator 版本:v1.4.2
- 部署方式:云端
- 安装方式:k8s
- 是否上生产环境: N
- 问题的具体描述
k8s 部署修改了默认的日志级别,将 minloglevel 改成了 1,v 改成 3,但是只有 graphd 的配置修改了,metad 和 storaged 的配置没有修改成功
k8s 部署修改了默认的日志级别,将 minloglevel 改成了 1,v 改成 3,但是只有 graphd 的配置修改了,metad 和 storaged 的配置没有修改成功
您好,可以贴一下您的nebula-cluster yaml吗
nebula:
logRotate:
rotate: 5
size: 200M
graphd:
affinity:
podAntiAffinity:
preferredDuringSchedulingIgnoredDuringExecution:
- podAffinityTerm:
labelSelector:
matchLabels:
app.kubernetes.io/component: graphd
topologyKey: kubernetes.io/hostname
weight: 100
config:
auth_type: password
enable_authorize: "true"
max_sessions_per_ip_per_user: "300"
minloglevel: "2"
session_idle_timeout_secs: "120"
system_memory_high_watermark_ratio: "0.9"
v: "2"
env:
- name: TZ
value: Asia/Shanghai
image: vesoft/nebula-graphd
logStorage: 50Gi
nodeSelector:
nebula: ""
replicas: 3
resources:
limits:
cpu: "8"
memory: 16Gi
requests:
cpu: "8"
memory: 16Gi
sidecarContainers:
- command:
- sh
- -ce
- |-
version=3.5.0
wget -O /usr/local/bin/nebula-console https://ghproxy.com/github.com/vesoft-inc/nebula-console/releases/download/v$version/nebula-console-linux-amd64-v$version
chmod a+x /usr/local/bin/nebula-console
while true; do find logs/ -size +1048576k -type f -delete;sleep 1h; done
image: alpine:edge
name: nebula-console
volumeMounts:
- mountPath: /logs
name: graphd-log
subPath: logs
imagePullPolicy: IfNotPresent
metad:
affinity:
podAntiAffinity:
preferredDuringSchedulingIgnoredDuringExecution:
- podAffinityTerm:
labelSelector:
matchLabels:
app.kubernetes.io/component: metad
topologyKey: kubernetes.io/hostname
weight: 100
config:
minloglevel: "2"
v: "2"
dataStorage: 100Gi
env:
- name: TZ
value: Asia/Shanghai
image: vesoft/nebula-metad
logStorage: 50Gi
nodeSelector:
nebula: ""
replicas: 3
resources:
limits:
cpu: "8"
memory: 16Gi
requests:
cpu: "8"
memory: 16Gi
sidecarContainers:
- command:
- sh
- -ce
- while true; do find logs/ -size +1048576k -type f -delete;sleep 1h; done
image: alpine:edge
name: clean-logs
volumeMounts:
- mountPath: /logs
name: metad-log
subPath: logs
schedulerName: default-scheduler # default-scheduler, nebula-scheduler
storageClassName: local-path-retain
storaged:
affinity:
podAntiAffinity:
preferredDuringSchedulingIgnoredDuringExecution:
- podAffinityTerm:
labelSelector:
matchLabels:
app.kubernetes.io/component: storaged
topologyKey: kubernetes.io/hostname
weight: 100
config:
minloglevel: "2"
v: "2"
dataStorage: 100Gi
env:
- name: TZ
value: Asia/Shanghai
image: vesoft/nebula-storaged
logStorage: 50Gi
nodeSelector:
nebula: ""
replicas: 3
resources:
limits:
cpu: "8"
memory: 16Gi
requests:
cpu: "8"
memory: 16Gi
sidecarContainers:
- command:
- sh
- -ce
- while true; do find logs/ -size +1048576k -type f -delete;sleep 1h; done
image: alpine:edge
name: clean-logs
volumeMounts:
- mountPath: /logs
name: storaged-log
subPath: logs
version: v3.5.0
您好,yaml看起来没什么问题,可以再贴一下metad和storaged的configmap吗,如果configmap里不存在该配置,可以试试手动删除pod后,configmap会不会更新
我后来各种方法都尝试了一遍,包括删除pod,或者删除configmap让operator重新建,都不能更新log的相关配置
您好,我在本地测试是正常的,修改nebulacluster的config字段后,会触发configmap更新以及container重启。另外,上周operator发布了1.5.0,可以升级到最新版试试
升级到 1.5.0 报错了
I0815 07:23:11.151002 1 cm.go:98] configMap [nebula/nebula-cluster-metad] updated successfully
I0815 07:23:11.151130 1 nebula_cluster_controller.go:143] Finished reconciling NebulaCluster [nebula/nebula-cluster], spendTime: (4.440012706s)
I0815 07:23:11.151206 1 controller.go:118] "msg"="Observed a panic in reconciler: runtime error: invalid memory address or nil pointer dereference" "NebulaCluster"={"name":"nebula-cluster","n
panic: runtime error: invalid memory address or nil pointer dereference [recovered]
panic: runtime error: invalid memory address or nil pointer dereference
[signal SIGSEGV: segmentation violation code=0x1 addr=0x20 pc=0x14a32a8]
您好,请问crd有更新吗,可以贴下完整的报错日志吗,可能有兼容性的问题
crd 之前没更新,后来修了一下 operator 跑起来了,但是配置还是更新失败了
删除了所有 configmap 之后,依然是只有最后创建的 graphd 的配置被修改了 minloglevel 和 v,先创建出来的 metad 和 storaged 依旧是默认的 0 和 0
了解,可以手动执行 kubectl get nc -o yaml 确认下spec里的config是否发生了更改吗
找运维同学确认了一下,kubectl get nc -o yaml
里的 config 确实都已经修改成 3 了,但是实际确实是只有 graphd 的配置改成 3,其他依旧是 0
您好,可以进到storaged的容器里,curl 一下ip:19779/flags接口,看下对应的config是否已经更新,看了下源码,目前把config分成了两类,一类是需要重启更新的,一类是不需要重启更新的,需要重启的才会更新configmap,不需要重启的会直接调用http接口进行更新
手动 curl 改了一下三个服务的 log 配置之后,再重启 curl flags 接口改完配置之后,又被 operator 配置改成 0 了 (@_@)
额,我们运维仔细研究了一下,发现问题出在动态配置上了
后来我们加了一个 log_dir=logs (相当于没改)的配置在 config 里面,就会修改 metad 和 storaged 的配置文件
问题是,就算动态更新能更新上对的也行啊,但是 operator 给 reset 成默认之后就完了,就不给更新成自定义配置的了 ……
你看,它在 reset dynamic flags successfully 之后直接返回 nil 了,不管自定义的配置了
此话题已在最后回复的 7 天后被自动关闭。不再允许新回复。