12
博客obdiag display-dba必备工具箱

obdiag display-c7电子娱乐

obdiag 是一款适用于 oceanbase 数据库的黑屏诊断工具,obdiag 现有功能包含了对于 oceanbase 数据库日志、sql audit 以及 oceanbase 数据库进程堆栈等信息进行的扫描、收集和分析,可以在 oceanbase 集群不同的部署模式下(ocp,obd 或用户根据文档手工部署)实现一键执行,完成诊断信息的收集和分析。

obdiagdisplay 在2.5版本由sig的黄尖老师和靖永栋老师开发和完善了基本功能。

obdiagdisplay 取名一键洞察,希望可以一键展示一些集群信息等,帮助dba方便排查问题和收集信息。

这个功能我心心念念了好久,这次找时间把玩下。

有很多人戏称是脚本dba,现在可以根据自己的使用习惯去diy自己的工具箱了

展示:

首先安装很简单,rpm安装下直接执行下init.sh的脚本就部署上了

obdaigcs:/home #b rpm -ivh oceanbase-diagnostic-tool-2.6.0-12024112811.el7.x86_64.rpm
preparing...                          ################################# [100%]
updating / installing...
   1:oceanbase-diagnostic-tool-2.6.0-1################################# [100%]
init obdiag finished
/
please execute the following command to init obdiag:
 source /usr/local/oceanbase-diagnostic-tool/init.sh 
 
obdaigcs:/home #b  source /usr/local/oceanbase-diagnostic-tool/init.sh 
init obdiag finished
/home

可以直接通过obdiag配置环境的config信息,因为我只需要集群的一些信息,所以主机的一些配置没有设置。

obdaigcs:/home #b obdiag config -h 168.88.1.111 -uroot@sys#zrycs -p'passwd##321' -p2883
obdaigcs:/home #b cat ~/.obdiag/config.yml 
obcluster:
  ob_cluster_name: obcluster
  db_host: 168.88.1.111
  db_port: '2883'
  tenant_sys:
    user: root@sys#zrycs
    password: passwd##321
  servers:
    nodes:
    - ip: 168.88.1.111
    - ip: 168.88.1.113
    - ip: 168.88.1.115
    global:
    .
    .
    .

通过obdiag display scene list命令展示下支持的功能,还有一些demo语句。按照demo的样式执行就可以查询集群一些信息了。

obdaigcs:/home #b obdiag display scene list
[observer problem display scenes]:
---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
command                                                                                                                                                                 info_en                info_cn          
---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
obdiag display scene run --scene=observer.cluster_info                                                                                                                  [cluster info]         [集群信息展示]         
obdiag display scene run --scene=observer.database_datasize --env tenant_id=1 --env database_name=test                                                                  [database data_size]   [查看库占用磁盘大小]      
obdiag display scene run --scene=observer.event --env tenant_name=test                                                                                                  [event]                [event信息]        
obdiag display scene run --scene=observer.lockholder --env db_connect='-h127.0.0.1 -p2881 -utest@test -p****** -dtest'                                                  [lock holder info]     [查看锁等待]          
obdiag display scene run --scene=observer.plan  --env tenant_name=test --env sqlid=test                                                                                 [plan]                 [查看sql的执行计划]     
obdiag display scene run --scene=observer.processlist --env tenant_name=test                                                                                            [processlist]          [查看processlist]  
obdiag display scene run --scene=observer.rs                                                                                                                            [rs]                   [查看rootservice信息]
obdiag display scene run --scene=observer.server_info                                                                                                                   [server info]          [server 信息展示]    
obdiag display scene run --scene=observer.slowsql --env tenant_name=test --env mtime=10                                                                                 [slowsql]              [查看慢sql]         
obdiag display scene run --scene=observer.table_datasize --env tenant_id=1 --env database_name=test --env table_name=test                                               [table data_size]      [查看表占用磁盘大小]      
obdiag display scene run --scene=observer.table_info --env db_connect='-h127.0.0.1 -p2881 -utest@test -p****** -dtest' --env database_name=test --env table_name=test   [table info]           [表信息展示]          
obdiag display scene run --scene=observer.tenant_info --env tenant_name=test                                                                                            [tenant info]          [租户信息展示]         
obdiag display scene run --scene=observer.topsql --env tenant_name=test --env mtime=10                                                                                  [topsql info]          [查看topsql]       
obdiag display scene run --scene=observer.unit_info                                                                                                                     [unit info]            [unit 信息展示]      
obdiag display scene run --scene=observer.zone_info                                                                                                                     [zone info]            [zone 信息展示]      
--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
trace id: 3600d0ee-bb7b-11ef-b4d8-00505689a0ae
if you want to view detailed obdiag logs, please run: obdiag display-trace 3600d0ee-bb7b-11ef-b4d8-00505689a0ae	

按照demo的样式执行就可以查询集群一些信息了。

obdaigcs:/home #b obdiag display scene run --scene=observer.cluster_info  
display_scenes_run start ...
execute tasks: observer.cluster_info
 ----------------------- --------------------------------------------------------------------------------------------------------- 
|       stat_name       |                                                  value                                                  |
 ----------------------- --------------------------------------------------------------------------------------------------------- 
|      cluster_name     |                                                  zrycs                                                  |
|       cluster_id      |                                                1711434374                                               |
|        version        |                                                 4.3.3.0                                                 |
|       observers       |                                                    3                                                    |
|         zones         |                                                    3                                                    |
|        tenants        |                                                    2                                                    |
|        rs_list        |                   168.88.1.111:2882:2881;168.88.1.113:2882:2881;168.88.1.115:2882:2881                  |
|       config_url      | http://168.88.1.109:8080/services?action=obrootserviceinfo&user_id=alibaba&uid=ocpmaster&obregion=zrycs |
|      memory_limit     |                                                  74.17g                                                 |
|  data_disk_allocated  |                                                 505.78g                                                 |
|       cpu_total       |                                                    96                                                   |
|      cpu_assigned     |                                                    57                                                   |
|    cpu_assinged_per   |                                                   59%                                                   |
|      memory_total     |                                                  44.17g                                                 |
|    memory_assigned    |                                                  33.00g                                                 |
|  memory_assigned_per  |                                                  75.00%                                                 |
|     log_disk_total    |                                                 501.75g                                                 |
|   log_disk_assigned   |                                                 103.00g                                                 |
| log_disk_assigned_per |                                                  21.00%                                                 |
|    log_disk_in_use    |                                                  64.38g                                                 |
|  log_disk_in_use_per  |                                                  63.00%                                                 |
|    data_disk_total    |                                                 505.78g                                                 |
|    data_disk_in_use   |                                                  1.00g                                                  |
|  data_disk_in_use_per |                                                  0.00%                                                  |
 ----------------------- --------------------------------------------------------------------------------------------------------- 
trace id: 4388c8c0-bb7b-11ef-9b93-00505689a0ae
if you want to view detailed obdiag logs, please run: obdiag display-trace 4388c8c0-bb7b-11ef-9b93-00505689a0ae

diy方法:

直接进入obdiag的display路径下就可以 自己创建yaml文件。

obdaigcs:/home #b cd ~/.obdiag/display/tasks/observer/
obdaigcs:~/.obdiag/display/tasks/observer #b cp rs.yaml   zrycs.ymal
obdaigcs:~/.obdiag/display/tasks/observer #b vi zrycs.ymal
obdaigcs:~/.obdiag/display/tasks/observer #b cat zrycs.yaml
info_en: "[zrycs]"
info_cn: "[zrycs信息]"
command: obdiag display scene run --scene=observer.zrycs
task:
  - version: "[4.0.0.0, *]"
    steps:
      - type: sql
        sql: "select now() chenck_time, t.tenant_name,t.tenant_id,ca.svr_ip,
  round(sum(case when stat_id='140006' then ca.value else 0 end)/100,2) cpu_usage,
  round(sum(case when stat_id='140005' then ca.value else 0 end)/100,2) cpu_max ,
  round(100*(sum(case when stat_id='140006' then ca.value else 0 end)/sum(case when stat_id='140005' then ca.value else 0 end)),2) cpu_percent
 from 
(select s.con_id, s.svr_ip,s.stat_id, sum(s.value) value
from oceanbase.gv$sysstat s where s.stat_id in(140005,140006) and s.con_id>1000
group by s.con_id, s.svr_ip,s.stat_id, s.name) ca,oceanbase.__all_tenant t
where t.tenant_id=ca.con_id
group by t.tenant_name,ca.svr_ip;"
        global: true

我创建了一个测试的yaml,通过list就可以看到信息了。

obdaigcs:~/.obdiag/display/tasks/observer #b obdiag display scene list
[warn] currently executing in obdiag home directory!
[observer problem display scenes]:
---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
command                                                                                                                                                                 info_en                info_cn          
---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
obdiag display scene run --scene=observer.cluster_info                                                                                                                  [cluster info]         [集群信息展示]         
obdiag display scene run --scene=observer.database_datasize --env tenant_id=1 --env database_name=test                                                                  [database data_size]   [查看库占用磁盘大小]      
obdiag display scene run --scene=observer.event --env tenant_name=test                                                                                                  [event]                [event信息]        
obdiag display scene run --scene=observer.lockholder --env db_connect='-h127.0.0.1 -p2881 -utest@test -p****** -dtest'                                                  [lock holder info]     [查看锁等待]          
obdiag display scene run --scene=observer.plan  --env tenant_name=test --env sqlid=test                                                                                 [plan]                 [查看sql的执行计划]     
obdiag display scene run --scene=observer.processlist --env tenant_name=test                                                                                            [processlist]          [查看processlist]  
obdiag display scene run --scene=observer.rs                                                                                                                            [rs]                   [查看rootservice信息]
obdiag display scene run --scene=observer.server_info                                                                                                                   [server info]          [server 信息展示]    
obdiag display scene run --scene=observer.slowsql --env tenant_name=test --env mtime=10                                                                                 [slowsql]              [查看慢sql]         
obdiag display scene run --scene=observer.table_datasize --env tenant_id=1 --env database_name=test --env table_name=test                                               [table data_size]      [查看表占用磁盘大小]      
obdiag display scene run --scene=observer.table_info --env db_connect='-h127.0.0.1 -p2881 -utest@test -p****** -dtest' --env database_name=test --env table_name=test   [table info]           [表信息展示]          
obdiag display scene run --scene=observer.tenant_info --env tenant_name=test                                                                                            [tenant info]          [租户信息展示]         
obdiag display scene run --scene=observer.topsql --env tenant_name=test --env mtime=10                                                                                  [topsql info]          [查看topsql]       
obdiag display scene run --scene=observer.unit_info                                                                                                                     [unit info]            [unit 信息展示]      
obdiag display scene run --scene=observer.zone_info                                                                                                                     [zone info]            [zone 信息展示]      
obdiag display scene run --scene=observer.zrycs                                                                                                                         [zrycs]                [zrycs信息]        
---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
trace id: 9f2f61ba-bb7c-11ef-ba1a-00505689a0ae
if you want to view detailed obdiag logs, please run: obdiag display-trace 9f2f61ba-bb7c-11ef-ba1a-00505689a0ae

我执行下新创建的查询cpu的demo,就可以正常执行了。

obdaigcs:~/.obdiag/display/tasks/observer #b obdiag display scene run --scene=observer.zrycs 
[warn] currently executing in obdiag home directory!
display_scenes_run start ...
execute tasks: observer.zrycs
 --------------------- ------------- ----------- -------------- ----------- --------- ------------- 
|     chenck_time     | tenant_name | tenant_id |    svr_ip    | cpu_usage | cpu_max | cpu_percent |
 --------------------- ------------- ----------- -------------- ----------- --------- ------------- 
| 2024-12-16 15:09:05 |  meta$1002  |    1001   | 168.88.1.111 |    0.00   |   1.50  |     0.00    |
| 2024-12-16 15:09:05 |     zry     |    1002   | 168.88.1.111 |    0.00   |  15.00  |     0.00    |
| 2024-12-16 15:09:05 |  meta$1002  |    1001   | 168.88.1.113 |    0.00   |   1.50  |     0.00    |
| 2024-12-16 15:09:05 |     zry     |    1002   | 168.88.1.113 |    0.01   |  15.00  |     0.07    |
| 2024-12-16 15:09:05 |  meta$1002  |    1001   | 168.88.1.115 |    0.00   |   1.50  |     0.00    |
| 2024-12-16 15:09:05 |     zry     |    1002   | 168.88.1.115 |    0.00   |  15.00  |     0.00    |
 --------------------- ------------- ----------- -------------- ----------- --------- ------------- 
trace id: a31b458c-bb7c-11ef-ab33-00505689a0ae
if you want to view detailed obdiag logs, please run: obdiag display-trace a31b458c-bb7c-11ef-ab33-00505689a0ae

再创建个检查server的状态的demo。

obdaigcs:~/.obdiag/display/tasks/observer #b cat status.yaml 
info_en: "[status]"
info_cn: "[server status信息]"
command: obdiag display scene run --scene=observer.status
task:
  - version: "[*,*]"
    steps:
      - type: sql
        sql: "select  b.info cluster_name,a.svr_ip,a.with_rootserver rs,a.zone,
ora_decode(a.start_service_time,0,null,date_format(usec_to_time(a.start_service_time),'%y-%m-%d %h:%i:%s')) start_service_time,
ora_decode(a.last_offline_time,0,null,date_format(usec_to_time(a.last_offline_time),'%y-%m-%d %h:%i:%s'))  last_offline_time,
ora_decode(a.stop_time,0,null,date_format(usec_to_time(a.stop_time),'%y-%m-%d %h:%i:%s')) stop_time,
a.status,substr(a.build_version, 1, instr(build_version, '-') - 1) build_version  
from  oceanbase.__all_server a ,oceanbase.__all_zone b where  b.name='cluster' order by a.zone,a.svr_ip; "
        global: true
5-observer-cs-06:~/.obdiag/display/tasks/observer #b obdiag display scene run --scene=observer.status
[warn] currently executing in obdiag home directory!
display_scenes_run start ...
execute tasks: observer.status
 -------------- -------------- ---- ------- --------------------- ------------------- ----------- -------- ---------------------------- 
| cluster_name |    svr_ip    | rs |  zone |  start_service_time | last_offline_time | stop_time | status |       build_version        |
 -------------- -------------- ---- ------- --------------------- ------------------- ----------- -------- ---------------------------- 
|    zrycs     | 168.88.1.111 | 1  | zone1 | 2024-10-09 10:45:25 |        none       |    none   | active | 4.3.3.0_100000362024093001 |
|    zrycs     | 168.88.1.113 | 0  | zone2 | 2024-10-09 10:45:26 |        none       |    none   | active | 4.3.3.0_100000362024093001 |
|    zrycs     | 168.88.1.115 | 0  | zone3 | 2024-10-09 11:31:47 |        none       |    none   | active | 4.3.3.0_100000362024093001 |
 -------------- -------------- ---- ------- --------------------- ------------------- ----------- -------- ---------------------------- 
trace id: 90f69dc8-bc5a-11ef-9e78-00505689a0ae

我们如果环境比较多,还可以多配置几个config文件。调用的时候指定-c就可以了。

obdaigcs:~/.obdiag/display/tasks/observer #b cp  ~/.obdiag/config.yml    ~/.obdiag/configzry.yml 
obdaigcs:~/.obdiag/display/tasks/observer #b vi  ~/.obdiag/configzry.yml 
obdaigcs:~/.obdiag/display/tasks/observer #b cat  ~/.obdiag/configzry.yml 
obcluster:
  ob_cluster_name: obcluster
  db_host: 168.88.1.112
  db_port: '2881'
  tenant_sys:
    user: root@sys
    password: passwd##321
  servers:
    nodes:
    - ip: 168.88.1.112
obdaigcs:~/.obdiag/display/tasks/observer #b obdiag display scene run --scene=observer.status -c ~/.obdiag/configzry.yml
[warn] currently executing in obdiag home directory!
display_scenes_run start ...
execute tasks: observer.status
 -------------- -------------- ---- ------- --------------------- ------------------- ----------- -------- ---------------------------- 
| cluster_name |    svr_ip    | rs |  zone |  start_service_time | last_offline_time | stop_time | status |       build_version        |
 -------------- -------------- ---- ------- --------------------- ------------------- ----------- -------- ---------------------------- 
|    zry_cc    | 168.88.1.112 | 1  | zone1 | 2024-12-17 11:28:08 |        none       |    none   | active | 3.2.3.3_110050012023112210 |
 -------------- -------------- ---- ------- --------------------- ------------------- ----------- -------- ---------------------------- 
trace id: 46e4d384-bc5b-11ef-bdad-00505689a0ae
if you want to view detailed obdiag logs, please run: obdiag display-trace 46e4d384-bc5b-11ef-bdad-00505689a0ae
    

案例展示:

我简单做了个小场景的diy案例,我们日常运维中经常有优化sql的需求,查看执行历史和执行计划都是必不可少的操作,所以我这个案例就是找到sqlid后通过display去看历史执行信息和执行计划的。

我创建了两个yaml一个splan查看历史执行信息的一个aplan查看执行计划的,aplan需要的env参数都可以通过splan来获取到。两个yaml都分别配置了3.x和4.x不同的适配语句。

obdaigcs:~/.obdiag/display/tasks/observer #b obdiag display scene list
[warn] currently executing in obdiag home directory!
[observer problem display scenes]:
---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
command                                                                                                                                                                 info_en                info_cn          
---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
obdiag display scene run --scene=observer.aplan  --env svr_ip=test --env tenant_id=test --env plan_id=test                                                              [aplan]                [实际执行计划信息]       
.
.
        
obdiag display scene run --scene=observer.splan  --env sqlid=test                                                                                                       [splan]                [执行记录信息]         
.
.
       
---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
trace id: 61ba1686-bc62-11ef-97f7-00505689a0ae
if you want to view detailed obdiag logs, please run: obdiag display-trace 61ba1686-bc62-11ef-97f7-00505689a0ae

查询4.x的历史信息和计划:

obdaigcs:~/.obdiag/display/tasks/observer #b obdiag display scene run --scene=observer.splan  --env   sqlid=ceeb8efd0bda15aa584b19d366fb25ec
[warn] currently executing in obdiag home directory!
display_scenes_run start ...
execute tasks: observer.splan
 --------------- -------------- --------- ----------- ------ ---------------------------------- --------------------- --------------------- ------------------ ------------ ----------------- ----------------- --------------------- ------------ ------------ -------- ------------ 
| database_name |    svr_ip    | plan_id | tenant_id | type |              sql_id              |   first_load_time   |   last_active_time  | avg_exe_usec(ms) | executions | elapsed_time(s) | slowest_time(s) |   slowest_exe_time  | slow_count | ps_stmt_id |  rows  | outline_id |
 --------------- -------------- --------- ----------- ------ ---------------------------------- --------------------- --------------------- ------------------ ------------ ----------------- ----------------- --------------------- ------------ ------------ -------- ------------ 
|      sys      | 168.88.1.111 |   9611  |    1002   |  1   | ceeb8efd0bda15aa584b19d366fb25ec | 2024-12-17 18:05:59 | 2024-12-17 18:05:59 |      3.2830      |     1      |    0.00328300   |       0.00      | 2024-12-17 18:05:59 |     0      |     -1     | 1.0000 |     -1     |
 --------------- -------------- --------- ----------- ------ ---------------------------------- --------------------- --------------------- ------------------ ------------ ----------------- ----------------- --------------------- ------------ ------------ -------- ------------ 
trace id: 3d6c43bc-bc62-11ef-83a6-00505689a0ae
if you want to view detailed obdiag logs, please run: obdiag display-trace 3d6c43bc-bc62-11ef-83a6-00505689a0ae
obdaigcs:~/.obdiag/display/tasks/observer #b obdiag display scene run --scene=observer.aplan  --env svr_ip=168.88.1.111 --env tenant_id=1002 --env lan_id=9611
[warn] currently executing in obdiag home directory!
display_scenes_run start ...
execute tasks: observer.aplan
 --------- -------------------- ------------ ------ ------ 
| plan_id |      operator      |    name    | rows | cost |
 --------- -------------------- ------------ ------ ------ 
|   9611  | phy_vec_table_scan | t1(idx_ti) |  1   |  2   |
 --------- -------------------- ------------ ------ ------ 
trace id: 478dbefc-bc62-11ef-bac6-00505689a0ae
if you want to view detailed obdiag logs, please run: obdiag display-trace 478dbefc-bc62-11ef-bac6-00505689a0ae

查询3.x的历史信息和计划:

obdaigcs:~/.obdiag/display/tasks/observer #b obdiag display scene run --scene=observer.splan  -c ~/.obdiag/configzry.yml --env   sqlid=ceeb8efd0bda5aa584b19d366fb25ec 
[warn] currently executing in obdiag home directory!
display_scenes_run start ...
execute tasks: observer.splan
 --------------- -------------- --------- ----------- ------ ---------------------------------- --------------------- --------------------- ------------------ ------------ ----------------- ----------------- --------------------- ------------ ------------ -------- ------------ 
| database_name |    svr_ip    | plan_id | tenant_id | type |              sql_id              |   first_load_time   |   last_active_time  | avg_exe_usec(ms) | executions | elapsed_time(s) | slowest_time(s) |   slowest_exe_time  | slow_count | ps_stmt_id |  rows  | outline_id |
 --------------- -------------- --------- ----------- ------ ---------------------------------- --------------------- --------------------- ------------------ ------------ ----------------- ----------------- --------------------- ------------ ------------ -------- ------------ 
|      sys      | 168.88.1.112 |   177   |    1001   |  1   | ceeb8efd0bda15aa584b19d366fb25ec | 2024-12-17 18:08:48 | 2024-12-17 18:12:34 |      8.8620      |     2      |    0.01772400   |       0.02      | 2024-12-17 18:08:48 |     0      |     -1     | 0.0000 |     -1     |
 --------------- -------------- --------- ----------- ------ ---------------------------------- --------------------- --------------------- ------------------ ------------ ----------------- ----------------- --------------------- ------------ ------------ -------- ------------ 
trace id: 3ff04ae8-bc62-11ef-a058-00505689a0ae
if you want to view detailed obdiag logs, please run: obdiag display-trace 3ff04ae8-bc62-11ef-a058-00505689a0ae
obdaigcs:~/.obdiag/display/tasks/observer #b obdiag display scene run --scene=observer.aplan   -c ~/.obdiag/configzry.yml --env svr_ip=168.88.1.112--env tenant_id=1001 --env plan_id=177
[warn] currently executing in obdiag home directory!
display_scenes_run start ...
execute tasks: observer.aplan
 --------- ---------------- ------ -------- ------- 
| plan_id |    operator    | name |  rows  |  cost |
 --------- ---------------- ------ -------- ------- 
|   177   | phy_table_scan |  t1  | 100000 | 38680 |
 --------- ---------------- ------ -------- ------- 
trace id: 5c5b59f2-bc62-11ef-b734-00505689a0ae
if you want to view detailed obdiag logs, please run: obdiag display-trace 5c5b59f2-bc62-11ef-b734-00505689a0ae
obdaigcs:~/.obdiag/display/tasks/observer #b obdiag display scene list

总结:

我后期会把我日常工作中经常用的功能都做成yaml工具箱的方式加到display里,这是成为一个合格的ob脚本dba的必修课。

行之所向,莫问远方。

点赞12
收藏

声明

本网站下的“博客”等板块为技术爱好者提供分享、交流的平台。发布者发布的任何内容、信息等,并不反映或代表本网站的观点、立场或政策。本网站不对其任何内容和信息的错误以及由此产生的损失或损坏承担任何责任。

尊重知识产权是本网站的基本原则之一,如您在使用本网站过程中发现本网站中存在侵犯您或其他第三人合法知识产权的情况,请您即可将侵权材料及初步证据提交至下述邮箱:obcompliance@oceanbase.com 。本网站将在收到材料后尽快进行审核及处理。

行之所向,莫问远方

已发布 32 篇博文

网站地图