广告位联系
返回顶部
分享到

Win10环境借助DockerDesktop部署大数据时序数据库Apache Druid的操作方法

相关技巧 来源:互联网 作者:佚名 发布时间:2025-02-18 21:19:55 人浏览
摘要

Win10环境借助DockerDesktop部署最新版大数据时序数据库Apache Druid32.0.0 前言 大数据分析中,有一种常见的场景,那就是时序数据,简言之,数据一旦产生绝对不会修改,随着时间流逝,每个时间点

Win10环境借助DockerDesktop部署最新版大数据时序数据库Apache Druid32.0.0

前言

大数据分析中,有一种常见的场景,那就是时序数据,简言之,数据一旦产生绝对不会修改,随着时间流逝,每个时间点都会有个新的状态值。这种时序数据的量级往往异常夸张,例如传感器的原始监控数据:

https://lizhiyong.blog.csdn.net/article/details/114898620

一个简单的加速度传感器一年的数据量就是31e!!!制造业传感器数据如果不经底层PLC等下位机预处理,直接打到边缘计算网关,即使mqtt也会有巨大的负载!!!

类似的,还有服务器的原始监控数据,例如常见的Prometheus和Zabbix,当集群很多时,监控项同样很多,再算上虚拟化后的容器和虚拟机内都可能部署了监控,此时的数据量级就灰常可观!!!一小时几百亿条数据都是常见的事情!!!

但是很多原始的监控数据如果全部存下来,存储成本高的可怕,同时信息密度极低,更多时候我们可能只关注近期的全部热数据来做在线的模型训练,人工查看每秒钟几千条数据也是不切合实际的,事实上,做一个简单的秒级/分钟级统计就能满足大多数的分析场景,超过1天的冷数据其实已经没什么时效性。

对于此类场景,可以高吞吐、预聚合的数据库,在压测后,从Apache Druid、Clickhouse、Kylin中,选择了前者。。。专业的事情要交给专业的组件去做!!!

对于非内核和二开的业务开发人员,更多场景应该关注的是API、特性及用法,不应该在部署这种事情上花费太多精力!!!笔者之前已部署了Docker Desktop:

https://lizhiyong.blog.csdn.net/article/details/145580868

今天在Win10环境再搭建个Apache Druid最新版玩玩。

版本选择

官网:

https://druid.apache.org/

注意不是阿里数据库连接池的那个Druid!!!

截至2025-02-13,Apache Druid最新版本是32.0.0。

资源准备

参考官网:

https://druid.apache.org/docs/latest/tutorials/docker

官方给出了使用docker-compose.yml编排容器的教程,作为一个实时组件,大内存是必须的!!!但是启动8个容器【Zookeeper+PostgreSQL+6个Druid】每个最多7GB内存也不是什么大事!!!

https://raw.githubusercontent.com/apache/druid/32.0.0/distribution/docker/docker-compose.yml

获取到这个资源文件:

1

2

3

4

5

6

7

8

9

10

11

12

13

14

15

16

17

18

19

20

21

22

23

24

25

26

27

28

29

30

31

32

33

34

35

36

37

38

39

40

41

42

43

44

45

46

47

48

49

50

51

52

53

54

55

56

57

58

59

60

61

62

63

64

65

66

67

68

69

70

71

72

73

74

75

76

77

78

79

80

81

82

83

84

85

86

87

88

89

90

91

92

93

94

95

96

97

98

99

100

101

102

103

104

105

106

107

version: "2.2"

volumes:

  metadata_data: {}

  middle_var: {}

  historical_var: {}

  broker_var: {}

  coordinator_var: {}

  router_var: {}

  druid_shared: {}

services:

  postgres:

    container_name: postgres

    image: postgres:latest

    ports:

      - "5432:5432"

    volumes:

      - metadata_data:/var/lib/postgresql/data

    environment:

      - POSTGRES_PASSWORD=FoolishPassword

      - POSTGRES_USER=druid

      - POSTGRES_DB=druid

  # Need 3.5 or later for container nodes

  zookeeper:

    container_name: zookeeper

    image: zookeeper:3.5.10

    ports:

      - "2181:2181"

    environment:

      - ZOO_MY_ID=1

  coordinator:

    image: apache/druid:32.0.0

    container_name: coordinator

    volumes:

      - druid_shared:/opt/shared

      - coordinator_var:/opt/druid/var

    depends_on:

      - zookeeper

      - postgres

    ports:

      - "8081:8081"

    command:

      - coordinator

    env_file:

      - environment

  broker:

    image: apache/druid:32.0.0

    container_name: broker

    volumes:

      - broker_var:/opt/druid/var

    depends_on:

      - zookeeper

      - postgres

      - coordinator

    ports:

      - "8082:8082"

    command:

      - broker

    env_file:

      - environment

  historical:

    image: apache/druid:32.0.0

    container_name: historical

    volumes:

      - druid_shared:/opt/shared

      - historical_var:/opt/druid/var

    depends_on:

      - zookeeper

      - postgres

      - coordinator

    ports:

      - "8083:8083"

    command:

      - historical

    env_file:

      - environment

  middlemanager:

    image: apache/druid:32.0.0

    container_name: middlemanager

    volumes:

      - druid_shared:/opt/shared

      - middle_var:/opt/druid/var

    depends_on:

      - zookeeper

      - postgres

      - coordinator

    ports:

      - "8091:8091"

      - "8100-8105:8100-8105"

    command:

      - middleManager

    env_file:

      - environment

  router:

    image: apache/druid:32.0.0

    container_name: router

    volumes:

      - router_var:/opt/druid/var

    depends_on:

      - zookeeper

      - postgres

      - coordinator

    ports:

      - "3012:8888" #这里笔者改为3012防止霸占有用的端口

    command:

      - router

    env_file:

      - environment

参照官网另一篇:

https://druid.apache.org/docs/latest/configuration/

自己玩玩可以先不改这些运行时配置,容器启动的,后续要重新部署也非常容易!!!

还需要:

https://raw.githubusercontent.com/apache/druid/32.0.0/distribution/docker/environment

做另一个配置文件:

1

2

3

4

5

6

7

8

9

10

11

12

13

14

15

16

17

18

19

20

21

22

23

24

# Java tuning

#DRUID_XMX=1g

#DRUID_XMS=1g

#DRUID_MAXNEWSIZE=250m

#DRUID_NEWSIZE=250m

#DRUID_MAXDIRECTMEMORYSIZE=6172m

DRUID_SINGLE_NODE_CONF=micro-quickstart

druid_emitter_logging_logLevel=debug

druid_extensions_loadList=["druid-histogram", "druid-datasketches", "druid-lookups-cached-global", "postgresql-metadata-storage", "druid-multi-stage-query"]

druid_zk_service_host=zookeeper

druid_metadata_storage_host=

druid_metadata_storage_type=postgresql

druid_metadata_storage_connector_connectURI=jdbc:postgresql://postgres:5432/druid

druid_metadata_storage_connector_user=druid

druid_metadata_storage_connector_password=FoolishPassword

druid_indexer_runner_javaOptsArray=["-server", "-Xmx1g", "-Xms1g", "-XX:MaxDirectMemorySize=3g", "-Duser.timezone=UTC", "-Dfile.encoding=UTF-8", "-Djava.util.logging.manager=org.apache.logging.log4j.jul.LogManager"]

druid_indexer_fork_property_druid_processing_buffer_sizeBytes=256MiB

druid_storage_type=local

druid_storage_storageDirectory=/opt/shared/segments

druid_indexer_logs_type=file

druid_indexer_logs_directory=/opt/shared/indexing-logs

druid_processing_numThreads=2

druid_processing_numMergeBuffers=2

DRUID_LOG4J=<?xml version="1.0" encoding="UTF-8" ?><Configuration status="WARN"><Appenders><Console name="Console" target="SYSTEM_OUT"><PatternLayout pattern="%d{ISO8601} %p [%t] %c - %m%n"/></Console></Appenders><Loggers><Root level="info"><AppenderRef ref="Console"/></Root><Logger name="org.apache.druid.jetty.RequestLog" additivity="false" level="DEBUG"><AppenderRef ref="Console"/></Logger></Loggers></Configuration>

部署文件看起来麻雀虽小五脏俱全!!!

部署

1

2

3

4

5

6

7

8

9

10

11

12

13

14

15

16

17

18

19

20

21

22

23

24

25

26

27

28

29

30

31

32

33

34

PS C:\Users\zhiyong> cd E:\dockerData\volume\druid1

PS E:\dockerData\volume\druid1> ls

    目录: E:\dockerData\volume\druid1

Mode                 LastWriteTime         Length Name

----                 -------------         ------ ----

-a----        2025-02-13     23:26           2980 docker-compose.yml

-a----        2025-02-13     23:33           1576 environment

PS E:\dockerData\volume\druid1> docker compose up -d

time="2025-02-13T23:34:39+08:00" level=warning msg="E:\\dockerData\\volume\\druid1\\docker-compose.yml: the attribute `version` is obsolete, it will be ignored, please remove it to avoid potential confusion"

[+] Running 72/15

 ? router Pulled                                          230.7s

 ? coordinator Pulled                                     230.7s

 ? postgres Pulled                                        181.0s

 ? historical Pulled                                      230.7s

 ? broker Pulled                                          230.7s

 ? middlemanager Pulled                                   230.7s

 ? zookeeper Pulled                                        85.7s

[+] Running 15/15

 ? Network druid1_default           Created                 0.1s

 ? Volume "druid1_druid_shared"     Created                 0.0s

 ? Volume "druid1_historical_var"   Created                 0.0s

 ? Volume "druid1_middle_var"       Created                 0.0s

 ? Volume "druid1_router_var"       Created                 0.0s

 ? Volume "druid1_metadata_data"    Created                 0.0s

 ? Volume "druid1_coordinator_var"  Created                 0.0s

 ? Volume "druid1_broker_var"       Created                 0.0s

 ? Container postgres               Started                 2.4s

 ? Container zookeeper              Started                 2.4s

 ? Container coordinator            Started                 1.6s

 ? Container router                 Started                 2.5s

 ? Container broker                 Started                 2.3s

 ? Container historical             Started                 2.5s

 ? Container middlemanager          Started                 2.8s

PS E:\dockerData\volume\druid1>

拉取镜像成功后很快就能拉起容器:

好家伙。。。还顺便把其它组件的端口也给暴露出来了。。。

于是还**白piao**到一个PG和Zookeeper!!!

验证

http://localhost:3012/unified-console.html#

灰常好,现在已经拥有了一个最新Apache Druid32.0.0!!!

转载请注明出处:https://lizhiyong.blog.csdn.net/article/details/145622903


版权声明 : 本文内容来源于互联网或用户自行发布贡献,该文观点仅代表原作者本人。本站仅提供信息存储空间服务和不拥有所有权,不承担相关法律责任。如发现本站有涉嫌抄袭侵权, 违法违规的内容, 请发送邮件至2530232025#qq.cn(#换@)举报,一经查实,本站将立刻删除。
原文链接 :
相关文章
  • 本站所有内容来源于互联网或用户自行发布,本站仅提供信息存储空间服务,不拥有版权,不承担法律责任。如有侵犯您的权益,请您联系站长处理!
  • Copyright © 2017-2022 F11.CN All Rights Reserved. F11站长开发者网 版权所有 | 苏ICP备2022031554号-1 | 51LA统计