频道栏目
首页 > 资讯 > 云计算 > 正文

storm环境搭建及demo

17-07-06        来源:[db:作者]  
收藏   我要投稿

storm环境搭建及demo,Storm是一个开源的分布式实时计算系统,可以简单、可靠的处理大量的数据流。被称作“实时的hadoop”。Storm有很多使用场景:如实时分析,在线机器学习,持续计算, 分布式RPC,ETL等等。Storm支持水平扩展,具有高容错性,保证每个消息都会得到处理,而且处理速度很快(在一个小集群中,每个结点每秒可以处理 数以百万计的消息)。

* 文件下载


1. zookeeper下载

下载地址:http://apache.fayea.com/zookeeper/zookeeper-3.3.6/zookeeper-3.3.6.tar.gz

1. storm下载

下载地址:http://mirrors.hust.edu.cn/apache/storm/apache-storm-1.0.3/apache-storm-1.0.3.tar.gz

* 系统环境搭建和配置


1. 配置zookeeper(此处使用单节点配置)


* 
    * 上传zookeeper-3.3.6.tar.gz到centos服务器目录/home/temp
    * 解压tar -zxvf zookeeper-3.3.6.tar.gz
    * 移动到/usr/lib/zookeeper,mv zookeeper-3.3.6  /usr/lib/zookeeper
    * 配置zoo.cfg,cd /usr/lib/zookeeper/conf,cp zoo_sample.cfg zoo.cfg,配置示例参考:
# Licensed to the Apache Software Foundation (ASF) under one or more
# contributor license agreements.  See the NOTICE file distributed with
# this work for additional information regarding copyright ownership.
# The ASF licenses this file to You under the Apache License, Version 2.0
# (the "License"); you may not use this file except in compliance with
# the License.  You may obtain a copy of the License at
#
#     http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.

maxClientCnxns=50
# The number of milliseconds of each tick
tickTime=2000
# The number of ticks that the initial
# synchronization phase can take
initLimit=10
# The number of ticks that can pass between
# sending a request and getting an acknowledgement
syncLimit=5
# the directory where the snapshot is stored.
dataDir=/var/lib/zookeeper/data
# the port at which the clients will connect
clientPort=2181
# the directory where the transaction logs are stored.
dataLogDir=/var/lib/zookeeper

server.1=hadooplearn:2888:3888
* 
    * 配置myid,echo 1>myid,移动myid文件到/var/lib/zookeeper/data目录,mv myid /var/lib/zookeeper/data/
    * 启动zookeeper服务,cd /usr/lib/zookeeper,bin/zkServer.sh start
    * 验证zookeeper服务,bin/zkServer.sh status
    * zookeeper服务部署成功




1. 配置storm(此处使用单节点)


* 
    * 上传apache-storm-1.0.3.tar.gz到centos服务器目录/home/temp
    * 解压,tar -zxvf apache-storm-1.0.3.tar.gz
    * 移动apache-storm-1.0.3到/usr/lib/apache-storm
    * 配置storm.yaml,cd /usr/lib/apache-storm/conf,配置示例参考:
# Licensed to the Apache Software Foundation (ASF) under one
# or more contributor license agreements.  See the NOTICE file
# distributed with this work for additional information
# regarding copyright ownership.  The ASF licenses this file
# to you under the Apache License, Version 2.0 (the
# "License"); you may not use this file except in compliance
# with the License.  You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
########### These MUST be filled in for a storm configuration
storm.zookeeper.servers:
    - "hadooplearn"
#    - "server2"

nimbus.seeds: ["hadooplearn"]
#
#
# ##### These may optionally be filled in:
#   
## List of custom serializations
# topology.kryo.register:
#     - org.mycompany.MyType
#     - org.mycompany.MyType2: org.mycompany.MyType2Serializer
#
## List of custom kryo decorators
# topology.kryo.decorators:
#     - org.mycompany.MyDecorator
#
## Locations of the drpc servers
# drpc.servers:
#     - "server1"
#     - "server2"

## Metrics Consumers
# topology.metrics.consumer.register:
#   - class: "org.apache.storm.metric.LoggingMetricsConsumer"
#     parallelism.hint: 1
#   - class: "org.mycompany.MyMetricsConsumer"
#     parallelism.hint: 1
#     argument:
#       - endpoint: "metrics-collector.mycompany.org"
* 
    * 启动nimbus,cd /usr/lib/apache-storm/bin,./storm nimbus
    * 启动storm web管理,cd  /usr/lib/apache-storm/bin,./storm ui
    * 启动supervisor,cd  /usr/lib/apache-storm/bin,./storm supervisor
    * 以后台服务运行的启动方式为:

nohup ./storm nimbus 1>/dev/null 2>&1 &
nohup ./storm ui 1>/dev/null 2>&1 &
nohup ./storm supervisor 1>/dev/null 2>&1 &

1. storm demo

下载地址:http://dl.download.csdn.net/down11/20170705/66e92331900ddcf5d5b88c37650b23dd.zip?response-content-disposition=attachment%3Bfilename%3D%22weekend-storm.zip%22&OSSAccessKeyId=9q6nvzoJGowBj4q1&Expires=1499244457&Signature=N%2BB5mmaOVn6OCxmCA5M6yLjjmJI%3D

* Q&A


1. 如何杀死storm作业?

cd /usr/lib/apache-storm/bin,./storm kill ‘作业名称’,入示例中的作业名称为:demotopo

如何开发spout?

继承BaseRichSpout,实现open,declareOutputFields,nextTuple三个方法

如何开发bolt?

继承BaseBasicBolt,实现prepare,declareOutputFields,execute

storm作业提交过程是怎样的?

创建TopologyBuilder实例builder

设置spout,builder.setSpout 多个spout时,下一个spout要指定本spout的输入spout,如:builder.setBolt(“upperbolt”, new UpperBolt(), 4).shuffleGrouping(“randomspout”); 用builder来创建一个topology,StormTopology demotop = builder.createTopology(); 配置一些topology在集群中运行时的参数 将这个topology提交给storm集群运行:StormSubmitter.submitTopology(“demotopo”, conf, demotop);

运行jar包,提交Topologies,命令格式:storm jar 【jar路径】 【拓扑包名.拓扑类名】【stormIP地址】【storm端口】【拓扑名称】【参数】eg:storm jar /home/storm/storm-starter.jar storm.starter.WordCountTopology wordcountTop;

 

相关TAG标签
上一篇:电脑菜鸟对杀毒软件的常见4大误区
下一篇:深入Hadoop之HDFS
相关文章
图文推荐

关于我们 | 联系我们 | 广告服务 | 投资合作 | 版权申明 | 在线帮助 | 网站地图 | 作品发布 | Vip技术培训 | 举报中心

版权所有: 红黑联盟--致力于做实用的IT技术学习网站