环境准备192.168.1.58192.168.1.59192.168.1.60其中IP尾数58位master,59和60为slave系统为CentOSLinuxrelease7.2.1511一安装java(JDK8)1文件准备文件名称:jdk-8u131-linux-x64.tar.gz2上传安装包到服务器上3解压文件tar-zxvfjdk-8u131-linux-x64.tar.gz4修改环境变量vim/etc/profile在文件末尾加上如下信息:JAVA_HOME=/usr/java/jdk1.8.0_131PATH=$PATH:$JAVA_HOME/binCLASSPATH=.:$JAVA_HOME/lib/dt.jar:$JAVA_HOME/lib/tools.jar5使环境变量即时生效source/etc/profile6查看JDK版本java-version出现如图信息证明JDK配置完成。二Hadoop安装配置1文件准备hadoop-2.8.0.tar.gz下载地址:安装2.1SSH免密登录设置在主节点192.168.1.58上ssh-keygen-trsa一路回车...cat~/.ssh/id_rsa.pub~/.ssh/authorized_keys设置本机免密登录本机2.2复制id_rsa.pub到192.168.1.59和192.168.1.60scp~/.ssh/id_rsa.pubroot@192.168.1.59:/root/scp~/.ssh/id_rsa.pubroot@192.168.1.60:/root/在两台服务器上分别运行如下命令cat~/.ssh/id_rsa.pub~/.ssh/authorized_keys如果没有~/.ssh目录,需要手动创建(mkdir/root/~/.ssh/)2.3上传hadoop-2.8.0.tar.gz到Master及两个Slave的/usr/nacp目录下并解压tar-zxvfhadoop-2.8.0.tar.gz2.4在Master上,设置环境变量vim/etc/profile在文件末尾添加:#HadoopEnvexportPATHUSERLOGNAMEMAILHOSTNAMEHISTSIZEHISTCONTROLexportPATH=/usr/nacp/hadoop-2.8.0/bin:$PATHexportPATH=/usr/nacp/hadoop-2.8.0/sbin:$PATHexportPATH=/usr/java/jdk1.8.0_131/bin:$PATHexportPATH=/usr/nacp/spark-2.2.0-bin-hadoop2.7/bin:$PATHexportPATH=/usr/nacp/spark-2.2.0-bin-hadoop2.7/sbin:$PATHexportCLASSPATH=.:$JAVA_HOME/lib/dt.jar:$JAVA_HOME/lib/tools.jarexportJAVA_HOME=/usr/java/jdk1.8.0_131exportSPARK_HOME=/usr/nacp/spark-2.2.0-bin-hadoop2.7exportSCALA_HOME=/usr/nacp/scala-2.12.2exportPATH=/usr/nacp/scala-2.12.2/bin:$PATH使环境变量即时生效(注意jdk环境变量的路径以实际为准)source/etc/profile在两个从节点上的环境变量文件各自添加上述信息并使之生效。2.5查看hadoop版本信息hadoopversion出现如上图信息,证明Hadoop环境设置成功3Hadoop配置切换到/usr/nacp/hadoop-2.8.0/etc/Hadoop目录下cd/usr/nacp/hadoop-2.8.0/etc/Hadoop3.1在hadoop-env.sh末尾增加如下配置vimhadoop-env.shexportJAVA_HOME=/usr/java/jdk1.8.0_131exportHADOOP_PREFIX=/usr/nacp/hadoop-2.8.03.2在yarn-env.sh末尾增加:exportJAVA_HOME=/usr/java/jdk1.8.0_1313.3core-site.xml创建tmp目录mkdir/usr/nacp/hadoop-2.8.0/tmp/修改core-site.xml文件vimcore-site.xml添加如下内容:propertynamefs.defaultFS/namevaluehdfs://192.168.1.58:9000/value/propertypropertynamehadoop.tmp.dir/namevalue/usr/nacp/hadoop-2.8.0/tmp/value/property3.4hdfs-site.xml修改hdfs-site.xmlvimhdfs-site.xml添加如下内容:propertynamedfs.replication/namevalue1/value/propertypropertynamedfs.data.dir/namevalue/home/test/hdfsData/value/property(注:标红的路径建议修改为服务器的存储空间较大的路径)3.5mapred-site.xml修改mapred-site.xml(如果没有这个文件cpmapred-site.xml.template重命名即可)vimmapred-site.xml添加如下内容:propertynamemapreduce.framework.name/namevalueyarn/value/propertypropertynameyarn.nodemanager.webapp.address/namevalue192.168.1.58:8042/value/propertypropertynameyarn.nodemanager.address/namevalue192.168.1.58:8041/value/propertypropertynameyarn.resourcemanager.address/namevalue192.168.1.58:8032/value/propertypropertynameyarn.resourcemanager.scheduler.address/namevalue192.168.1.58:8030/value/propertypropertynameyarn.resourcemanager.resource-tracker.address/namevalue192.168.1.58:8035/value/propertypropertynameyarn.resourcemanager.admin.address/namevalue192.168.1.58:8033/value/propertypropertynameyarn.nodemanager.localizer.address/namevalue192.168.1.58:8040/value/property例3.6yarn-site.xml修改yarn-site.xmlvimyarn-site.xml添加如下内容:propertynameyarn.nodemanager.aux-services/namevaluemapreduce_shuffle/value/propertypropertynameyarn.resourcemanager.hostname/namevalue192.168.1.58/value/propertypropertynameyarn.scheduler.minimum-allocation-mb/namevalue1024/value/propertypropertynameyarn.scheduler.maximum-allocation-mb/namevalue16000/value/propertypropertynameyarn.nodemanager.resource.memory-mb/namevalue8048/value/propertypropertynameyarn.nodemanager.vmem-pmem-ratio/namevalue8048/value/property例3.7slaves修改slavesvimslaves例如:3.8复制上述修改的配置文件到两个Slave节点三Hadoop使用1格式化NameNodeMaster节点上,执行如下命令hdfsnamenode–format2启动HDFS(NameNode、DataNode)Master节点上执行如下命令start-dfs.sh使用jps命令,分别在Master以及两个Slave上查看Java进程可以在Master上看到如下进程:在两个Slave上,看到如下进程:3启动Yarn(ResourceManager、NodeManager)Master节点上,执行如下命令#start-yarn.sh使用jps命令,分别在Master以及两个Slave上查看Java进程可以在Master上看到如下进程:在两个Slave上,看到如下进程:四Scala1上传scala-2.12.2.tgz到/usr/nacp目录并解压tar-zxvfscala-2.12.2.tgz2修改环境变量并使之生效环境变量增加:exportSCALA_HOME=/usr/nacp/scala-2.12.2exportPATH=/usr/nacp/scala-2.12.2/bin:$PATH3查看scala版本scala–version五Spark安装配置1上传spark-2.2.0-bin-hadoop2.7.tgz到/usr/nacp/目录下并解压tar-zxvfspark-2.2.0-bin-hadoop2.7.tgz修改环境变量并使之生效增加:exportSPARK_HOME=/usr/nacp/spark-2.2.0-bin-hadoop2.7exportPATH=/usr/nacp/spark-2.2.0-bin-hadoop2.7/bin:$PATHexportPATH=/usr/nacp/spark-2.2.0-bin-hadoop2.7/sbin:$PATH2Spark配置转到/usr/nacp/spark-2.2.0-bin-hadoop2.7/conf目录下spark-env.sh将spark-env.sh.template重命名为spark-env.sh#mvspark-env.sh.templatespark-env.sh使用vim编辑器,打开spark-env.sh,在文件最后,添加如下内容exportSPARK_LOCAL_IP=192.168.1.58exportSPARK_WORKER_PORT=10095exportJAVA_HOME=/usr/java/jdk1.8.0_131exportSCALA_HOME=/usr/nacp/scala-2.12.2exportSPARK_MASTER