第三代半导体 Java包装类 自动化部署 观察者模式 security dart vcpkg textview vue响应式 后台管理模板 jq选择第一个子元素 js数组截取前5个 math保留两位小数 docker启动命令 数据库教程 mysql插入 python调用命令行 java的string javastring类型 java实用教程 java怎么使用 java获取当前ip java时间戳转换 java格式化字符串 java中long 蓝牙运动耳机排行榜 内存整理软件 服务器系统安装 凯恩与林奇2下载 识别音乐的软件 assist是什么意思 特战英雄辅助 古风头像女动漫 ansys安装教程 xflow python列表求和 qq飞车刷车 jdk11下载 illustrator下载 stylist
当前位置: 首页 > 学习教程  > 编程语言

大数据学习之jdk、入门级hadoop安装

2021/1/13 19:59:47 文章标签: 测试文章如有侵权请发送至邮箱809451989@qq.com投诉后文章立即删除

前言: 本篇文章针对于2020秋季学期的复习操作,一是对该学期的巩固,二是让老师知道他的努力没有白费,同时,在此感谢徐老师对我们的精心教导… 本次需要用到的材料 jdk-8u112-linux-x64.tar.gz jdk官网下载hadoop-2.7.…

前言:

本篇文章针对于2020秋季学期的复习操作,一是对该学期的巩固,二是让老师知道他的努力没有白费,同时,在此感谢徐老师对我们的精心教导…


本次需要用到的材料

  • jdk-8u112-linux-x64.tar.gz
    jdk官网下载
  • hadoop-2.7.6.tar.gz
    hadoop官网下载
    清华镜像下载
    当然可能低版本下载不到了,没关系以下是我的百度网盘链接可供下载→https://pan.baidu.com/s/1F0dl4368hlWR18mHzoZKwA
    →t123

前提

  • 打开虚拟机并用finalShell连接
  • 在root目录下创建一个自己放东西的目录(我的是opt)在其目录下再建一个放安装包的目录(我的是soft)
  • 将我们的两个安装包“拉”进soft目录下

以下是实操

在这里插入图片描述
在这里插入图片描述
在这里插入图片描述
ok!

安装jdk

这里有一个安装软件的口号→上传解压重命名环境变量source生效
这里的上传已经弄好了,接下来是解压,由于我想解压到opt目录下而我们的安装包在/opt/soft/里面,所以我的解压命令是↓

tar -zxvf jdk-8u112-linux-x64.tar.gz -C …/
在这里插入图片描述
在这里插入图片描述

重命名!!

mv jdk1.8.0_112/ jdk

在这里插入图片描述
环境变量
在这里插入图片描述
由于是第一次配置环境变量我们得了解清楚我们的liunx是怎么运作的怎样更加高效的工作,那么这里我们的环境变量就设置在了/opt/etc/profile.d/bigdata-etc.sh里面
具体操作↓

vim /etc/profile.d/bigdata-etc.sh

注意点: #后面空格经验之举!!!
在这里插入图片描述

source生效↓

source /etc/profile.d/bigdata-etc.sh
在这里插入图片描述
这里可以用查看版本来验证是否成功安装jdk
→javac -version
→java -version

安装hadoop

相信有了安装jdk的经验,hadoop还难吗?
要这么简单为什么要分开来操作了?

下面是实操

同样是解压重命名环境变量,这里就不多说了
在这里插入图片描述
这里需要将bin和sbin都配置进去!!

# 配置jdk环境变量
export JAVA_HOME=/opt/jdk
export PATH=$PATH:$JAVA_HOME/bin

# 配置hadoop环境变量
export HADOOP_HOME=/opt/hadoop
export PATH=$PATH:$HADOOP_HOME/bin:$HADOOP_HOME/sbin

在这里插入图片描述
最后source生效

source /etc/profile.d/bigdata-etc.sh 

最后这里有个finalShell好用的操作
清空页面Ctrl+l
liunx操作
↑↓键调用历史命令,可以定制历史命令条数,这里就不介绍了

测试

测试一

hadoop version

在这里插入图片描述
测试二
运行mapreduce示例程序grep
目的→
统计源目录(input)下所有资源中的内容以dfs开头的行数
①准备计算源

[root@Mymaster hadoop]# mkdir ~/input
[root@Mymaster ~]# cd input/
[root@Mymaster input]# vim Test.txt

在这里插入图片描述

hadoop jar /opt/hadoop/share/hadoop/mapreduce/hadoop-mapreduce-examples-2.7.6.jar grep ~/input ~/output 'dfs[a-z.]'
21/01/13 18:34:20 INFO Configuration.deprecation: session.id is deprecated. Instead, use dfs.metrics.session-id
21/01/13 18:34:20 INFO jvm.JvmMetrics: Initializing JVM Metrics with processName=JobTracker, sessionId=
21/01/13 18:34:20 INFO input.FileInputFormat: Total input paths to process : 1
21/01/13 18:34:20 INFO mapreduce.JobSubmitter: number of splits:1
21/01/13 18:34:20 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_local437625394_0001
21/01/13 18:34:21 INFO mapreduce.Job: The url to track the job: http://localhost:8080/
21/01/13 18:34:21 INFO mapreduce.Job: Running job: job_local437625394_0001
21/01/13 18:34:21 INFO mapred.LocalJobRunner: OutputCommitter set in config null
21/01/13 18:34:21 INFO output.FileOutputCommitter: File Output Committer Algorithm version is 1
21/01/13 18:34:21 INFO mapred.LocalJobRunner: OutputCommitter is org.apache.hadoop.mapreduce.lib.output.FileOutputCommitter
21/01/13 18:34:21 INFO mapred.LocalJobRunner: Waiting for map tasks
21/01/13 18:34:21 INFO mapred.LocalJobRunner: Starting task: attempt_local437625394_0001_m_000000_0
21/01/13 18:34:21 INFO output.FileOutputCommitter: File Output Committer Algorithm version is 1
21/01/13 18:34:21 INFO mapred.Task:  Using ResourceCalculatorProcessTree : [ ]
21/01/13 18:34:21 INFO mapred.MapTask: Processing split: file:/root/input/Test.txt:0+31
21/01/13 18:34:21 INFO mapred.MapTask: (EQUATOR) 0 kvi 26214396(104857584)
21/01/13 18:34:21 INFO mapred.MapTask: mapreduce.task.io.sort.mb: 100
21/01/13 18:34:21 INFO mapred.MapTask: soft limit at 83886080
21/01/13 18:34:21 INFO mapred.MapTask: bufstart = 0; bufvoid = 104857600
21/01/13 18:34:21 INFO mapred.MapTask: kvstart = 26214396; length = 6553600
21/01/13 18:34:21 INFO mapred.MapTask: Map output collector class = org.apache.hadoop.mapred.MapTask$MapOutputBuffer
21/01/13 18:34:21 INFO mapred.LocalJobRunner: 
21/01/13 18:34:21 INFO mapred.MapTask: Starting flush of map output
21/01/13 18:34:21 INFO mapred.MapTask: Spilling map output
21/01/13 18:34:21 INFO mapred.MapTask: bufstart = 0; bufend = 39; bufvoid = 104857600
21/01/13 18:34:21 INFO mapred.MapTask: kvstart = 26214396(104857584); kvend = 26214388(104857552); length = 9/6553600
21/01/13 18:34:21 INFO mapred.MapTask: Finished spill 0
21/01/13 18:34:21 INFO mapred.Task: Task:attempt_local437625394_0001_m_000000_0 is done. And is in the process of committing
21/01/13 18:34:21 INFO mapred.LocalJobRunner: map
21/01/13 18:34:21 INFO mapred.Task: Task 'attempt_local437625394_0001_m_000000_0' done.
21/01/13 18:34:21 INFO mapred.Task: Final Counters for attempt_local437625394_0001_m_000000_0: Counters: 18
        File System Counters
                FILE: Number of bytes read=296008
                FILE: Number of bytes written=588377
                FILE: Number of read operations=0
                FILE: Number of large read operations=0
                FILE: Number of write operations=0
        Map-Reduce Framework
                Map input records=6
                Map output records=3
                Map output bytes=39
                Map output materialized bytes=36
                Input split bytes=90
                Combine input records=3
                Combine output records=2
                Spilled Records=2
                Failed Shuffles=0
                Merged Map outputs=0
                GC time elapsed (ms)=90
                Total committed heap usage (bytes)=212860928
        File Input Format Counters 
                Bytes Read=31
21/01/13 18:34:21 INFO mapred.LocalJobRunner: Finishing task: attempt_local437625394_0001_m_000000_0
21/01/13 18:34:21 INFO mapred.LocalJobRunner: map task executor complete.
21/01/13 18:34:21 INFO mapred.LocalJobRunner: Waiting for reduce tasks
21/01/13 18:34:21 INFO mapred.LocalJobRunner: Starting task: attempt_local437625394_0001_r_000000_0
21/01/13 18:34:21 INFO output.FileOutputCommitter: File Output Committer Algorithm version is 1
21/01/13 18:34:21 INFO mapred.Task:  Using ResourceCalculatorProcessTree : [ ]
21/01/13 18:34:21 INFO mapred.ReduceTask: Using ShuffleConsumerPlugin: org.apache.hadoop.mapreduce.task.reduce.Shuffle@480c6baa
21/01/13 18:34:21 INFO reduce.MergeManagerImpl: MergerManager: memoryLimit=334338464, maxSingleShuffleLimit=83584616, mergeThreshold=220663392, ioSortFactor=10, memToMemMergeOutputsThreshold=10
21/01/13 18:34:21 INFO reduce.EventFetcher: attempt_local437625394_0001_r_000000_0 Thread started: EventFetcher for fetching Map Completion Events
21/01/13 18:34:21 INFO reduce.LocalFetcher: localfetcher#1 about to shuffle output of map attempt_local437625394_0001_m_000000_0 decomp: 32 len: 36 to MEMORY
21/01/13 18:34:21 INFO reduce.InMemoryMapOutput: Read 32 bytes from map-output for attempt_local437625394_0001_m_000000_0
21/01/13 18:34:21 INFO reduce.MergeManagerImpl: closeInMemoryFile -> map-output of size: 32, inMemoryMapOutputs.size() -> 1, commitMemory -> 0, usedMemory ->32
21/01/13 18:34:21 INFO reduce.EventFetcher: EventFetcher is interrupted.. Returning
21/01/13 18:34:21 INFO mapred.LocalJobRunner: 1 / 1 copied.
21/01/13 18:34:21 INFO reduce.MergeManagerImpl: finalMerge called with 1 in-memory map-outputs and 0 on-disk map-outputs
21/01/13 18:34:21 INFO mapred.Merger: Merging 1 sorted segments
21/01/13 18:34:21 INFO mapred.Merger: Down to the last merge-pass, with 1 segments left of total size: 25 bytes
21/01/13 18:34:21 INFO reduce.MergeManagerImpl: Merged 1 segments, 32 bytes to disk to satisfy reduce memory limit
21/01/13 18:34:21 INFO reduce.MergeManagerImpl: Merging 1 files, 36 bytes from disk
21/01/13 18:34:21 INFO reduce.MergeManagerImpl: Merging 0 segments, 0 bytes from memory into reduce
21/01/13 18:34:21 INFO mapred.Merger: Merging 1 sorted segments
21/01/13 18:34:21 INFO mapred.Merger: Down to the last merge-pass, with 1 segments left of total size: 25 bytes
21/01/13 18:34:21 INFO mapred.LocalJobRunner: 1 / 1 copied.
21/01/13 18:34:21 INFO Configuration.deprecation: mapred.skip.on is deprecated. Instead, use mapreduce.job.skiprecords
21/01/13 18:34:21 INFO mapred.Task: Task:attempt_local437625394_0001_r_000000_0 is done. And is in the process of committing
21/01/13 18:34:21 INFO mapred.LocalJobRunner: 1 / 1 copied.
21/01/13 18:34:21 INFO mapred.Task: Task attempt_local437625394_0001_r_000000_0 is allowed to commit now
21/01/13 18:34:21 INFO output.FileOutputCommitter: Saved output of task 'attempt_local437625394_0001_r_000000_0' to file:/root/input/grep-temp-2131710143/_temporary/0/task_local437625394_0001_r_000000
21/01/13 18:34:21 INFO mapred.LocalJobRunner: reduce > reduce
21/01/13 18:34:21 INFO mapred.Task: Task 'attempt_local437625394_0001_r_000000_0' done.
21/01/13 18:34:21 INFO mapred.Task: Final Counters for attempt_local437625394_0001_r_000000_0: Counters: 24
        File System Counters
                FILE: Number of bytes read=296112
                FILE: Number of bytes written=588553
                FILE: Number of read operations=0
                FILE: Number of large read operations=0
                FILE: Number of write operations=0
        Map-Reduce Framework
                Combine input records=0
                Combine output records=0
                Reduce input groups=2
                Reduce shuffle bytes=36
                Reduce input records=2
                Reduce output records=2
                Spilled Records=2
                Shuffled Maps =1
                Failed Shuffles=0
                Merged Map outputs=1
                GC time elapsed (ms)=0
                Total committed heap usage (bytes)=212860928
        Shuffle Errors
                BAD_ID=0
                CONNECTION=0
                IO_ERROR=0
                WRONG_LENGTH=0
                WRONG_MAP=0
                WRONG_REDUCE=0
        File Output Format Counters 
                Bytes Written=140
21/01/13 18:34:21 INFO mapred.LocalJobRunner: Finishing task: attempt_local437625394_0001_r_000000_0
21/01/13 18:34:21 INFO mapred.LocalJobRunner: reduce task executor complete.
21/01/13 18:34:22 INFO mapreduce.Job: Job job_local437625394_0001 running in uber mode : false
21/01/13 18:34:22 INFO mapreduce.Job:  map 100% reduce 100%
21/01/13 18:34:22 INFO mapreduce.Job: Job job_local437625394_0001 completed successfully
21/01/13 18:34:22 INFO mapreduce.Job: Counters: 30
        File System Counters
                FILE: Number of bytes read=592120
                FILE: Number of bytes written=1176930
                FILE: Number of read operations=0
                FILE: Number of large read operations=0
                FILE: Number of write operations=0
        Map-Reduce Framework
                Map input records=6
                Map output records=3
                Map output bytes=39
                Map output materialized bytes=36
                Input split bytes=90
                Combine input records=3
                Combine output records=2
                Reduce input groups=2
                Reduce shuffle bytes=36
                Reduce input records=2
                Reduce output records=2
                Spilled Records=4
                Shuffled Maps =1
                Failed Shuffles=0
                Merged Map outputs=1
                GC time elapsed (ms)=90
                Total committed heap usage (bytes)=425721856
        Shuffle Errors
                BAD_ID=0
                CONNECTION=0
                IO_ERROR=0
                WRONG_LENGTH=0
                WRONG_MAP=0
                WRONG_REDUCE=0
        File Input Format Counters 
                Bytes Read=31
        File Output Format Counters 
                Bytes Written=140
21/01/13 18:34:22 INFO jvm.JvmMetrics: Cannot initialize JVM Metrics with processName=JobTracker, sessionId= - already initialized
21/01/13 18:34:22 INFO input.FileInputFormat: Total input paths to process : 1
21/01/13 18:34:22 INFO mapreduce.JobSubmitter: number of splits:1
21/01/13 18:34:22 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_local685543909_0002
21/01/13 18:34:22 INFO mapreduce.Job: The url to track the job: http://localhost:8080/
21/01/13 18:34:22 INFO mapreduce.Job: Running job: job_local685543909_0002
21/01/13 18:34:22 INFO mapred.LocalJobRunner: OutputCommitter set in config null
21/01/13 18:34:22 INFO output.FileOutputCommitter: File Output Committer Algorithm version is 1
21/01/13 18:34:22 INFO mapred.LocalJobRunner: OutputCommitter is org.apache.hadoop.mapreduce.lib.output.FileOutputCommitter
21/01/13 18:34:22 INFO mapred.LocalJobRunner: Waiting for map tasks
21/01/13 18:34:22 INFO mapred.LocalJobRunner: Starting task: attempt_local685543909_0002_m_000000_0
21/01/13 18:34:22 INFO output.FileOutputCommitter: File Output Committer Algorithm version is 1
21/01/13 18:34:22 INFO mapred.Task:  Using ResourceCalculatorProcessTree : [ ]
21/01/13 18:34:22 INFO mapred.MapTask: Processing split: file:/root/input/grep-temp-2131710143/part-r-00000:0+128
21/01/13 18:34:22 INFO mapred.MapTask: (EQUATOR) 0 kvi 26214396(104857584)
21/01/13 18:34:22 INFO mapred.MapTask: mapreduce.task.io.sort.mb: 100
21/01/13 18:34:22 INFO mapred.MapTask: soft limit at 83886080
21/01/13 18:34:22 INFO mapred.MapTask: bufstart = 0; bufvoid = 104857600
21/01/13 18:34:22 INFO mapred.MapTask: kvstart = 26214396; length = 6553600
21/01/13 18:34:22 INFO mapred.MapTask: Map output collector class = org.apache.hadoop.mapred.MapTask$MapOutputBuffer
21/01/13 18:34:22 INFO mapred.LocalJobRunner: 
21/01/13 18:34:22 INFO mapred.MapTask: Starting flush of map output
21/01/13 18:34:22 INFO mapred.MapTask: Spilling map output
21/01/13 18:34:22 INFO mapred.MapTask: bufstart = 0; bufend = 26; bufvoid = 104857600
21/01/13 18:34:22 INFO mapred.MapTask: kvstart = 26214396(104857584); kvend = 26214392(104857568); length = 5/6553600
21/01/13 18:34:22 INFO mapred.MapTask: Finished spill 0
21/01/13 18:34:22 INFO mapred.Task: Task:attempt_local685543909_0002_m_000000_0 is done. And is in the process of committing
21/01/13 18:34:22 INFO mapred.LocalJobRunner: map
21/01/13 18:34:22 INFO mapred.Task: Task 'attempt_local685543909_0002_m_000000_0' done.
21/01/13 18:34:22 INFO mapred.Task: Final Counters for attempt_local685543909_0002_m_000000_0: Counters: 17
        File System Counters
                FILE: Number of bytes read=592255
                FILE: Number of bytes written=1175758
                FILE: Number of read operations=0
                FILE: Number of large read operations=0
                FILE: Number of write operations=0
        Map-Reduce Framework
                Map input records=2
                Map output records=2
                Map output bytes=26
                Map output materialized bytes=36
                Input split bytes=115
                Combine input records=0
                Spilled Records=2
                Failed Shuffles=0
                Merged Map outputs=0
                GC time elapsed (ms)=60
                Total committed heap usage (bytes)=385351680
        File Input Format Counters 
                Bytes Read=140
21/01/13 18:34:22 INFO mapred.LocalJobRunner: Finishing task: attempt_local685543909_0002_m_000000_0
21/01/13 18:34:22 INFO mapred.LocalJobRunner: map task executor complete.
21/01/13 18:34:22 INFO mapred.LocalJobRunner: Waiting for reduce tasks
21/01/13 18:34:22 INFO mapred.LocalJobRunner: Starting task: attempt_local685543909_0002_r_000000_0
21/01/13 18:34:22 INFO output.FileOutputCommitter: File Output Committer Algorithm version is 1
21/01/13 18:34:22 INFO mapred.Task:  Using ResourceCalculatorProcessTree : [ ]
21/01/13 18:34:22 INFO mapred.ReduceTask: Using ShuffleConsumerPlugin: org.apache.hadoop.mapreduce.task.reduce.Shuffle@2b831f1f
21/01/13 18:34:22 INFO reduce.MergeManagerImpl: MergerManager: memoryLimit=334338464, maxSingleShuffleLimit=83584616, mergeThreshold=220663392, ioSortFactor=10, memToMemMergeOutputsThreshold=10
21/01/13 18:34:22 INFO reduce.EventFetcher: attempt_local685543909_0002_r_000000_0 Thread started: EventFetcher for fetching Map Completion Events
21/01/13 18:34:22 INFO reduce.LocalFetcher: localfetcher#2 about to shuffle output of map attempt_local685543909_0002_m_000000_0 decomp: 32 len: 36 to MEMORY
21/01/13 18:34:22 INFO reduce.InMemoryMapOutput: Read 32 bytes from map-output for attempt_local685543909_0002_m_000000_0
21/01/13 18:34:22 INFO reduce.MergeManagerImpl: closeInMemoryFile -> map-output of size: 32, inMemoryMapOutputs.size() -> 1, commitMemory -> 0, usedMemory ->32
21/01/13 18:34:22 INFO reduce.EventFetcher: EventFetcher is interrupted.. Returning
21/01/13 18:34:22 WARN io.ReadaheadPool: Failed readahead on ifile
EBADF: Bad file descriptor
        at org.apache.hadoop.io.nativeio.NativeIO$POSIX.posix_fadvise(Native Method)
        at org.apache.hadoop.io.nativeio.NativeIO$POSIX.posixFadviseIfPossible(NativeIO.java:267)
        at org.apache.hadoop.io.nativeio.NativeIO$POSIX$CacheManipulator.posixFadviseIfPossible(NativeIO.java:146)
        at org.apache.hadoop.io.ReadaheadPool$ReadaheadRequestImpl.run(ReadaheadPool.java:206)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
        at java.lang.Thread.run(Thread.java:745)
21/01/13 18:34:22 INFO mapred.LocalJobRunner: 1 / 1 copied.
21/01/13 18:34:22 INFO reduce.MergeManagerImpl: finalMerge called with 1 in-memory map-outputs and 0 on-disk map-outputs
21/01/13 18:34:22 INFO mapred.Merger: Merging 1 sorted segments
21/01/13 18:34:22 INFO mapred.Merger: Down to the last merge-pass, with 1 segments left of total size: 22 bytes
21/01/13 18:34:22 INFO reduce.MergeManagerImpl: Merged 1 segments, 32 bytes to disk to satisfy reduce memory limit
21/01/13 18:34:22 INFO reduce.MergeManagerImpl: Merging 1 files, 36 bytes from disk
21/01/13 18:34:22 INFO reduce.MergeManagerImpl: Merging 0 segments, 0 bytes from memory into reduce
21/01/13 18:34:22 INFO mapred.Merger: Merging 1 sorted segments
21/01/13 18:34:22 INFO mapred.Merger: Down to the last merge-pass, with 1 segments left of total size: 22 bytes
21/01/13 18:34:22 INFO mapred.LocalJobRunner: 1 / 1 copied.
21/01/13 18:34:22 INFO mapred.Task: Task:attempt_local685543909_0002_r_000000_0 is done. And is in the process of committing
21/01/13 18:34:22 INFO mapred.LocalJobRunner: 1 / 1 copied.
21/01/13 18:34:22 INFO mapred.Task: Task attempt_local685543909_0002_r_000000_0 is allowed to commit now
21/01/13 18:34:22 INFO output.FileOutputCommitter: Saved output of task 'attempt_local685543909_0002_r_000000_0' to file:/root/output/_temporary/0/task_local685543909_0002_r_000000
21/01/13 18:34:22 INFO mapred.LocalJobRunner: reduce > reduce
21/01/13 18:34:22 INFO mapred.Task: Task 'attempt_local685543909_0002_r_000000_0' done.
21/01/13 18:34:22 INFO mapred.Task: Final Counters for attempt_local685543909_0002_r_000000_0: Counters: 24
        File System Counters
                FILE: Number of bytes read=592359
                FILE: Number of bytes written=1175820
                FILE: Number of read operations=0
                FILE: Number of large read operations=0
                FILE: Number of write operations=0
        Map-Reduce Framework
                Combine input records=0
                Combine output records=0
                Reduce input groups=2
                Reduce shuffle bytes=36
                Reduce input records=2
                Reduce output records=2
                Spilled Records=2
                Shuffled Maps =1
                Failed Shuffles=0
                Merged Map outputs=1
                GC time elapsed (ms)=0
                Total committed heap usage (bytes)=385351680
        Shuffle Errors
                BAD_ID=0
                CONNECTION=0
                IO_ERROR=0
                WRONG_LENGTH=0
                WRONG_MAP=0
                WRONG_REDUCE=0
        File Output Format Counters 
                Bytes Written=26
21/01/13 18:34:22 INFO mapred.LocalJobRunner: Finishing task: attempt_local685543909_0002_r_000000_0
21/01/13 18:34:22 INFO mapred.LocalJobRunner: reduce task executor complete.
21/01/13 18:34:23 INFO mapreduce.Job: Job job_local685543909_0002 running in uber mode : false
21/01/13 18:34:23 INFO mapreduce.Job:  map 100% reduce 100%
21/01/13 18:34:23 INFO mapreduce.Job: Job job_local685543909_0002 completed successfully
21/01/13 18:34:23 INFO mapreduce.Job: Counters: 30
        File System Counters
                FILE: Number of bytes read=1184614
                FILE: Number of bytes written=2351578
                FILE: Number of read operations=0
                FILE: Number of large read operations=0
                FILE: Number of write operations=0
        Map-Reduce Framework
                Map input records=2
                Map output records=2
                Map output bytes=26
                Map output materialized bytes=36
                Input split bytes=115
                Combine input records=0
                Combine output records=0
                Reduce input groups=2
                Reduce shuffle bytes=36
                Reduce input records=2
                Reduce output records=2
                Spilled Records=4
                Shuffled Maps =1
                Failed Shuffles=0
                Merged Map outputs=1
                GC time elapsed (ms)=60
                Total committed heap usage (bytes)=770703360
        Shuffle Errors
                BAD_ID=0
                CONNECTION=0
                IO_ERROR=0
                WRONG_LENGTH=0
                WRONG_MAP=0
                WRONG_REDUCE=0
        File Input Format Counters 
                Bytes Read=140
        File Output Format Counters 
                Bytes Written=26

验证结果

cat output/*

在这里插入图片描述
测试三:运行mapreduce示例程序wordcount
目的:
统计指定的源目录下所有的资源(文件)中每个单词出现的总次数。(将所有文件中相同的单词都会进行累计)

准备计算源(这里插播有反扒机制)

[root@Mymaster ~]# mkdir input2
[root@Mymaster ~]# cd input2/
[root@Mymaster input2]# vim heheda.txt

在这里插入图片描述

[root@Mymaster input2]# vim haha.txt

在这里插入图片描述

hadoop jar /opt/hadoop/share/hadoop/mapreduce/hadoop-mapreduce-examples-2.7.6.jar wordcount ~/input2 ~/output2
21/01/13 19:01:01 INFO Configuration.deprecation: session.id is deprecated. Instead, use dfs.metrics.session-id
21/01/13 19:01:01 INFO jvm.JvmMetrics: Initializing JVM Metrics with processName=JobTracker, sessionId=
21/01/13 19:01:01 INFO input.FileInputFormat: Total input paths to process : 2
21/01/13 19:01:01 INFO mapreduce.JobSubmitter: number of splits:2
21/01/13 19:01:01 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_local1702293927_0001
21/01/13 19:01:01 INFO mapreduce.Job: The url to track the job: http://localhost:8080/
21/01/13 19:01:01 INFO mapreduce.Job: Running job: job_local1702293927_0001
21/01/13 19:01:01 INFO mapred.LocalJobRunner: OutputCommitter set in config null
21/01/13 19:01:01 INFO output.FileOutputCommitter: File Output Committer Algorithm version is 1
21/01/13 19:01:01 INFO mapred.LocalJobRunner: OutputCommitter is org.apache.hadoop.mapreduce.lib.output.FileOutputCommitter
21/01/13 19:01:01 INFO mapred.LocalJobRunner: Waiting for map tasks
21/01/13 19:01:01 INFO mapred.LocalJobRunner: Starting task: attempt_local1702293927_0001_m_000000_0
21/01/13 19:01:01 INFO output.FileOutputCommitter: File Output Committer Algorithm version is 1
21/01/13 19:01:01 INFO mapred.Task:  Using ResourceCalculatorProcessTree : [ ]
21/01/13 19:01:01 INFO mapred.MapTask: Processing split: file:/root/input2/heheda.txt:0+95
21/01/13 19:01:01 INFO mapred.MapTask: (EQUATOR) 0 kvi 26214396(104857584)
21/01/13 19:01:01 INFO mapred.MapTask: mapreduce.task.io.sort.mb: 100
21/01/13 19:01:01 INFO mapred.MapTask: soft limit at 83886080
21/01/13 19:01:01 INFO mapred.MapTask: bufstart = 0; bufvoid = 104857600
21/01/13 19:01:01 INFO mapred.MapTask: kvstart = 26214396; length = 6553600
21/01/13 19:01:01 INFO mapred.MapTask: Map output collector class = org.apache.hadoop.mapred.MapTask$MapOutputBuffer
21/01/13 19:01:01 INFO mapred.LocalJobRunner: 
21/01/13 19:01:01 INFO mapred.MapTask: Starting flush of map output
21/01/13 19:01:01 INFO mapred.MapTask: Spilling map output
21/01/13 19:01:01 INFO mapred.MapTask: bufstart = 0; bufend = 155; bufvoid = 104857600
21/01/13 19:01:01 INFO mapred.MapTask: kvstart = 26214396(104857584); kvend = 26214340(104857360); length = 57/6553600
21/01/13 19:01:01 INFO mapred.MapTask: Finished spill 0
21/01/13 19:01:01 INFO mapred.Task: Task:attempt_local1702293927_0001_m_000000_0 is done. And is in the process of committing
21/01/13 19:01:01 INFO mapred.LocalJobRunner: map
21/01/13 19:01:01 INFO mapred.Task: Task 'attempt_local1702293927_0001_m_000000_0' done.
21/01/13 19:01:01 INFO mapred.Task: Final Counters for attempt_local1702293927_0001_m_000000_0: Counters: 18
        File System Counters
                FILE: Number of bytes read=296179
                FILE: Number of bytes written=589373
                FILE: Number of read operations=0
                FILE: Number of large read operations=0
                FILE: Number of write operations=0
        Map-Reduce Framework
                Map input records=5
                Map output records=15
                Map output bytes=155
                Map output materialized bytes=191
                Input split bytes=93
                Combine input records=15
                Combine output records=15
                Spilled Records=15
                Failed Shuffles=0
                Merged Map outputs=0
                GC time elapsed (ms)=6
                Total committed heap usage (bytes)=212860928
        File Input Format Counters 
                Bytes Read=95
21/01/13 19:01:01 INFO mapred.LocalJobRunner: Finishing task: attempt_local1702293927_0001_m_000000_0
21/01/13 19:01:01 INFO mapred.LocalJobRunner: Starting task: attempt_local1702293927_0001_m_000001_0
21/01/13 19:01:01 INFO output.FileOutputCommitter: File Output Committer Algorithm version is 1
21/01/13 19:01:01 INFO mapred.Task:  Using ResourceCalculatorProcessTree : [ ]
21/01/13 19:01:01 INFO mapred.MapTask: Processing split: file:/root/input2/haha.txt:0+83
21/01/13 19:01:01 INFO mapred.MapTask: (EQUATOR) 0 kvi 26214396(104857584)
21/01/13 19:01:01 INFO mapred.MapTask: mapreduce.task.io.sort.mb: 100
21/01/13 19:01:01 INFO mapred.MapTask: soft limit at 83886080
21/01/13 19:01:01 INFO mapred.MapTask: bufstart = 0; bufvoid = 104857600
21/01/13 19:01:01 INFO mapred.MapTask: kvstart = 26214396; length = 6553600
21/01/13 19:01:01 INFO mapred.MapTask: Map output collector class = org.apache.hadoop.mapred.MapTask$MapOutputBuffer
21/01/13 19:01:01 INFO mapred.LocalJobRunner: 
21/01/13 19:01:01 INFO mapred.MapTask: Starting flush of map output
21/01/13 19:01:01 INFO mapred.MapTask: Spilling map output
21/01/13 19:01:01 INFO mapred.MapTask: bufstart = 0; bufend = 131; bufvoid = 104857600
21/01/13 19:01:01 INFO mapred.MapTask: kvstart = 26214396(104857584); kvend = 26214352(104857408); length = 45/6553600
21/01/13 19:01:01 INFO mapred.MapTask: Finished spill 0
21/01/13 19:01:01 INFO mapred.Task: Task:attempt_local1702293927_0001_m_000001_0 is done. And is in the process of committing
21/01/13 19:01:01 INFO mapred.LocalJobRunner: map
21/01/13 19:01:01 INFO mapred.Task: Task 'attempt_local1702293927_0001_m_000001_0' done.
21/01/13 19:01:01 INFO mapred.Task: Final Counters for attempt_local1702293927_0001_m_000001_0: Counters: 18
        File System Counters
                FILE: Number of bytes read=296465
                FILE: Number of bytes written=589556
                FILE: Number of read operations=0
                FILE: Number of large read operations=0
                FILE: Number of write operations=0
        Map-Reduce Framework
                Map input records=4
                Map output records=12
                Map output bytes=131
                Map output materialized bytes=151
                Input split bytes=91
                Combine input records=12
                Combine output records=11
                Spilled Records=11
                Failed Shuffles=0
                Merged Map outputs=0
                GC time elapsed (ms)=0
                Total committed heap usage (bytes)=318242816
        File Input Format Counters 
                Bytes Read=83
21/01/13 19:01:01 INFO mapred.LocalJobRunner: Finishing task: attempt_local1702293927_0001_m_000001_0
21/01/13 19:01:01 INFO mapred.LocalJobRunner: map task executor complete.
21/01/13 19:01:01 INFO mapred.LocalJobRunner: Waiting for reduce tasks
21/01/13 19:01:01 INFO mapred.LocalJobRunner: Starting task: attempt_local1702293927_0001_r_000000_0
21/01/13 19:01:01 INFO output.FileOutputCommitter: File Output Committer Algorithm version is 1
21/01/13 19:01:01 INFO mapred.Task:  Using ResourceCalculatorProcessTree : [ ]
21/01/13 19:01:01 INFO mapred.ReduceTask: Using ShuffleConsumerPlugin: org.apache.hadoop.mapreduce.task.reduce.Shuffle@5caf2357
21/01/13 19:01:01 INFO reduce.MergeManagerImpl: MergerManager: memoryLimit=334338464, maxSingleShuffleLimit=83584616, mergeThreshold=220663392, ioSortFactor=10, memToMemMergeOutputsThreshold=10
21/01/13 19:01:01 INFO reduce.EventFetcher: attempt_local1702293927_0001_r_000000_0 Thread started: EventFetcher for fetching Map Completion Events
21/01/13 19:01:01 INFO reduce.LocalFetcher: localfetcher#1 about to shuffle output of map attempt_local1702293927_0001_m_000000_0 decomp: 187 len: 191 to MEMORY
21/01/13 19:01:01 INFO reduce.InMemoryMapOutput: Read 187 bytes from map-output for attempt_local1702293927_0001_m_000000_0
21/01/13 19:01:01 INFO reduce.MergeManagerImpl: closeInMemoryFile -> map-output of size: 187, inMemoryMapOutputs.size() -> 1, commitMemory -> 0, usedMemory ->187
21/01/13 19:01:01 WARN io.ReadaheadPool: Failed readahead on ifile
EBADF: Bad file descriptor
        at org.apache.hadoop.io.nativeio.NativeIO$POSIX.posix_fadvise(Native Method)
        at org.apache.hadoop.io.nativeio.NativeIO$POSIX.posixFadviseIfPossible(NativeIO.java:267)
        at org.apache.hadoop.io.nativeio.NativeIO$POSIX$CacheManipulator.posixFadviseIfPossible(NativeIO.java:146)
        at org.apache.hadoop.io.ReadaheadPool$ReadaheadRequestImpl.run(ReadaheadPool.java:206)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
        at java.lang.Thread.run(Thread.java:745)
21/01/13 19:01:01 INFO reduce.LocalFetcher: localfetcher#1 about to shuffle output of map attempt_local1702293927_0001_m_000001_0 decomp: 147 len: 151 to MEMORY
21/01/13 19:01:01 INFO reduce.InMemoryMapOutput: Read 147 bytes from map-output for attempt_local1702293927_0001_m_000001_0
21/01/13 19:01:01 INFO reduce.MergeManagerImpl: closeInMemoryFile -> map-output of size: 147, inMemoryMapOutputs.size() -> 2, commitMemory -> 187, usedMemory ->334
21/01/13 19:01:01 WARN io.ReadaheadPool: Failed readahead on ifile
EBADF: Bad file descriptor
        at org.apache.hadoop.io.nativeio.NativeIO$POSIX.posix_fadvise(Native Method)
        at org.apache.hadoop.io.nativeio.NativeIO$POSIX.posixFadviseIfPossible(NativeIO.java:267)
        at org.apache.hadoop.io.nativeio.NativeIO$POSIX$CacheManipulator.posixFadviseIfPossible(NativeIO.java:146)
        at org.apache.hadoop.io.ReadaheadPool$ReadaheadRequestImpl.run(ReadaheadPool.java:206)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
        at java.lang.Thread.run(Thread.java:745)
21/01/13 19:01:01 INFO reduce.EventFetcher: EventFetcher is interrupted.. Returning
21/01/13 19:01:01 INFO mapred.LocalJobRunner: 2 / 2 copied.
21/01/13 19:01:01 INFO reduce.MergeManagerImpl: finalMerge called with 2 in-memory map-outputs and 0 on-disk map-outputs
21/01/13 19:01:01 INFO mapred.Merger: Merging 2 sorted segments
21/01/13 19:01:01 INFO mapred.Merger: Down to the last merge-pass, with 2 segments left of total size: 312 bytes
21/01/13 19:01:01 INFO reduce.MergeManagerImpl: Merged 2 segments, 334 bytes to disk to satisfy reduce memory limit
21/01/13 19:01:01 INFO reduce.MergeManagerImpl: Merging 1 files, 336 bytes from disk
21/01/13 19:01:01 INFO reduce.MergeManagerImpl: Merging 0 segments, 0 bytes from memory into reduce
21/01/13 19:01:01 INFO mapred.Merger: Merging 1 sorted segments
21/01/13 19:01:01 INFO mapred.Merger: Down to the last merge-pass, with 1 segments left of total size: 321 bytes
21/01/13 19:01:01 INFO mapred.LocalJobRunner: 2 / 2 copied.
21/01/13 19:01:01 INFO Configuration.deprecation: mapred.skip.on is deprecated. Instead, use mapreduce.job.skiprecords
21/01/13 19:01:01 INFO mapred.Task: Task:attempt_local1702293927_0001_r_000000_0 is done. And is in the process of committing
21/01/13 19:01:01 INFO mapred.LocalJobRunner: 2 / 2 copied.
21/01/13 19:01:01 INFO mapred.Task: Task attempt_local1702293927_0001_r_000000_0 is allowed to commit now
21/01/13 19:01:01 INFO output.FileOutputCommitter: Saved output of task 'attempt_local1702293927_0001_r_000000_0' to file:/root/output2/_temporary/0/task_local1702293927_0001_r_000000
21/01/13 19:01:01 INFO mapred.LocalJobRunner: reduce > reduce
21/01/13 19:01:01 INFO mapred.Task: Task 'attempt_local1702293927_0001_r_000000_0' done.
21/01/13 19:01:01 INFO mapred.Task: Final Counters for attempt_local1702293927_0001_r_000000_0: Counters: 24
        File System Counters
                FILE: Number of bytes read=297207
                FILE: Number of bytes written=590035
                FILE: Number of read operations=0
                FILE: Number of large read operations=0
                FILE: Number of write operations=0
        Map-Reduce Framework
                Combine input records=0
                Combine output records=0
                Reduce input groups=16
                Reduce shuffle bytes=342
                Reduce input records=26
                Reduce output records=16
                Spilled Records=26
                Shuffled Maps =2
                Failed Shuffles=0
                Merged Map outputs=2
                GC time elapsed (ms)=0
                Total committed heap usage (bytes)=318242816
        Shuffle Errors
                BAD_ID=0
                CONNECTION=0
                IO_ERROR=0
                WRONG_LENGTH=0
                WRONG_MAP=0
                WRONG_REDUCE=0
        File Output Format Counters 
                Bytes Written=143
21/01/13 19:01:01 INFO mapred.LocalJobRunner: Finishing task: attempt_local1702293927_0001_r_000000_0
21/01/13 19:01:01 INFO mapred.LocalJobRunner: reduce task executor complete.
21/01/13 19:01:02 INFO mapreduce.Job: Job job_local1702293927_0001 running in uber mode : false
21/01/13 19:01:02 INFO mapreduce.Job:  map 100% reduce 100%
21/01/13 19:01:02 INFO mapreduce.Job: Job job_local1702293927_0001 completed successfully
21/01/13 19:01:02 INFO mapreduce.Job: Counters: 30
        File System Counters
                FILE: Number of bytes read=889851
                FILE: Number of bytes written=1768964
                FILE: Number of read operations=0
                FILE: Number of large read operations=0
                FILE: Number of write operations=0
        Map-Reduce Framework
                Map input records=9
                Map output records=27
                Map output bytes=286
                Map output materialized bytes=342
                Input split bytes=184
                Combine input records=27
                Combine output records=26
                Reduce input groups=16
                Reduce shuffle bytes=342
                Reduce input records=26
                Reduce output records=16
                Spilled Records=52
                Shuffled Maps =2
                Failed Shuffles=0
                Merged Map outputs=2
                GC time elapsed (ms)=6
                Total committed heap usage (bytes)=849346560
        Shuffle Errors
                BAD_ID=0
                CONNECTION=0
                IO_ERROR=0
                WRONG_LENGTH=0
                WRONG_MAP=0
                WRONG_REDUCE=0
        File Input Format Counters 
                Bytes Read=178
        File Output Format Counters 
                Bytes Written=143

[root@Mymaster ~]# cd output2
[root@Mymaster output2]# ll
总用量 4
-rw-r--r-- 1 root root 131 1月  13 19:01 part-r-00000
-rw-r--r-- 1 root root   0 1月  13 19:01 _SUCCESS
[root@Mymaster output2]# cat part-r-00000 
Khaleesi        2
Link    2
My      2
are     1
bye     2
hello   1
hi      1
home    2
https://blog.csdn.net/m0_52080234       2
is      2
my      2
name    2
ok      1
page    2
to      2
you     1
[root@Mymaster output2]# 

搞定!!
那么到了这一步就说明hadoop是没有问题的

安装hadoop伪分布式集群

简单介绍

hadoop伪分布式集群搭建好之后:
一台服务器,这个服务器上运行着多个进程
NameNode → 负责接收客户端存取的请求,并存储一些元数据(描述数据的数据)信息。类比:项目经理
DataNode → 负责hdfs上资源的具体的存和取。类比:项目组中的程序员
SecondaryNameNode → 用来协助NameName进程对元数据信息进行管理。类比:经理助理(秘书)
类比:一家公司就一个员工,这个员工充当了各种角色。(老板,员工,经理,程序员,出纳,…)

那么,下面是正式实操

免密登陆

介绍
在这里插入图片描述
首先看一下没有操作之前的样子

[root@Mymaster output2]# cd
[root@Mymaster ~]# ll ~/.ssh/
ls: 无法访问/root/.ssh/: 没有那个文件或目录
[root@Mymaster ~]# ll .ssh/
ls: 无法访问.ssh/: 没有那个文件或目录
[root@Mymaster ~]# ssh Mymaster
"The authenticity of host ‘mymaster (192.168.8.201)’ can’t be established.
ECDSA key fingerprint is SHA256:S18Xnq5jGlaByGMauuqmae8WCIN88kze704KfHa40jY.
ECDSA key fingerprint is MD5:e6:c6:37:60:2d:dd:d3:e5:bd:8d:00:cb:32:38:00:26.
Are you sure you want to continue connecting (yes/no)? y
Please type ‘yes’ or ‘no’: yes
Warning: Permanently added ‘mymaster,192.168.8.201’ (ECDSA) to the list of known hosts.
root@mymaster’s password:
Last login: Wed Jan 13 16:54:28 2021 from 192.168.8.1
[root@Mymaster ~]#

开始实施
生成公钥和私钥(rsa → 非对称加密)
私钥:产生后,在当前的节点上,不在网络上传输。
公钥:产生后,将公钥拷贝给授信用户,数据在网络上传输的,即使被拦截,也没有影响。

这里一直回车就好了

[root@Mymaster ~]# ssh-keygen -t rsa
Generating public/private rsa key pair.
Enter file in which to save the key (/root/.ssh/id_rsa):
Enter passphrase (empty for no passphrase):
Enter same passphrase again:
Your identification has been saved in /root/.ssh/id_rsa.
Your public key has been saved in /root/.ssh/id_rsa.pub.
The key fingerprint is:
SHA256:+EgMgcNF/l7UsdKrTAgsPh/40esK5c9DT3rVXLJDnxY root@Mymaster
The key’s randomart image is:
±–[RSA 2048]----+
| . ++ . |
| +o . o o |
| …= o + |
| . o * + . .o E |
| + + B S .+ = o |
| * =.B… = + |
| . +.+++. o |
| . +o o |
| …+o |
±—[SHA256]-----+
[root@Mymaster ~]#

再次查看发现生成了文件,可以自行查看一下

[root@Mymaster ~]# exit
登出
Connection to mymaster closed.
[root@Mymaster ~]# ll .ssh/
总用量 12
-rw------- 1 root root 1675 1月 13 19:35 id_rsa
-rw-r–r-- 1 root root 395 1月 13 19:35 id_rsa.pub
-rw-r–r-- 1 root root 184 1月 13 19:32 known_hosts
[root@Mymaster ~]#

下面就是将公钥拷贝给授信用户

[root@master ~]# ssh-copy-id -i root@master

输入密码
搞定

[root@Mymaster ~]# ssh-copy-id -i root@Mymaster
/usr/bin/ssh-copy-id: INFO: Source of key(s) to be installed: “/root/.ssh/id_rsa.pub”
/usr/bin/ssh-copy-id: INFO: attempting to log in with the new key(s), to filter out any that are already installed
/usr/bin/ssh-copy-id: INFO: 1 key(s) remain to be installed – if you are prompted now it is to install the new keys
root@mymaster’s password:
Number of key(s) added: 1
Now try logging into the machine, with: “ssh ‘root@Mymaster’”
and check to make sure that only the key(s) you wanted were added.
[root@Mymaster ~]#

验证一下

[root@Mymaster ~]# ssh Mymaster
Last login: Wed Jan 13 19:41:38 2021 from mymaster
[root@Mymaster ~]# 

不需要输入密码就成了
再看看文件情况

[root@Mymaster ~]# ll .ssh/
总用量 16
-rw------- 1 root root  395 1月  13 19:42 authorized_keys
-rw------- 1 root root 1675 1月  13 19:35 id_rsa
-rw-r--r-- 1 root root  395 1月  13 19:35 id_rsa.pub
-rw-r--r-- 1 root root  184 1月  13 19:32 known_hosts
[root@Mymaster ~]# 

以下是对其的理解可供参考!!!
在这里插入图片描述
伪分布式对机器的要求不是很高,学习大数据相关知识大可用伪分布式集群,但想要学习原滋原味的大数据技术建议搭建完全分布式集群,这是徐老师的原话,现在感觉确实如此!!

那么,本次的复习就到此结束了!!!
编写于2021-1-13


本文链接: http://www.dtmao.cc/news_show_600229.shtml

附件下载

相关教程

    暂无相关的数据...

共有条评论 网友评论

验证码: 看不清楚?