Best book for Linux-
1.RHCSA/RHCE Red Hat Linux Certification Study Guide Exams EX200 & EX300
2.Red Hat RHCSA/RHCE 7 Cert Guide: Red Hat Enterprise Linux 7 (EX200 and EX300) 1 Edition
3.Linux All-In-One For Dummies, 6ed
DekaVision
Big Data Hadoop...
Saturday, 6 July 2019
Best Book for Hadoop
Best books for hadoop are listed as below-
1. Hadoop: The Definitive Guide, 4th Edition
2.Hadoop for Dummies
3.Hadoop Real-World Solutions Cookbook -
4.Hadoop Operations and Cluster Management Cookbook
5.Hadoop 2 Quick-Start Guide: Learn the Essentials of Big Data Computing in the Apache Hadoop 2 Ecosystem 1 Edition
1. Hadoop: The Definitive Guide, 4th Edition
2.Hadoop for Dummies
3.Hadoop Real-World Solutions Cookbook -
4.Hadoop Operations and Cluster Management Cookbook
5.Hadoop 2 Quick-Start Guide: Learn the Essentials of Big Data Computing in the Apache Hadoop 2 Ecosystem 1 Edition
Monday, 1 July 2019
How to install Ansible in Redhat /Ubuntu

How to install Ansible in Redhat-
If you have "Subscription-Manager Register" on your system than hit the below commands-
To enable the Ansible Engine repository for RHEL 7, run the following command:
#sudo subscription-manager repos --enable rhel-7-server-ansible-2.8-rpms
# sudu yum repolist
#sudo yum install ansible
# ansible -- version
If you have a Red Hat Ansible Engine Subscription: Than through command line-
Register the control node with Red Hat Subscription Manager(RHSM) and subscribe to the Ansible Engine
repository:
First register your system to RHSM:
# subscription-manager register
Attach your Red Hat Ansible Engine subscription. This command will help you find the Red Hat Ansible
Engine subscription:
# subscription-manager list --available
Grab the pool id of the subscription and run the following:
# subscription-manager attach --pool=<pool id here of engine subscription>
Enable the Red Hat Ansible Engine Repository:
# subscription-manager repos --enable rhel-7-server-ansible-VERSION-rpms
# subscription-manager repos --enable ansible-VERSION-for-rhel-8-x86_64-rpms
Install Ansible Engine:
# yum install ansible
# ansible -- version
========================
Install Ansible in Ubuntu
$ sudo apt update
$ sudo apt install software-properties-common
$ sudo apt-add-repository --yes --update ppa:ansible/ansible
$ sudo apt update
$ sudo apt install ansible
Saturday, 8 June 2019
HOW TO UNINSTALL MYSQL AND REINSTALL:
HOW TO UNINSTALL MYSQL AND REINSTALL:
sudo apt-get remove --purge mysql*
sudo apt-get purge mysql*
sudo apt-get autoremove
sudo apt-get autoclean
sudo apt-get remove dbconfig-mysql
sudo apt-get dist-upgrade
sudo apt-get install mysql-server
============================
HOW TO CHANGE MYSQL ROOT PASSWORD-
$sudo apt install mysql-server
After installing mysql if you will not get GUI option to put root user name and password than by default you will get the below user to login ;
Type-
$ sudo cat /etc/mysql/debian.cnf
Example-
mdeka@master:~$ sudo cat /etc/mysql/debian.cnf
# Automatically generated for Debian scripts. DO NOT TOUCH!
[client]
host = localhost
user = debian-sys-maint
password = 5Ge6hlZwxM0IYJBZ
socket = /var/run/mysqld/mysqld.sock
[mysql_upgrade]
host = localhost
user = debian-sys-maint
password = 5Ge6hlZwxM0IYJBZ
socket = /var/run/mysqld/mysqld.sock
----
Here below is the credential-
mysql -u debian-sys-maint -p
Enter password: 5Ge6hlZwxM0IYJBZ
Now -
CHANGE YOUR MYSQL ROOT PASSWORD
-----------
sudo cat /etc/mysql/debian.cnf
You can simply reset the root password by running the server with --skip-grant-tables and logging in without a password by running the following as root or with sudo:
service mysql stop
mysqld_safe --skip-grant-tables &
mysql -u root
---
or login with mysql -u debian-sys-maint -p and do the below steps-
mysql> use mysql;
mysql> update user set authentication_string=PASSWORD("YOUR-NEW-ROOT-PASSWORD") where User='root';
mysql> flush privileges;
mysql> quit
# service mysql stop
# service mysql start
$ mysql -u root -p
Enter Password .......
:) enjoy
Friday, 17 August 2018
Steps of installing Cloudera on Google clood
( For personal use )
Create node with 4 vcore CPU
node1 ,node2,node3 node4
go to API manager - service account file
ssh all nodes , log on as sudo -i , passwd , and keep the same password in all nodes.
log on as ROOT
go to nano /etc/ssh/sshd_config and change the below four things
1) PermitRootLogin yes
2) AuthorizedKeyFile /root/.ssh/authorized_keys
3) PasswordAuthentication yes
4) ChallengeResponseAuthentication yes
cntrl X yes enter
service ssh restart
ssh-keygen
service ssh restart
( DO THE SAME STEPS FOR ALL NODES)
DO ssh-copy-id for all
root@node1# ssh-copy-id root@node2
DO IT VICE VERSA in all nodes
than do it for all nodes :
sysctl -w vm.swappiness=0
wget https://archive.cloudera.com/cm5/installer/latest/cloudera-manager-installer.bin
ls and check
chmod u+x cloudera-manager-installer.bin
sudo ./cloudera-manager-installer.bin (if you are lohin also as root)
NETWORK CHANGE
Networking --> Firewall rules --> default-allo-internet--> source filter (allow from any source (0.0.0.0/0)) SAVE
DO IT FOR ALL NODES
SSH issue got fixed.
gcloud compute instances add-metadata node1 --metadata enable-oslogin=TRUE
Sqoop Command
1. sqoop import --connect jdbc:mysql://localhost/sqoop --username root --password 'root123' --table emp --m 1
2 sqoop import --connect jdbc:mysql://localhost/sqoop --username root --password '' --table emp --target-dir /user/root/sqoop
2. sqoop import --connect jdbc:mysql://localhost/sqoop --username root --password '' --table emp --where "sex='f'"
3. sqoop import --connect jdbc:mysql://localhost/sqoop --username root --table emp -P
4. sqoop import --connect jdbc:mysql://localhost/sqoop --username root --password '' --table emp --as-sequencefile
5. sqoop import --connect jdbc:mysql://localhost/sqoop --username root --password '' --table emp --as-avrodatafile
6. sqoop import --connect jdbc:mysql://localhost/sqoop --username root --password '' --table emp --compress
7. sqoop import --connect jdbc:mysql://localhost/sqoop --username root --password '' --table emp --direct
8. sqoop import --connect jdbc:mysql://localhost/sqoop --username root --password '' --table emp --map-column-java id=Long
9. sqoop import --connect jdbc:mysql://localhost/sqoop --username root --password '' --table emp --num-mappers 10
10. sqoop import --connect jdbc:mysql://localhost/sqoop --username root --password '' --table emp --null-string '\\N' \
--null-non-string '\\N'
11. sqoop import-all-tables --connect jdbc:mysql://localhost/sqoop --username root --password ''
2 sqoop import --connect jdbc:mysql://localhost/sqoop --username root --password '' --table emp --target-dir /user/root/sqoop
2. sqoop import --connect jdbc:mysql://localhost/sqoop --username root --password '' --table emp --where "sex='f'"
3. sqoop import --connect jdbc:mysql://localhost/sqoop --username root --table emp -P
4. sqoop import --connect jdbc:mysql://localhost/sqoop --username root --password '' --table emp --as-sequencefile
5. sqoop import --connect jdbc:mysql://localhost/sqoop --username root --password '' --table emp --as-avrodatafile
6. sqoop import --connect jdbc:mysql://localhost/sqoop --username root --password '' --table emp --compress
7. sqoop import --connect jdbc:mysql://localhost/sqoop --username root --password '' --table emp --direct
8. sqoop import --connect jdbc:mysql://localhost/sqoop --username root --password '' --table emp --map-column-java id=Long
9. sqoop import --connect jdbc:mysql://localhost/sqoop --username root --password '' --table emp --num-mappers 10
10. sqoop import --connect jdbc:mysql://localhost/sqoop --username root --password '' --table emp --null-string '\\N' \
--null-non-string '\\N'
11. sqoop import-all-tables --connect jdbc:mysql://localhost/sqoop --username root --password ''
Wednesday, 20 June 2018
Frequently Used HDFS Shell Commands
1.$ hadoop version
2.$hadoop fs -ls /usr or HDFS dfs -ls /
3.$hadoop fs -ls -R /usr ( to check contain of all directory )
4.$hadoop fs -ls -R –h /usr ( if we need to check format file sizes in human readable fashion then use “ –h” )
5.$hadoop fs -mkdir /usr/hadoop/dir1 ( if /usr/hadoop directory is already exist )
6.$hadoop fs -mkdir -p /usr/hadoop/dir1 ( if /usr/hadoop directoty does not exist , the we must put “-p”)
7.$hadoop fs -mkdir -p /usr/hadoop/dir1 /usr/hadoop/dir2 /usr/hadoop/dir3 ( if we need to create multiple directory together)
8.$hadoop fs -put <source> <destination> ( Syntax)
9.$hadoop fs -put localfile /usr/hadoop ( single file copy)
10.$hadoop fs -put localfile1 localfile2 /usr/hadoop (multiple file copy)
11.$hadoop fs -put localfile -hdfs://host:port/usr/hadoop ( single file with HDFS shcema specifiled )
12.$hadoop fs -put -hdfs://host:port:usr/hadoop/ ( read user’s input text and write to HDFS)
13.$hadoop fs - get < hdfs _source> < local_destination > (Syntax)
14.$hadoop fs -get /usr/hadoop/SampleTest1.csv localfile ( copy SampeTest1.csv file from hdfs)
15.$hadoop fs -get hdfs://host:port/usr/hadoop/SampeTest1.csv localfile ( copy SampleTest.1 file with HDFS schema specified to local file system )
16.$hadoop fs -cat <path (file name )> (Syntax)
17.$hadoop fs –cat /usr/hadoop/cmd.txt (show the content of “cmd.txt” file )
18.$hadoop fs –cp [-f] UR1 [UR1…] <destination> ( alow multiple sources , destination must be a directory , [-f] option for overwrite) (Syntax)
19.$hadoop fs -cp /usr/hadoop/Test1.txt /usr/hadoop/dir1 (copy test1.txt to dir1 directory)
20.$hadoop fs -cp /usr/hadoop/Test2.txt /usr/hadoop/Test3.txt /usr/hadoop/dir1 ( copy Test2.txt & Test3.txt on /usr/hadoop/dir1)
21.$hadoop fs -copyFromLocal [-f] <localscr> URI (copy a file from Local file system to HDFS, “-f ” is for overwrite forcefully ) (Syntax)
22.$hadoop fs -copyFromLocal localfile hdfs://host:port/user/hadoop/ ( copy file to hdfs)
23.$hadoop fs -copyFromLocal /localdir hdfs:host:port/usr/hadoop (copy directory /folder to hdfs)
24.$hadoop fs -copyFromLocal localfile1 localfile2 /ust/hadoop/ (copy multiple file to hdfs)
25.$hadoop fs -copyFromLocal -hdfs://host:portusr/hadoop/ (copy user input and write to HDFS)
26.$hadoop fs -copyToLocal URL <localdest> (copy file/folder to Local file system from HDFS ) (Syntax)
27.$hadoop fs –copyToLocal hdfs://host:port/usr/hadoop/Test1.txt (copy Test1.txt file to local directory from hdfs)
28.$hadoop fs -copyToLocal hdfs://port/usr/hadoop/hdfsdirlocal dirpath
29.$hadoop fs -mv URI [ URI…] <dest> (Syntax)
30.$hadoop fs -mv /usr/hadoop/Test1.txt/usr/hadoop/dir
31.$hadoop fs -mv hdfs://host:port/Test1.txt /usr/hadoop/dir
32.$hadoop fs -mv hdfs://host:port/Test2.txt hdfs://host:port/Test3.txt hdfs://host:port/dir (move multiple files Test2.txt ,Test3.txt to dir)
33.$hadoop fs -mv /usr/hadoop/Test4.txt /usr/hadoo/Test4rename.txt (rename)
34.$hadoop fs -rm [-r][-R] [- skipTrash] URI [URI….] (Syntax) (if we want a permanent delete than need to use –skipTrash optiom )
35.$hadoop fs -rm /usr/hadoop/dir1/test5.txt
36.$hadoop fs -rm /usr/Hadoop/dir1/test5.text
37.$hadoop fs -rm -r hdfs://host:port/Test7.txt /usr/Hadoop/dir2 (Syntax)
38.$hadoop fs -rm -skipTrash /usr/Hadoop/Test6.txt
39.$hadoop fs -rmdir [--ignore-fail-on-non-empty] URI[URI….] (Syntax) (--ignore-fail-on-non-empty,don’t fail if a directory stillcontains files)
40.$hadoop fs –rmdir /usr/Hadoop/dir3 (delete empty direcotry)
41.$hadoop fs -rmdir /usr/Hadoop/dir5 hdfs://host:port/usr/Hadoop/dir6 ( delete multiple empty directories “ dir5’ & ‘dir6’)
42.$hadoop fs -rmdir /usr/Hadoop/dir5 hdfs://master:9000/usr/Hadoop/dir6 (delete multiple dir)
43.$hadoop fs -rmdir --gnore-fail-on-non-empty /usr/Hadoop/dir4 ( delete non empty directory)
44.$hadoop fs -tail [-f] URI (Syntax)
45.$hadoop fs -tail /usr/Hadoop/test1
46.$hadoop fs -du [-s][-h] URI [URI…] (Syntax) (TO CHECK DISH USES)
47.$hadoop fs -du /usr/Hadoop/sampletest1
48.$hadoop fs -du /usr/Hadoop/dir1
49.$hadoop fs -du hdfs://host:port/usr/Hadoop/dir
50.$hadoop fs -du -h /usr/Hadoop/dir (human readable format)
51.$hadoop fs -du -s /usr/Hadoop/dir (summarized format)
52.$hadoop fs -expunge (to delete Trash files)
------------------------------------------------
<property> fs.trash.interval</property>
<property> 1448</property> … property in core-site.xml for trash
-------------------------------------------------
53.$hadoop fs -touchz URI [URI…] (Syntax)
54.$hadoop fs -touchz /usr/Hadoop/test1.txt ( create empty file)
55.$hadoop fs –touchz /usr/Hadoop/test2.txt /usr/Hadoop/test3.txt (create multiple empty files)
56.$hadoop fs -stat [format] <path>… (Syntax)
Formatting option :
%b – Size in bytes
%F - With return”file” ,’directory’,’symlink’ depending on the type
%g – Group name
%n – File name
%o – HDFS block size
%r –Replication factor
% u –User name
%y - UTC date as “yyy-MM-dd:HH:mm:ss
%Y – miliseconds since Januaay 1 , 1970 UTC
57. $hadoop fs -stat “ %b%F%g%n%o%r%u%y%Y” /usr/Hadoop/hdfs.txt ( user stat to collect specific information on file)
58. $hadoop fs -stat “ %b%F%g%n%o%r%u%y%Y” /usr/Hadoop/hdfsdir ( use stat to collect information of a directory)
59. $hadoop fs -setrep [-w] <numReplicas> <path> (Syntax) ( The –w flag request that the command wait for the replication to complete)
60. $hadoop fs -setrep 4 /usr/Hadoop/hdfsdir ( change replication for a specific directory )
** Changing “dfs.replication” will only apply to new files you create, but will not modify the replication factor for the already existing files
61. $hadoop fs - count [-q] [-h] <path> (Syntax) ( Count of directoris, files and bytes in specified path and file pattern) ( q - quota)
62. $hadoop fs - count hdfs://host:port/Test1.txt hdfs://host:port/Test2.txt
63. $hadoop fs -count -q hdfs://host:port/usr/Hadoop/hdfsdir
64. $hadoop fs -count -q -h hdfs://host:port /usr/Hadoop/hdfsdir
65. Hadoop FSCK Command-
Hadoop fsck / (File system check in HDFS)
Hadoop fsck / -files ( Display files during check)
Hadoop fsck / -files -blocks (Display files and blocks during check)
Hadoop fsck / -files-blocks-locations (Display files, blocks and its location during check )
Hadoop fsck / Mdeka/Data -files-blocks-locations-racks (Display network topology for the datanode location)
hadoop fsck -delete (Delete corrupted files)
hadoop fsck -move (Move corrupted files to /lost+found directory)
66. Hadoop Job Commands
hadoop job -submit <job-file> Submit the job
hadoop job -status <job-id> Print job status completion percentage
hadoop job -list all List all jobs
hadoop job -list-active-trackers List all available TaskTrackers
hadoop job -set-priority <job-id> <priority> Set priority for a job. Valid priorities: VERY_HIGH, HIGH, NORMAL, LOW, VERY_LOW
hadoop job -kill-task <task-id> Kill a task
hadoop job -history Display job history including job details, failed and killed jobs
67. Hadoop dfsadmin Commands-
** important
hadoop dfsadmin -report Report filesystem info and statistics
hadoop dfsadmin -metasave file.txt Save namenode’s primary data structures to file.txt
hadoop dfsadmin -setQuota 10 /quotatest Set Hadoop directory quota to only 10 files
hadoop dfsadmin -clrQuota /quotatest Clear Hadoop directory quota
hadoop dfsadmin -refreshNodes Read hosts and exclude files to update datanodes that are allowed to connect to namenode. Mostly used to commission or
decommsion nodes
hadoop fs -count -q /mydir Check quota space on directory /mydir
hadoop dfsadmin -setSpaceQuota /mydir 100M Set quota to 100M on hdfs directory named /mydir
hadoop dfsadmin -clrSpaceQuota /mydir Clear quota on a HDFS directory
hadooop dfsadmin -saveNameSpace Backup Metadata (fsimage & edits). Put cluster in safe mode before this command
68. Hadoop Safe Mode (Maintenance Mode) Commands-
hadoop dfsadmin -safemode enter Enter safe mode
hadoop dfsadmin -safemode leave Leave safe mode
hadoop dfsadmin -safemode get Get the status of mode
hadoop dfsadmin -safemode wait Wait until HDFS finishes data block replication
69.Hadoop mradmin Commands-
hadoop mradmin -safemode get Check Job tracker status
hadoop mradmin -refreshQueues Reload mapreduce configuration
hadoop mradmin -refreshNodes Reload active TaskTrackers
hadoop mradmin -refreshServiceAcl Force Jobtracker to reload service ACL
hadoop mradmin -refreshUserToGroupsMappingsForce jobtracker to reload user group mappings
70.Hadoop Balancer Commands -
start-balancer.sh Balance the cluster
hadoop dfsadmin -setBalancerBandwidth <bandwidthinbytes> Adjust bandwidth used by the balancer
hadoop balancer -threshold 20 Limit balancing to only 20% resources in the cluster
71. Hadoop Filesystem Commands
hadoop fs -mkdir mydir Create a directory (mydir) in HDFS
hadoop fs -ls List files and directories in HDFS
hadoop fs -cat myfile View a file content
hadoop fs -du Check disk space usage in HDFS
hadoop fs -expunge Empty trash on HDFS
hadoop fs -chgrp hadoop file1 Change group membership of a file
hadoop fs -chown huser file1 Change file ownership
hadoop fs -rm file1 Delete a file in HDFS
hadoop fs -touchz file2 Create an empty file
hadoop fs -stat file1 Check the status of a file
hadoop fs -test -e file1 Check if file exists on HDFS
hadoop fs -test -z file1 Check if file is empty on HDFS
hadoop fs -test -d file1 Check if file1 is a directory on HDFS
72. Additional Hadoop Filesystem Commands
hadoop fs -copyFromLocal <source> <destination> Copy from local fileystem to HDFS
hadoop fs -copyFromLocal file1 data e.g: Copies file1 from local FS to data dir in HDFS
hadoop fs -copyToLocal <source> <destination> copy from hdfs to local filesystem
hadoop fs -copyToLocal data/file1 /var/tmp e.g: Copies file1 from HDFS data directory to /var/tmp on local FS
hadoop fs -put <source> <destination> Copy from remote location to HDFS
hadoop fs -get <source> <destination> Copy from HDFS to remote directory
hadoop distcp hdfs://192.168.0.8:8020/input hdfs://192.168.0.8:8020/output Copy data from one cluster to another using the cluster URL
hadoop fs -mv file:///data/datafile /user/hduser/data Move data file from the local directory to HDFS
hadoop fs -setrep -w 3 file1 Set the replication factor for file1 to 3
hadoop fs -getmerge mydir bigfile Merge files in mydir directory and download it as one big fil
73. Hadoop Namenode Commands
hadoop namenode -format Format HDFS filesystem from Namenode
hadoop namenode -upgrade Upgrade the NameNode
start-dfs.sh Start HDFS Daemons
stop-dfs.sh Stop HDFS Daemons
start-mapred.sh Start MapReduce Daemons
stop-mapred.sh Stop MapReduce Daemons
hadoop namenode -recover -force Recover namenode metadata after a cluster failure (may lose data)
2.$hadoop fs -ls /usr or HDFS dfs -ls /
3.$hadoop fs -ls -R /usr ( to check contain of all directory )
4.$hadoop fs -ls -R –h /usr ( if we need to check format file sizes in human readable fashion then use “ –h” )
5.$hadoop fs -mkdir /usr/hadoop/dir1 ( if /usr/hadoop directory is already exist )
6.$hadoop fs -mkdir -p /usr/hadoop/dir1 ( if /usr/hadoop directoty does not exist , the we must put “-p”)
7.$hadoop fs -mkdir -p /usr/hadoop/dir1 /usr/hadoop/dir2 /usr/hadoop/dir3 ( if we need to create multiple directory together)
8.$hadoop fs -put <source> <destination> ( Syntax)
9.$hadoop fs -put localfile /usr/hadoop ( single file copy)
10.$hadoop fs -put localfile1 localfile2 /usr/hadoop (multiple file copy)
11.$hadoop fs -put localfile -hdfs://host:port/usr/hadoop ( single file with HDFS shcema specifiled )
12.$hadoop fs -put -hdfs://host:port:usr/hadoop/ ( read user’s input text and write to HDFS)
13.$hadoop fs - get < hdfs _source> < local_destination > (Syntax)
14.$hadoop fs -get /usr/hadoop/SampleTest1.csv localfile ( copy SampeTest1.csv file from hdfs)
15.$hadoop fs -get hdfs://host:port/usr/hadoop/SampeTest1.csv localfile ( copy SampleTest.1 file with HDFS schema specified to local file system )
16.$hadoop fs -cat <path (file name )> (Syntax)
17.$hadoop fs –cat /usr/hadoop/cmd.txt (show the content of “cmd.txt” file )
18.$hadoop fs –cp [-f] UR1 [UR1…] <destination> ( alow multiple sources , destination must be a directory , [-f] option for overwrite) (Syntax)
19.$hadoop fs -cp /usr/hadoop/Test1.txt /usr/hadoop/dir1 (copy test1.txt to dir1 directory)
20.$hadoop fs -cp /usr/hadoop/Test2.txt /usr/hadoop/Test3.txt /usr/hadoop/dir1 ( copy Test2.txt & Test3.txt on /usr/hadoop/dir1)
21.$hadoop fs -copyFromLocal [-f] <localscr> URI (copy a file from Local file system to HDFS, “-f ” is for overwrite forcefully ) (Syntax)
22.$hadoop fs -copyFromLocal localfile hdfs://host:port/user/hadoop/ ( copy file to hdfs)
23.$hadoop fs -copyFromLocal /localdir hdfs:host:port/usr/hadoop (copy directory /folder to hdfs)
24.$hadoop fs -copyFromLocal localfile1 localfile2 /ust/hadoop/ (copy multiple file to hdfs)
25.$hadoop fs -copyFromLocal -hdfs://host:portusr/hadoop/ (copy user input and write to HDFS)
26.$hadoop fs -copyToLocal URL <localdest> (copy file/folder to Local file system from HDFS ) (Syntax)
27.$hadoop fs –copyToLocal hdfs://host:port/usr/hadoop/Test1.txt (copy Test1.txt file to local directory from hdfs)
28.$hadoop fs -copyToLocal hdfs://port/usr/hadoop/hdfsdirlocal dirpath
29.$hadoop fs -mv URI [ URI…] <dest> (Syntax)
30.$hadoop fs -mv /usr/hadoop/Test1.txt/usr/hadoop/dir
31.$hadoop fs -mv hdfs://host:port/Test1.txt /usr/hadoop/dir
32.$hadoop fs -mv hdfs://host:port/Test2.txt hdfs://host:port/Test3.txt hdfs://host:port/dir (move multiple files Test2.txt ,Test3.txt to dir)
33.$hadoop fs -mv /usr/hadoop/Test4.txt /usr/hadoo/Test4rename.txt (rename)
34.$hadoop fs -rm [-r][-R] [- skipTrash] URI [URI….] (Syntax) (if we want a permanent delete than need to use –skipTrash optiom )
35.$hadoop fs -rm /usr/hadoop/dir1/test5.txt
36.$hadoop fs -rm /usr/Hadoop/dir1/test5.text
37.$hadoop fs -rm -r hdfs://host:port/Test7.txt /usr/Hadoop/dir2 (Syntax)
38.$hadoop fs -rm -skipTrash /usr/Hadoop/Test6.txt
39.$hadoop fs -rmdir [--ignore-fail-on-non-empty] URI[URI….] (Syntax) (--ignore-fail-on-non-empty,don’t fail if a directory stillcontains files)
40.$hadoop fs –rmdir /usr/Hadoop/dir3 (delete empty direcotry)
41.$hadoop fs -rmdir /usr/Hadoop/dir5 hdfs://host:port/usr/Hadoop/dir6 ( delete multiple empty directories “ dir5’ & ‘dir6’)
42.$hadoop fs -rmdir /usr/Hadoop/dir5 hdfs://master:9000/usr/Hadoop/dir6 (delete multiple dir)
43.$hadoop fs -rmdir --gnore-fail-on-non-empty /usr/Hadoop/dir4 ( delete non empty directory)
44.$hadoop fs -tail [-f] URI (Syntax)
45.$hadoop fs -tail /usr/Hadoop/test1
46.$hadoop fs -du [-s][-h] URI [URI…] (Syntax) (TO CHECK DISH USES)
47.$hadoop fs -du /usr/Hadoop/sampletest1
48.$hadoop fs -du /usr/Hadoop/dir1
49.$hadoop fs -du hdfs://host:port/usr/Hadoop/dir
50.$hadoop fs -du -h /usr/Hadoop/dir (human readable format)
51.$hadoop fs -du -s /usr/Hadoop/dir (summarized format)
52.$hadoop fs -expunge (to delete Trash files)
------------------------------------------------
<property> fs.trash.interval</property>
<property> 1448</property> … property in core-site.xml for trash
-------------------------------------------------
53.$hadoop fs -touchz URI [URI…] (Syntax)
54.$hadoop fs -touchz /usr/Hadoop/test1.txt ( create empty file)
55.$hadoop fs –touchz /usr/Hadoop/test2.txt /usr/Hadoop/test3.txt (create multiple empty files)
56.$hadoop fs -stat [format] <path>… (Syntax)
Formatting option :
%b – Size in bytes
%F - With return”file” ,’directory’,’symlink’ depending on the type
%g – Group name
%n – File name
%o – HDFS block size
%r –Replication factor
% u –User name
%y - UTC date as “yyy-MM-dd:HH:mm:ss
%Y – miliseconds since Januaay 1 , 1970 UTC
57. $hadoop fs -stat “ %b%F%g%n%o%r%u%y%Y” /usr/Hadoop/hdfs.txt ( user stat to collect specific information on file)
58. $hadoop fs -stat “ %b%F%g%n%o%r%u%y%Y” /usr/Hadoop/hdfsdir ( use stat to collect information of a directory)
59. $hadoop fs -setrep [-w] <numReplicas> <path> (Syntax) ( The –w flag request that the command wait for the replication to complete)
60. $hadoop fs -setrep 4 /usr/Hadoop/hdfsdir ( change replication for a specific directory )
** Changing “dfs.replication” will only apply to new files you create, but will not modify the replication factor for the already existing files
61. $hadoop fs - count [-q] [-h] <path> (Syntax) ( Count of directoris, files and bytes in specified path and file pattern) ( q - quota)
62. $hadoop fs - count hdfs://host:port/Test1.txt hdfs://host:port/Test2.txt
63. $hadoop fs -count -q hdfs://host:port/usr/Hadoop/hdfsdir
64. $hadoop fs -count -q -h hdfs://host:port /usr/Hadoop/hdfsdir
65. Hadoop FSCK Command-
Hadoop fsck / (File system check in HDFS)
Hadoop fsck / -files ( Display files during check)
Hadoop fsck / -files -blocks (Display files and blocks during check)
Hadoop fsck / -files-blocks-locations (Display files, blocks and its location during check )
Hadoop fsck / Mdeka/Data -files-blocks-locations-racks (Display network topology for the datanode location)
hadoop fsck -delete (Delete corrupted files)
hadoop fsck -move (Move corrupted files to /lost+found directory)
66. Hadoop Job Commands
hadoop job -submit <job-file> Submit the job
hadoop job -status <job-id> Print job status completion percentage
hadoop job -list all List all jobs
hadoop job -list-active-trackers List all available TaskTrackers
hadoop job -set-priority <job-id> <priority> Set priority for a job. Valid priorities: VERY_HIGH, HIGH, NORMAL, LOW, VERY_LOW
hadoop job -kill-task <task-id> Kill a task
hadoop job -history Display job history including job details, failed and killed jobs
67. Hadoop dfsadmin Commands-
** important
hadoop dfsadmin -report Report filesystem info and statistics
hadoop dfsadmin -metasave file.txt Save namenode’s primary data structures to file.txt
hadoop dfsadmin -setQuota 10 /quotatest Set Hadoop directory quota to only 10 files
hadoop dfsadmin -clrQuota /quotatest Clear Hadoop directory quota
hadoop dfsadmin -refreshNodes Read hosts and exclude files to update datanodes that are allowed to connect to namenode. Mostly used to commission or
decommsion nodes
hadoop fs -count -q /mydir Check quota space on directory /mydir
hadoop dfsadmin -setSpaceQuota /mydir 100M Set quota to 100M on hdfs directory named /mydir
hadoop dfsadmin -clrSpaceQuota /mydir Clear quota on a HDFS directory
hadooop dfsadmin -saveNameSpace Backup Metadata (fsimage & edits). Put cluster in safe mode before this command
68. Hadoop Safe Mode (Maintenance Mode) Commands-
hadoop dfsadmin -safemode enter Enter safe mode
hadoop dfsadmin -safemode leave Leave safe mode
hadoop dfsadmin -safemode get Get the status of mode
hadoop dfsadmin -safemode wait Wait until HDFS finishes data block replication
69.Hadoop mradmin Commands-
hadoop mradmin -safemode get Check Job tracker status
hadoop mradmin -refreshQueues Reload mapreduce configuration
hadoop mradmin -refreshNodes Reload active TaskTrackers
hadoop mradmin -refreshServiceAcl Force Jobtracker to reload service ACL
hadoop mradmin -refreshUserToGroupsMappingsForce jobtracker to reload user group mappings
70.Hadoop Balancer Commands -
start-balancer.sh Balance the cluster
hadoop dfsadmin -setBalancerBandwidth <bandwidthinbytes> Adjust bandwidth used by the balancer
hadoop balancer -threshold 20 Limit balancing to only 20% resources in the cluster
71. Hadoop Filesystem Commands
hadoop fs -mkdir mydir Create a directory (mydir) in HDFS
hadoop fs -ls List files and directories in HDFS
hadoop fs -cat myfile View a file content
hadoop fs -du Check disk space usage in HDFS
hadoop fs -expunge Empty trash on HDFS
hadoop fs -chgrp hadoop file1 Change group membership of a file
hadoop fs -chown huser file1 Change file ownership
hadoop fs -rm file1 Delete a file in HDFS
hadoop fs -touchz file2 Create an empty file
hadoop fs -stat file1 Check the status of a file
hadoop fs -test -e file1 Check if file exists on HDFS
hadoop fs -test -z file1 Check if file is empty on HDFS
hadoop fs -test -d file1 Check if file1 is a directory on HDFS
72. Additional Hadoop Filesystem Commands
hadoop fs -copyFromLocal <source> <destination> Copy from local fileystem to HDFS
hadoop fs -copyFromLocal file1 data e.g: Copies file1 from local FS to data dir in HDFS
hadoop fs -copyToLocal <source> <destination> copy from hdfs to local filesystem
hadoop fs -copyToLocal data/file1 /var/tmp e.g: Copies file1 from HDFS data directory to /var/tmp on local FS
hadoop fs -put <source> <destination> Copy from remote location to HDFS
hadoop fs -get <source> <destination> Copy from HDFS to remote directory
hadoop distcp hdfs://192.168.0.8:8020/input hdfs://192.168.0.8:8020/output Copy data from one cluster to another using the cluster URL
hadoop fs -mv file:///data/datafile /user/hduser/data Move data file from the local directory to HDFS
hadoop fs -setrep -w 3 file1 Set the replication factor for file1 to 3
hadoop fs -getmerge mydir bigfile Merge files in mydir directory and download it as one big fil
73. Hadoop Namenode Commands
hadoop namenode -format Format HDFS filesystem from Namenode
hadoop namenode -upgrade Upgrade the NameNode
start-dfs.sh Start HDFS Daemons
stop-dfs.sh Stop HDFS Daemons
start-mapred.sh Start MapReduce Daemons
stop-mapred.sh Stop MapReduce Daemons
hadoop namenode -recover -force Recover namenode metadata after a cluster failure (may lose data)
Subscribe to:
Posts (Atom)