11. log into the terminal/Putty , go to downloads folder
Download the sqoop .tar file using the below
Please note that if you do not get this version available , download any other stable version from Apache's mirror http://apache.mirrors.hoobly.com/sqoop/
2. Unzip the file using below:
tar -xzvf
sqoop-1.4.4.bin__hadoop-1.0.0.tar.gz
a 3. Move the unzipped file to the install folder :
4. Log into the Oracle's website below
Download the ojdbc6.jar by accepting the terms and conditions .
Please note that you will have to sign up with oracle to download the file .
5. Copy and paste the ojdbc6.jar to the "~/install/sqoop/lib" folder as given below
6. Go to "/home/hadoop/lab/install/sqoop-1.4.3.bin__hadoop-1.0.0/conf "
open the sqoop-env.sh using a VI editor
Or do the below
cp sqoop-env-template.sh sqoop-env.sh ( This will create the file if it does not exist )
vi sqoop-env.sh
Edit the below path as you have in your machine .
7. Start Hadoop by going to
hadoop@ubuntu-server64:~/lab/install/hadoop-1.0.4/bin$ ./start-all.sh
"jps" should list all the processes running as given above .
For Hadoop Installation , please check my blog on Hadoop installation .
8. Go to "/home/hadoop/lab/install/sqoop-1.4.3.bin__hadoop-1.0.0/bin " , Test the sqoop to be working or not ( This is for oracle for others RDBMS please revisit #4 above and download the appropriate jar)
./sqoop list-databases --connect jdbc:oracle:thin:@192.168.10.100:1521:PROD --username SYSTEM --password manager
If the connection is working fine , it is time to go to the next step . Else There can be couple of issues
1. Correct .JAR is not placed in the sqoop/lib folder
2. User/pass using JDBC doesn't have access to the database .
This can be one type of error
================ERROR=====================================================
hadoop@ubuntu-server64:~/lab/install/sqoop-1.4.3.bin__hadoop-1.0.0/bin$ ./sqoop list-databases --connect jdbc:oracle:thin:@192.168.0.41:1521:PROD --username SPENDOMETER -P
Warning: /usr/lib/hbase does not exist! HBase imports will fail.
Please set $HBASE_HOME to the root of your HBase installation.
Enter password:
13/12/16 14:48:08 INFO manager.SqlManager: Using default fetchSize of 1000
13/12/16 14:48:09 INFO manager.OracleManager: Time zone has been set to GMT
13/12/16 14:48:09 ERROR manager.OracleManager: The catalog view DBA_USERS was not found. This may happen if the user does not have DBA privileges. Please check privileges and try again.
================ERROR=====================================================
9. Execute the below command to pull the data and load into Hive .
./sqoop import --connect jdbc:oracle:thin:@192.168.0.41:1521:PROD --table TEMP_FIRST_NAME --username SPENDOMETER --password spendometer --hive-import --split-by first_name --target-dir /user/hive/warehouse/name
TEMP_FIRST_NAME -> Name of the table
username -> This must always be CAPITIAL LETTER (SPENDOMETER )
password -> This must always be small letter (spendometer )
10. To check if the file has been created or not , use the below command
Hadoop fs -ls /user/hive/warehouse/temp_first_name
This should contain the folders inside which the data will be available
11. Start the hive with command as given below .
If the hive starts and you still don't see the table , you are pointing to the wrong metastore_DB from hive .
12. start the hive using "bash" command as below
hadoop@ubuntu-server64:~/lab/install/sqoop-1.4.3.bin__hadoop-1.0.0/bin$ bash /home/hadoop/lab/install/hive-0.9.0/bin/hive
This should now fetch you data as below
what should we do with the error?
ReplyDeleteWhen I try to list Databases it gave me the DBA users was not found error too.
ERROR manager.OracleManager: The catalog view DBA_USERS was not found. This may happen if the user does not have DBA privileges. Please check privileges and try again.