Oracle to HDFS using Sqoop

Last updated on May 22,2019 28.7K Views
Awanish is a Sr. Research Analyst at Edureka. He has rich expertise... Awanish is a Sr. Research Analyst at Edureka. He has rich expertise in Big Data technologies like Hadoop, Spark, Storm, Kafka, Flink. Awanish also...

Oracle to HDFS using Sqoop

edureka.co

1. Download the Oracle Expresss Edition from the below link and unzip it.

https://docs.google.com/a/edureka.in/file/d/0B2-rlCGKD40NNW5BcHZMTkdtcmc/edit

2. After extracting you will find a executable file of Oracle XE Edition as in the below image.

3. Double click on the OracleXEUniv to install Oracle database on your system and click on Run.

4. Click on Next

5. Accept the licence agreement and click on next.

6. By default oracle will take system as database name. Let us enter password

for this database.

Enter Password –> system

Confirm Password –> system

Click Next:

7. Click on Install

8. Click on Finish

9. Let us edit the sqlnet.ora file present in the folder

C:oraclexepporacleproduct.2.0serverNETWORKADMIN

When you open it you will find the below content.

Edit it as in the Below Screen Shot

10. Let us start the SQL Command Line of Oracle.

Goto Start menu -> All Programs -> Oracle Database 10g Express Edition ->

Run SQL Command Line and double click on it.

11. You will get the Cli fo Oracle Database.

12. Let us connect to the Oracle database.

Username: system

Password: system

You will receive the message connected successfully.

13. Let us create a simple table.

Command:

create table emp (id number);

14. Let us insert some values in it by using Insert command.

Command:

insert into emp values(2);

15.Let us check if the data is inserted into the table by using Select Command.

Command:

select * from emp;

16. Let us commit the data.

Command:

commit;

17. To import the data from Oracle Database to Sqoop we need to add the

Oracle Connector (ojdbc6_g.jar).

You can download the jar from the below link.

https://docs.google.com/a/edureka.in/file/d/0B2-rlCGKD40Nekw3ZXBRWUU5Y1E/edit

18.Open Cloudera cdh3 and move Oracle connector to Cloudera cdh3 (To Desktop) using FileZilla.

Use the below link to understand how to move a file from Windows to cloudera cdh3 vm.

https://www.edureka.co/blog/transfer-files-windows-cloudera-demo-vm/

19. Once the Oracle connector is present on Cloudera Cdh3 Desktop, move it to the lib folder of

sqoop by executing the below command:

Command:

sudo cp /home/cloudera/Desktop/ojdbc6_g.jar /usr/lib/sqoop/lib/

20. Change the directory to Sqoop by executing the below command:

Command:

cd /usr/lib/sqoop/

21. Open Command Prompt (CMD) on Windows and check the IPv4 Address

by executing the

below command:

Command:

ipconfig

22. Import the data of the table emp present in Oracle database to hdfs by executing the below

command:

Required items for the command:

IPv4 Address – Your IPv4 address. In my case it is 192.168.46.1

Database Name – system

Table Name – emp

Username – system

Password – system

Output Directory – Could be any. I have used sqoopoutput1

Command:

sudo bin/sqoop import –connect jdbc:oracle:thin:system/

system@192.168.46.1:1521:xe

–username system -P –table system.emp –columns “ID” –target-dir /

sqoopoutput1 -m 1

23. When the command got executed successfully, you will receive the message

that Retrieved Records as in the below image.

24. Open the Browser and go to the below URL:

URL: http://localhost:50070/dfshealth.jsp

Click on Browse the filesystem

25. Click on sqoopoutput1 directory

26. Click on part-m-00000 file:

27.Below is the data that was imported from Oracle database:

Congratulations! You have Successfully Imported Data from Oracle Database

to HDFS Using Sqoop..!

BROWSE COURSES