Highest Voted 'hdfs' Questions - Server Fault Stack Exchange

13

votes

4 answers

In Hadoop, how to show current process of -copyFromLocal

I am still a newbie learner of Hadoop, and this time I was trying to process a 106GB file. I used -copyFromLocal to copy that big file to my Hadoop DFS, but since the file is big I have to wait for a long time without a clue about the current…

hadoop hdfs

asked Apr 11 '14 at 04:15

Bang Dao

233
2
6

7

votes

2 answers

HBASE Space Used Started Climbing Rapidly

Update 4,215: After looking at space usage inside of hdfs, I see that .oldlogs is using a lot of space: 1485820612766 /hbase/.oldlogs So new questions: What is it? How do I clean it up? How do I keep it from growing again What caused it to…

hdfs hbase cloudera opentsdb

asked Dec 04 '14 at 22:41

Kyle Brandt

82,107
71
302
444

7

votes

2 answers

Hadoop HDFS Backup & DR Strategy

We are preparing to implement our first Hadoop cluster. As such we are starting out small with a four node setup. (1 master node, and 3 worker nodes) Each node will have 6TB of storage. (6 x 1TB disks) We went with a SuperMicro 4-node chassis so…

backup disaster-recovery hadoop hdfs

asked Aug 13 '13 at 23:32

Matt Keller

221
4
7

6

votes

2 answers

Hadoop HDFS: set file block size from commandline?

I need to set the block-size of a file when I load it into HDFS, to some value lower than the cluster block size. For example, if HDFS is using 64mb blocks, I may want a large file to be copied in with 32mb blocks. I've done this before within a…

hadoop block hdfs

asked Aug 11 '11 at 15:22

BigChief

398
1
2
12

5

votes

1 answer

Forward-sync to HDFS? (OR continue an incomplete hdfs upload?)

Anyone have a good suggestion for doing a forward sync to HDFS? ("forward-sync" in contrast to "bi-directional sync") Basically I have a large number of files I want to put into the HDFS. Its so large that I'll often, say, lose connectivity before…

rsync synchronization hadoop hdfs

asked Sep 14 '09 at 15:52

Nate Murray

973
1
7
7

5

votes

2 answers

How to fix Hadoop HDFS cluster with missing blocks after one node was reinstalled?

I have a 5 slave Hadoop cluster (using CDH4)---slaves are where DataNode and TaskNode run. Each slave has 4 partitions dedicated to HDFS storage. One of the slaves needed a reinstall and this caused one of the HDFS partitions to be lost. At this…

partition hadoop reinstall hdfs

asked Aug 10 '13 at 12:36

Dolan Antenucci

329
1
4
16

5

votes

1 answer

Ceph: Why is a greater number of "placement groups" a "bad thing"?

I have been researching distributed databases and file systems, and while I was originally mostly interested in Hadoop/HBase because I'm a Java programmer, I found this very interesting document about Ceph, which as a major plus point, is now…

distributed-filesystems hdfs

asked Apr 22 '11 at 11:20

monster

608
2
10
17

4

votes

1 answer

mount.nfs: mount system call failed

I am trying to mount hdfs on my local machine running Ubuntu using the following command :--- sudo mount -t nfs -o vers=3,proto=tcp,nolock 192.168.170.52:/ /mnt/hdfs_mount/ But I am getting this error:- mount.nfs: mount system call failed Output…

ubuntu nfs mount hadoop hdfs

asked Jun 28 '17 at 06:23

Bhavya Jain

141
1
1
3

4

votes

2 answers

Upload large files with curl without RAM cache.

I'm using curl to upload large files (from 5 to 20Gb) to HOOP based on HDFS (Hadoop Cluster) as follows: curl -f --data-binary "@$file" "$HOOP_HOST$UPLOAD_PATH?user.name=$HOOP_USER&op=create" But when curl uploading large file it trying to fully…

http curl hdfs

asked May 24 '15 at 07:53

Gening D.

81
1
5

4

votes

3 answers

Is there a way to grep gzipped content in hdfs without extracting it?

I'm looking for a way to zgrep hdfs files something like: hadoop fs -zcat hdfs://myfile.gz | grep "hi" or hadoop fs -cat hdfs://myfile.gz | zgrep "hi" it does not really work for me is there anyway to achieve that with command line?

hadoop hdfs

asked Jan 22 '15 at 10:49

Jas

701
4
13
23

4

votes

0 answers

java.lang.NullPointerException When Doing A Read in HDFS

I have had a 10 node HBase cluster up and running for the past 4 months. The cluster was setup on VMs in a corporate environment which I do not control, but everything has been working great...until today. Today, every part of the system was down. I…

hdfs hbase

asked Apr 15 '14 at 22:04

JasCav

233
1
12

4

votes

1 answer

Can't connect to HDFS in pseudo-distributed mode

I followed the instructions here for installing hadoop in pseudo-distributed mode. However, I'm having trouble connecting to HDFS. When I execute this command : ./hadoop fs -ls / I get a directory listing just like I should. However, when I execute…

linux hadoop hdfs hbase

asked Aug 23 '12 at 22:53

sangfroid

193
1
3
10

4

votes

3 answers

What is meant by "streaming data access" in HDFS?

According to the HDFS Architecture page HDFS was designed for "streaming data access". I'm not sure what that means exactly, but would guess it means an operation like seek is either disabled or has sub-optimal performance. Would this be…

filesystems streaming hadoop hdfs

asked Jul 14 '09 at 10:13

Van Gale

472
1
5
10

3

votes

0 answers

How can I launch hdfs on Mesos without DC/OS?

From my understand DC/OS is a freemium managed service. Because I'd rather just have a raw Mesos implementation, I'd rather not be dependent on DC/OS and so I just want to know how to implement HDFS on Mesos without it. Unfortunately google is…

hadoop hdfs apache-mesos

asked Feb 23 '17 at 21:10

Dr.Knowitall

209
1
10

3

votes

1 answer

Linux Network tuning to prevent tcp rcvpruned and backlogdrop?

My datanodes in my hbase cluster are triggering some tcp rcvpruned and backlog drops from time to time: It seems to be there are at least two angles to approach this at: Tune HBase/HDFS etc... so that these are not triggered Tune the Linux network…

linux networking kernel hdfs hbase

asked Sep 19 '14 at 13:33

Kyle Brandt

82,107
71
302
444

Questions tagged [hdfs]