0

we have hadoop cluster ( Ambari platform with HDP version - 2.6.4 )

and we performed verification step in order to understand if we have under replica blocks

the first verification was with:

su hdfs
hdfs fsck / -   --> 

its gives the results:


 Total size:    17653549013347 B (Total open files size: 854433698229 B)
 Total dirs:    843714
 Total files:   11752836
 Total symlinks:                0 (Files currently being written: 16)
 Total blocks (validated):      11792203 (avg. block size 1497052 B) (Total open file blocks (not validated): 6381)
 Minimally replicated blocks:   11792203 (100.00001 %)
 Over-replicated blocks:        0 (0.0 %)
 Under-replicated blocks:       0 (0.0 %)
 Mis-replicated blocks:         0 (0.0 %)
 Default replication factor:    3
 Average block replication:     3.0
 Corrupt blocks:                0
 Missing replicas:              0 (0.0 %)
 Number of data-nodes:          6
 Number of racks:               1

so as we can see above Under-replicated blocks is 0

BUT

when we perform the next verification:

hdfs dfsadmin -report

then we get

Configured Capacity: 141275429535744 (128.49 TB)
Present Capacity: 140886991802565 (128.14 TB)
DFS Remaining: 84748655941292 (77.08 TB)
DFS Used: 56138335861273 (51.06 TB)
DFS Used%: 39.85%
Under replicated blocks: 4212067
Blocks with corrupt replicas: 0
Missing blocks: 0
Missing blocks (with replication factor 1): 0

so from above we can see that Under replicated blocks is --> 4212067

about to know what is the right under replica number:

why we get differences between hdfs fsck / and hdfs dfsadmin -report ?

BTW - from Ambari we get the ~ same results as from hdfs dfsadmin -report

enter image description here

King David
  • 433
  • 4
  • 17

0 Answers0