Apache HBase

HBase is an open-source non-relational distributed database modeled after Google's Bigtable and written in Java. It is developed as part of Apache Software Foundation's Apache Hadoop project and runs on top of HDFS (Hadoop Distributed File System) or Alluxio, providing Bigtable-like capabilities for Hadoop. That is, it provides a fault-tolerant way of storing large quantities of sparse data (small amounts of information caught within a large collection of empty or unimportant data, such as finding the 50 largest items in a group of 2 billion records, or finding the non-zero items representing less than 0.1% of a huge collection).

Apache HBase
Developer(s)Apache Software Foundation
Initial releaseMarch 28, 2008 (2008-03-28)
Stable release
1.3.x1.3.6 / 20 October 2019 (2019-10-20)[1]
1.4.x1.4.13 / 29 February 2020 (2020-02-29)[1]
1.6.x1.6.0 / 6 March 2020 (2020-03-06)[1]
2.2.x2.2.5 / 21 May 2020 (2020-05-21)[1]
RepositoryHBase Repository
Written inJava
Operating systemCross-platform
TypeDistributed database
LicenseApache License 2.0
Websitehbase.apache.org

HBase features compression, in-memory operation, and Bloom filters on a per-column basis as outlined in the original Bigtable paper.[2] Tables in HBase can serve as the input and output for MapReduce jobs run in Hadoop, and may be accessed through the Java API but also through REST, Avro or Thrift gateway APIs. HBase is a column-oriented key-value data store and has been widely adopted because of its lineage with Hadoop and HDFS. HBase runs on top of HDFS and is well-suited for faster read and write operations on large datasets with high throughput and low input/output latency.

HBase is not a direct replacement for a classic SQL database, however Apache Phoenix project provides a SQL layer for HBase as well as JDBC driver that can be integrated with various analytics and business intelligence applications. The Apache Trafodion project provides a SQL query engine with ODBC and JDBC drivers and distributed ACID transaction protection across multiple statements, tables and rows that use HBase as a storage engine.

HBase is now serving several data-driven websites[3] but Facebook's Messaging Platform recently migrated from HBase to MyRocks.[4][5] Unlike relational and traditional databases, HBase does not support SQL scripting; instead the equivalent is written in Java, employing similarity with a MapReduce application.

In the parlance of Eric Brewer’s CAP Theorem, HBase is a CP type system.

History

Apache HBase began as a project by the company Powerset out of a need to process massive amounts of data for the purposes of natural-language search. It is now a top-level Apache project.

Facebook elected to implement its new messaging platform using HBase in November 2010, but migrated away from HBase in 2018.[4]

As of February 2017, the 1.2.x series is the current stable release line.

Use cases & production deployments

Enterprises that use HBase

The following is a list of notable enterprises that have used or are using HBase:

gollark: ... apart from active cooling on the middle ring. I forgot that.
gollark: The WHY-10000 is finally complete and fully running. It's a self-contained 3-reactor system generating more than 300kRF/t (max).
gollark: Active cooling... kind of working, maybe.
gollark: D-D#2 electromagnets online, time to activate it.
gollark: Wait, does it cost clay?

See also

References

Bibliography

This article is issued from Wikipedia. The text is licensed under Creative Commons - Attribution - Sharealike. Additional terms may apply for the media files.