14

I am having some troubles with MongoDB and space usage. In particular, I once used to have a large collection of about 600 million records totaling 110+ GB on disk. Recently I decided to drop it because the data was outdated, to do so I dropped the collection through rockmongo's web interface. Accordingly, rockmongo doesn't show me the collection anymore, however my disk usage hasn't changed at all.

Is there any clean operation which I am not aware of, which must be run in order to synchronize the database with database files on disk?

I have tried to perform a "repair" but the system complains that there's not enough space on disk ... that's because it is all used by MongoDB.

user9517
  • 114,104
  • 20
  • 206
  • 289
tunnuz
  • 427
  • 2
  • 5
  • 10

4 Answers4

21

As with most database systems, the database files does not shrink when you delete data, the data is just removed/marked as deleted, and the space is reused.

You'll need to run db.repairDatabase() to compact space as noted here

nos
  • 2,368
  • 3
  • 20
  • 24
  • 2
    Hard disk space was too low to do that. However I solved this way: `mongodump`, `oldDatabase.dropDatabase()`, `mongorestore --db newDatabase dump/oldDatabase`. – tunnuz May 12 '11 at 13:43
7

While the above mongodump/drop/mongorestore approach will work fine from a technical perspective, it will require you to take the database offline while you do so, which would be a service-affecting event.

If you would like do this without downtime AND if you are using MongoDB Replica Sets[1], you could do so like this:

  1. Select a member and stop the MongoDB there (service mongodb stop). If this was the PRIMARY, wait for another member to be elected the PRIMARY.
  2. Remove the data files on this member (cd /var/lib/mongodb; rm *).
  3. Restart MongoDB service again (service mongodb start).
  4. Wait for the member to resync to the PRIMARY (rs.status()).
  5. This will rebuild only the required (smaller) data files.

Then repeat the above steps for each of the other members in the Replica Set.

[1] https://docs.mongodb.org/manual/tutorial/deploy-replica-set)

James Mernin
  • 71
  • 1
  • 1
0

In order to reclaim disk in newer versions of MongoDB, rather than using repairDatabase you should use compact which rewrites and defragments all data and indexes in a collection.

The WiredTiger storage engine maintains lists of empty records in data files as it deletes documents. This space can be reused by WiredTiger, but will not be returned to the operating system unless under very specific circumstances.

The amount of empty space available for reuse by WiredTiger is reflected in the output of db.collection.stats() under the heading wiredTiger.block-manager.file bytes available for reuse.

To allow the WiredTiger storage engine to release this empty space to the operating system, you can de-fragment your data file. This can be achieved using the compact command. For more information on its behavior and other considerations, see compact.

Before Mongo 4.4, running compact will block all operations on the database. Starting from 4.4, it will only block collection drop and index creation/deletion.

Note to always do backups before running these kind of commands.

Preview
  • 105
  • 4
0

According to this FAQ https://docs.mongodb.com/manual/faq/storage/#faq-disk-size

the only single way is to do the following:

  • setup fresh & empty replica
  • sync it with master
  • set it as master
Rubycon
  • 101
  • 2