Working with MongoDB Capped Collections from Java

In addition to regular collections, into which you can insert any number of documents, MongoDB also offers capped collections that are limited in size. When your application inserts a document into such a collection, and the collection is full, MongoDB automatically deletes the oldest document to make room for the new one. The insertion date specifies the age of a document.

For the following examples, I use the Java MongoDB driver and a MongoDB 4 instance running on localhost:27017.

    <dependency>
      <groupId>org.mongodb</groupId>
      <artifactId>mongodb-driver-sync</artifactId>
      <version>5.5.1</version>
    </dependency>

pom.xml

Create ¶

In contrast to regular collections, capped collections need to be created with MongoDatabase.createCollection() before the application starts inserting documents.

The following example creates a capped collection named log in the test database and limits the size to 256 bytes.

    try (MongoClient mongoClient = MongoClients.create()) {
      MongoDatabase db = mongoClient.getDatabase("test");

      Set<String> collectionNames = new HashSet<>();
      db.listCollectionNames().into(collectionNames);

      if (!collectionNames.contains("log")) {
        db.createCollection("log", new CreateCollectionOptions().capped(true)
            // .autoIndex(false)
            .sizeInBytes(256));
      }

Example1.java

256 bytes is the minimum value you can set the size to. MongoDB automatically changes the value to 256 if you try to set a size less than 256. Also, the size has to be a multiple of 256. For example, if you set the size to 1000 bytes, MongoDB automatically rounds up the value to 1024.

MongoDB also automatically creates an index on the _id field. If you want to improve the insert performance, you can disable this index with autoIndex(false).

  db.createCollection("log", new CreateCollectionOptions()
                                  .capped(true)
                                  .autoIndex(false)
                                  .sizeInBytes(256));

Note that you can only disable this index when the collection is not part of a replica set.

Insert ¶

There is no difference in the code between inserting documents into a capped collection and a regular collection. But behind the scenes, MongoDB automatically deletes old documents in a capped collection when an insert command exceeds the specified maximum number of bytes.

      MongoCollection<Document> collection = db.getCollection("log");

      for (int j = 0; j < 10; j++) {
        Document logMessage = new Document();
        logMessage.append("index", j);
        logMessage.append("message", "User sr");
        logMessage.append("loggedIn", new Date());
        logMessage.append("loggedOut", new Date());
        collection.insertOne(logMessage);
      }

Example1.java

When we now check the documents in the collection, we see that only the last 2 are stored in the collection. Our test document has a size of about 90 bytes, and only 2 documents fit into the 256-byte limit.

      collection.find()
          .forEach((Consumer<Document>) block -> System.out.println(block.get("index")));
      // 8
      // 9

Example1.java

The find() method returns the documents in insertion order, even without an index on _id. MongoDB guarantees this for capped collections.

To retrieve documents in reverse order, you issue a find() with a sort and the $natural parameter set to -1.

      Document last = collection.find().sort(Sorts.descending("$natural")).first();
      System.out.println(last.get("index")); // 9

Example1.java

Limit number of documents ¶

It is possible to limit a capped collection not only by size but also by the number of documents. To configure the maximum number, you specify the maxDocuments option. Example with a maximum of 3 documents.

      db.createCollection("log",
          new CreateCollectionOptions().capped(true).maxDocuments(3).sizeInBytes(512));

Example2.java

You cannot omit sizeInBytes; this option is mandatory when you create a capped collection. Therefore, you need to choose a size that is big enough to hold the maximum number of documents. If the size is not big enough, the limit of maxDocuments is never reached because MongoDB always considers sizeInBytes first and deletes old documents when it reaches this limit. In our example, the document has a size of about 90 bytes, so to store 3, we need at least 270 bytes, and the value needs to be a multiple of 256, so we set it to 512 for this example.

If we add 10 documents and then fetch all documents from the collection, we see that only the last 3 documents are stored in the database.

      MongoCollection<Document> collection = db.getCollection("log");

      for (int j = 0; j < 10; j++) {
        Document logMessage = new Document();
        logMessage.append("index", j);
        logMessage.append("message", "User sr");
        logMessage.append("loggedIn", new Date());
        logMessage.append("loggedOut", new Date());
        collection.insertOne(logMessage);
      }

      collection.find()
          .forEach((Consumer<Document>) block -> System.out.println(block.get("index")));
      // 7
      // 8
      // 9

Example2.java

To figure out the size of your documents, you can run the collStats command, which returns the statistics for a particular collection. The field avgObjSize returns the average document size in bytes and gives you a good hint about the size.

    try (MongoClient mongoClient = MongoClients.create()) {
      MongoDatabase db = mongoClient.getDatabase("test");

      Document collStats = db.runCommand(new Document("collStats", "log"));
      System.out.println(collStats.toJson());
      System.out.println("Number of Documents: " + collStats.get("count"));
      System.out.println("Size in Bytes: " + collStats.get("size"));
      System.out.println("Average Object size in Bytes : " + collStats.get("avgObjSize"));
    }

Stat.java

The collStats command is also useful if you need to figure out if a collection is capped or not.

    try (MongoClient mongoClient = MongoClients.create()) {
      MongoDatabase db = mongoClient.getDatabase("test");

      Document collStats = db.runCommand(new Document("collStats", "log"));

      System.out.println("Is capped: " + collStats.get("capped"));
      System.out.println("Max. Documents: " + collStats.get("max"));
      System.out.println("Max. Size in Bytes: " + collStats.get("maxSize"));
    }

Capped.java

The capped field contains the boolean value true if the collection is capped. The fields max and maxSize contain the configured maximum number of documents and the maximum number of bytes, respectively.

Tailable Cursors ¶

For capped collections, you may use a special kind of cursor: a tailable cursor. A tailable cursor remains open after the client reads all results from the initial cursor and continues to return documents that other clients insert into the collection. It is similar to the tail command on Linux and Unix with the follow option (-f).

The following example creates a capped collection named log, then starts a thread that inserts a new document every second.

    try (MongoClient mongoClient = MongoClients.create()) {
      MongoDatabase db = mongoClient.getDatabase("test");
      db.drop();

      db.createCollection("log",
          new CreateCollectionOptions().capped(true).sizeInBytes(512));

      MongoCollection<Document> collection = db.getCollection("log");

      AtomicInteger index = new AtomicInteger(0);
      Thread insertThread = new Thread(() -> {
        while (true) {
          try {
            TimeUnit.SECONDS.sleep(1);
          }
          catch (InterruptedException e) {
            // ignore this
          }

          Document logMessage = new Document();
          logMessage.append("index", index.incrementAndGet());
          logMessage.append("message", "User sr");
          logMessage.append("loggedIn", new Date());
          logMessage.append("loggedOut", new Date());
          collection.insertOne(logMessage);
        }
      });
      insertThread.start();

Tail.java

The application then opens a tailable cursor. We have to wrap this call in a second loop because opening a tailable cursor on an empty collection immediately consumes and closes the cursor. So, we wait in the outer loop for two seconds and then try to open another tailable cursor.

      while (true) {
        try (MongoCursor<Document> cursor = collection.find()
            .cursorType(CursorType.TailableAwait).noCursorTimeout(true).iterator()) {
          while (cursor.hasNext()) {
            Document doc = cursor.next();
            System.out.println(doc.get("index"));
          }
        }
        TimeUnit.SECONDS.sleep(2);
      }

Tail.java

As soon as there are documents in the collection, hasNext() returns true, and next() returns the new document. hasNext() then blocks the caller until new documents arrive. By default, a tailable cursor times out after an inactivity period of 10 minutes. To prevent that, you can set noCursorTimeout(true), and the cursor remains open indefinitely.

See the official documentation for more information about tailable cursors.

Conversion ¶

To convert a regular collection to a capped collection, you can run the database command convertToCapped. Note that this command locks the entire database. Other commands that also lock the entire database need to wait until the command finishes.

      Document doc = new Document("convertToCapped", "log");
      doc.append("size", 1024);
      Document result = db.runCommand(doc);

Convert.java

MongoDB reads the documents in natural order and loads them into the capped collection. If the size is not enough to fit all documents from the source collection, MongoDB automatically deletes old documents based on the insertion order.

Instead of converting a collection, you can run the command cloneCollectionAsCapped to create a capped copy of a regular collection.

      Document doc = new Document("cloneCollectionAsCapped", "log");
      doc.append("toCollection", "logCapped");
      doc.append("size", 1024);
      Document result = db.runCommand(doc);

Clone.java

This command does not affect the documents in the source collection.

As far as I know, there is no command to convert a capped collection to a regular collection. However, you can combine a rename, copy, and drop operation to achieve this goal. The following example converts the capped collection log into an uncapped collection with the same name.

      MongoNamespace newName = new MongoNamespace("test", "logOld");
      collection.renameCollection(newName);

      collection = db.getCollection("logOld");
      collection.aggregate(Arrays.asList(Aggregates.out("log")))
          .forEach((Consumer<Document>) block -> {
            System.out.println(block);
          });

      collStats = db.runCommand(new Document("collStats", "log"));
      System.out.println(collStats.get("capped")); // false

      db.getCollection("logOld").drop();

ToNormal.java

The code first renames the collection to logOld and then copies the documents with the $out aggregate pipeline stage to the newly created log collection.

Limitations ¶

Capped collections have a few limitations.

You cannot delete documents. If you need to delete certain documents, the only way is to copy the documents you want to keep to another collection, then drop() the source collection and re-create it.

You can update documents in a capped collection, but you have to make sure that the size of the document does not change.

If we try to update the message of the test document to "User sra", we would increase the size of the document by 1 byte. The MongoDB driver throws a MongoWriteException exception if we run this code.

  collection.updateOne(
       Filters.eq("_id", new ObjectId("5a7364a426ae0a26a81bd4e3")),
       Updates.set("message", "User sra"));
  // com.mongodb.MongoWriteException: Cannot change the size of a document in a capped collection

However, this update operation would succeed because the size does not change:

  collection.updateOne(
       Filters.eq("_id", new ObjectId("5a7364a426ae0a26a81bd4e3")),
       Updates.set("message", "User fa"));

If you need to update documents in a capped collection, make sure that you create an index. Without an index, the update operations require a collection scan.

More information ¶

All the code examples from this blog are stored on GitHub:
https://github.com/ralscha/blog/tree/master/capped

Official MongoDB documentation about capped collections:
https://www.mongodb.com/docs/manual/core/capped-collections/index.html