Working with MongoDB Capped Collections from Java

Published: February 04, 2018  •  Updated: February 05, 2018  •  mongodb, java, database

In addition to the normal collections, into which you can insert any number of documents, MongoDB also offers capped collections that are limited in size. If documents are inserted in such collections, while the maximum size is reached, old documents are deleted. The age of a document is defined by the insertion date.

For the following examples I use the Java MongoDB driver and a MongoDB 3.6.2 instance running on localhost:27017

<dependency>
  <groupId>org.mongodb</groupId>
  <artifactId>mongodb-driver</artifactId>
  <version>3.6.2</version>
</dependency>

pom.xml


Create

In contrast to normal collections, capped collections need to be created with MongoDatabase.createCollection() before the application starts inserting documents.

The following example creates a capped collection log in the test database and limits the size to 256 bytes.

  MongoClient mongoClient = new MongoClient();
  MongoDatabase db = mongoClient.getDatabase("test");

  Set<String> collectionNames = new HashSet<>();
  db.listCollectionNames().into(collectionNames);

  if (!collectionNames.contains("log")) {
    db.createCollection("log", new CreateCollectionOptions()
                                    .capped(true)
                                    .sizeInBytes(256));
  }

Example1.java

256 bytes is the minimum value you can set the size to. MongoDB automatically changes the value to 256 if you try to set a size less than 256. Also the size has to be a multiple of 256. For example if you set the size to 1000 bytes MongoDB automatically rounds up the value to 1024.

MongoDB also automatically creates an index on the _id field. If you want to improve the insert performance you can disable this index with autoIndex(false).

  db.createCollection("log", new CreateCollectionOptions()
                                  .capped(true)
                                  .autoIndex(false)
                                  .sizeInBytes(256));

Note that you can only disable this index when the collection is not part of a replica set.


Insert

There is no difference in the code between inserting documents into a capped collection and a normal collection. But behind the scenes MongoDB automatically deletes old documents in a capped collection when an insert command exceeds the specified maximum number of bytes.

  for (int j = 0; j < 10; j++) {
    Document logMessage = new Document();
    logMessage.append("index", j);
    logMessage.append("message", "User sr");
    logMessage.append("loggedIn", new Date());
    logMessage.append("loggedOut", new Date());
    collection.insertOne(logMessage);
  }

Example1.java

When we now check the documents in the collection, we see that only the last 2 are stored in the collection. Our test document has a size of about 90 bytes and only 2 documents fit into the 256 bytes limit.

  collection.find().forEach((Block<Document>) block -> System.out.println(block.get("index")));
  // 8
  // 9

Example1.java

The find() method returns the documents in insertion order, even without an index on _id. MongoDB guarantees this for capped collection.

To retrieve documents in reverse order you issue a find() with a sort and the $natural parameter set to -1.

  Document last = collection.find().sort(Sorts.descending("$natural")).first();
  System.out.println(last.get("index")); // 9

Example1.java


Limit number of documents

It is possible to limit a capped collection not only by the size but also on the number of documents. In order to configure the maximum number you specify the maxDocuments option.
Example with a maximum of 3 documents.

  db.createCollection("log",
        new CreateCollectionOptions()
             .capped(true)
             .maxDocuments(3)
             .sizeInBytes(512));

Example2.java

You cannot omit sizeInBytes, this option is mandatory when you create a capped collection.
Therefore you need to choose a size that is big enough to hold the maximum number of documents. If the size is not big enough the limit of maxDocuments is never reached because MongoDB always considers sizeInBytes first and deletes old documents when it reaches this limit. In our example the document has a size of about 90 bytes so to store 3 we need at least 270 bytes and the value needs to be a multiple of 256 so we set it to 512 for this example.

If we add 10 documents and then fetch all documents from the collection we see that only the last 3 documents are stored in the database.

  MongoCollection<Document> collection = db.getCollection("log");

  for (int j = 0; j < 10; j++) {
    Document logMessage = new Document();
    logMessage.append("index", j);
    logMessage.append("message", "User sr");
    logMessage.append("loggedIn", new Date());
    logMessage.append("loggedOut", new Date());
    collection.insertOne(logMessage);
  }

  collection.find()
        .forEach((Block<Document>) block -> System.out.println(block.get("index")));
  // 7
  // 8
  // 9

Example2.java

To figure out the size of your documents you can run the collStats command that returns the statistics for a particular collection. The field avgObjSize returns the average document size in bytes and gives you a good hint about the size.

  MongoClient mongoClient = new MongoClient();
  MongoDatabase db = mongoClient.getDatabase("test");

  Document collStats = db.runCommand(new Document("collStats", "log"));
  System.out.println(collStats.toJson());
  System.out.println("Number of Documents: " + collStats.get("count"));
  System.out.println("Size in Bytes: " + collStats.get("size"));
  System.out.println("Average Object size in Bytes : " + collStats.get("avgObjSize"));

Stat.java

The collStats command is also useful if you need to figure out if a collection is capped or not.

  MongoClient mongoClient = new MongoClient();
  MongoDatabase db = mongoClient.getDatabase("test");
    
  System.out.println("Is capped: " + collStats.get("capped")); 
  System.out.println("Max. Documents: " + collStats.get("max"));
  System.out.println("Max. Size in Bytes: " + collStats.get("maxSize"));

Capped.java

The capped field contains the boolean true if the collection is capped. The fields max and maxSize contain the configured maximum number of documents resp. the maximum number of bytes.


Tailable Cursors

For capped collections you may use a special kind of cursor: Tailable Cursor. A tailable cursor remains open after the client read all results from the initial cursor and continues to return documents that other clients insert into the collection. Similar to the tail command on Linux and Unix with the follow option (-f).

The following example creates a capped collection log, then starts a thread that inserts a new document every second.

    try (MongoClient mongoClient = new MongoClient()) {
      MongoDatabase db = mongoClient.getDatabase("test");
      db.drop();

      db.createCollection("log",
          new CreateCollectionOptions().capped(true).sizeInBytes(512));

      MongoCollection<Document> collection = db.getCollection("log");

      AtomicInteger index = new AtomicInteger(0);
      Thread insertThread = new Thread(() -> {
        while (true) {
          try {
            TimeUnit.SECONDS.sleep(1);
          }
          catch (InterruptedException e) {
            // ignore this
          }

          Document logMessage = new Document();
          logMessage.append("index", index.incrementAndGet());
          logMessage.append("message", "User sr");
          logMessage.append("loggedIn", new Date());
          logMessage.append("loggedOut", new Date());
          collection.insertOne(logMessage);
        }
      });
      insertThread.start();

The application then opens a tailable cursor. We have to wrap this call in a second loop because opening a tailable cursor on an empty collection immediately consumes and closes the cursor. So we wait in the outer loop for two seconds and then try to open another tailable cursor.

      while (true) {
        try (MongoCursor<Document> cursor = collection.find()
            .cursorType(CursorType.TailableAwait).noCursorTimeout(true).iterator()) {
          while (cursor.hasNext()) {
            Document doc = cursor.next();
            System.out.println(doc.get("index"));
          }
        }
        TimeUnit.SECONDS.sleep(2);
      }
    }

Tail.java

As soon as there are documents in the collection hasNext() returns true and next() returns the new document. hasNext() then blocks the caller until new documents arrive.
By default, a tailable cursor times out after an inactivity period of 10 minutes. To prevent that you can set noCursorTimeout(true) and the cursor remains open forever.

See the official documentation for more information about tailable cursor


Conversion

To convert a normal to a capped collection you can run the database command convertToCapped.
Note that this command locks the whole database. Other commands that also lock the whole database need to wait until the command finishes.

  Document doc = new Document("convertToCapped", "log");
  doc.append("size", 1024);
  Document result = db.runCommand(doc);

Convert.java

MongoDB reads the documents in natural order and loads them into the capped collection. If the size does not fit all documents from the source collection, MongoDB automatically deletes old documents based on the insertion order.


Instead of converting a collection you can run the command cloneCollectionAsCapped to create a capped copy of a normal collection.

  Document doc = new Document("cloneCollectionAsCapped", "log");
  doc.append("toCollection", "logCapped");
  doc.append("size", 1024);
  Document result = db.runCommand(doc);

Clone.java

This command does not affect the documents in the source collection.


As far as I know, there is no command to convert a capped to a normal collection. But you can combine a rename, copy and drop operation to achieve the goal. The following example converts the capped collection log into an uncapped with the same name.

  MongoNamespace newName = new MongoNamespace("test", "logOld");
  collection.renameCollection(newName);

  collection = db.getCollection("logOld");
  collection.aggregate(Arrays.asList(Aggregates.out("log")))
            .forEach((Block<Document>) block -> {});
  db.getCollection("logOld").drop();

ToNormal.js

The code first renames the collection to logOld and then copies the document with the $out aggregate pipeline stage to the newly created log collection.


Limitation

Capped collection have a few limitations.

You cannot delete documents. If you need to delete certain documents the only way is to copy the documents you want to keep to another collection then drop() the source collection and re-create it.

You can update documents in a capped collection but you have to make sure that the size of document does not change.

If we try to update the message of the test document to "User sra", we would increase the size of the document by 1 byte. The MongoDB driver throws a MongoWriteException exception if we run this code.

  collection.updateOne(
       Filters.eq("_id", new ObjectId("5a7364a426ae0a26a81bd4e3")),
       Updates.set("message", "User sra"));
  // com.mongodb.MongoWriteException: Cannot change the size of a document in a capped collection

But this update operation would succeed because the size does not change

  collection.updateOne(
       Filters.eq("_id", new ObjectId("5a7364a426ae0a26a81bd4e3")),
       Updates.set("message", "User fa"));

If you need to update documents in a capped collection, make sure that you create an index. Without an index the update operations requires a collection scan.


More information

All the code examples from this blog are stored on GitHub:
https://github.com/ralscha/blog/tree/master/capped

Official MongoDB documentation about capped collections:
https://docs.mongodb.com/manual/core/capped-collections/index.html