Home | Send Feedback

Searching "Have I Been Pwned?" passwords locally with Java

Published: 1. March 2018  •  Updated: 11. March 2023  •  java

haveibeenpwned.com is a service that hosts passwords from data breaches. Currently (March 2023), over 840 million passwords are stored in this database.

On their website, you can check if your passwords are on this list. The service does a very smart thing for checking if a password is on the list but not reveal any information about the password to the service.

A JavaScript application on the website first calculates the SHA-1 hash and sends the first five characters of the hash to the service. The service responds with a list of all password hashes that start with these five characters. The JavaScript code in the browser then checks if the SHA-1 hash of the password in question matches one on the list. Read more about the algorithm in this blog post from Troy Hunt (the developer of Have I Been Pwned).

The service also provides an API you can access with any HTTP client. Here is an example in Java with the OkHttp library.

    String password = "123456";
    byte[] passwordBytes = md.digest(password.getBytes());
    String hex = HexUtil.byteArrayToString(passwordBytes).toUpperCase();
    String prefixHash = hex.substring(0, 5);
    String suffixHash = hex.substring(5);

    OkHttpClient client = new OkHttpClient();
    String url = "https://api.pwnedpasswords.com/range/" + prefixHash;

    Request request = new Request.Builder().url(url).build();
    try (Response response = client.newCall(request).execute();
        ResponseBody body = response.body()) {
      String hashes = body.string();
      String[] lines = hashes.split("\\r?\\n");

      for (String line : lines) {
        if (line.startsWith(suffixHash)) {
          System.out
              .println("password found, count: " + line.substring(line.indexOf(":") + 1));
          return;
        }
      }
      System.out.println("password not found");

    }

  }

ApiExample.java

You find more information about the API on this page: https://haveibeenpwned.com/API/v2


If you don't like to use a third-party service and send the passwords over the internet to check (even if it's only a part of the SHA-1 hash), you can download the whole password database onto your computer and then search it locally. This is what we're going to do in this blog post.

Download

The HIBP service does not provide a file with all the passwords that you can download. Instead, the service allows everybody to download the database with the range API. HIBP provides an official downloader, but for this blog post, I want to show you how to write a downloader in Java.

To download the whole database, the program must send requests to the range API for all five character hex strings starting from 00000 up to FFFFF. These are 16^5 = 1,048,576 requests the application has to send. To speed up the process, we can send requests in parallel. Java provides the ExecutorService to create thread pools easily. In this program, I create a thread pool with a fixed amount of threads (CPU cores * 4).

  public static void main(String[] args) throws IOException {
    int numThreads = Runtime.getRuntime().availableProcessors() * 4;

    OkHttpClient httpClient = new OkHttpClient();

    try (ExecutorService executor = Executors.newFixedThreadPool(numThreads)) {
      Path outputDir = Paths.get("./pwned");
      Files.createDirectories(outputDir);

      int max = 1024 * 1024;
      for (int i = 0; i < max; i++) {
        String range = getRange(i);
        executor.execute(() -> downloadRange(httpClient, range, outputDir));
      }
    }
  }

Download.java

I use the OkHttp library to call the range API. The response of each request is stored in a text file in the folder ./pwned with the name <hash_prefix>.txt

  private static void downloadRange(OkHttpClient httpClient, String hashPrefix,
      Path outputDir) {
    Request request = new Request.Builder().url(RANGE_API + hashPrefix).build();
    try (Response response = httpClient.newCall(request).execute();
        ResponseBody body = response.body();
        InputStream bodyIs = body.byteStream()) {
      Files.copy(bodyIs, outputDir.resolve(hashPrefix + ".txt"),
          StandardCopyOption.REPLACE_EXISTING);
    }
    catch (IOException e) {
      e.printStackTrace();
    }
  }

Download.java

Depending on your internet connection this can take a while. Also, make sure that you have enough space on your disk. Storing all the files requires around 33GB of disk space.

Import

Every file we downloaded in the previous step contains a list of SHA-1 hashes. After the hash follows a colon (:) and the count of how many times the password has been seen in all the data breaches.

0005AD76BD555C1D6D771DE417A4B87E4B4:10
000A8DAE4228F821FB418F59826079BF368:4
000DD7F2A1C68A35673713783CA390C9E93:873
001E225B908BAC31C56DB04D892E47536E0:6
006BAB7FC3113AA73DE3589630FC08218E7:3
...

Note that the hash does not contain the first five characters, the prefix we sent to the range API.

If we want to check if a password is in the list, we could search through all downloaded files, but that would take too much time. So instead, we are going to import the downloaded text files into a database. With that, we can query passwords much faster.

For this example, I wrote an application that stores the data in a Xodus database, a transactional schema-less embedded database from JetBrains, the developers of IntelliJ and Kotlin.

    <dependency>
      <groupId>org.jetbrains.xodus</groupId>
      <artifactId>xodus-environment</artifactId>
      <version>2.0.1</version>
    </dependency>

pom.xml

Xodus is written in Java and Kotlin and can be embedded into any Java application. It is a key/value store and a good fit for this use case. We're using the password SHA-1 hash as the key, and as value, we store the count.

Xodus organizes the data in stores inside environments. Every environment can hold multiple stores; we only need one environment with one store and call it "passwords". Also, note that every database operation in Xodus needs to run inside a transaction.

The import application first lists and sorts all the files in the download folder. This step is important because the application needs to import the hashes in ascending order. Next, the application loops over all files, open them, and uses the Files.lines() method to read the content line by line into memory.

    try (Environment env = Environments.newInstance("./pwned_db")) {
      env.executeInTransaction((@NotNull final Transaction txn) -> {
        Store store = env.openStore("passwords", StoreConfig.WITHOUT_DUPLICATES, txn);
        Path inputDir = Paths.get("./pwned");

        final AtomicLong importCounter = new AtomicLong(0L);
        final AtomicLong fileCounter = new AtomicLong(0L);

        List<String> hashFiles = listAllFiles(inputDir);
        int totalFiles = hashFiles.size();
        for (String hashFile : hashFiles) {
          Path inputFile = inputDir.resolve(Paths.get(hashFile));
          try (var linesReader = Files.lines(inputFile)) {
            linesReader.forEach(line -> {
              long c = importCounter.incrementAndGet();
              if (c > 10_000_000) {
                txn.flush();
                System.out.println(
                    "Processed no of files " + fileCounter.get() + " of " + totalFiles);
                importCounter.set(0L);
              }
              String hashPrefix = hashFile.substring(0, hashFile.lastIndexOf("."));
              handleLine(store, txn, hashPrefix, line);
            });

          }
          catch (IOException e) {
            throw new RuntimeException(e);
          }

          fileCounter.incrementAndGet();
        }

        txn.commit();
      });
    }
  }

Importer.java

For each line, the application calls the handleLine method.

  static void handleLine(Store store, Transaction txn, String prefix, String line) {
    String sha1 = line.substring(0, 35);
    int count = Integer.parseInt(line.substring(36).trim());

    ByteIterable key = new ArrayByteIterable(hexStringToByteArray(prefix + sha1));
    store.putRight(txn, key, IntegerBinding.intToCompressedEntry(count));
  }

  private static byte[] hexStringToByteArray(String s) {
    byte[] data = new byte[20];
    for (int i = 0; i < 40; i += 2) {
      data[i / 2] = (byte) ((Character.digit(s.charAt(i), 16) << 4)
          + Character.digit(s.charAt(i + 1), 16));
    }
    return data;
  }

Importer.java

Everything an application wants to store in Xodus must be a ByteIterable, a combination of a byte array and an iterable. The Xodus library provides convenience methods and classes that convert from common Java types to ByteIterable.

The SHA-1 hash is a 40-character hex string, but because Xodus can work with byte arrays, we can save space and convert the hex string into a byte array that only occupies 20 bytes and then store that as the key. As mentioned before, the file only contains 35 characters of the hash, everything after the prefix. Therefore, the program adds the prefx (the filename minus the suffix .txt) to the front of the hash string.

Because the application reads the hashes in order, it can store the key/value with the store.putRight method. This method performs much better because it does not perform a search before insertion. This is only possible when the key you want to insert is greater than any other key in the store.

Ensure you have enough free space on your disk before starting the importer. After importing all hashes, the database occupies about 24 GB of space on the disk.

The final application is the search that looks if a given password is stored in the database.

First, we need an SHA-1 encoder that converts a plaintext password into a hash. Java has built-in support for that with the MessageDigest class.

  private static MessageDigest md;
  static {
    try {
      md = MessageDigest.getInstance("SHA-1");
    }
    catch (NoSuchAlgorithmException e) {
      e.printStackTrace();
    }
  }

Search.java

The method that queries for the password takes the Environment, a Xodus class, representing the database and the plain text password. As mentioned before, every database operation needs to run inside a transaction, and because this method only reads data from the database, we start a read-only transaction. The method then calculates the SHA-1 with the MessageDigest and queries the database with store.get(), which returns the value, in our case, the counter, if the key exists or null if it does not.

  private static Integer haveIBeenPwned(Environment env, String password) {
    return env.computeInReadonlyTransaction(txn -> {
      Store store = env.openStore("passwords", StoreConfig.WITHOUT_DUPLICATES, txn);
      byte[] passwordBytes = md.digest(password.getBytes());
      ByteIterable key = new ArrayByteIterable(passwordBytes);
      ByteIterable bi = store.get(txn, key);
      if (bi != null) {
        return IntegerBinding.compressedEntryToInt(bi);
      }
      return null;
    });
  }

Search.java

In the main method, we instantiate the Xodus Environment and point it to the location where the database is stored on disk. Then it iterates through a few example passwords, checks if they exist in the database, and prints out their count.

  public static void main(String[] args) {
    try (Environment env = Environments.newInstance("./pwned_db")) {
      for (String pw : Arrays.asList("123456", "password", "654321", "qwerty",
          "letmein")) {
        long start = System.currentTimeMillis();
        Integer count = haveIBeenPwned(env, pw);
        if (count != null) {
          System.out.println("I have been pwned. Number of occurrences: " + count);
        }
        else {
          System.out.println("Password not found");
        }
        System.out.println(System.currentTimeMillis() - start + " ms");
        System.out.println();
      }
    }
  }

Search.java

On my computer, a search operation takes between 1 and 10 milliseconds to look for a hash.

This concludes our excursion about downloading and storing the HIBP passwords database locally.

You find the complete source code on GitHub:
https://github.com/ralscha/blog/tree/master/pwnd