A few people asked me what software I use for this blog. Here a blog post that answers these questions and walks you through the different parts of the application.
Requirements ¶
I had a few requirements for the blog software.
- Write something from scratch with Java. It's always fun to play with new libraries and technologies
- The generated blog posts should be static HTML files. No JavaScript on the pages.
- Blog posts should be stored in Markdown files. One file per blog post.
- Blog posts should be stored in a Git repository.
- Full-text search
- Code syntax highlighting
Overview ¶
I wrote a Java application with Spring Boot called Gitblog that covers all my requirements, and here is how it works.
I write the blog posts on my computer in Markdown files (1). Then I commit and push the files to a self-hosted private Git repository (Gitea) (2). The Git server sends a POST request to the Gitblog application (3). Gitblog listens for these requests and pulls the changes from the Git repository to a local repository (4). Then it figures out what changed or is new and creates HTML files of all new and changed Markdown files (5). The generated HTML files are stored in the filesystem. Additionally, Gitblog writes a new sitemap.xml
, feed.atom
and feed.rss
file and "pings" the Google and Bing search engine with a GET request (7). Nginx handles incoming HTTP requests from readers (8) and sends the generated HTML back (9).
Not all sites are static. Exceptions are the index and feedback page, which are dynamically generated by Gitblog. Gitblog also maintains a full-text search index with Lucene so users can perform a full-text query over all blog posts. Gitblog takes the user feedbacks and sends them to me by email.
Blog Post format ¶
I needed a way to add metadata to each blog post, like title and date.
Because in this system, I only transfer Markdown files and there is no management user interface in the blog software, I add this information at the beginning of the Markdown file, enclosed with ---
. The following example shows you how a blog post looks like with all supported headers.
---
summary: The summary for the index page
tags: [java, database]
title: The title for the intext page
draft: false
published: 2020-01-01T10:10:59.167Z
updated: 2020-01-10T06:29:02.130Z
---
... Here the blog post ...
Most information is used for the index page and for querying the posts. A blog post with draft: true
will be converted to HTML but does not appear on the index.html page, is not included in the full-text search index, and is not added to sitemap.xml and RSS and Atom feeds. I use the draft mode to review the posts before I publish them. Publishing means removing the draft
line or setting the value to false
.
Implementation ¶
Here an overview of all the components of Gitblog.
The system consists of these HTTP endpoint controllers.
GiteaWebhook | Handles requests coming from the Git server and starts the conversion process via MainService |
IndexController | Handles requests to / and /index.html . Queries the Lucene index and returns a HTML page with a list of blog posts |
FeedbackController | Handles feedback requests and sends emails to my account |
Components that are responsible for the Markdown -> HTML conversion, creating the feed and sitemap files and updating the Lucene full-text index
MainService | Runs the other services when changed Markdown files arrive |
GitService | Responsible for cloning and pulling files from remote Git repository |
FileService | Manages the conversion process with the help of the following three components |
GitHubCodeService | Fetches Code from GitHub and embeds it into the blog post |
MarkdownService | Converts Markdown to HTML |
PrimsJsService | Syntax highlight code blocks |
FeedService | Creates the feed files feed.rss and feed.atom |
SitemapService | Creates sitemap.xml and pings Google and Microsoft Bing |
LuceneService | Updates and queries the Lucene full-text index |
Two scheduled jobs run tasks regularly
URLChecker | Runs once a month and checks all URLs of all blog posts and creates an HTML report with the errors |
S3Backup | Runs daily, backups the local Git repository and sends it to Amazon S3 |
You find the complete source code of the application on GitHub:
https://github.com/ralscha/gitblog
In the following sections, I will go a bit more into detail about how these components work.
Git ¶
For the Git operations, the application leverages JGit, a library that implements Git in pure Java and does not depend on a native Git installation.
Webhook ¶
The webhook POST handler waits for incoming requests from the remote Git repository. In my case, a self-hosted Gitea repository.
@PostMapping("/webhook")
@ResponseStatus(HttpStatus.NO_CONTENT)
public void handleWebhook(@RequestHeader("X-Hub-Signature") String signature,
@RequestBody String body) {
byte[] result = this.mac.doFinal(body.getBytes());
StringBuilder sb = new StringBuilder();
for (byte b : result) {
sb.append(String.format("%02x", b));
}
String computedSignature = "sha1=" + sb.toString();
if (signature.equals(computedSignature)) {
this.mainService.setup();
}
}
After the service has been called, it triggers the MainService to start the conversion process.
I also wrote a webhook handler for GitHub, in case somebody wants to store the Markdown blog posts in a GitHub repository.
The secret has to be configured correctly in src/main/resources/application.properties
.
app.webhook-secret=
Clone ¶
A git clone is performed the first time Gitblog starts up and the local Git repository does not exist. I use username/password authentication for connecting to my private Gitea instance.
public boolean cloneRepositoryIfNotExists() {
if (!Files.exists(Paths.get(this.appProperties.getWorkDir()))) {
try {
Files.createDirectories(Paths.get(this.appProperties.getWorkDir()));
}
catch (IOException e) {
Application.logger.error("create workdir", e);
}
CloneCommand gitCommand = Git.cloneRepository()
.setURI(this.appProperties.getGitRepository())
.setDirectory(Paths.get(this.appProperties.getWorkDir()).toFile());
if (StringUtils.hasText(this.appProperties.getGitRepositoryUser())) {
UsernamePasswordCredentialsProvider credentialsProvider = new UsernamePasswordCredentialsProvider(
this.appProperties.getGitRepositoryUser(),
this.appProperties.getGitRepositoryPassword());
gitCommand.setCredentialsProvider(credentialsProvider);
}
try (Git result = gitCommand.call()) {
return true;
}
catch (GitAPIException e) {
Application.logger.error("clone repository", e);
}
}
return false;
}
Address and credentials are stored in src/main/resources/application.properties
app.git-repository=https://github.com/ralscha/gitblog-test.git
#app.git-repository-user=
#app.git-repository-password=
Pull ¶
A git pull operation is executed each time Gitblog receives a webhook request from the Git server.
The method compares the commits before and after the pull to figure out what has changed. The application does not have to worry about any Git conflicts because Gitblog does not commit any files to the local Git repository.
The method returns a collection of GitChange
objects. This object encapsulates the type of change
(add, modify, delete, rename or copy) and the old and new path of the file.
public List<GitChange> pull() {
List<GitChange> changes = new ArrayList<>();
try (Git git = Git.open(Paths.get(this.appProperties.getWorkDir()).toFile());
Repository repository = git.getRepository();
ObjectReader reader = repository.newObjectReader()) {
ObjectId oldHead = repository.resolve("HEAD^{tree}");
if (StringUtils.hasText(this.appProperties.getGitRepositoryUser())) {
git.pull()
.setCredentialsProvider(new UsernamePasswordCredentialsProvider(
this.appProperties.getGitRepositoryUser(),
this.appProperties.getGitRepositoryPassword()))
.call();
}
else {
git.pull().call();
}
ObjectId head = repository.resolve("HEAD^{tree}");
CanonicalTreeParser oldTreeIter = new CanonicalTreeParser();
oldTreeIter.reset(reader, oldHead);
CanonicalTreeParser newTreeIter = new CanonicalTreeParser();
newTreeIter.reset(reader, head);
List<DiffEntry> diffs = git.diff().setNewTree(newTreeIter)
.setOldTree(oldTreeIter).call();
for (DiffEntry entry : diffs) {
changes.add(new GitChange(entry.getChangeType(), entry.getNewPath(),
entry.getOldPath()));
}
}
catch (IOException | GitAPIException e) {
Application.logger.error("pull", e);
}
return changes;
}
If you want to learn more about JGit, check out the official documentation and visit my blog post about JGit.
Page Generation ¶
Overview ¶
The conversion from Markdown to HTML happens in multiple steps.
-
My blog posts are mostly about software development. Therefore, I wanted an easy way to embed source code into the blog post. All my example projects are hosted on GitHub. I implemented a mechanism where I insert a link to the code snippet on GitHub in the Markdown file, Gitblog resolves this link, fetches the code from GitHub and embeds it into the page. It's worth noting that this is only a one-way integration. The blog posts do not automatically change when I push changes to my example projects. Usually, when I change source code, the blog post also has to change.
-
Markdown files are converted to HTML with flexmark-java. A Markdown parser implementation in Java.
-
Another requirement I had was to syntax highlight the embedded code blocks. I didn't find a Java library for this purpose and so I ended up integrating the JavaScript library Prism. Usually, you add Prism into the client application, but it's also possible to run Prism on the server and create the syntax highlight HTML code there. It's quite easy to run JavaScript code on the JVM with the JavaScript engine from GraalVM. Java has a built-in JavaScript engine called Nashorn, but this engine implements an older ECMAScript standard and will be removed in Java 15.
Implementation ¶
The FileService.regenerateHtml()
method handles the HTML generation. The method is called after
a Git pull or Git clone operation and receives a collection of all the changed files. It loops over
them and calls the method readPost()
.
public List<PostContent> regenerateHtml(Set<String> changedUrls) {
List<PostContent> posts = new ArrayList<>();
for (String url : changedUrls) {
Path mdFile = this.workDir.resolve(url);
Application.logger.info("creating html for {}", mdFile);
PostContent content = readPost(mdFile);
generateHtml(content);
posts.add(readPost(mdFile));
}
return posts;
}
The FileService.readPost()
method is responsible for converting the Markdown into HTML files.
It reads the Markdown file into memory. Extracts the header from the content, inserts code segments
from GitHub (gitHubCodeService.insertCode()
), converts it to HTML with flexmark-java (markdownService.renderHtml()
)
and runs it through the Prism syntax highlighter (prismJsService.prism()
).
public PostContent readPost(Path mdFile) {
try {
// read markdown
String content = new String(Files.readAllBytes(mdFile),
StandardCharsets.UTF_8);
// extract header
Matcher matcher = this.headerPattern.matcher(content);
if (!matcher.matches()) {
// not a valid post, delete an existing html file
Files.deleteIfExists(PostMetadata.siblingPath(mdFile, "html"));
return null;
}
String headerString = matcher.group(1);
PostHeader header = this.yaml.loadAs(headerString, PostHeader.class);
PostMetadata metadata = new PostMetadata(header, this.workDir, mdFile);
// insert github code
String markdown = matcher.group(2);
markdown = this.gitHubCodeService.insertCode(markdown);
// convert md to html
String html = this.markdownService.renderHtml(markdown);
html = this.prismJsService.prism(html);
return new PostContent(metadata, markdown, html);
}
catch (IOException e) {
Application.logger.error("readPost", e);
}
return null;
}
Lastly the FileService.generateHtml()
method writes the HTML file into the
filesystem and precompresses it with Gzip and Brotli.
public void generateHtml(PostContent post) {
try {
String postHtml = this.postTemplate.execute(post);
Path htmlFile = PostMetadata.siblingPath(post.getMetadata().getMdFile(),
"html");
Files.write(htmlFile, postHtml.getBytes(StandardCharsets.UTF_8));
gzip(htmlFile);
brotli(this.brotliCmd, htmlFile);
}
catch (IOException e) {
Application.logger.error("generate", e);
}
}
1. GitHub code ¶
The GitHubCodeService
service is responsible for fetching code snippets from GitHub and embedding them into the Markdown blog post.
I use this syntax in my blog posts.
[github:https://.../FileService.java#L124-L136]
GitHubCodeService
uses a regular expression to search these links. Then downloads the file from GitHub and embeds the source code as a code element into the Markdown file. If a fragment (e.g. #L124-L136
) is specified, the service only embeds these lines into the blog post, otherwise the whole file.
Because I often reference the same file multiple times in one post, GitHubCodeService
stores downloaded files into a Caffeine cache for 5 minutes. This speeds things up and reduces the number of requests to GitHub.
2. Markdown -> HTML ¶
The MarkdownService
converts the Markdown file with the flexmark-java library to HTML.
@Component
public class MarkdownService {
private final Parser parser;
private final HtmlRenderer renderer;
public MarkdownService() {
MutableDataSet options = new MutableDataSet();
options.set(Parser.EXTENSIONS,
Arrays.asList(AutolinkExtension.create(), AnchorLinkExtension.create(),
TablesExtension.create(), AbbreviationExtension.create(),
InsExtension.create(), SuperscriptExtension.create(),
EmojiExtension.create(), DefinitionExtension.create(),
FootnoteExtension.create(), BlankAnchorLinkExtension.create()));
this.parser = Parser.builder(options).build();
this.renderer = HtmlRenderer.builder(options).build();
}
public String renderHtml(String markdown) {
Node document = this.parser.parse(markdown);
return this.renderer.render(document);
}
public String renderText(String markdown) {
Node document = this.parser.parse(markdown);
TextCollectingVisitor textCollectingVisitor = new TextCollectingVisitor();
return textCollectingVisitor.collectAndGetText(document);
}
}
The service provides two render methods one that converts the Markdown to HTML (renderHtml()
) and one that converts to plain text (renderText()
). The LuceneService uses the latter method for adding the blog post text to the full-text search index.
3. Syntax Highlighting ¶
Syntax highlighting the code blocks is the last step in the conversion process. PrismJsService
performs this task with the help of the JavaScript library Prism.
public String prism(String html) {
Document doc = Jsoup.parse(html);
Elements codeElements = doc.select("code[class*=\"language-\"]");
for (Element codeElement : codeElements) {
String lang = "markup";
for (String cl : codeElement.classNames()) {
if (cl.startsWith("language-")) {
lang = cl.substring("language-".length());
}
}
codeElement.html(prism(codeElement.wholeText(), lang));
}
return doc.body().html();
}
The service first extracts all the <code>
blocks from the HTML code, feeds them to the prism()
method, and get's back new HTML code. For the extraction the service leverages jsoup a HTML parser written in Java.
The prism()
method invokes Prism. The javax.script.*
abstraction calls the underlying GraalVM JavaScript engine and runs the Prism JavaScript library.
private String prism(String code, String language) {
try {
String lang = this.aliases.get(language);
if (lang == null) {
lang = language;
}
if (!this.builtin.contains(lang)) {
Path componentFile = this.prismComponentsDir
.resolve("prism-" + lang + ".js");
if (Files.exists(componentFile)) {
try (FileReader fr = new FileReader(componentFile.toFile())) {
this.engine.eval(fr);
}
this.engine.eval("var lang = Prism.languages." + lang);
}
else {
this.engine.eval("var lang = Prism.languages.markup");
}
}
else {
this.engine.eval("var lang = Prism.languages." + lang);
}
Invocable invocable = (Invocable) this.engine;
Object result = invocable.invokeMethod(this.engine.get("Prism"), "highlight",
code, this.engine.get("lang"));
return (String) result;
}
catch (NoSuchMethodException | ScriptException | IOException e) {
Application.logger.error("prism", e);
return null;
}
}
PrismJsService
automatically downloads Prism from GitHub if it does not exist locally.
The download URL and local location are configured in src/main/resources/application.properties
.
app.prism-js-download-url=https://github.com/PrismJS/prism/archive/v1.29.0.zip
app.prism-js-version=prism-1.29.0
app.prism-js-workdir=./prismjs
Prism wraps all well-known components of the source code with <span>
tags.
Here an example how the HTML code looks like after the Prism run.
<span class="token annotation punctuation">@Bean</span>
<span class="token keyword">public</span> <span class="token class-name">PasswordEncoder</span>
....
The Prism CSS classes are added to the main CSS file. See section CSS a bit further below.
Sitemap Files ¶
The sitemap file is an XML file listing all URLs of a website. This file is read by search engines to index pages from your site. This helps the search engine to find all pages without the need to crawl through your site.
For generating the sitemap XML file Gitblog leverages the sitemapgen4j library.
The writeSitemap()
method receives a collection of all blog posts and writes the sitemap.xml
file.
public void writeSitemap(List<PostMetadata> posts) {
String baseURL = this.appProperties.getBaseUrl();
Path workDir = Paths.get(this.appProperties.getWorkDir());
try {
WebSitemapGenerator wsg = new WebSitemapGenerator(baseURL);
wsg.addUrl(baseURL + "index.html");
for (PostMetadata post : posts) {
wsg.addUrl(baseURL + post.getUrl());
}
String result = wsg.writeAsStrings().stream()
.collect(Collectors.joining("\n"));
Path sitemapPath = workDir.resolve("sitemap.xml");
Files.write(sitemapPath, result.getBytes(StandardCharsets.UTF_8));
FileService.gzip(sitemapPath);
FileService.brotli(this.appProperties.getBrotliCmd(), sitemapPath);
}
catch (IOException e) {
Application.logger.error("writeSitemap", e);
}
}
After creating the sitemap.xml
file Gitblog sends a GET request to Google, to inform the search engine that a new sitemap has been generated.
public void pingSearchEngines() {
String baseURL = this.appProperties.getBaseUrl();
String sitemapUrl = baseURL + "sitemap.xml";
OkHttpClient httpClient = new OkHttpClient();
HttpUrl googlePingUrl = new HttpUrl.Builder().scheme("https").host("google.com")
.addPathSegment("ping").addQueryParameter("sitemap", sitemapUrl).build();
ping(httpClient, googlePingUrl);
}
private static void ping(OkHttpClient httpClient, HttpUrl pingUrl) {
Request request = new Request.Builder().url(pingUrl).build();
httpClient.newCall(request).enqueue(new Callback() {
@Override
public void onFailure(Call call, IOException e) {
Application.logger.error("sitemap controller", e);
}
@Override
public void onResponse(Call call, Response response) throws IOException {
if (!response.isSuccessful()) {
Application.logger.error("Unexpected code " + response);
}
}
});
}
RSS and Atom feeds ¶
Gitblog uses the ROME library to create RSS and Atom feeds.
The following methods of the FeedService
class receive
a list of all blog posts and write the files feed.atom
and feed.rss
.
private void rss2(List<PostMetadata> posts) {
Path baseDir = Paths.get(this.appProperties.getWorkDir());
Path feedFile = baseDir.resolve("feed.rss");
try (Writer writer = Files.newBufferedWriter(feedFile)) {
WireFeedOutput output = new WireFeedOutput();
output.output(createWireFeed(posts, "rss_2.0"), writer);
}
catch (IllegalArgumentException | IOException | FeedException e) {
Application.logger.error("write rss feed", e);
}
if (Files.exists(feedFile)) {
FileService.gzip(feedFile);
FileService.brotli(this.appProperties.getBrotliCmd(), feedFile);
}
}
private void atom1(List<PostMetadata> posts) {
Path baseDir = Paths.get(this.appProperties.getWorkDir());
Path feedFile = baseDir.resolve("feed.atom");
try (Writer writer = Files.newBufferedWriter(feedFile)) {
WireFeedOutput output = new WireFeedOutput();
output.output(createWireFeed(posts, "atom_1.0"), writer);
}
catch (IllegalArgumentException | IOException | FeedException e) {
Application.logger.error("write rss feed", e);
}
if (Files.exists(feedFile)) {
FileService.gzip(feedFile);
FileService.brotli(this.appProperties.getBrotliCmd(), feedFile);
}
}
Full-text Search ¶
The LuceneService leverages the Apache Lucene library to manage a local full-text search index.
You find the code that indexes the pages here.
CSS ¶
For the CSS I built a simple npm project that takes the CSS from normalize.css, github-markdown-css, Prism and my custom CSS and concatenates them with a npm script together.
"scripts": {
"prebuild": "shx mkdir -p build && shx rm -rf dist/* && shx cp src/favicon.ico build",
"build": "npx cleancss -o build/blog-6.css node_modules/normalize.css/normalize.css node_modules/github-markdown-css/github-markdown-light.css node_modules/prismjs/themes/prism.css src/blog.css",
"postbuild": "bread-compressor build"
},
The bread-compressor-cli precompresses the file with Gzip and Brotli. I copy the built CSS file to my local blog post Git repository into the assets
folder. From there, I can push it to the Git repository, and Gitblog pulls it onto the server.
Dynamic Pages ¶
Not all pages are static HTML files. index.html
and the feedback page are dynamically generated.
Index ¶
The index page is a Mustache template that is read and compiled in the IndexController constructor.
public IndexController(Mustache.Compiler mustacheCompiler,
LuceneService luceneService) throws IOException {
this.luceneService = luceneService;
ClassPathResource cpr = new ClassPathResource("/templates/index.mustache");
try (InputStream is = cpr.getInputStream();
InputStreamReader isr = new InputStreamReader(is,
StandardCharsets.UTF_8);) {
this.indexTemplate = mustacheCompiler.withFormatter(new Mustache.Formatter() {
@Override
public String format(Object value) {
if (value instanceof ZonedDateTime) {
return ((ZonedDateTime) value).format(this._fmt);
}
return String.valueOf(value);
}
protected DateTimeFormatter _fmt = DateTimeFormatter
.ofPattern("MMMM dd, yyyy", Locale.ENGLISH);
}).compile(isr);
}
}
The following method handles incoming requests to /
and /index.html
.
@GetMapping({ "/", "/index.html" })
public ResponseEntity<?> index(
@RequestParam(name = "tag", required = false) String tag,
@RequestParam(name = "query", required = false) String query,
@RequestParam(name = "year", required = false) String yearString) {
Integer year = null;
if (StringUtils.hasText(yearString)) {
try {
year = Integer.parseInt(yearString);
}
catch (NumberFormatException e) {
// ignore this
}
}
Depending on the request parameters, the method searches in the Lucene index with either the tag name, search term, or the year for blog posts.
Set<Integer> years = this.luceneService.getPublishedYears();
List<YearNavigation> yearNavigation;
List<PostMetadata> posts;
String queryString = null;
if (StringUtils.hasText(tag)) {
queryString = "tags:" + tag;
posts = this.luceneService.searchWithTag(tag);
yearNavigation = years.stream().map(y -> new YearNavigation(y, false))
.sorted(Comparator.reverseOrder()).toList();
}
else if (StringUtils.hasText(query)) {
posts = this.luceneService.searchWithQuery(query);
queryString = query;
yearNavigation = years.stream().map(y -> new YearNavigation(y, false))
.sorted(Comparator.reverseOrder()).toList();
}
else if (year != null) {
posts = this.luceneService.getPostsOfYear(year);
final int queryYear = year;
yearNavigation = years.stream()
.map(y -> new YearNavigation(y, y == queryYear))
.sorted(Comparator.reverseOrder()).toList();
}
else {
int currentYear = LocalDate.now().getYear();
posts = this.luceneService.getPostsOfYear(currentYear);
if (posts.isEmpty()) {
currentYear = currentYear - 1;
posts = this.luceneService.getPostsOfYear(currentYear);
}
final int queryYear = currentYear;
yearNavigation = years.stream()
.map(y -> new YearNavigation(y, y == queryYear))
.sorted(Comparator.reverseOrder()).toList();
}
SearchResults result = new SearchResults(posts, queryString, yearNavigation);
Lastly, it dynamically creates the index.html and sends back the HTML code to the client.
String indexHtml = this.indexTemplate.execute(result);
return ResponseEntity.ok().contentType(MediaType.TEXT_HTML)
.cacheControl(CacheControl.noCache()).body(indexHtml);
Feedback ¶
The feedback pages are also Mustache templates that are imported by the constructor of the FeedbackController.
When somebody clicks on the Feedback link, this GET endpoint is called which sends back the HTML code.
@GetMapping("/feedback/{url}")
public ResponseEntity<?> feedback(@PathVariable("url") String url) {
String feedbackHtml = this.feedbackTemplate.execute(new Object() {
@SuppressWarnings({ "unused" })
String postUrl = url;
@SuppressWarnings("unused")
String token = FeedbackController.this.hashids
.encode(System.currentTimeMillis());
});
return ResponseEntity.ok().contentType(MediaType.TEXT_HTML)
.cacheControl(CacheControl.noCache()).body(feedbackHtml);
}
When the user submits the feedback, this POST handler receives the request, creates an email with Spring's email support and sends it to me.
@PostMapping("/submitFeedback")
public ResponseEntity<?> submitFeedback(
@RequestParam(name = "url", required = false) String url,
@RequestParam(name = "token", required = false) String token,
@RequestParam(name = "feedback", required = false) String feedbackStr,
@RequestParam(name = "email", required = false) String email,
@RequestParam(name = "name", required = false) String nameHoney) {
if (StringUtils.hasText(feedbackStr) && StringUtils.hasText(url)
&& StringUtils.hasText(token) && !StringUtils.hasText(nameHoney)) {
long[] numbers = this.hashids.decode(token);
long twoSecondsAgo = System.currentTimeMillis() - 2_000;
if (numbers.length == 1 && numbers[0] < twoSecondsAgo) {
this.executorService.submit(() -> {
SimpleMailMessage mailMessage = new SimpleMailMessage();
mailMessage.setFrom(this.appProperties.getFeedbackFromEmail());
mailMessage.setTo(this.appProperties.getFeedbackToEmail());
if (StringUtils.hasText(email)) {
mailMessage.setReplyTo(email);
}
mailMessage.setSubject("Feedback: " + url);
mailMessage.setText(feedbackStr);
this.mailSender.send(mailMessage);
});
}
}
String feedbackOkHtml = this.feedbackOkTemplate.execute(null);
return ResponseEntity.ok().contentType(MediaType.TEXT_HTML)
.cacheControl(CacheControl.noCache()).body(feedbackOkHtml);
}
In the end, the feedback handler sends back a confirmation HTML page.
URL Checker ¶
My blog posts contain many links, and to keep them up to date I wanted a process that regularly checks them.
The URLChecker goes through all HTML files, extracts the links with the autolink library, and sends a GET request with OkHttp to each URL. The URL checker runs once a month and creates a static HTML report.
One issue was that the blog posts contain many links that can't be checked (for example: http://localhost
and file://
). So I needed a list with all the URLs the checker should ignore. The URL checker looks for a file with the name ignore-urls.txt
in the root of the blog post repository, reads it, and ignores all listed URLs. This way, I can easily add new URLs by simply inserting them to the text file and commit/push the file to the blog post Git repository. Thanks to the Git webhook, the blog software automatically clones the change.
Backup ¶
Gitblog automatically creates a backup of the blog post Git repository and stores it in an S3 bucket.
The process is triggered with a Spring @Scheduled
method that runs every day.
@Scheduled(cron = "0 0 12 * * *")
public void backup() {
The backup method clones the repository as a bare repository into a temporary directory with JGit, zips everything together, and uploads the zip file with the Amazon AWS SDK for Java into an Amazon S3 bucket.
The backup functionality is encapsulated in the S3Backup class.
Production Setup ¶
A few notes about the installation. I run this application on a virtual private server (VPS) under Debian 10.
First, I downloaded a Java Virtual Machine (JVM). You can download them from many different places. I usually download an OpenJDK JVM. Other options are AdoptOpenJDK, Oracle, Azul, Red Hat, Amazon, SAP
I currently use a Java 14 JVM installed in the opt/java
folder.
My blog software is installed in the /opt/gitblog
folder. I package the application on my development computer with mvn package
and then copy the jar file to my server with SCP.
The jar file is a so-called Spring Boot "executable". The jar contains a launch script. You can see this script
when you print out the jar in the console: head -n 290 gitblog.jar
.
See the Spring Boot documentation for more information.
For the installation, I created an application.properties
file in the same directory. This overwrites the development configuration file
that is packaged into the jar.
Because no environment variable points to the installed JVM, I created a gitblog.conf
with the
location of Java.
JAVA_HOME=/opt/java
The .conf
file is used for configuring the launch script in the jar file. The script has to have the same name as the
jar with the suffix .conf
. Visit the Spring documentation for more information about this topic.
The application is managed by systemd
, to enable that I created systemd
service file: gitblog.service
[Unit]
Description=gitblog
After=network.target
After=nginx.service
[Service]
Restart=always
RestartSec=10
ExecStart=/opt/gitblog/gitblog.jar
SuccessExitStatus=143
[Install]
WantedBy=multi-user.target
This configuration is straight from the Spring Boot documentation.
You place that file in the /etc/systemd/system
folder and start and enable the application with the systemctl
command.
systemctl daemon-reload
systemctl start gitblog
systemctl enable gitblog
Check the status of the application with systemctl status gitblog
The web server, as mentioned before, is Nginx. Here the Nginx configuration I use for my blog. Nginx adds some headers, especially for caching.
Let's Encrypt is used for the TLS certificates. The webroot points to the /opt/gitblog/posts/
folder, where the Git repository of the blog posts is located. The Spring Boot application listens on port 48899
and Nginx sends all the dynamic URLs (for example: /index.html
) to the Spring Boot application.
Because the webroot is a Git folder I configured Nginx to hide the .git
repository and the markdown files (.md
)
server {
listen 80;
listen [::]:80;
server_name golb.hplar.ch blog.rasc.ch blog.ralscha.ch;
location / {
return 301 https://golb.hplar.ch$request_uri;
}
}
server {
server_name golb.hplar.ch;
listen [::]:443 ssl http2;
listen 443 ssl http2;
ssl_certificate /etc/letsencrypt/live/golb.hplar.ch/fullchain.pem; # managed by Certbot
ssl_certificate_key /etc/letsencrypt/live/golb.hplar.ch/privkey.pem; # managed by Certbot
add_header Strict-Transport-Security "max-age=63072000; includeSubDomains; preload";
add_header Referrer-Policy "no-referrer";
add_header X-Frame-Options DENY;
add_header X-Content-Type-Options nosniff;
add_header X-XSS-Protection "1; mode=block";
location ~* \.(?:rss|atom|xml)$ {
root /opt/gitblog/posts/;
expires 1h;
add_header Cache-Control "public";
access_log off;
}
location ~* \.(?:css|js|jpg|jpeg|gif|png|ico|cur|gz|svg|svgz|mp4|ogg|ogv|webm|htc|eot|ttf|woff|woff2)$ {
root /opt/gitblog/posts/;
expires 1y;
access_log off;
add_header Cache-Control "public";
}
location / {
root /opt/gitblog/posts/;
}
location = / {
proxy_pass http://localhost:48899;
}
location = /index.html {
proxy_pass http://localhost:48899;
}
location = /submitFeedback {
access_log off;
proxy_pass http://localhost:48899;
}
location /feedback/ {
access_log off;
proxy_pass http://localhost:48899;
}
location = /webhook {
access_log off;
proxy_pass http://localhost:48899;
}
location = /ignore-urls.txt {
return 404;
}
location ~ \.md$ {
return 404;
}
location /.git {
return 404;
}
}
That concludes the overview of my blog software. You find the source code in this GitHub repository: https://github.com/ralscha/gitblog
In this GitHub repository, you find an example of the repository with the blog posts.
https://github.com/ralscha/gitblog-test
This repository will be cloned from the Gitblog application and is at the same
time, the webroot directory for the HTTP server. You are free
how you want to organize the files in this directory.
If you have more questions, send me a message. If you find bugs or have ideas for enhancements open an issue and/or create pull requests. Contributions are always welcome.