<< Previous | Next >>

Daily Learnings: Wed, Jan 10, 2024

Time you enjoy wasting, was not wasted. — John Lennon

Notes on Astro - API Routes & Static Site Generation

In working on my personal website, preparing it for its initial launch, I’ve learned a lot about Astro, the Javascript metaframework that I’m using for generating the static assets.

To bootstrap the site I used the open-source AstroPaper site template, which came with a TON of great code out of the box. One of the features is dynamic OpenGraph (OG) image generation using satori, another open-source package from Vercel team.

In reviewing the code located at pages/posts/[slug]/index.png.ts (see snippet below), I was concerned that the OG images were generating dynamically at “runtime” or load of a given HTML page, meaning that I’d need to run a server or serverless function to create the images. For context, my plan was to use a static-site generator, as I don’t want my site to have any servers running that I’ll have to maintain, so I did some further digging.

import type { APIRoute } from "astro";
import { getCollection, type CollectionEntry } from "astro:content";

import { generateOgImageForPost } from "@utils/generateOgImages";
import { slugifyStr } from "@utils/slugify";

export async function getStaticPaths() {
  const posts = await getCollection("blog").then(p => p.filter(({ data }) => !data.draft && !data.ogImage));

  return posts.map(post => ({
    params: { slug: slugifyStr(post.data.title) },
    props: post,
  }));
}

export const GET: APIRoute = async ({ props }) =>
  new Response(await generateOgImageForPost(props as CollectionEntry<"blog">), {
    headers: { "Content-Type": "image/png" },
  });

After reviewing the Astro Docs on Endpoints, I was immediately relieved:

Astro lets you create custom endpoints to serve any kind of data. You can use this to generate images, expose an RSS document, or use them as API Routes to build a full API for your site.

In statically-generated sites, your custom endpoints are called at build time to produce static files. If you opt in to SSR mode, custom endpoints turn into live server endpoints that are called on request. Static and SSR endpoints are defined similarly, but SSR endpoints support additional features.

After reading this I ran a full build of my site (which again, isn’t published yet) and lo, and behold: PNGs were generated for every HTML page automatically, and I had no reason to fear.

Key Takeaways

Disallowing LLMs from Crawling Your Site

I was doing some research on how to update my robots.txt to block web-scrapers used by LLMs from using my site’s data for training purposes. I came across a really good article on how to accomplish this today. It’s well written and gave me the exact info that I needed. I’ll add the following to my robots.txt:

User-agent: GPTBot
Disallow: /

User-agent: CCBot
Disallow: /

The article also recognized that this is not a great means for blocking all LLMs, and really it’s up to the companies training AI to respect your robots.txt settings. Further, we don’t know what crawler Anthropic or Facebook or others are even using, so it’s kind of a losing battle right now.

I do hope that we see some level of regulation start to assist in mandating that large tech companies respect the privacy of us common folk.

References