Migrating from Mastodon botsin.space: self-hosted vs hosting service alternatives

Given the news that the bot-friendly Mastodon instance https://botsin.space/home is shutting down, I need to decide what my next steps should be for the bots that have accounts on that instance:

  • abandon them
  • migrate their accounts to another Mastodon instance or somewhere else like BlueSky
  • setup and run my own Mastodon instance
  • pay for a hosted Mastodon instance

Developing bots is a fun personal project to get up to speed with developing and running services in the cloud, so even if I don’t continue running my current bots, it’s likely I’ll deploy something else bot-related in the future, so I’m most likely going to migrate them somewhere.

I’ve already migrated a few of my bots from Twitter to Mastodon, and now faced with another move, the option of running my own Mastodon instance seems more appealing than relying on someone else’s instance that may or may not be running months from now. Given that I already host other things in the cloud, including this blog, I thought I’d give it a go to setup a Docker based Mastodon instance. The source project provides Dockerfile and docker-compose.yml so I thought it would probably be relatively easy. The docs look more detailed for installing on a bare OS though, so it’s not as obvious what you need to do to configure an instance to get it up and running successfully.

I followed multiple guides which all seem to cover various different parts of the install and setup, these two were the most comprehensive:

Despite following these guides, I ran into many, many issues, and as I found solutions I started putting together my own step by step guide below. Several times I discovered that the issues I was running into was because there was an additional step I needed to run first that wasn’t mentioned elsewhere, and even though I found work arounds it was easier to throw the install away and start fresh adding the step(s) I’d missed before.

The tl;dr conclusion

After spending a few hours over several days, I got to the point of having an instance up and running on GCP, but an e2-small instance was too slow, and while upgrading to a e2-medium ran ok, at that point that instance type would have been too expensive for a hobby project to leave up 24×7. Even though it was up and running I couldn’t seem to search for or follow anyone on another instance, or get any relays successfully added.

To run a self-hosted instance I’d also need an SMTP service as well for notification emails, so I decided that the cheapest ‘Moon’ hosting plan from https://masto.host/ would be more than for my projects, so I’ve set up my own instance with them. Sign up was effortless, and my own instance was up and running in a couple of minutes – it’s at: https://mastodon.kevinhooke.com/home

docker-compose Mastodon setup steps:

As explained above, despite getting to the point of a running server, it still had issues that I didn’t want to spend more time investigating, so I’ll leave these notes here in case they’re useful for someone else running into similar issues, but please take these with a grain of salt and no guarantee that you’ll get a working server as result.

  1. Clone the mastdon repo
  2. cp .env.production.sample .env.production
  3. Run secret generation steps from comments in .env.production and paste generated values into .env.production, using
docker compose run --rm web bin/rails db:encryption:init

and (run this one twice for SECRET_KEY_BASE and OTP_SECRET):

docker compose run --rm web bundle exec rails secret

and this one for VAPID_PUBLIC_KEY and VAPID_PRIVATE_KEY:

docker compose run --rm web  bundle exec rails mastodon:webpush:generate_vapid_key

4. Replace any localhost references with the name of the Docker container in .env.production, for example:

    REDIS_HOST=redis
    DB_HOST=db
    ES_HOST=es

    5. Run the db setup step:

    docker compose run --rm web  bundle exec rails db:setup

    I’d previously missed this step and so managed to get the db setup via several manual steps – skip these if you run db:setup instead: run psql in the db service container and manually create a mastodon user:

    CREATE USER mastodon WITH PASSWORD '<password>' CREATEDB; 

    Run the db:create script. If you get an error that the db already exists, run the db:migrate script.

    Mounted Volume ownership

    Within your mastodon dir, change the permissions on the following folders which get mounted as volumes.

    For static content accessed by the web container:

    sudo chown -R 991:991 public

    For elasticsearch runtime data:

    sudo chown -R 1000:root elasticsearch 

    … this avoid error in the es logs about being unable to access the mounted volume (from here):

    AccessDeniedException: /usr/share/elasticsearch/data/nodes

    ElasticSearch vm.max_map_count error

    bootstrap check failure [1] of [1]: max virtual memory areas vm.max_map_count [65530] is too low, increase to at least [262144]

    ElasticSearch error in Admin page

    Elasticsearch index mappings are outdated. Please run tootctl search deploy --only=instances tags

    ‘docker exec -it container-id bash’ into the web container and run to fix.

    Post install setups

    RAILS_ENV=production bin/tootctl accounts create \
    alice \
    --email alice@example.com \
    --confirmed \
    --role Owner

    Troubleshooting

    On starting up, if you get any database connection errors, check the previous step about replacing localhost with Docker container names:

    Did you not create the database, or did you delete it? To create the database, run: bin/rails db:create

    Planning Twitter bot to Mastodon migration / updates – what do I have running right now?

    The odd thing about personal bot projects is that after you’ve deployed them and they’re up and running, unless apis change and need to be updated, there’s not much needed to keep them running, if anything. Some of my first bots I deployed as AWS Lambdas I’ve had running several times a day for 5 years. In this time AWS Lambda supported runtimes have come and gone out of support, so the Node6 runtime I was originally using has now definitely passed it’s official support.

    This is mostly a todo list to help consolidate my todo list of bots that I need to look at as part of my migration from Twitter to Mastodon, but if you search you can find my previous posts that describe how these were built.

    @kevinhookebot

    Mostly migrated to @kevinhookebot@botsin.space on Mastodon but running on Twitter and Mastodon at the same time. Sends the same generated text to both at the same time, but replying to the bot either on Twitter or Mastodon will interact with just that bot on that account.

    My first Twitterbot project, and has now tweeted over 11k times since 2018 when it went live. This comprises multiple Lambdas to provide different features:

    • a trained RNN text generation model generates random text and tweets every ~ 3 hours. One scheduled AWS Lambda generates the text and inserts to a DynamoDB table. Another scheduled Lambda reads the next tweet from the table and tweets using Twitter’s apis.
    • A scheduled Lambda runs every minutes calling a Twitter api to check for replies and tweets at this account. It replies with one of a number of canned replies
    • If you tweet at this bot with ‘go north|south|east|west it replies with a generated response typical of a text based adventure game. The replies are generated with a template and randomly inserted words (it isn’t actually a game)

    @productnamebot

    Tweets randomly generated product names using lists of key words. Not yet migrated to Mastondon. Has tweeted 7k times since 2018

    @blackjackcard

    A BlackJack cardgame bot. Not migrated to Mastodon yet. @ the bot with ‘deal’ to start a game. Tracks game state per player in DynamoDB. Uses Twitter apis to check for replies to the game bot every 5 minutes.

    Getting started with the Mastodon APIs – notifications

    The docs for the Mastodon APIs are pretty good, but there’s a surprising lack of working examples online (compared to using the Twitter APIs) which means starting out I’ve been stumped several times already trying to work out how to what seem to be simple things.

    Publishing a new status (a ‘Toot’, equivalent in Twitter terms to a ‘Tweet’), is easy enough with POST /statuses . Getting a list of who has mentioned you in a status was not that obvious though.

    I took a look at getting my timeline with various options, using GET /timelines, before realizing what I was probably looking for was GET /notifications which can be filtered by various types, including mentions, using

    GET /notifications?types[]=mention

    Note the types array parameter with [] following the name. I haven’t seen this convention used before, but this is described in the docs here.

    Most of the APIs returning statuses look like this:

    {
          id: 'unique-id',
          type: 'mention',
          created_at: '2022-11-20T04:46:33.902Z',
          account: {
            // details about the account that posted this status
          },
          status: {
            id: 'unique-id-for-this-status',
            created_at: '2022-11-20T04:46:22.000Z',
            in_reply_to_id: null,
            in_reply_to_account_id: null,
            content: {
              //content of the status here, as HTML
            }

    Note that the type=mention here, as this is what we filtered for with the types=[] parameter.

    Read AWS IAM permission errors carefully – they tell you everything you need to know (Twitter to Mastodon bot migration)

    Migrating my @kevinhookebot Twitter bot to Mastodon, I made some updates to how the Lambda queries a source DynamoDB table for new messages to be posted and ran into this error:

    "errorType": "AccessDeniedException",
        "errorMessage": "User: arn:aws:sts::account-id:assumed-role/lambda-kevinhookebot-role/kevinhooketwitterbot-v2-dev-sendTweet is not authorized to perform: dynamodb:Query on resource: arn:aws:dynamodb:us-west-1:account-id:table/tweetbottweets/index/tweetdate-createdate-index because no identity-based policy allows the dynamodb:Query action"

    The IAM role I’m reusing does have dynamodb:Query, but only on these resources:

    "Resource": [
      "arn:aws:dynamodb:us-west-1:account-id:table/tweetbottweets",
      "arn:aws:dynamodb:us-west-1:account-id:table/tweetbottweets/index/Index",
      "arn:aws:dynamodb:us-west-1:account-id:table/tweetbotreplies"
    ]

    This only includes the table itself, the primary index called Index, and another table tweebotreplies.

    Notice this part of the message:

    is not authorized to perform: dynamodb:Query on resource: arn:aws:dynamodb:us-west-1:account-id:table/tweetbottweets/index/tweetdate-createdate-index

    The issue is this role does not include Query on a new index I added, called tweetdate-createdate-index. To resolve this, add this index to the list of Resources, and problem resolved.