Write a simple Matrix bot in Scheme (or any other language) - Part 1
A while ago I rewrote a bot we used in a (now defunct)
Matrix community, and in the process I took the
opportunity to throw away matrix-nio
, a python library for writing Matrix
clients, and learn a bit about Matrix’s client-server API. I was happy with the
result but I have no use for the bot now, so I thought I’d share the knowledge
with the rest of the world so that my efforts won’t go to waste.
In Part 2 I’ll be using Guile, but you can follow this guide in whatever language you like, provided it has a reasonable way to make HTTP requests and parse JSON.
Prerequisites
Set up an homeserver or choose a public one, then make an account for your bot
(it will be a regular user, so no special action should be taken here) and
join the rooms you want to use the bot in. If you are using Guile, install
guile-json.
We won’t be dealing with encryption, so if you plan on using the bot inside
end-to-end encrypted rooms you should also set up
pantalaimon.
Finally, curl
and jq
will come in handy for exploring the API.
How Matrix’s API works
Matrix clients talk to servers by exchanging JSON objects through HTTP requests. Here you can find the documentation of the APIs we’ll need. Don’t get discouraged by the size of the document, the protocol makes an effort to support both full-featured clients that satisfy modern IM expectations, and simple automated ones like ours. Believe it or not, the endpoints we need are only three: one to log in, one to read incoming events, and one to send messages.
First, make sure you have the URL of your homeserver. For instance, mine is at
https://matrix.alsd.eu:8448
, but you may want to use a local instance of
pantalaimon (e.g. http://localhost:8009
). You can check that everything works
by requesting /_matrix/client/versions
:
$ curl https://matrix.alsd.eu:8448/_matrix/client/versions | jq
{
"versions": [
"r0.0.1",
"r0.1.0",
"r0.2.0",
"r0.3.0",
"r0.4.0",
"r0.5.0",
"r0.6.0"
],
"unstable_features": {
"org.matrix.label_based_filtering": true,
"org.matrix.e2e_cross_signing": true,
"org.matrix.msc2432": true,
"uk.half-shot.msc2666": true,
"io.element.e2ee_forced.public": false,
"io.element.e2ee_forced.private": false,
"io.element.e2ee_forced.trusted_private": false
}
}
So far so good! All the endpoints we’ll be accessing start with
/_matrix/client/r0
, so we’ll say:
$ base=https://matrix.alsd.eu/_matrix/client/r0
Logging in
We can log in by POSTing some JSON to $base/login:
$ curl -d @- $base/login <<END | jq
> {
> "type": "m.login.password",
> "identifier": {
> "type": "m.id.user",
> "user": "testbot"
> },
> "password": "BOT_PASSWORD",
> "device_id": "bot"
> }
> END
{
"user_id": "@testbot:alsd.eu",
"access_token": "MDAxNWxvY2F0aW9uIGFsc2...",
"home_server": "alsd.eu",
"device_id": "bot"
}
Here you should replace testbot
with the id you chose, and the same goes for
BOT_PASSWORD
. In the response object there’s an access token that we’ll use
to authenticate further operations. The device id will be shown in the session
list in the bot’s profile. Every time you log in with the same id, the previous
token associated with that id is revoked.
From now on we’ll provide the token inside an Authorization
header:
$ token="MDAxNWxvY2F0aW9uIGFsc2..."
$ curl -H "Authorization: Bearer $token" $base/...
Synchronizing state
Matrix isn’t designed to simply pass messages between clients, but to keep the state of a room syncronized across clients and servers. When a client GETs $base/sync, for every room the user joined the response will contain the latest events that happened in that room, as well as tokens to retrieve events sent prior to the first contained in the response, and to tell the server where to start reporting events the next time the client syncs.
The docs have an example to help visualize the process of retrieving events:
First, the client makes an inital sync, and receives events
[E2]
to[E5]
from the server. The response also contains theprev_batch
andnext_batch
tokens.[E0]->[E1]->[E2]->[E3]->[E4]->[E5] ^ ^ | | prev_batch: '1-2-3' next_batch: 'a-b-c'
The next time the client syncs, it will provide the
next_batch
token received earlier. The servers replies with[E6]
, the only event generated since the last request.[E0]->[E1]->[E2]->[E3]->[E4]->[E5]->[E6] ^ ^ | | | next_batch: 'x-y-z' prev_batch: 'a-b-c'
However, it may happen that many events have been sent in the meantime: in that case, the server only sends the most recent ones, and the client has a gap in knowledge of the room’s history.
| gap | | <-> | [E0]->[E1]->[E2]->[E3]->[E4]->[E5]->[E6]->[E7]->[E8]->[E9]->[E10] ^ ^ | | prev_batch: 'd-e-f' next_batch: 'u-v-w'
The gap can be filled with the help of the
prev_batch
token.The server also makes sure that the client always receives enough information about the room’s state (who is in the room, has the description changed…), even if the corrisponding events fall into the forementioned gap, by putting state events that don’t fit in the returned timeline in a separate response field.
For simplicity, we’ll assume that:
- the bot will never receive enough messages for the gap to be a problem;
- we don’t care about what happens when the bot is not running.
Make sure your bot joined at least a room, then try out the following:
$ curl -H "Authorization: Bearer $token" $base/sync | jq
# probably very long output
At the end of the output should can see the next_batch
token. Now let’s try
putting it in the request:
$ next_batch=$(curl -H "Authorization: Bearer $token" $base/sync | jq -r .next_batch)
$ curl -H "Authorization: Bearer $token" $base/sync?since=$next_batch | jq
{
"account_data": {
"events": []
},
"to_device": {
"events": []
},
"device_lists": {
"changed": [],
"left": []
},
"presence": {
"events": []
},
"rooms": {
"join": {},
"invite": {},
"leave": {}
},
"groups": {
"join": {},
"invite": {},
"leave": {}
},
"device_one_time_keys_count": {},
"org.matrix.msc2732.device_unused_fallback_key_types": [],
"next_batch": "s152848_7628336_4261_148566_26667_43_66472_186015_5"
}
As you can see, I provided the token using the since
query parameter. This
time the output is much shorter: in this case, nothing happened between the two
syncs, so the response object is mostly empty. This gives us a chance
to familiarize ourselves with its structure: what we’re intrested in is the
.rooms.join
object. Try writing something in a room the bot’s in and syncing
again:
$ curl -H "Authorization: Bearer $token" $base/sync?since=$next_batch | jq .rooms.join
{
"!PXeSeufpLzIQnfleAn:alsd.eu": {
"timeline": {
"events": [
{
"type": "m.room.message",
"sender": "@dalz:alsd.eu",
"content": {
"msgtype": "m.text",
"body": "hello there"
},
"origin_server_ts": 1611324778904,
"unsigned": {
"age": 38842
},
"event_id": "$rwoCYM9CitktykunRqT_v2ta8aenebgOM-aHD20EKZ0"
}
],
"prev_batch": "s152848_7628433_4263_148573_26671_43_66472_186015_5",
"limited": false
},
"state": {
"events": []
},
"account_data": {
"events": []
},
"ephemeral": {
"events": [
{
"type": "m.typing",
"content": {
"user_ids": []
}
}
]
},
"unread_notifications": {
"notification_count": 1,
"highlight_count": 0
},
"summary": {},
"org.matrix.msc2654.unread_count": 1
}
}
Here I used jq
to filter only the intersting part. .rooms.join
is an object
that maps room identifiers to updates on the room’s content: most importantly,
a list of events sent to the room since the last sync. All events of type
m.room.message
must have a textual .content.body
, which we’ll use later to
make our bot react to incoming messages.
Lastly, this endpoint supports
long polling: you
can specify a timeout in milliseconds as a query parameter (like
$base/sync?since=$next_batch&timeout=30000
) so that the server will wait for
up to the specified interval if it has no new events to report.
Sending messages
Let’s see some action now: we’ll send a message using the PUT endpoint
$base/rooms/{roomId}/send/{eventType}/{txnId}.
First you need to find out the id of the room the message will be sent to: you
can copy it from the .rooms.join
object we retrieved earlier, or look it up
from Element (room settings > advanced > internal room id).
$ room='!PXeSeufpLzIQnfleAn:alsd.eu'
$ curl -H "Authorization: Bearer $token" "$base/rooms/$room/send/m.room.message/0" -X PUT -d @- <<END
> {
> "msgtype": "m.text",
> "body": "Hello, world!"
> }
> END
You should now see the message in the chat. A couple of things to note:
- the
0
at the end of the URL is a transaction id that should be unique as long as you reuse the same access token. We’ll take the easiest approach and use a monotonically increasing integer; - what we’re sending is an
m.room.message
event (docs here) of typem.text
, which also supports aformatted_body
(more on this later).
Aaand we’re done curl
ing and jq
ing, head over to
Part 2 to put all this to practice!