How Moderation would work in Darcy

“Good Moderation” is one of the key promises that we make here at Darcy. But what does this mean? Who will do this work, and how? This blogpost tries to give an overview of what we are planning. While community moderation is usually about deleting content, Darcy focuses on preserving spaces instead. With all the information stored in self-hosted Solid Pods, harmful content is not deleted. Instead, it is unlinked and removed from the space, in order to preserve the community.

Before we start though, some reminders on how the Darcy infrastructure is different from places like Facebook, Twitter, but also the Fediverse:

Everything is decentralised

In the Darcy environment, decentralisation creates a few layers and components:

Your own Solid Pod

This is essentially your own web server (hosted by yourself or one of the various Solid Pod providers springing up) saving all your content: Text, comments, images, your social graph and so on. If you don’t know what this is, and don’t care, the instance you join can transparently create such a pod for you during the signup process.

You will have full control over this, and no one will be able to censor or delete content from it. That also means that you will have full responsibility over it: If you use this for example to host illegal things, your local law enforcement might want to have a word with you.

The Darcy instance(s) you joined

This is similar to a Fediverse instance, in that all your Darcy interactions are mediated through this. Your instance helps you discover new content and people and keeps track of all interactions for moderation purposes. Each instance can have its own content and conduct policies, but in order to join the “Darcyverse”, they have to agree to a common set of standards. You can host your own one-person instance if you want, and then federate with the “Darcyverse”. (Our http://shepherd.darcy.is prototype is actually such a one-person instance, as it runs entirely in your browser, storing no information outside of your own Solid Pod)

The nice part about this is that you can change the Instance at any time without loosing your content, social graph, or your identity, as all of these things are tied to and stored your Solid Pod.

The “Darcyverse”

The totality of all Darcy instances that have agreed to abide by the common set of policy standards. This includes a self-categorisation into “Highly Permissive”, “Moderately Permissive”, and “Restricted”. (More on that later)

The Moderation Service

Darcy provides an API that enables instance owners to outsource moderation to a moderation service. That means that moderation requests and the content is available to the moderators of the chosen moderation service, so they can act on it.

The Fediverse

As we hope to eventually add ActivityPub to Darcy, this will mean that Darcy users can interact with the even wider Fediverse.

Any moderation action works by unlinking content from an instance, rather than deleting it. This means that harmful content will not be visible in moderated spaces, but it will still exist in Solid Pods.

In terms of interactions with content, Darcy uses the Post-and-comments model as seen in Diaspora, Facebook and similar, versus the “everything is a post” model of Mastodon or Twitter. This enables a lot more self-care capabilities for individual users.

The different self-categorisations of Instances

Each Darcy instance that wants to join the Darcyverse has to self-categorize into one of three modes. Highly permissive is basically the full-on adult version, where people can express themselves nearly any way they want. Provided they’re not engaging in harmful behaviour like bullying, harassment, or hate speech of course. As such, moderately permissive instances are for mature people, but without “adult” content. This is rated from a progressive and inclusive perspective that allows for nudity but not pornography.

Restricted Mode will be for instances that cater for all audiences. Mature and adult content is forbidden here. This is what people colloquially call “Safe For Work”. This mode is intended for instances that want to be completely free of sexual and otherwise mature content.

We do recognise that it is virtually impossible to algorithmically figure out whether content coming from a Highly permissive environment is actually free of NSFW or mature content. As a result, the minimum we have to do is to implement things so that means that adults can follow and potentially interact with Restricted Mode users, but to not allow Restricted Mode users access to mature spaces.

As a result, this is a matrix that shows what can be posted where and what is visible to whom.

Posted by:	visible on: content type:	Highly permissive	Moderately permissive	Restricted Mode
Highly permissive	Adult content with CW	visible	not visible	not visible
	Adult content w/o CW	visible	not visible	not visible
	Mature content w CW	visible	not visible	not visible
	Mature content w/o CW	visible	not visible	not visible
	Safe content	visible	not visible	not visible
Moderately permissive	Adult content with CW	forbidden	forbidden	forbidden
	Adult content w/o CW	forbidden	forbidden	forbidden
	Mature content w CW	visible	visible	not visible
	Mature content w/o CW	visible	visible	not visible
	Safe content	visible	visible	not visible
Restricted Mode	Adult content with CW	forbidden	forbidden	forbidden
	Adult content w/o CW	forbidden	forbidden	forbidden
	Mature content w CW	forbidden	forbidden	forbidden
	Mature content w/o CW	forbidden	forbidden	forbidden
	Safe content	visible	visible	visible

This does not mean that Restricted Mode instances are immediately safe spaces for children though: There are other issues as well, a lack of age verification, or even more unwelcome things like grooming or cyberstalking. If we cannot come up with a workable solution for these and similar cases we might discover that a child-friendly Darcy version is not in the cards for now.

It is entirely possible for individual users to move their identity upwards. So when a kid grows up and wants to join a mature or highly permissive mode instance, they can do so and take their existing social graph and content along. (Or create a new one, leaving their “child friendly” identity behind if they want to). This is again not to claim that we can do age verification, beyond the bare minimum of asking for a birthday. If a given user claims that they are fit to join a mature environment, we will have to believe them.

Also, please be aware that this is just the model on how different classes of instances make things available on the ground level. On top of that, every user has fine-grained controls over who can see or interact with their content. That can then create protected timelines, open or closed groups, and so on.

The moderation ladder

Moderation is staggered on three general levels:

Self-care

Each Darcy user will have tools to help manage and moderate their own experience. This allows them to mute or block others, selectively hide interactions from others with their own posts, set up automated rules for Blocking or Muting based on certain criteria, review past interactions, or escalate by using the Report function to call in Moderators.

Users are encouraged to use those functions as a form of self-care. When they do, they hide or unlink the unwanted interactions from their content, making them invisible. The comments stay saved on the blocked/muted persons Solid Pod, so they do not loose their content. For the general audience of the Darcyverse however, that content is unavailable .

Content moderation of individual users

Once users hit the Report function, moderators get called to action. Reporting alerts the moderators of the instance where the offending interaction comes from. This can either be the instance owner, moderators deputised by them, or, if the instance subscribed to the service, the global Darcyverse moderation team. (We want to eventually have teams that are embedded in the different local cultures around the globe, to make sure that they not only understand the language, but also the shorthand and cultural and social norms.)

Whoever the report ended up with will then review the content and the interaction. They can act on it by either unlinking the offending content, issue warnings and can also exclude users from the Darcyverse, either for a limited time, or permanently. Cases of child sexual abuse and credible threats of violence will be escalated to the authorities.

When any of this is done by the Darcyverse moderation team, this creates an entry in the Darcyverse moderation log. That means that the offending users cannot simply sign on to another Darcyverse instance and continue.

Policy moderation of instances

The use of the central Darcyverse moderation service is optional. This means that instance owners can choose to do this work by themselves. Most hobbyist and small instances will probably do so, at least at first. But that also means that it might be possible that they do not adhere properly to the standards of the wider Darcyverse, do not act in a timely manner, or play favourites and let their friends get away with violating policies.

If that is the case, there will be an escalation mechanism that eventually involves the Darcyverse moderators. They will not and cannot moderate individual users on those instances, but rather eventually block the whole instance from federating with the wider Darcyverse.

Drawbacks and edge cases

This system isn’t perfect. We think it is important to address potential issues so that people go in eyes open.

As there is for example no extra identity verification system, people can create extra identities on new Solid Pods to circumvent blocks and bans, leading to a not-so-fun game of whack-a-mole. The whole concept of a federated social space is unfamiliar to people who learned online behaviour on platforms like Twitter or Facebook. (Although as soon as one realises the similarity to email addresses, things get a lot easier)

Currently, we are on the fence regarding private messages: We do not want to create a system where our moderators can read private messages. However if we were to build such a thing, moderation of those private messages would be impossible. Our moderators could not sanction or exclude people from the platform based on things being said in a private message. Right now, that means that we simply do not have private messages, which is a problem in itself. Further research into this is clearly necessary.

And, as there can always be instances that do not follow the Darcyverse policies, it is important to make it easy for end users to figure out whether the instance they want to use does apply the policies they are after. Potentially, it is conceivable that instances are set up as traps or scams, tricking end users into believing they are part of the Darcyverse, but are actually not. This needs to be solved before mass-adoption is feasible. The result might be a simple published list of approved instances, but time will tell.

Further thoughts

As you can see, the Darcyverse is created from a bunch of different components that work with each other. Most of them are interchangeable, even to the point where other groups can create their own version of the ‘verse, with their own policies and guidelines. Yes, it can lead to fragmentation and that is a good thing, It is possible to build connection without centralisation. Social interactions are too varied and complex to fit into one model.

The upside is that due to shared technical protocols and the fluidity of connections, savvy users can participate in a variety of instances and ‘verses at the same time. This would broaden their social media horizons without ever being locked into a walled garden.