DEDUP // BOT Back to dedup.bot

Legal // Early Access

Privacy

Dedup reads repository and pull request content only to analyze duplicate-code risk, run verifier checks, publish GitHub check runs, and maintain the service.

This policy applies to the Dedup GitHub App and dedup.bot. For privacy or deletion requests, contact hello@dedup.bot.

GDPR

We comply with GDPR. You can request a full copy of your data or ask us to delete your data by emailing hello@dedup.bot.

We do not sell data. We do not share data except with the providers explicitly listed in this policy: Scaleway and OpenRouter.

Services we use

  • GitHub installation access is scoped through the GitHub App permissions granted by the repository owner.
  • Scaleway is our primary infrastructure and hosts the production API, workers, Postgres database, object storage, container registry, and Kubernetes workloads in the fr-par region in France.
  • For availability and resiliency we may route inference traffic through OpenRouter. If OpenRouter is used, requests are always sent with ZDR enabled for the highest privacy posture.
  • GitHub receives check-run and pull-request comment output when Dedup publishes findings back to a repository.
  • Pro plans are coming soon. Free repositories do not require billing details.

Data we process

  • Repository metadata: organization, repository name, repository id, default branch, installation id, commit SHAs, pull request numbers, and webhook delivery ids.
  • Repository content: source files, normalized code chunks, hashes, token shingles, tree-like structural shingles, embeddings, file paths, and line ranges.
  • Analysis output: candidate pairs, verifier findings, confidence, duplicate type, evidence, check-run summaries, and inline comment text.
  • Operational data: queue records, timestamps, error messages, worker logs, and artifact object keys.

Where data lives

Production storage is in Scaleway's fr-par region in France unless a future customer-specific deployment states otherwise. OpenRouter is only used as external infrastructure for additional availability, and when used, Dedup enables ZDR for every OpenRouter request.

Retention

  • Repository indexes, snippets, traces, raw content references, logs, and artifacts are kept only for 30 days after they were last used unless a shorter setting is configured.
  • GitHub check runs and pull request comments remain in GitHub according to the repository's GitHub retention and audit settings.
  • Webhook queues and analysis records may be retained while needed for idempotency, debugging, abuse prevention, billing limits, and legal obligations.

Deletion requests

Email hello@dedup.bot with the GitHub organization, repository, installation id if available, and the data you want deleted. We may verify that you control or administer the relevant repository before deleting data. You can also uninstall the GitHub App from GitHub to stop future access. Data is kept for a maximum of 30 days after it was last used.

Security

GitHub webhooks are signature-verified. GitHub App access uses short-lived installation tokens. Production workers run in Kubernetes with scoped service accounts and non-root containers.