DocuFlag is a document compliance assistant for immigration professionals, not an immigration advisory service. Consulate requirements may change. Always verify with official sources.
DocuFlag

Data Protection Impact Assessment

GDPR Article 35 — Last updated: March 2026

This assessment evaluates the data protection risks of DocuFlag's AI-powered document analysis service. It is conducted in accordance with Article 35 of the General Data Protection Regulation (EU) 2016/679.

1. Description of processing operations

1.1 Nature of processing

DocuFlag assists immigration professionals (visa agencies, consultants) in verifying that client documents meet published consulate requirements for Schengen visa applications. The service uses AI (large language models) to extract structured data from uploaded documents and compare it against a requirements database.

1.2 Roles

  • Data controller: The visa agency or immigration professional using DocuFlag
  • Data processor: DocuFlag (operated by DocuFlag)
  • Sub-processor: OpenAI (EU endpoint) for AI-powered document analysis
  • Sub-processor: Stripe for payment processing

1.3 Data flow

Documents are processed using an EU-hosted relay architecture that ensures original document content is never stored:

  1. The immigration professional adds a document in the browser
  2. The document is stored locally in the browser (IndexedDB)
  3. The browser sends the document to DocuFlag's EU-hosted analysis server, which forwards it to OpenAI's EU endpoint (eu.api.openai.com) for analysis
  4. The document is processed in-memory only by the analysis server — never written to disk, logged, or cached
  5. OpenAI processes the request within EU infrastructure with zero data retention
  6. The structured analysis result (field extractions, compliance observations) is returned to the browser
  7. Only the structured result (no original document content) is stored by DocuFlag

1.3a Optional E2EE cloud storage (separate from analysis flow)

Users on Professional, Agency, or Enterprise plans may optionally enable E2EE cloud storage on a per-case basis. This flow is entirely separate from the analysis flow described above:

  1. The user enables “Cloud Sync” for a case (disabled by default)
  2. Case data is serialised and encrypted client-side using the organisation's AES-256-GCM data encryption key (DEK)
  3. The DEK is itself wrapped by a user RSA-OAEP 4096-bit keypair, which is derived from a passphrase via PBKDF2 (600,000 iterations)
  4. The encrypted blob is uploaded to AWS S3 (EU region: eu-west-1 or eu-central-1)
  5. The server stores only the encrypted blob, encrypted key material, and metadata (org ID, application ID, blob size, timestamps, S3 key, expiry). The server cannot decrypt any case data
  6. Blobs are subject to a 180-day TTL and automatically deleted upon expiry

1.4 Categories of personal data

  • Identity data: Passport data pages (name, date of birth, nationality, passport number, photograph)
  • Financial data: Bank statements (account numbers, balances, transaction history)
  • Employment data: Employment letters (employer name, salary, position)
  • Travel data: Flight itineraries, hotel bookings, travel insurance certificates
  • Correspondence: Cover letters, invitation letters

1.5 Special category data (Article 9)

Passport photographs may constitute biometric data as defined by Article 4(14) GDPR when processed to uniquely identify a natural person. WhileDocuFlag's AI analysis is not designed to perform facial recognition or biometric identification, the raw passport image is transmitted to OpenAI's EU endpoint for data extraction. This constitutes processing of potential special category data.

2. Necessity and proportionality

2.1 Purpose

The processing enables immigration professionals to efficiently verify that their clients' documents meet published consulate requirements. The AI extracts factual data and compares it against an official requirements database, producing structured observations — not recommendations or predictions about visa outcomes.

2.2 Necessity

Manual document comparison is time-consuming and error-prone. AI-assisted extraction reduces the risk of overlooking discrepancies between documents and published requirements, improving the quality of case preparation for immigration professionals.

2.3 Proportionality

  • Only documents voluntarily uploaded by the immigration professional are processed
  • Original documents are never stored by DocuFlag — they remain in the browser
  • OpenAI processes documents transiently with zero data retention
  • Only structured analysis results (field extractions) are retained by DocuFlag
  • No automated decision-making occurs — all observations are presented as factual comparisons for human review

2.4 Legal basis

  • Article 6(1)(b): Processing is necessary for the performance of the contract between DocuFlag and the visa agency (data controller)
  • Article 9(2)(a): For special category data (passport biometrics), the visa agency (data controller) must obtain explicit consent from the data subject (visa applicant) before uploading their passport to DocuFlag

3. Risk assessment

The following risks have been identified and assessed for likelihood and impact on data subjects' rights and freedoms.

RiskLikelihoodImpactMitigation
Document data stored by DocuFlagLowHighDocuments transit through DocuFlag's EU-hosted analysis server in-memory only — never written to disk, logged, or cached. The application database only stores structured analysis results, not original document content.
OpenAI stores or trains on document dataLowHighOpenAI's EU endpoint operates with zero data retention. A Data Processing Agreement (DPA) and Zero Data Retention amendment are in place. API data is not used for model training.
Unauthorized access to analysis resultsMediumMediumHTTPS (TLS 1.2+) for all data in transit. Short-lived JWT tokens (5-minute expiry) for analysis session authentication. Role-based access control per organization.
Prompt injection extracts PII from documentsLowMediumSystem prompt enforces factual extraction only. Output is validated against a strict JSON schema. The model is instructed to never include data not present in the uploaded document.
API key abuse by malicious userMediumLowAPI keys are stored on the EU analysis server and never exposed to the browser. Rate limits and spend caps on the OpenAI project. Per-organization credit system.
Encrypted blob breach (E2EE cloud storage)LowLowAll data is encrypted client-side with AES-256-GCM before upload. The server stores only encrypted blobs and cannot decrypt them. 180-day TTL limits exposure window. AWS S3 server-side encryption (SSE) provides an additional layer. Users can delete cloud data at any time.
Passphrase / recovery key loss (E2EE cloud storage)MediumHighBy design, if both the user passphrase and the 256-bit recovery key (provided at setup) are lost, encrypted data is permanently irrecoverable. This is mitigated by: recovery key provided at setup for disaster recovery, clear documentation of the risk at onboarding, and the fact that cloud storage is optional and separate from the primary local-first data store.
Cross-border data transfer outside EULowHighAll AI processing uses OpenAI's EU-dedicated endpoint (eu.api.openai.com). Data is processed within European infrastructure. No transfer to the United States.

4. Measures and safeguards

  • EU relay processing: Documents go from the user's browser through DocuFlag's EU-hosted analysis server to OpenAI's EU endpoint. The analysis server processes documents in-memory only — no content is written to disk, logged, or cached. DocuFlag's servers only receive structured analysis results.
  • Zero data retention at sub-processor: OpenAI's EU endpoint does not store API requests or responses at rest. Covered by contractual DPA and Zero Data Retention amendment.
  • Encryption: All data in transit is encrypted via TLS 1.2+. Analysis sessions use short-lived JWT tokens (5-minute expiry) for authentication.
  • Access control: Multi-tenant organization model with role-based access (Owner, Admin, Member). Session-based authentication via NextAuth.
  • Audit logging: All document analysis events are logged with timestamps, action types, and actor IDs. No personal data is included in audit records.
  • Data subject rights: Data subjects can exercise their rights (access, rectification, erasure, portability) through the data controller (visa agency). DocuFlag supports case deletion and data export functionality.
  • Data minimization: Only the minimum data necessary for compliance checking is processed. Analysis results use structured field names rather than reproducing full document content.
  • Regular review: This DPIA is reviewed annually or whenever there is a material change to the processing operations described above.

5. Special category data (Article 9)

Passport photographs transmitted during document analysis may constitute biometric data. The following safeguards apply:

  • Legal basis: Article 9(2)(a) — explicit consent from the data subject (visa applicant), obtained by the data controller (visa agency) prior to uploading the passport to DocuFlag
  • No biometric identification: DocuFlag does not perform facial recognition or biometric matching. The AI extracts textual information (name, date of birth, passport number, expiry date) from passport images.
  • No storage of biometric data: Passport images are processed transiently by OpenAI's EU endpoint with zero data retention.DocuFlag does not store, cache, or retain passport photographs.
  • Controller responsibility: The visa agency (data controller) is responsible for ensuring that appropriate consent has been obtained from the visa applicant before uploading passport documents.

6. Sub-processors

Sub-processorPurposeData processedLocationDPA
OpenAI (EU endpoint)Document analysis via GPT-5Documents (transient, zero retention)EU (eu.api.openai.com)Yes
Hetzner Online GmbHEU VPS for analysis proxy + databaseDocuments (transient, in-memory) + case metadataEUYes
AWS (Amazon Web Services EMEA SARL)E2EE cloud storage (encrypted blobs)Encrypted blobs only (server cannot decrypt)EU (eu-west-1 / eu-central-1)Yes
StripePayment processingPayment details (no document data)US/EUYes

7. Conclusion

The processing operations described in this assessment present manageable risks to data subjects' rights and freedoms. The EU relay architecture ensures that original document content is never stored by DocuFlag, and OpenAI's EU endpoint with zero data retention provides strong guarantees against unauthorized data persistence. The measures and safeguards described above are considered adequate to mitigate the identified risks.

The primary residual risk relates to the transient processing of passport photographs (potential biometric data) by OpenAI's AI systems. This risk is mitigated by contractual safeguards (DPA, zero data retention), the explicit consent requirement under Article 9(2)(a), and the fact that no biometric identification is performed.

Contact

For questions about this assessment, contact: [email protected]