iLoveMD
100% in your browser

Convert PDF to Markdown

Extract clean Markdown from any PDF, entirely in your browser. Text-based PDFs convert instantly, and scanned PDFs are read with on-device OCR. Your file never leaves your device.

  • Always free
  • No uploads
  • No sign-up
  • No tracking

← All conversion tools

Text-based PDFs convert instantly. Scanned or image-only PDFs have no text layer, so iLoveMD reads them with OCR that runs entirely in your browser. The first scanned PDF downloads the OCR engine once (about 9 MB); after that it works offline.

Drop a PDF file here or click to upload .pdf
No files selected

Markdown Output

Markdown Preview

Export Your Results

PDF to Markdown Docs

iLoveMD has two paths for PDFs. Text-based PDFs (the ones where you can select and copy text with your mouse) extract instantly via PDF.js. Scanned PDFs — photographs, screenshots saved as PDF, many government forms — have no text layer, so the converter automatically falls back to in-browser OCR powered by Tesseract.js. Both paths run entirely on your device.

Text-layer extraction

The fast path. Most PDFs produced by Word, LaTeX, browsers, or any tool that authored real text have an embedded text layer. PDF.js parses it, and the converter rebuilds the document structure as Markdown: headings get # prefixes, list items become bullets, link annotations carry through.

How to tell if your PDF has a text layer: open it in any reader and try to select a sentence with your cursor. If the selection follows the text word by word, you have a text layer. If the whole page selects as one image, you're on the scanned path.

OCR fallback for scanned PDFs feature

When PDF.js returns little or no text (the heuristic threshold is ~10 non-whitespace characters total across the document), the converter automatically switches to OCR. The Tesseract engine + English language model load once (~9 MB total) and stay cached for subsequent scans. After the first download, OCR works offline.

OCR is slower than text extraction — page-by-page processing with progress updates in the preview pane. Best results come from clear, high-resolution scans of printed text. Handwriting, very low resolution (below ~150 DPI), and complex layouts (multi-column with images) reduce accuracy.

Tip: if a scan reads as gibberish, the source is usually too low-resolution. Re-scan at 300 DPI in grayscale and re-run.

What carries over

Headings, paragraphs, ordered and unordered lists, hyperlinks (from text-layer PDFs), and basic table-like structures. The output is a readable Markdown document; for serious round-trip fidelity (a PDF you produced from Markdown that you want to extract back without loss), the conversion is one-way lossy by design.

What does NOT carry: images, signatures, form fields, footnotes (extracted as inline text), multi-column layouts (often collapse to single column), and most embedded fonts beyond their characters. Page numbers and headers/footers usually appear as repeated lines — sometimes useful, sometimes noise.

Privacy posture for OCR

OCR is the operation people are most nervous about — typical online services upload the scan to their servers and bill per page. iLoveMD does not. The Tesseract engine runs as WebAssembly in your browser; the only network traffic is the one-time download of the engine + language model on your first scan. The scan itself never leaves your device.

This is why the CSP includes wasm-unsafe-eval: Tesseract requires it to compile its WASM module. The token is the minimum relaxation required; it permits WebAssembly compilation only, not arbitrary JS eval. Without it, scanned PDFs would silently fail to read.

Mermaid in source PDFs feature

If your PDF was produced from Markdown that contained \`\`\`mermaid blocks (for example, exported from our Markdown → PDF converter), the diagrams in the PDF are rasterized images. The reverse converter extracts the surrounding text but not the embedded image — the diagrams will be missing in the output Markdown.

If you have the original Mermaid source, paste it back into the output via the Mermaid Editor to round-trip. If you only have the PDF and need the diagram, treat the image as documentation in its own right.

How to convert PDF to Markdown

1

Drop your PDF

Drop a .pdf file onto the upload strip, or click Choose Files to pick one.

2

We read it in your browser

iLoveMD reads the PDF text layer locally with PDF.js and formats it as Markdown. If the PDF is scanned and has no text layer, it runs OCR in your browser instead. Nothing is uploaded.

3

Copy or download the Markdown

Edit the result if you like, then copy it or download it as a .md file.

Frequently asked questions

Does my PDF get uploaded anywhere?

No. The conversion runs entirely in your browser. Your PDF never leaves your device, including scanned PDFs read with OCR.

Can I convert a scanned PDF?

Yes. When a PDF has no text layer (scanned documents, screenshots saved as PDF, many government forms), iLoveMD runs OCR to read the text directly in your browser. The first scanned PDF downloads the OCR engine once (about 9 MB); after that it works offline. OCR is slower than text extraction and works best on clear scans.

How do I tell if my PDF is text-based?

If you can select and copy text in the PDF with your mouse, it has a text layer and converts instantly. If the whole page selects as one image, it is scanned, and iLoveMD reads it with OCR instead.

Does it work offline?

Once the page has loaded, yes. Conversions run locally without an internet connection.

Can I convert Markdown back into a PDF?

Yes. Use the Markdown to PDF tool to render Markdown as a PDF.