Jay Taylor's notes

back to listing index

oliversauter/pdf2text4extensions

[web search]
Original source (github.com)
Tags: javascript pdf document-converter github.com
Clipped on: 2018-08-12
Extract all text from PDF, works for extensions, pure Javascript
JavaScript

README.md

pdf2text4extensions

Simple textextractor for pdfs

Workflow:

  1. Opens URL for pdf with XMLHTTP Request openPDF()

  2. Extracts all the text from each page: getContentPDF(). You can ditch the first step and just give it a blob file.

Background

This library is part of the WorldBrain Project. We build a search engine that allows you to full-text search through your browsing history, bookmarks, Evernote, Pocket, Google Drive etc.

Acknowledgements:

This library is made possible with the help of the PDF.js libary

Press h to open a hovercard with more details.