Jay Taylor's notes

back to listing index

(1) What are good tools to extract key words and/or topics/tags from a random paragraph of text? - Quora

[web search]
Original source (www.quora.com)
Tags: nlp www.quora.com
Clipped on: 2016-06-29

Have this question too? Request Answers:
Request From Quora
We will distribute this question to writers, and notify you about new answers.
22 Answers
Image (Asset 1/10) alt=
Abhishek Shivkumar, IBM Watson Labs
25.4k Views · Most Viewed Writer in Natural Language Processing with 60+ answers
First, check with the answers on What is the best text analytics API + service? All the answers there provide pointers to good API functions that extract keywords and/or topics.

Also, recently Topicmarks [1] released their capability which can read your text and provide facts, summaries and keywords. I looked at their results and it is something you might want to explore.

[1]: http://topicmarks.com/(closed operations)
Image (Asset 2/10) alt=
Vineet Yadav, M.Tech Computational linguistics , IIIT-H Text analytics and Natural language...
27.1k Views · Most Viewed Writer in Natural Language Processing with 120+ answers
  • open source tools

Commercial api
Image (Asset 3/10) alt=
Antonio Matarranz, High-tech Marketing, Madrid (Spain)
19.2k Views · Most Viewed Writer in Semantic Web
Textalytics (Meaning as a Service) is a cloud-based semantic API that offers a Topic Extraction service (entities, concepts)
In addition, you can tag your text with theme categories, feature-level sentiment, etc.
Disclosure: I work for Daedalus, the company that manufactures Textalytics.

UPDATE:  Textalytics has been rebranded to MeaningCloud Web Services for Text Analytics and Mining | MeaningCloud
and it features a Topics Extraction service Topics extraction & named entity recognition | MeaningCloud
Image (Asset 4/10) alt=
Sujit Pal, search engineer interested in semantic search, text analytics and NLP, machin...
17.2k Views · Most Viewed Writer in Semantic Web
All great answers, but I see no one has mentioned RAKE so I am mentioning... It works quite nicely and is quite light on performance. The algorithm works by removing stopwords from the text and finding runs of high frequency words, then finds high frequency runs across stopwords if any. It is described in Michael W Berry's book Text Mining Applications and Theory (free PDF available if you search or use Amazon's look inside feature). There is also a Python implementation at https://github.com/aneesha/RAKE/...
Image (Asset 5/10) alt=
Kumar Ishan, Founder @ ReaderDeck
You can also look at jatetoolkit - Java Automatic Term Extraction toolkit
It implements following statistical algorithms for keyword/keyphrase extraction.

  • basic term frequency
  • Average term frequency in the corpus (term frequency/ document frequency)
  • TF-IDF
  • RIDF - Inverse Document Frequency (IDF): A Measure of Deviation from Poisson
  • Weirdness - Weirdness indexing for logical document extrapolation and retrieval
  • C-value - A methodology for automatic term recognition.
  • GlossEx - Glossary extraction and knowledge in large organisations via semantic web technologies.
  • TermEx - Termextractor: a web application to learn the shared terminology of emergent web communities.
Image (Asset 6/10) alt=
Yura Koroliov, Natural Language Processing developer
12.2k Views · Most Viewed Writer in Information Extraction
There are good tutorial on significant phrase extraction at ling-pipe site http://alias-i.com/lingpipe/demo...

I found most useful and scalable Xtract algorithm implementation in Dragon Toolkit (dragon.ischool.drexel.edu). It is English only, but it's use smart wordnet stemmer and POS taggers in addition to pure probabilistic(Chi/Info gain) phrase scoring.

Related Questions

More Related Questions

Question Stats

Last Asked Feb 9
Top Stories from Your Feed
Read In Feed
Popular on Quora

What is the biggest tip that you have ever received as an employee?

Image (Asset 7/10) alt=
Image (Asset 8/10) alt=

The biggest tip I ever got was "$55,007 and more" from one customer. Yes, "fifty five thousand and seven (US) dollars and more" from one customer.

I have to go anonymous because I do not want to rev...

Read In Feed
Popular on Quora
Why do breakups happen even after being together for years?
Image (Asset 9/10) alt=
Image (Asset 10/10) alt=

Had to go anonymous on this one simply to protect those involved and because it will happen in a few weeks.

This is a merry-go-round. One of my favorite rides on the playground as a child and what ...

Read In Feed
Popular on Quora
What are some unconventional ridiculously high paying jobs?
Zaki Islam, British born Muslim. Passionate about cross-people + cultural collaboration
  1. Air Traffic Controller. The average wage is said to be around $120,000. It is a difficult job but the pay compensates.
  2. The field of Anaesthesia. Anaesthesiologists are in charge of ensuring a person...