EDIT: Completed the project 🙂 see here
I had started working on this project 6-7 months ago. I left it mid-way as I got busy with something else but, now I am again onto this :D. The plan was to create something insanely awesome but, then I recalled few words; someone told me once that first one should create a Minimum Viable Product and then go for more features.
Currently, I am working on a pretty naive text summarizer by implementing a basic text scoring algorithm with some use of NLTK. Although, I’ve worked with Stanford’s CoreNLP earlier, I wanted to exploit the power of NLTK.
I’ve tested the script by summarizing some articles from techcrunch.com and compared the summary results with results from some online text summarizing websites like:
- http://smmry.com (this is already pretty bad in summarizing texts)
- http://freesummarizer.com (not so good but better than smmry.com)
- http://autosummarizer.com/index.php (most of my results matches with this one)
After this basic implementation works fine, I’ll try to implement some complex language processing concepts for which I may be dealing with more of NLTK or even CoreNLP (personally, I like Stanford’s CoreNLP more).
I’ve also created a separate branch and initialized it with a Django based web application. Once the script works fine, I’ll try to host this script as a web application for text summarizing. But, my priority and focus is on creating a more efficient summarizing script 😀